Your AI Found a Target. Now What?

Andy Cosgrove, CRO

Artificial intelligence has transformed the early stages of drug discovery. Machine learning models can now scan billions of molecular interactions, identify patterns across vast datasets, and generate lists of promising drug targets in hours. It's genuinely impressive — and genuinely incomplete.

Artificial intelligence has transformed the early stages of drug discovery. Machine learning models can now scan billions of molecular interactions, identify patterns across vast datasets, and generate lists of promising drug targets in hours. It’s genuinely impressive — and genuinely incomplete.

Because generating a hypothesis is not the same as validating one.

Despite emerging successes by Insilico and Recursion whose AI-driven drugs have demonstrated clinical validation through Phase I and Phase II trials, the uncomfortable truth is that most AI-generated targets still fail. Not because the AI is wrong, but because hypothesis generation and evidence generation are fundamentally different problems. One requires pattern recognition at scale. The other requires rigorous, multimodal, reproducible science — the kind that holds up in front of a CSO, a portfolio committee, or an FDA reviewer.

That gap — between AI-generated hypothesis and validated, actionable target — is where drug programs succeed or fail.

Bullseye First
Robert Plenge, Chief Research Officer at Bristol Myers Squibb, has written compellingly about this on his blog. His “Bullseye, Aim, Fire” framework argues that the most critical — and most underinvested — step in drug R&D is establishing causal human biology before committing to a program. Hit the bullseye first. Then aim.
Then fire.

AI accelerates the “aim” and “fire” phases considerably. What it doesn’t yet solve is the bullseye — the rigorous, genetically-anchored, multimodal evidence that a target is causally linked to disease in humans. That still requires deep analytical work across large, diverse, well-harmonized datasets.
Plenge’s framework is a useful lens for understanding why so many AI-generated targets are still failing in the clinic: the industry is aiming and firing faster than ever, but the bullseye work isn’t keeping pace.

What Rigorous Evidence Actually Requires
Validating an AI-generated target with the rigor that justifies a major program commitment requires several things working together:
Multimodal data integration. Genomic signals need to be corroborated by proteomic, transcriptomic, phenotypic, and clinical data. A target that shows up convincingly across multiple data modalities is a fundamentally different bet than one supported by a single signal.

Multi-cohort analysis. A finding that replicates across UK Biobank, All of Us, FinnGen, and your proprietary biobanks is a finding you can build on. A finding from a single cohort is a starting point.

Harmonization at scale. Getting diverse datasets to speak the same language — same variant nomenclature, same phenotype definitions, same quality standards — is unglamorous, time-consuming, and absolutely essential. It’s also where most DIY solutions quietly break down.

Reproducibility. In an era of increasing regulatory scrutiny and internal governance requirements, the ability to reproduce an analysis exactly — same inputs, same pipeline, same outputs — is not optional. It’s foundational.

The Build vs. Buy Trap
Now more than ever with the power of AI coding assistants, organizations respond to this challenge by building their own solution. You have data scientists, you have cloud infrastructure, you have smart people.

What you don’t have is time. Building a production-grade, multi-cohort, multimodal analytics environment that meets your reproducibility, security, and compliance requirements is a multi-year undertaking. The hidden costs compound: data harmonization, pipeline maintenance, dataset updates, staff turnover, and the opportunity cost of your best scientists doing data engineering and maintenance instead of science.

A Better Approach
Paradigm4’s REVEAL platform was built specifically for this problem. It is the multimodal evidence platform for rigorous, reproducible target validation — purpose-built for the complexity of population-scale biobank data, multi-cohort meta-analysis, and the integrative analytics that turn AI-generated hypotheses into confident decisions.

REVEAL brings together genomics, multi-omics, phenotypes, and EHR data across cohorts and institutions in a single, harmonized, analysis-ready environment. The data gravity is already there. The harmonization is already done. The pipelines are already validated.

Alnylam Pharmaceuticals uses REVEAL as the foundation of their target discovery efforts — enabling their team to identify novel, genetically-validated targets from population-scale biobank data at a pace and depth that would be impossible to replicate with a homegrown solution.

We work with global biopharma teams as a platform, as a provider of purpose-built bioinformatics agents, or as a scientific services partner — meeting you where you are and scaling with your program.

Hit the Bullseye
Plenge’s framework ends with a simple imperative: you have to know what you’re aiming at before you fire. AI has made firing faster and cheaper. But the bullseye — causal human biology, rigorously established across multimodal data — still requires the right infrastructure, the right data, and the right scientific expertise.
That’s what REVEAL delivers. Your AI found a target. We help you prove it.

Garay post

Chris Garay, Ph.D., Director, Paradigm4 Life Science Applications

Imagine having easy access and meta-analysis for public genetic association summary statistics together with your expanding collection of proprietary association datasets. Each dataset provides a unique window into the role human genetic variation plays in disease. While datasets are valuable on their own, their true potential can be unlocked when they are intelligently integrated to reveal patterns that remain hidden with individual analyses.
Dr. Matt Brauer - How to grow a biomed startup

Dr. Matt Brauer

Biobanks offer a rich source of data for biomed startups, but the route to deriving maximum benefit from them is not always straightforward. We talk to Matt Brauer, Vice President of Data Science at Maze Therapeutics, about his experience with the UK Biobank, Finngen and other consortia, and the role that making data-management processes open-source has had in delivering business success.
Dr. Lygia Pereira - Thirty years in genetics

Dr. Lygia Pereira

The course of academic research rarely runs smoothly, and through her 30-year career in genetics, Professor Lygia V. Pereira has certainly seen plenty of challenges – but also lots of successes. We talk to her about the importance of genetic diversity to better serve the health needs of countries like Brazil but also to improve our understanding of disease and health across all populations, and why we need to make bioinformatics tools truly user-friendly.