Artificial intelligence has transformed the early stages of drug discovery. Machine learning models can now scan billions of molecular interactions, identify patterns across vast datasets, and generate lists of promising drug targets in hours. It’s genuinely impressive — and genuinely incomplete.
Because generating a hypothesis is not the same as validating one.
Despite emerging successes by Insilico and Recursion whose AI-driven drugs have demonstrated clinical validation through Phase I and Phase II trials, the uncomfortable truth is that most AI-generated targets still fail. Not because the AI is wrong, but because hypothesis generation and evidence generation are fundamentally different problems. One requires pattern recognition at scale. The other requires rigorous, multimodal, reproducible science — the kind that holds up in front of a CSO, a portfolio committee, or an FDA reviewer.
That gap — between AI-generated hypothesis and validated, actionable target — is where drug programs succeed or fail.
Bullseye First
Robert Plenge, Chief Research Officer at Bristol Myers Squibb, has written compellingly about this on his blog. His “Bullseye, Aim, Fire” framework argues that the most critical — and most underinvested — step in drug R&D is establishing causal human biology before committing to a program. Hit the bullseye first. Then aim.
Then fire.
AI accelerates the “aim” and “fire” phases considerably. What it doesn’t yet solve is the bullseye — the rigorous, genetically-anchored, multimodal evidence that a target is causally linked to disease in humans. That still requires deep analytical work across large, diverse, well-harmonized datasets.
Plenge’s framework is a useful lens for understanding why so many AI-generated targets are still failing in the clinic: the industry is aiming and firing faster than ever, but the bullseye work isn’t keeping pace.
What Rigorous Evidence Actually Requires
Validating an AI-generated target with the rigor that justifies a major program commitment requires several things working together:
Multimodal data integration. Genomic signals need to be corroborated by proteomic, transcriptomic, phenotypic, and clinical data. A target that shows up convincingly across multiple data modalities is a fundamentally different bet than one supported by a single signal.
Multi-cohort analysis. A finding that replicates across UK Biobank, All of Us, FinnGen, and your proprietary biobanks is a finding you can build on. A finding from a single cohort is a starting point.
Harmonization at scale. Getting diverse datasets to speak the same language — same variant nomenclature, same phenotype definitions, same quality standards — is unglamorous, time-consuming, and absolutely essential. It’s also where most DIY solutions quietly break down.
Reproducibility. In an era of increasing regulatory scrutiny and internal governance requirements, the ability to reproduce an analysis exactly — same inputs, same pipeline, same outputs — is not optional. It’s foundational.
The Build vs. Buy Trap
Now more than ever with the power of AI coding assistants, organizations respond to this challenge by building their own solution. You have data scientists, you have cloud infrastructure, you have smart people.
What you don’t have is time. Building a production-grade, multi-cohort, multimodal analytics environment that meets your reproducibility, security, and compliance requirements is a multi-year undertaking. The hidden costs compound: data harmonization, pipeline maintenance, dataset updates, staff turnover, and the opportunity cost of your best scientists doing data engineering and maintenance instead of science.
A Better Approach
Paradigm4’s REVEAL platform was built specifically for this problem. It is the multimodal evidence platform for rigorous, reproducible target validation — purpose-built for the complexity of population-scale biobank data, multi-cohort meta-analysis, and the integrative analytics that turn AI-generated hypotheses into confident decisions.
REVEAL brings together genomics, multi-omics, phenotypes, and EHR data across cohorts and institutions in a single, harmonized, analysis-ready environment. The data gravity is already there. The harmonization is already done. The pipelines are already validated.
Alnylam Pharmaceuticals uses REVEAL as the foundation of their target discovery efforts — enabling their team to identify novel, genetically-validated targets from population-scale biobank data at a pace and depth that would be impossible to replicate with a homegrown solution.
We work with global biopharma teams as a platform, as a provider of purpose-built bioinformatics agents, or as a scientific services partner — meeting you where you are and scaling with your program.
Hit the Bullseye
Plenge’s framework ends with a simple imperative: you have to know what you’re aiming at before you fire. AI has made firing faster and cheaper. But the bullseye — causal human biology, rigorously established across multimodal data — still requires the right infrastructure, the right data, and the right scientific expertise.
That’s what REVEAL delivers. Your AI found a target. We help you prove it.