View Paradigm4’s catalog of peer-reviewed articles and posters.
Browse the glossary of terms from across the realm of bioinformatics.
Request access to the collection of white papers by Paradigm4.
See how Paradigm4 has transformed the pace of innovation for our clients.
Watch videos from Paradigm4’s library to learn about flexFS, see use cases, and more.
Read interviews, lessons, and insights from academic and industry visionaries.
REVEAL Solutions maximize the productivity of limited bioinformatics and data scientist resources with out-of-the box solutions for current challenges in translational data science.
Metadata and data are well-described with provenance so that they can be replicated and/or combined in different settings.
Versioning of data, algorithms, genome assembly, reference datasets, ontologies, and computing environment along with logging, provenance and permissions support reusability.
REVEAL: SingleCell is designed to be the point of truth for spatial and suspension single cell data, including images, enabling scalable cohort selection and analytics at patient population scales, billions of cells.
Findable: View and explore hundreds of single cell multiomics datasets (e.g., RNA-seq, CITE-seq, Spatial Transcriptomics) through R, Python, and REST APIs.
Accessible: Run analytical queries and algorithms across datasets and millions of cells within user friendly and intuitive interfaces (e.g., search expression for gene signatures, run batch correction at scale, find all macrophage cells across hundreds of datasets).
Interoperable: Use visualization tools and GUIs to analyze results and correlate findings in multiomics data such as spatial transcriptomics.
Reusable: Organized storage of data removes repeated ETL overhead. Create custom precision cell atlases (e.g., kidney cells) that can be versioned and annotated for specific downstream analysis.
Immediately available Population Scale Data
Immediately available solution to centralize all single cell data
Ease of use with low code
Data management layer with schema tailored for fast retrieval and compute based on metadata and scalability to 100s of TBs of data
Costs
Predictable costs
Domain expert workflows
Extensible: Build your own workflows in Python or R
Integration/ Interoperability
APIs to retrieve data in a format that can be integrated with other data and return data in whatever formats are required by downstream analysis: e.g. data frames in python or R; parquet files, bioinformatic-specific formats
Interoperable with flexFS and AI/ML packages
Governance
Integrated provenance, versioning, traceability, and logs
SSO
Granular, slice-level access control
Public Data Sets
Publicly available datasets available pre-loaded for simplified validation
REVEAL: VariantBank is a single point of truth for public biobanks and clinical studies with variant data, enabling cohort creation, target and biomarker discovery, off-target effects, scalable genomics analysis, and evidence generation with the lowest TCO on the market.
Are you using: UK Biobank, Our Future Health, and other large biobanks? public association data from the Broad, NY Genome Center?
Jupyter notebook allows users to perform scalable analysis to 10s of billions of calculations for 10-1,000x less than do-it-yourself approaches.
Immediately available Population Scale Data
Immediately available solution to centralize all single cell data
Ease of use with low code
Data management layer with schema tailored for fast retrieval and compute based on metadata and scalability to 100s of TBs of data
Costs
Predictable costs
Domain expert workflows
Extensible: Build your own workflows in Python or R
Integration/ Interoperability
APIs to retrieve data in a format that can be integrated with other data and return data in whatever formats are required by downstream analysis: e.g. data frames in python or R; parquet files, bioinformatic-specific formats.
Interoperable with flexFS and AI/ML packages
Governance
Integrated provenance, versioning, traceability, and logs
SSO
Granular, slice-level access control
Public Data Sets
Publicly available datasets available pre-loaded for simplified validation
Interoperable: Allows any device independent data set to be loaded.
Immediately available Population Scale Data
Immediately available solution to centralize all single cell data
Ease of use with low code
Data management layer with schema tailored for fast retrieval and compute based on metadata and scalability to 100s of TBs of data
Costs
Predictable costs
Domain expert workflows
Extensible: Build your own workflows in Python or R
Integration/ Interoperability
APIs to retrieve data in a format that can be integrated with other data and return data in whatever formats are required by downstream analysis: e.g. data frames in python or R; parquet files, bioinformatic-specific formats.
Interoperable with flexFS and AI/ML packages
Governance
Integrated provenance, versioning, traceability, and logs
SSO
Granular, slice-level access control
Public Data Sets
Publicly available datasets available pre-loaded for simplified validation
Findable: REVEAL: Reference provides your organization with a single, integrated source of feature and ontology names integrated into a simple user interface and updated from the government and organization source servers with versioning. By definition, reference data becomes instantly findable.
Accessible: REVEAL: Reference can be accessed directly through R and Python APIs or indirectly through application GUIs that automatically liftover names and provide requested annotations.
Interoperable: REVEAL: Reference is designed to be interoperable with REVEAL application APIs, GUIs, and containers.
Reusable: REVEAL: Reference persists versions and provides a single point of truth for annotations and ontology harmonization.
Challenge 1 – public reference data servers are often down or extremely slow
Challenge 2 – public reference data servers are difficult to query with more than one name
Challenge 3 – exceptions, versions, and ambiguities prevent an automated solution to name conversions, annotation, data linking
It’s all about time. If your answers are counted in days, then you know why you should ask us about REVEAL: Reference.
<1 minute
2 days for expert bioinformaticist
?
<1 minute
2 days for expert bioinformaticist
?
Immediately available Population Scale Data
Immediately available solution to centralize all single cell data
Ease of use with low code
Data management layer with schema tailored for fast retrieval and compute based on metadata and scalability to 100s of TBs of data
Costs
Predictable costs
Domain expert workflows
Extensible: Build your own workflows in Python or R
Integration/ Interoperability
APIs to retrieve data in a format that can be integrated with other data and return data in whatever formats are required by downstream analysis: e.g. data frames in python or R; parquet files, bioinformatic-specific formats
Interoperable with flexFS and AI/ML packages
Governance
Integrated provenance, versioning, traceability, and logs
SSO
Granular, slice-level access control
Public Data Sets
Publicly available datasets available pre-loaded for simplified validation
Biobank
Broad-Pan UKBB QTL dataset (all by all)
40 TB
https://pan.ukbb.broadinstitute.org/docs/hail-format
LD
40 TB
https://pan.ukbb.broadinstitute.org/docs/hail-format
TCGA VCF’s (+gVCF’s)
10 TB
TCGA Data Portal + Client Pipeline
PPP pQTLs
702 TB
UK Biobank
deCODE Proteomics summary stats
5 TB
Ferkingstad, E. et al. Large-scale integration of the plasma proteome with genetics and disease.
deCODE GWAS
48 TB
https://www.decode.com/ukbsummary/
gtex QTL
460 GB
https://console.cloud.google.com/storage/browser/gtex-resources/GTEx_Analysis_v8_QTLs/GTEx_Analysis_v8_EUR_eQTL_all_associations
ClinVar
dbNSFP
SingleCell
Human cell atlas 40+ million cells
4 TB (expression data only)
https://data.humancellatlas.org/explore/projects
Proteomics
Pride
3 TB
Immediately available Population Scale Data
Immediately available solution to centralize all single cell data
Ease of use with low code
Data management layer with schema tailored for fast retrieval and compute based on metadata and scalability to 100s of TBs of data
Costs
Predictable costs
Domain expert workflows
Extensible: Build your own workflows in Python or R
Integration/ Interoperability
APIs to retrieve data in a format that can be integrated with other data and return data in whatever formats are required by downstream analysis: e.g. data frames in python or R; parquet files, bioinformatic-specific formats.
Interoperable with flexFS and AI/ML packages
Governance
Integrated provenance, versioning, traceability, and logs
SSO
Granular, slice-level access control
Public Data Sets
Publicly available datasets available pre-loaded for simplified validation
Explore the latest papers and posters about REVEAL Solutions
ASHG: Exploring the genotypic and phenotypic significance of Polycystic Kidney Disease-2 (PKD2) variants in the UK Biobank using REVEAL.
ASHG: Ancestry determination using REVEAL: VariantBank, an efficient storage, management, & computational analysis platform for VCF files.
ASHG: Analysis of Computational Algorithms for Spatial Transcriptomics using REVEAL.