Solicitation Number: 75N95020S-NCATS-0001
NCATS established the NCATS Collaborative Scientific Platform to integrate, manage, secure, and analyze data across the translational spectrum, from basic research through pre-clinical research, clinical research, clinical implementation, and public health improvements. This platform-as-a-service (PaaS) currently supports NCATS, the National Cancer Institute (NCI), and the President’s Emergency Plan for AIDS Relief (PEPFAR). The Platform has integrated hundreds of live intramural and third-party data sources in support of dozens of ongoing, critical scientific projects that rely on continuous access, data, and analyses within the Platform. The Collaborative Scientific Platform is now the standard means for accessing and collaboratively analyzing NCATS screening data for dozens of investigators at both NCATS and NCI, and for accessing and analyzing RNASeq and several kinds of proteomics data at NCI. Pilot projects are underway within the Platform to extend its use to clinical applications that will support NCATS (the Clinical and Translational Science Awards (CTSA) and Rare Diseases Clinical Research Network (RDCRN)), NCI, and PEPFAR.
The Platform is based on the Foundry software platform from Palantir, Inc., under contract number HHSN271201800020I, awarded in 2018.
Purpose and Objectives: Provide a collaborative translational research scientific platform as a service (PaaS) to integrate, manage, secure, and analyze data across the translational research spectrum.
Project requirements:
- A commercial software solution deployable on day one of the project and that can be configured within expedited timelines. Respondents must possess FedRAMP authorization so that an agency ATO can be granted upon award of a contract.
- An open data architecture, where data always remains under the full control of NCATS and can be easily exported in open, non-proprietary data formats via open APIs. The software should be built on an open, distributed microservices architecture with open, well-documented REST APIs that are designed to seamlessly interface with other systems, adapt to meet evolving needs, and avoid system lock-in.
- Proven data integration capabilities, including the ability to rapidly ingest unprocessed high-throughput drug screening (HTS) outputs, genomic data (including DNA sequencing, RNA-Seq, miRNA, etc.), mass spectrometry, flow cytometry, and other data types used in basic and translational biosciences research.
- The ability to maintain data and scientific provenance and reproducibility of all integrated data sources where every resource (dataset, analysis, code, plot, report) contains provenance, metadata, and can be both traced back to the exact version of all upstream dependencies, and where the dependency tree can be easily replayed given new data or updated analysis logic, while still retaining prior versions and branches…