Real world health data are critical for Patient-Centered Outcomes Research (PCOR). However, it’s often difficult, expensive, and time consuming for researchers to access real-world clinical health data because of privacy concerns, security restrictions, and usage issues. Although PCOR researchers, health information technology developers, and informaticists often depend on anonymized or de-identified clinical health data for testing theories, data models, algorithms, and prototype innovations, re-identification of anonymized data remains a possible security risk. Synthetic health data can provide a no-risk data source to complement research and support testing needs until real clinical health data are available.
In 2019, ONC launched the Synthetic Health Data Project (Project), enabling Synthea™—a synthetic health data generation engine created by MITRE Corporation—to produce high-quality synthetic health data in specific areas, and more varied and plentiful synthetic health records. The Project focused on the creation of synthetic data to support PCOR for patients with complex care needs, opioid use, and pediatric populations. Synthea was chosen because it is free, open-source, community-driven, shareable, and reliable. It also has the capacity to generate realistic health data for fictitious patients and generates data randomly and independently from publicly available datasets, so it’s free of protected health information (PHI) and personally identifiable information (PII)… Read the full article here.