Data Integration

One of the challenges of big data research is the integration of diverse data types from different sources. This is especially true for electronic health record data. We have developed powerful software tools for the integration of clinical data. The first, Carnival, is a graph database for linking different data elements in a graph. The second, PennTURBO, maps data elements to a biomedical ontology that facilitates inference. We are working with the University of Pennsylvania Health system to integrate these tools into our health IT infrastructure so that all of our investigators can perform user-friendly cohort discovery and analysis.

Data Integration projects include:

Carnival is a data unification technology that aggregates disparate data into a unified property graph resource.  Inspired by Open Biological and Biomedical Ontology (OBO) Foundry ontologies, the Carnival data model supports the execution of common investigatory tasks including patient cohort identification, automated case-control matching, and the production of data sets for scientific analysis.

TURBO stands for Transforming and Unifying Research with Biomedical Ontologies. The PennTURBO group accelerates finding and connecting key information from clinical records for research through semantic associations to the processes that generated the clinical data. Discovery of previously unappreciated relations between the data are made possible by these associations.