IBI Seminar: Ryan J. Urbanowicz, Ph.D.
June 1 @ 10:00 am - 11:00 am
“A new paradigm for data mining in bioinformatics: Embracing genetic heterogeneity”
Ryan J. Urbanowicz, Ph.D.
Research Associate, Department of Biostatistics, Epidemiology and Informatics
University of Pennsylvania
Thursday, June 1, 2017
JMB Reunion Auditorium
The goals for data mining in bioinformatics are to detect, model, interpret, and make predictions. In genomics, this typically involves connecting predictive genotypes with a disease phenotype. However, this task is regularly confounded by the complexity of underlying patterns of association as well as the scale of modern data. Genetic heterogeneity, in particular, is a source of complexity that is well recognized, but has been almost ubiquitously side stepped by genomic analyses relying instead on data stratification or homogeneous sampling. Our biostatistics and machine learning tools adhere to a classic paradigm of modeling that are not equip to deal with heterogeneous associations. In this talk, we will (1) discuss a new paradigm of ‘piece-wise’ modeling that successfully embraces genetic heterogeneity using a rule-based approach to machine learning, (2) review the advancements we have made in developing state-of-the-art bioinformatics methods that can detect both genetic heterogeneity and gene-gene interactions, as well as identify candidate clinical patient subgroups towards personalized medicine, and (3) delve deeper into our broadly applicable work on feature selection methods that must similarly be sensitive to both genetic heterogeneity and gene-gene interactions. We show that by embracing heterogeneous associations, we can solve problems that have never been surmounted, we gain the potential to recover ‘missing heritability’ from existing research cohorts, and we have a unique, flexible new tool for the incoming wave of whole genome sequencing analyses.