A major research focus is analysis genotype-phenotype relationships using tree-based statistical models (referred to as decision trees in the machine learning community) and a recent extension (random forests). Among my investigations in this area examine how DNA or protein sequence data can be used to understand and predict variation in basic immune system functions at the molecular level (Segal et al. 2001), RNA editing in plant mitochondrial genomes (Cummings and Myers 2004), and drug resistance in tuberculosis (Cummings and Segal 2004). More generally, the relationships of genotype to phenotype is a fundamental problem in genetics and through these investigations it is hoped that deeper understanding of these relationships will be gained.
Much of our present work in genotype-phenotype relationships includes consideration of the protein structural context. One example is my National Science Foundation funded research with Rebecca Gast and David Beaudoin on cold adaptation of tubulins from protists living in arctic and antarctic environments. Microtubules are highly conserved biological structures that are essential components of eukaryotic cellular functions such as cell division, locomotion and maintenance of cytoskeletal structure. The microtubule is formed by the assembly (polymerization) of tubulin subunits, along with a heterogeneous collection of associated proteins (MAPs). Tubulin subunits are actually composed of two related, but distinct, proteins: alpha and beta tubulin. Heterodimers of alpha and beta tubulin associate longitudinally to form protofilaments, with 13 of these filaments normally coming together in the creation of a microtubule. Microtubule assembly is mediated largely through hydrophobic interactions that occur between the carboxy terminus of the alpha tubulin and the amino terminus of the beta tubulin. A well-documented characteristics of these microtubules is that they disassemble (depolymerize) at low temperatures. Using machine learning methods we have identified changes at specific residues that are associated with adaptation of tubulins to cold environments.
Our other current research in genotype-phenotype relationships also considers ecological influences on phenotype. Studies in this area include our Department of Energy funded research with Egbert Schwartz and Bruce Hungate using combined data from microbial genomics and environmental measurements to predict nitrous oxide fluxes from soil.
Cummings, M. P. , D. S. Myers, and M. Mangelson. 2004. Applying Permutation Tests to Tree-Based Statistical Models: Extending the R Package rpart. Tehnical Report CS-TR-4581, UMIACS-TR-2004-24, Center for Bioinformatics and Computational Biology, Institute for Advanced Computer Studies, University of Maryland.