Application of two machine learning algorithms to genetic association studies in the presence of covariates.

Bareng AS Nonyane; Andrea S Foulkes; (2008) Application of two machine learning algorithms to genetic association studies in the presence of covariates. BMC genetics, 9 (1). 71-. ISSN 1471-2156 DOI: 10.1186/1471-2156-9-71

Copy

BACKGROUND: Population-based investigations aimed at uncovering genotype-trait associations often involve high-dimensional genetic polymorphism data as well as information on multiple environmental and clinical parameters. Machine learning (ML) algorithms offer a straightforward analytic approach for selecting subsets of these inputs that are most predictive of a pre-defined trait. The performance of these algorithms, however, in the presence of covariates is not well characterized. METHODS AND RESULTS: In this manuscript, we investigate two approaches: Random Forests (RFs) and Multivariate Adaptive Regression Splines (MARS). Through multiple simulation studies, the performance under several underlying models is evaluated. An application to a cohort of HIV-1 infected individuals receiving anti-retroviral therapies is also provided. CONCLUSION: Consistent with more traditional regression modeling theory, our findings highlight the importance of considering the nature of underlying gene-covariate-trait relationships before applying ML algorithms, particularly when there is potential confounding or effect mediation.

Item Type	Article
Keywords	Algorithms, Artificial Intelligence, Cholesterol, HDL/blood, Computational Biology, Computer Simulation, Genetic Predisposition to Disease, Genotype, HIV Infections/blood/drug therapy/ethnology/genetics, Humans, Lipase/genetics, Models, Statistical, Polymorphism, Single Nucleotide/genetics, Regression Analysis, Algorithms, Artificial Intelligence, Cholesterol, HDL, blood, Computational Biology, Computer Simulation, Genetic Predisposition to Disease, Genotype, HIV Infections, blood, drug therapy, ethnology, genetics, Humans, Lipase, genetics, Models, Statistical, Polymorphism, Single Nucleotide, genetics, Regression Analysis
ISI	262191500001

picture_as_pdf: 1471-2156-9-71.pdf
subject: Published Version
: Available under Creative Commons: 3.0

View

Download

Atom

BibTeX

OpenURL ContextObject in Span

Multiline CSV

OpenURL ContextObject

Dublin Core

MPEG-21 DIDL

EndNote

HTML Citation

JSON

MARC (ASCII)

MARC (ISO 2709)

METS

MODS

RDF+N3

RDF+N-Triples

RDF+XML

RIOXX2 XML

Reference Manager

Refer

Simple Metadata

ASCII Citation

EP3 XML

Export

Downloads