Clarke B, Fokoué E, Zhang HH: Principles and theory for data mining and machine learning. 2009, Springer, New York

Book
Google Scholar

Inza I, Calvo B, Armananzas R, Bengoetxea E, Larranaga P, Lozano JA: Machine learning: an indispensable tool in bioinformatics. Methods Mol Biol. 2010, 593: 25-48. 10.1007/978-1-60327-194-3_2.

Article
CAS
PubMed
Google Scholar

Witten I, Frank E: Data mining: practical machine learning tools and techniques. 2005, Morgan Kaufmann Publishers, San Francisco

Google Scholar

Holzinger E, Szymczak S, Malley J, Pugh E, Ling H, Griffith S, Zhang P, Li Q, Cropp C, Bailey-Wilson J: Comparison of parametric and machine methods for variable selection in simulated GAW19 data. BMC Proc. 2015, 9 Suppl 8: S15-

Google Scholar

Ziegler A, DeStefano AL, König IR, on behalf of Group 6: Data mining, neural nets, trees—problems 2 and 3 of Genetic Analysis Workshop 15. Genet Epidemiol. 2007, 31: S51-S60. 10.1002/gepi.20280.

Article
PubMed
Google Scholar

Yang HC, Lin YT: Homozygosity disequilibrium and its gene regulation. BMC Proc. 2015, 9 Suppl 8: S17-

Google Scholar

Clark AG, Boerwinkle E, Hixson J, Sing CF: Determinants of the success of whole-genome association testing. Genome Res. 2005, 15: 1463-1467. 10.1101/gr.4244005.

Article
CAS
PubMed
Google Scholar

Auerbach J, Agne M, Fan R, Lo A, Lo S, Zheng T, Wang P: Identifying regions of disease related variants in admixed populations with the summation partition approach. BMC Proc. 2015, 9 Suppl 8: S12-

Google Scholar

Fan R, Lo SH: A robust model-free approach for rare variants association studies incorporating gene-gene and gene-environmental interactions. PLoS One. 2013, 8: e83057-10.1371/journal.pone.0083057.

Article
PubMed
PubMed Central
Google Scholar

Yang HC, Chang LC, Liang YJ, Lin CH, Wang PL: A genome-wide homozygosity association study identifies runs of homozygosity associated with rheumatoid arthritis in the human major histocompatibility complex. PLoS One. 2012, 7: e34840-10.1371/journal.pone.0034840.

Article
CAS
PubMed
PubMed Central
Google Scholar

Sun R, Deng Q, Hu I, Zee BC-Y, Wang MH: A clustering approach to identify rare variants associated with hypertension. BMC Proc. 2015, 9 Suppl 8: S16-

Google Scholar

Wu MC, Lee S, Cai T, Li Y, Boehnke M, Lin X: Rare-variant association testing for sequencing data with the sequence kernel association test. Am J Hum Genet. 2011, 89: 82-93. 10.1016/j.ajhg.2011.05.029.

Article
CAS
PubMed
PubMed Central
Google Scholar

Ritchie MD, Holzinger ER, Li R, Pendergrass SA, Kim D: Methods of integrating data to uncover genotype-phenotype interactions. Nat Rev Genet. 2015, 16: 85-97. 10.1038/nrg3868.

Article
CAS
PubMed
Google Scholar

Held E, Cape J, Tintle N: Comparing machine learning and logistic regression methods for predicting hypertension using a combination of gene expression and next-generation sequencing data. BMC Proc. 2015, 9 Suppl 8: S14-

Google Scholar

Huang HH, Xu T, Yang J: Comparing logistic regression, support vector machines, and permanental classification methods in predicting hypertension. BMC Proc. 2014, 8: S96-10.1186/1753-6561-8-S1-S96.

Article
PubMed
PubMed Central
Google Scholar

Li B, Leal SM: Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am J Hum Genet. 2008, 83: 311-321. 10.1016/j.ajhg.2008.06.024.

Article
CAS
PubMed
PubMed Central
Google Scholar

Dering C, König IR, Ramsey L, Relling M, Yang W, Ziegler A: A comprehensive evaluation of collapsing methods using simulated and real data: excellent annotation of functionality and large sample sizes required. Front Genet. 2014, 5: 323-10.3389/fgene.2014.00323.

Article
PubMed
PubMed Central
Google Scholar

Kruppa J, Ziegler A, König IR: Risk estimation and risk prediction using machine-learning methods. Hum Genet. 2012, 131: 1639-1654. 10.1007/s00439-012-1194-y.

Article
PubMed
PubMed Central
Google Scholar

Haddow JE, Palomaki GE: A model process for evaluating data on emerging genetic tests. Human genome epidemiology: scope and strategies. Edited by: Khoury MJ, Little J, Burke W. 2004, Oxford University Press, New York, 217-233.

Google Scholar

Blume J, Peipert JF: What your statistician never told you about p-values. J Am Assoc Gynecol Laparosc. 2003, 10: 439-444. 10.1016/S1074-3804(05)60143-0.

Article
PubMed
Google Scholar

Simon R: Class probability estimation for medical studies. Biom J. 2014, 56: 597-600. 10.1002/bimj.201300296.

Article
PubMed
Google Scholar

Fisher R: The logic of inductive inference. J R Stat Soc Series B Stat Methodol. 1935, 98: 39-54. 10.2307/2342435.

Article
Google Scholar

Gorlov IP, Moore JH, Peng B, Jin JL, Gorlova OY, Amos CI: SNP characteristics predict replication success in association studies. Hum Genet. 2014, 133: 1477-1486. 10.1007/s00439-014-1493-6.

Article
PubMed
PubMed Central
Google Scholar

Ziegler A, König IR: Mining data with random forests: current options for real-world applications. WIREs Data Mining Knowl Discov. 2014, 4: 55-63. 10.1002/widm.1114.

Article
Google Scholar

Breiman L: Random forests. Mach Learn. 2001, 45: 5-32. 10.1023/A:1010933404324.

Article
Google Scholar

Schwarz DF, König IR, Ziegler A: On safari to random jungle: a fast implementation of random forests for high dimensional data. Bioinformatics. 2010, 26: 1752-1758. 10.1093/bioinformatics/btq257.

Article
CAS
PubMed
PubMed Central
Google Scholar

Strobl C, Malley J, Tutz G: An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests. Psychol Methods. 2009, 14: 323-348. 10.1037/a0016973.

Article
PubMed
PubMed Central
Google Scholar

Yang HC, Li HW: Analysis of homozygosity disequilibrium using whole-genome sequencing data. BMC Proc. 2014, 8: S15-10.1186/1753-6561-8-S1-S15.

Article
PubMed
PubMed Central
Google Scholar

Upstill-Goddard R, Eccles D, Fliege J, Collins A: Machine learning approaches for the discovery of gene-gene interactions in disease data. Brief Bioinform. 2013, 14: 251-260. 10.1093/bib/bbs024.

Article
CAS
PubMed
Google Scholar

Gola D, König IR: Identification of interactions using model-based multifactor dimensionality. BMC Proc. 2015, 9 Suppl 8: S13-

Google Scholar

Kira K, Rendell LA: The feature selection problem: traditional methods and a new algorithm. Proceedings of the tenth national conference on artificial intelligence. 1992

Google Scholar

Calle ML, Urrea V, Vellalta G, Malats N, Steen KV: Improving strategies for detecting genetic patterns of disease susceptibility in association studies. Stat Med. 2008, 27: 6532-6546. 10.1002/sim.3431.

Article
CAS
PubMed
Google Scholar

Chen HS, Hutter CM, Mechanic LE, Amos CI, Bafna V, Hauser ER, Hernandez RD, Li C, Liberles DA, McAllister K, et al: Genetic simulation tools for post-genome wide association studies of complex diseases. Genet Epidemiol. 2015, 39: 11-19. 10.1002/gepi.21870.

Article
PubMed
Google Scholar

Mjolsness E, DeCoste D: Machine learning for science: state of the art and future prospects. Science. 2001, 293 (5537): 2051-2055. 10.1126/science.293.5537.2051.

Article
CAS
PubMed
Google Scholar

Kruppa J, Liu Y, Biau G, Kohler M, König IR, Malley JD, Ziegler A: Probability estimation with machine learning methods for dichotomous and multicategory outcome: theory. Biom J. 2014, 56: 534-563. 10.1002/bimj.201300068.

Article
PubMed
Google Scholar

Kruppa J, Liu Y, Diener HC, Holste T, Weimar C, König IR, Ziegler A: Probability estimation with machine learning methods for dichotomous and multicategory outcome: applications. Biom J. 2014, 56: 564-583. 10.1002/bimj.201300077.

Article
PubMed
Google Scholar

Ademuyiwa FO, Miller A, O’Connor T, Edge SB, Thorat MA, Sledge GW, Levine E, Badve S: The effects of Oncotype DX recurrence scores on chemotherapy utilization in a multi-institutional breast cancer cohort. Breast Cancer Res Treat. 2011, 126: 797-802. 10.1007/s10549-010-1329-6.

Article
CAS
PubMed
Google Scholar

Cronin M, Sangli C, Liu ML, Pho M, Dutta D, Nguyen A, Jeong J, Wu J, Langone KC, Watson D: Analytical validation of the Oncotype DX genomic diagnostic test for recurrence prognosis and therapeutic response prediction in node-negative, estrogen receptor-positive breast cancer. Clin Chem. 2007, 53: 1084-1091. 10.1373/clinchem.2006.076497.

Article
CAS
PubMed
Google Scholar

McKinney BA, Reif DM, Ritchie MD, Moore JH: Machine learning for detecting gene-gene interactions: a review. Appl Bioinformatics. 2006, 5: 77-88. 10.2165/00822942-200605020-00002.

Article
CAS
PubMed
PubMed Central
Google Scholar

Breiman L: Statistical modeling: the two cultures. Stat Sci. 2001, 16: 199-231. 10.1214/ss/1009213726.

Article
Google Scholar