Clarke B, Fokoué E, Zhang HH: Principles and theory for data mining and machine learning. 2009, Springer, New York
Book
Google Scholar
Inza I, Calvo B, Armananzas R, Bengoetxea E, Larranaga P, Lozano JA: Machine learning: an indispensable tool in bioinformatics. Methods Mol Biol. 2010, 593: 25-48. 10.1007/978-1-60327-194-3_2.
Article
CAS
PubMed
Google Scholar
Witten I, Frank E: Data mining: practical machine learning tools and techniques. 2005, Morgan Kaufmann Publishers, San Francisco
Google Scholar
Holzinger E, Szymczak S, Malley J, Pugh E, Ling H, Griffith S, Zhang P, Li Q, Cropp C, Bailey-Wilson J: Comparison of parametric and machine methods for variable selection in simulated GAW19 data. BMC Proc. 2015, 9 Suppl 8: S15-
Google Scholar
Ziegler A, DeStefano AL, König IR, on behalf of Group 6: Data mining, neural nets, trees—problems 2 and 3 of Genetic Analysis Workshop 15. Genet Epidemiol. 2007, 31: S51-S60. 10.1002/gepi.20280.
Article
PubMed
Google Scholar
Yang HC, Lin YT: Homozygosity disequilibrium and its gene regulation. BMC Proc. 2015, 9 Suppl 8: S17-
Google Scholar
Clark AG, Boerwinkle E, Hixson J, Sing CF: Determinants of the success of whole-genome association testing. Genome Res. 2005, 15: 1463-1467. 10.1101/gr.4244005.
Article
CAS
PubMed
Google Scholar
Auerbach J, Agne M, Fan R, Lo A, Lo S, Zheng T, Wang P: Identifying regions of disease related variants in admixed populations with the summation partition approach. BMC Proc. 2015, 9 Suppl 8: S12-
Google Scholar
Fan R, Lo SH: A robust model-free approach for rare variants association studies incorporating gene-gene and gene-environmental interactions. PLoS One. 2013, 8: e83057-10.1371/journal.pone.0083057.
Article
PubMed
PubMed Central
Google Scholar
Yang HC, Chang LC, Liang YJ, Lin CH, Wang PL: A genome-wide homozygosity association study identifies runs of homozygosity associated with rheumatoid arthritis in the human major histocompatibility complex. PLoS One. 2012, 7: e34840-10.1371/journal.pone.0034840.
Article
CAS
PubMed
PubMed Central
Google Scholar
Sun R, Deng Q, Hu I, Zee BC-Y, Wang MH: A clustering approach to identify rare variants associated with hypertension. BMC Proc. 2015, 9 Suppl 8: S16-
Google Scholar
Wu MC, Lee S, Cai T, Li Y, Boehnke M, Lin X: Rare-variant association testing for sequencing data with the sequence kernel association test. Am J Hum Genet. 2011, 89: 82-93. 10.1016/j.ajhg.2011.05.029.
Article
CAS
PubMed
PubMed Central
Google Scholar
Ritchie MD, Holzinger ER, Li R, Pendergrass SA, Kim D: Methods of integrating data to uncover genotype-phenotype interactions. Nat Rev Genet. 2015, 16: 85-97. 10.1038/nrg3868.
Article
CAS
PubMed
Google Scholar
Held E, Cape J, Tintle N: Comparing machine learning and logistic regression methods for predicting hypertension using a combination of gene expression and next-generation sequencing data. BMC Proc. 2015, 9 Suppl 8: S14-
Google Scholar
Huang HH, Xu T, Yang J: Comparing logistic regression, support vector machines, and permanental classification methods in predicting hypertension. BMC Proc. 2014, 8: S96-10.1186/1753-6561-8-S1-S96.
Article
PubMed
PubMed Central
Google Scholar
Li B, Leal SM: Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am J Hum Genet. 2008, 83: 311-321. 10.1016/j.ajhg.2008.06.024.
Article
CAS
PubMed
PubMed Central
Google Scholar
Dering C, König IR, Ramsey L, Relling M, Yang W, Ziegler A: A comprehensive evaluation of collapsing methods using simulated and real data: excellent annotation of functionality and large sample sizes required. Front Genet. 2014, 5: 323-10.3389/fgene.2014.00323.
Article
PubMed
PubMed Central
Google Scholar
Kruppa J, Ziegler A, König IR: Risk estimation and risk prediction using machine-learning methods. Hum Genet. 2012, 131: 1639-1654. 10.1007/s00439-012-1194-y.
Article
PubMed
PubMed Central
Google Scholar
Haddow JE, Palomaki GE: A model process for evaluating data on emerging genetic tests. Human genome epidemiology: scope and strategies. Edited by: Khoury MJ, Little J, Burke W. 2004, Oxford University Press, New York, 217-233.
Google Scholar
Blume J, Peipert JF: What your statistician never told you about p-values. J Am Assoc Gynecol Laparosc. 2003, 10: 439-444. 10.1016/S1074-3804(05)60143-0.
Article
PubMed
Google Scholar
Simon R: Class probability estimation for medical studies. Biom J. 2014, 56: 597-600. 10.1002/bimj.201300296.
Article
PubMed
Google Scholar
Fisher R: The logic of inductive inference. J R Stat Soc Series B Stat Methodol. 1935, 98: 39-54. 10.2307/2342435.
Article
Google Scholar
Gorlov IP, Moore JH, Peng B, Jin JL, Gorlova OY, Amos CI: SNP characteristics predict replication success in association studies. Hum Genet. 2014, 133: 1477-1486. 10.1007/s00439-014-1493-6.
Article
PubMed
PubMed Central
Google Scholar
Ziegler A, König IR: Mining data with random forests: current options for real-world applications. WIREs Data Mining Knowl Discov. 2014, 4: 55-63. 10.1002/widm.1114.
Article
Google Scholar
Breiman L: Random forests. Mach Learn. 2001, 45: 5-32. 10.1023/A:1010933404324.
Article
Google Scholar
Schwarz DF, König IR, Ziegler A: On safari to random jungle: a fast implementation of random forests for high dimensional data. Bioinformatics. 2010, 26: 1752-1758. 10.1093/bioinformatics/btq257.
Article
CAS
PubMed
PubMed Central
Google Scholar
Strobl C, Malley J, Tutz G: An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests. Psychol Methods. 2009, 14: 323-348. 10.1037/a0016973.
Article
PubMed
PubMed Central
Google Scholar
Yang HC, Li HW: Analysis of homozygosity disequilibrium using whole-genome sequencing data. BMC Proc. 2014, 8: S15-10.1186/1753-6561-8-S1-S15.
Article
PubMed
PubMed Central
Google Scholar
Upstill-Goddard R, Eccles D, Fliege J, Collins A: Machine learning approaches for the discovery of gene-gene interactions in disease data. Brief Bioinform. 2013, 14: 251-260. 10.1093/bib/bbs024.
Article
CAS
PubMed
Google Scholar
Gola D, König IR: Identification of interactions using model-based multifactor dimensionality. BMC Proc. 2015, 9 Suppl 8: S13-
Google Scholar
Kira K, Rendell LA: The feature selection problem: traditional methods and a new algorithm. Proceedings of the tenth national conference on artificial intelligence. 1992
Google Scholar
Calle ML, Urrea V, Vellalta G, Malats N, Steen KV: Improving strategies for detecting genetic patterns of disease susceptibility in association studies. Stat Med. 2008, 27: 6532-6546. 10.1002/sim.3431.
Article
CAS
PubMed
Google Scholar
Chen HS, Hutter CM, Mechanic LE, Amos CI, Bafna V, Hauser ER, Hernandez RD, Li C, Liberles DA, McAllister K, et al: Genetic simulation tools for post-genome wide association studies of complex diseases. Genet Epidemiol. 2015, 39: 11-19. 10.1002/gepi.21870.
Article
PubMed
Google Scholar
Mjolsness E, DeCoste D: Machine learning for science: state of the art and future prospects. Science. 2001, 293 (5537): 2051-2055. 10.1126/science.293.5537.2051.
Article
CAS
PubMed
Google Scholar
Kruppa J, Liu Y, Biau G, Kohler M, König IR, Malley JD, Ziegler A: Probability estimation with machine learning methods for dichotomous and multicategory outcome: theory. Biom J. 2014, 56: 534-563. 10.1002/bimj.201300068.
Article
PubMed
Google Scholar
Kruppa J, Liu Y, Diener HC, Holste T, Weimar C, König IR, Ziegler A: Probability estimation with machine learning methods for dichotomous and multicategory outcome: applications. Biom J. 2014, 56: 564-583. 10.1002/bimj.201300077.
Article
PubMed
Google Scholar
Ademuyiwa FO, Miller A, O’Connor T, Edge SB, Thorat MA, Sledge GW, Levine E, Badve S: The effects of Oncotype DX recurrence scores on chemotherapy utilization in a multi-institutional breast cancer cohort. Breast Cancer Res Treat. 2011, 126: 797-802. 10.1007/s10549-010-1329-6.
Article
CAS
PubMed
Google Scholar
Cronin M, Sangli C, Liu ML, Pho M, Dutta D, Nguyen A, Jeong J, Wu J, Langone KC, Watson D: Analytical validation of the Oncotype DX genomic diagnostic test for recurrence prognosis and therapeutic response prediction in node-negative, estrogen receptor-positive breast cancer. Clin Chem. 2007, 53: 1084-1091. 10.1373/clinchem.2006.076497.
Article
CAS
PubMed
Google Scholar
McKinney BA, Reif DM, Ritchie MD, Moore JH: Machine learning for detecting gene-gene interactions: a review. Appl Bioinformatics. 2006, 5: 77-88. 10.2165/00822942-200605020-00002.
Article
CAS
PubMed
PubMed Central
Google Scholar
Breiman L: Statistical modeling: the two cultures. Stat Sci. 2001, 16: 199-231. 10.1214/ss/1009213726.
Article
Google Scholar