Skip to main content

Table 1 Comparison of the different methods.

From: The challenge for genetic epidemiologists: how to analyze large numbers of SNPs in relation to complex diseases

 

Logistic regression

Neural networks

Set association

CPM

RPM

MDR

Random forests

  

PDM

GPNN

     

Outcome variable

dichotomous

categorical continuous

categorical continuous

dichotomous

continuous

continuous

dichotomous

categorical

Dimensionality

no

no

yes

yes

yes

yes

yes

yes

Number of predictors

few

moderate

many

many

moderate

many*

moderate†

many

Power to detect important effects

low

no info

high

high

high

high

high

high

Detection of interactions

no

yes

yes

no

yes

yes

yes

yes‡

Correlated predictors

no

no

yes

n.i.**

yes

yes

yes

no

Genetic heterogeneity

no

yes

yes

no

no

no

no

yes

Software available

Open source

yes

yes

no

no

yes

yes

no

at request and under development

yes

yes

yes

yes

  1. For the problems of dimensionality, correlated predictors and genetic heterogeneity yes and no indicate respectively that a method is able or not able to handle the problem. For detection of interactions when main effects are absent yes and no indicate respectively that a method is able or not able to detect interactions while main effects of the loci involved in the interaction are small or absent.
  2. * RPM is subject to the multiple testing problem.
  3. † MDR can analyze a moderate number of factors, but filter methods that are part of the MDR software can be applied before using MDR, enabling the user of the MDR software to analyze large numbers of factors.
  4. ‡ Interactions contribute to the importance of predictors.
  5. ** n.i.: not implemented, adjustment of the test statistics for correlation between markers is not implemented in the software.