 Research
 Open Access
 Published:
Do changes in DNA methylation mediate or interact with SNP variation? A pharmacoepigenetic analysis
BMC Genetics volume 19, Article number: 70 (2018)
Abstract
Background
In studies with multiomics data available, there is an opportunity to investigate interdependent mechanisms of biological causality. The GAW20 data set includes both DNA genotype and methylation measures before and after fenofibrate treatment. Using change in triglyceride (TG) levels pre to posttreatment as outcome, we present a mediation analysis that incorporates methylation. This approach allows us to simultaneously consider a mediation hypothesis that genotype affects change in TG level by means of its effect on methylation, and an interaction hypothesis that the effect of change in methylation on change in TG levels differs by genotype. We select 322 singlenucleotide polymorphism–cytosinephosphateguanine (SNPCpG) site pairs for mediation analysis on the basis of proximity and marginal genomewide association study (GWAS) and epigenomewide association study (EWAS) significance, and present results from the realdata sample of 407 individuals with complete genotype, methylation, TG levels, and covariate data.
Results
We identified 3 SNPCpG site pairs with significant interaction effects at a Bonferronicorrected significance threshold of 1.55E4. None of the analyzed sites showed significant evidence of mediation. Power analysis by simulation showed that a sample size of at least 19,500 is needed to detect nominally significant indirect effects with true effect sizes equal to the point estimates at the locus with strongest evidence of mediation.
Conclusions
These results suggest that there is stronger evidence for interaction between genotype and methylation on change in triglycerides than for methylation mediating the effect of genotype.
Background
Epigenetic mechanisms, including DNA methylation, are known to influence the phenotypic consequences of genetic variation. To fully explain the biological mechanism of an outcome of interest, it is necessary to characterize the relationship between genetic and epigenetic effects. These relationships may be described as mediation, in which genetic variation influences methylation which then influences the phenotype, or interaction (also called effect modification) in which the average effect of methylation differs by genotype, or both.
Mediation analysis has been applied to epidemiological studies of genetic and epigenetic variation to investigate the first of these hypotheses [1, 2]. Previous studies found evidence that methylation may mediate genetic risk of rheumatoid arthritis, inflammatory bowel disease, and peanut allergy [3, 4]. Gene–environment interaction methods have also been adapted to pharmacogenetics trials to address the second hypothesis.
The GAW20 data set reports a singlearm clinical trial of a drug intended to lower triglyceride (TG) levels. TG and DNA methylation are observed both before and after drug treatment. In this article, we investigate the extent to which mediation and interaction effects between singlenucleotide polymorphisms (SNPs) and changes in methylation at nearby cytosinephosphateguanine (CpG) sites contribute to changes in TG levels. In this context, mediation effects represent a mechanism of drug action through contextspecific methylation quantitative trait loci, while interaction effects may identify genetic subgroups in which druginduced changes in methylation lead to changes in TG levels.
Methods
We analyzed the real GAW data set, comprising 407 individuals with complete TG, genotype, methylation, and covariate data. The sample of 679 individuals with TG, genotype, and covariate data was used for preliminary screening of SNPs for analysis. In the following, we present the details for an exposure A (SNP genotype alternate allele count), a continuous mediator M (difference in methylation posttreatment minus pretreatment), and a continuous outcome Y (difference in log TG posttreatment minus pretreatment). Relevant covariates C include age, sex, study center, and smoking status.
Mediation hypothesis
The counterfactual approach to mediation analysis provides methods to quantify these relationships [5, 6]. This approach is based on the potential outcomes of each subject, conditional on the levels of exposure and mediator. Only one of these potential outcomes is observed for each individual, but under certain assumptions, the others may be estimated from the data. Here, Y_{am} represents the potential outcome for exposure level A = a and mediator level M = m, and M(a) represents the level of the mediator that would be observed for a given subject with exposure level a. The total contribution of mediation through M to the effect of A on Y is given by the natural indirect effect (NIE): \( NIE={Y}_{aM(a)}{Y}_{aM\left({a}^{\ast}\right)} \), which is the difference in potential outcomes among individuals with exposure level a compared to those with observed mediator level M (a) and counterfactual mediator level M (a*) which they would have had if their exposure level had been a*. For notational simplicity, we take a = 1 and a* = 0 so the contrast is defined in terms of 1 additional alternate allele for the SNP under consideration. Note that this quantity will be zero if there is no effect of the exposure on the mediator [so that M(a) = M(a^{∗})] or no effect of the mediator on the outcome (so that \( {Y}_{a{m}_1}={Y}_{a{m}_2} \)for any values m_{1}, m_{2} of the mediator). The NIE can be estimated from the simultaneous regression models as follows:
Under the assumptions described below, the NIE=β_{1}(θ_{2} + θ_{3}). The SE of this estimate via the delta method is \( \sqrt{{\Gamma \Sigma \Gamma}^{\prime }} \)where Γ = (0, θ_{2} + θ_{3}, 0^{′}, 0, 0, β_{1}, β_{1}, 0^{′}) and ∑ is the blockdiagonal covariance matrix of the estimators from regression models (1) and (2).
This NIE estimator has a valid causal interpretation if models (1) and (2) are correctly specified and the following assumptions hold:

1.
No unmeasured confounding for the exposure–outcome relationship.

2.
No unmeasured confounding for the mediator–outcome relationship.

3.
No unmeasured confounding for the exposure–mediator relationship.

4.
No mediatoroutcome confounder is affected by the exposure.
Similar assumptions are required for causal interpretation of any regression analysis.
Because the statistical power to detect indirect effects is low in studies with a small to moderate sample size, and because statistical hypothesis testing is not a valid method for qualitative assessment of confounding between the exposure and mediator, VanderWeele recommends comparing the magnitude of the total effect of the exposure on the outcome, estimated from a model that excludes the mediator, and the direct effect of exposure adjusting for the effect of the mediator and exposure–mediator interaction [6].
Interaction hypothesis
For the purpose of assessing mediation, the interaction term in model (2) is useful primarily to allow valid estimates in the presence of nonadditive contributions of the genetic and methylation effects. However, we are also interested in the interaction coefficient θ_{3} in its own right. The null hypothesis of interaction, θ_{3} = 0, may be interpreted as follows: the effect of M on Y is the same at all levels of A. If this null hypothesis does not hold, we may identify genotypic subgroups with different methylation effects.
Implementation
The GAW20 real data set is drawn from a singlearm clinical trial of fenofibrate treatment in the Genetics of Lipid Lowering Drugs and Diet Network (GOLDN) study familybased cohort. We selected SNPCpG site pairs by first running marginal association models with the phenotype:
We then selected SNPCpG site pairs with all the following 3 criteria:

1.
SNP p value <1e3

2.
Methylation epigenomewide association study p value < 0.05

3.
Distance between SNP and CpG site < 50 kb pairs
These criteria were chosen to balance the considerations of low statistical power resulting from multiple testing corrections against the possibility of failing to detect significant interactions when the marginal effects are negligible.
The mediation–interaction model described above was then estimated for these SNPCpG site pairs. The total effect refers to the coefficient γ_{1} in regression model (3). Models (3) and (4) were estimated genomewide using EPACTS, and models (1) and (2) were estimated only at selected SNPCpG pairs using the kinship and coxme packages in R.
Because of missing data in the posttreatment methylation data set, the sample for mediation analysis was a subset of the GWAS screening sample.
Power calculations
We used simulation to investigate the statistical power to detect mediation between genotype and change in methylation. Based on the SNP allele frequency and distribution of change in methylation at the SNPCpG site pair with strongest evidence of nonzero NIE, we simulated genotypes, change in methylation, and outcome measures varying the sample size, effect of SNP on change in methylation (β_{1}), effect of methylation on outcome (θ_{2}), and interaction effect (θ_{3}), while holding all other model parameters constant at their observed point estimates. The simulated samples comprised unrelated individuals, so the parameters in models (1) and (2) were estimated by multiple linear regression rather than linear mixed models. All power calculations used a significance level of α = 0.05, with 500 replicates.
Results
Using the above criteria, 322 SNPCpG site pairs were selected, including 156 unique SNPs and 223 unique CpG sites. The maximum number of significant CpG sites within the 50kb radius of a given SNP was 7, and the maximum number of significant SNPs within 50 kb of a given CpG site was 16. These numbers presumably reflect linkage disequilibrium (LD) patterns among nearby variants. Tables 1 and 2 , respectively, summarize the most significant mediation and interaction effects.
The Wald test for the NIE (see Table 1) reveals no SNPCpG pairs with significant evidence of mediation at the α = 0.05 level. However, it is noteworthy that the total effect and natural direct effect show opposite direction in 3 of these 5 cases, and differ substantially in magnitude in all 5. For interaction, 3 SNPCpG site pairs pass a Bonferronicorrected significance threshold of 0.05/322 = 0.000155, adjusting for multiple testing at all selected pairs (see Table 2). For each of these pairs, the interaction effect was more significant than the total effect of the SNP from model (3), thereby excluding methylation. Estimated effects of methylation stratified by genotype are reported in Table 3, demonstrating differential responses to change in methylation. The top 2 interaction effects, both with p < 5e8, were found with the same CpG site, cg21463380 on chromosome 3. Two SNPs involved in these interaction effects, rs4686740 and rs2575, are in high LD (R2 = 0.9175, D′ = 0.9675), so we assume that they are tagging the same signal. The lead SNP, rs4686740, is located in an intron of the gene DGKG (diacylglycerol kinase gamma), which codes an enzyme involved in lipid metabolism. The CpG site with which it interacts is located over 40 kb away, near the somatostatin coding gene SST. This finding suggests a regulatory relationship between this methylation site and the DGKG gene.
The second SNPCpG pair (rs17216446 cg15395354) with significant interaction effect is located on chromosome 4 in an intron of the gene METP1, which codes the methionyl aminopeptidase 1 protein. The interacting CpG site is located 19 kb away, in a long noncoding RNA, BX647984. The first SNPCpG site pair displays substantial positive methylation effect estimates for individuals with 1 or 2 G alleles, but no effect of methylation among those with homozygous reference genotype. The second pair displays a positive effect of methylation only for those with homozygous reference genotype at the SNP. It is notable that positive effects are considered deleterious in this study, as the aim of the drug treatment is to reduce TG levels. Mediation, as measured by NIE, did not reach nominal significance (p < 0.05) at any of the SNPCpG sites with significant interaction effects. In all these cases, the effect of genotype on change in methylation, one factor in the product formulation of the NIE, was not significant.
Figure 1 shows plots of the statistical power from simulations to detect NIE. Varying the components of the NIE independently within the range of parameter estimates observed in the study data, all scenarios showed power of less than 50%. The genotype effect on change in methylation, β_{1}, appears to be the greatest limitation on statistical power as increasing this parameter leads to the greatest improvements in power to detect mediation. Sample size is also a limitation, with 10,000 unrelated subjects required to attain 50% power to detect NIE, and 19,500 unrelated subjects required for 80% power, given true effect sizes of β_{1} = 0.001, θ_{2} = − 1.661, and θ_{3} = 0.713, equal to the point estimates at the rs12771141cg04855826 site.
Discussion
The mediation analysis did not identify significant indirect effects with changes in methylation level mediating the effect of SNP genotype on change in TG levels. This may be the result of the genetic architecture of lipid traits; for example, shortterm changes in DNA methylation may not be an effective mechanism for modifying TG levels. The moderately small sample size, especially in the real posttreatment methylation data, also limits our statistical power to detect indirect effects. The substantial changes in direct effect estimates after accounting for possible confounding and interaction with the nearby CpG site suggests that the effects of genotype and methylation are not independent at these sites, despite the failure to attain statistical significance. Further work is needed on hypothesis testing for mediation in the context of a heavy burden of multiple testing. In particular, statistical tests for the change in effect estimates between the unadjusted and interactionadjusted models would provide overall quantification of the impact of methylation on genetic effects at a given locus. Furthermore, multipleexposure or multiplemediator models may be appropriate at loci where several SNPCpG pairs were identified.
Conclusions
We found significant interaction effects between SNP genotypes and CpG methylation levels on chromosomes 3 and 4. For individuals with certain genotypes, increases in methylation at the identified CpG sites were strongly associated with increased TG levels after drug treatment. These findings provide evidence of regulatory relationships between DNA methylation and SNPs at these loci. However, none of these sites showed nominally significant evidence of mediation, a consequence of a lack of association between genotype and change in methylation. In other words, the distribution of change in methylation is the same across genotypes, but the effect of change in methylation differs. This paper demonstrates the utility of integrated analysis of genetic and epigenetic data to investigate the multiple sources of variation for complex traits.
Abbreviations
 CpG:

cytosinephosphateguanine
 EWAS:

epigenomewide association study
 GAW:

Genetic Analysis Workshop
 GWAS:

genomewide association study
 LD:

linkage disequilibrium
 NIE:

natural indirect effect
 SNP:

single nucleotide polymorphism
 TG:

triglyceride
References
Millstein J, Zhang B, Zhu J, Schadt EE. Disentangling molecular relationships with a causal inference test. BMC Genet. 2009;10:23.
Bjornsson HT, Sigurdsson MI, Fallin MD, Irizarry RA, Aspelund T, Cui H, Yu W, Rongione MA, Ekström TJ, Harris TB, et al. Intraindividual change over time in DNA methylation with familial clustering. JAMA. 2008;299(24):2877–83.
Liu Y, Aryee MJ, Padyukov L, Fallin MD, Hesselberg E, Runarsson A, Reinius L, Acevedo N, Taub M, Ronninger M, et al. Epigenomewide association data implicate DNA methylation as an intermediary of genetic risk in rheumatoid arthritis. Nat Biotechnol. 2013;31(2):142–7.
Ventham NT, Kennedy NA, Adams AT, Kalla R, Heath S, O’Leary KR, Drummond H, IBD BIOM consortium; IBD CHARACTER consortium, Wilson DC, et al. Integrative epigenomewide analysis demonstrates that DNA methylation may mediate genetic risk in inflammatory bowel disease. Nat Commun. 2016;7:13507.
Fairchild AJ, MacKinnon DP. A general model for testing mediation and moderation effects. Prev Sci. 2009;10(2):87–99.
VanderWeele T. Explanation in causal inference: methods for mediation and interaction. New York: Oxford University Press; 2015.
Funding
Publication of the proceedings of Genetic Analysis Workshop 20 was supported by National Institutes of Health grant R01 GM031575.
Availability of data and materials
The data that support the findings of this study are available from the Genetic Analysis Workshop (GAW), but restrictions apply to the availability of these data, which were used under license for the current study. Qualified researchers may request these data directly from GAW.
About this supplement
This article has been published as part of BMC Genetics Volume 19 Supplement 1, 2018: Genetic Analysis Workshop 20: envisioning the future of statistical genetics by exploring methods for epigenetic and pharmacogenomic data. The full contents of the supplement are available online at https://bmcgenet.biomedcentral.com/articles/supplements/volume19supplement1.
Author information
Authors and Affiliations
Contributions
VF, LW, XD, CS, LAC, and CTL designed the study. VF, LW, and XD performed analysis. VF prepared the manuscript, and all authors reviewed and edited it. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Fisher, V., Wang, L., Deng, X. et al. Do changes in DNA methylation mediate or interact with SNP variation? A pharmacoepigenetic analysis. BMC Genet 19 (Suppl 1), 70 (2018). https://doi.org/10.1186/s1286301806356
Published:
DOI: https://doi.org/10.1186/s1286301806356
Keywords
 Causal modeling
 Genomic data integration
 Genemethylation interaction
 Indirect effects
 Triglycerides
 Genofibrate treatment