Gene-Environment Interactions and the Statistical Fallacy

One question that has long haunted behavioral geneticists is the detection of gene-environment interactions. Obviously, every developmentalist has known for the good part of a century that genes and environments interact, as countless experiments have demonstrated genes and environments never act independently (Gottlieb 1995, 2007; Henderson 1970, 1972Hood 2005, p. 225-251; Hughes & Zubek 1956Järvilehto 1998a, 1998b, 1999, 2000). The issue here is that the analysis of variance that behavioral geneticists perform seems to ignore a role for gene-environment interactions. There have been feeble attempts to model gene-environment interactions in development (Jinks & Fulker 1970), but many of these models have faced mathematical (Capron et. al 1999) and scientific (Charney 2012) errors that have persisted into the 21st century (van der Sluis et. al 2006). Some have posited reaction ranges (Turkheimer et. al 1995) as a unifying tool for behavioral geneticists and developmentalists, but these seem to be inadequate in bridging the gaps (Gottlieb 1995, 2003).

Some have posited that the gene-environment interactions that developmentalists posit are ubiquitous (and for some, comprise the entirety of the variance), simply do not exist. For them, we end up living in an additive world anyways! They typically cite the failure of gene-environment interactions to be found or to replicate (Border et. al 2019; Caspi et. al 2003; Figlio et. al 2017Turkheimer et. al 2003Young et. al 2006) [1] in larger samples as evidence that gene-environment interactions are less important or nonexistent in complex outcomes. Despite the results of some research to find gene-environment interactions, we have every reason to believe they exist in humans and are substantial (DeYoung & Clark 2012; Buil et. al 2015;  Molenaar, Boomsma & Dolan 1993Molenaar & Raijmakers 1999, 2000; Mostafavi et. al 2019; Sauce & Matzel 2018Wahlsten 2003Young & Durbin 2014Zuk et. al 2012, ). This is a sorely mistaken interpretation of genes and environments and ends up showing exactly how family modeling can’t partition anything (Lickliter 2009).

Firstly is the problem of actually detecting them. The idea that a gene-environment interaction can be detected statistically by merely observing some relatedness coefficient ($R=1$ for MZ twins, $R=\frac{1}{2}$ for DZ twins, etc) and perhaps one or two environmental variables (Turkheimer et. al 2003), if even (van der Sluis et. al 2006) is as naive as it comes. Genes and environments combine in statistically heterogeneous and stochastic ways that evade detection without actual developmental research. Some methodologies for ‘detecting’ ‘interactions’ (Moore 2018) have statistical issues that make them difficult to rely on (Grigorenko 2005; Wahlsten 1990, 1994, 2000, 2003).

Now the issue is at least threefold when it comes to measurement.

The first issue is that the environment is measured (when it is) poorly (Joober et. al 2007; Foley & Riley 2007; Thapar et. al 2007Turkheimer 2008). The domain of environmental influences that can modify, constrain, amplify, mediate, or interact with genetic influences is not bounded, which presents a significant problem in the face of the fact that environmental influences are often non-obvious (Turvey & Sheya 2018, Wallman 1979). When we do think we have an idea of what an environmental influence may be, we never have a perfect measure of this environment (Kaufman et. al 1997), nor do we always actually measure what we think we’re measuring (Richardson 2002). The sociological, psychological and psychometric instruments and batteries are never true representations of some latent construct, but only an error-laden way of getting at it. When these errors and biases add up, that can lead to problems with statistical detection.

Secondly, is the issue of actually measuring the genes. Obviously in the two parts of a gene-environment interaction, there are going to be genetic effects on some trait. This is the first issue barring people from detecting gene-environment interactions: the fact that we have been unable to find reliable, replicable and large associations (let alone those that are causal) between genes and behavioral traits (Charney 2012, Chaufan & Joseph 2013; Colhoun et. al 2003; Curtis 2018a, 2018b; Johnson et. al 2017Kaprio 2012, Genin 2019; Ioannidis et. al 2001; Matthews & Turkheimer 2019; Nolte et. al 2017; Young 2019). If we are unable to detect actual genetic associations sufficiently, then it would be hard to find a gene-environment interaction with that! Moreover, even when we do find a reliable association for an SNP, there are other barriers standing in our way. The first is that SNPs are not present in all individuals, but are non-randomly distributed throughout the population (Haworth et. al 2019; Lawson et. al 2019). That means that the range of environments for individuals with a given SNP is restricted, sometimes severely, and the confounds and colliders that can result from this non-random distribution are endless (Schmitz & Conley 2017). 

Third, there is the issue that we shouldn’t expect development to work in such a way that statistical detection will be easy. Genes effects on behaviors are subtle. Environmental influences are often subtle, and the treatment heterogeneous (Angrist 2003). The combination between the two is both subject to itself further mediations, moderation and interaction, but is also severe heterogeneity due to developmental stochasticity. This means that we may never be able to recover some of the interactions that produced a given phenotype, or set of given phenotypes.

Beyond measurement, there is the simple problem that has plagued genetic research all along: sample size. Behavioral geneticists initially thought they would be able to detect the genes causing behavioral traits with relatively small sample sizes (hence the candidate gene era), and that there would be a small number of them. This belief turned out to be mistaken: sorely mistaken. Genome wide association studies have marshaled near-millions of individuals (Gaziano et. al 2016Keyes & Westreich 2019Nagai et. al 2017; Zillikens et. al 2017) and are still failing to detect the necessary SNPs. This has lead some to call for increasing sample sizes even more to find more SNPs, associate more SNPs, etc (Goldman 2014). In the case of the environmental research, sample size has also been a proven. Interventions are typically done with small sample sizes (Bacharach & Baumeister 2000Heckman & Karapakula 2019) and have issues of power. Psychology research has similar sample size issues, where canonical research has failed to replicate with larger sample sizes. This issue is compounded by several orders of magnitude when it comes to gene-environment association research: the sample sizes required for robust interaction testing are much higher than both the individual sample sizes for a gene and an environmental treatment and their sum (Khoury, Beaty & Hwang 1995).

Another severe issue is that there are different amounts of research and research funding going into additive genetic association studies and interaction studies. As a result, fewer statisticians, biologists and geneticists have spent the time to construct robust models that don’t suffer from a number of issues like:

a) population stratification and linkage disequilibrium (de los Campos, Sorensen & Toro 2019Kruijer 2016Zan, Forsberg & Carlborg 2018)

b) confounds (genetic, environmental, epigenetic, etc) (Dudbridge & Fletcher 2014; Keller 2014Vanderweele, Ko & Mukherjee 2013)

c) statistical properties (Eaves & Verhulst 2014; Niel et. al 2015Su & Lee 2016Sa, Liu, He, Liu & Cui 2016; Ritchie 2015Wu, Zhong & Cui 2018)

d) computability & sample size requirements (Aschard et. al 2015; Cordell 2009; Huang et. al 2017;  Lin et. al 2015; Wu et. al 2013)

Moreover, the concepts of ‘gene-environment interaction’ used by behavioral geneticists (particularly hereditarians) and by developmentalists, embryologists and ethologists differ significantly (Meaney 2010, Moore & Shenk 2017, Moore 2018; Oyama 1985, ). Developmentalists argue that the fact of ubiquitous developmentally emergent gene-environment interactions makes the statisticalist concepts of heritability meaningless (Moore & Shenk 2017). It has been shown that the presence of gene-environment interactions makes the concept of heritability dependent on genetic effects and gene frequencies, environmental effects and distributions, and of the interactions themselves (Guo 2000).

Kuo showed it was naive to compute heritability coefficients in the 1920s as if they represented some sacrosanct part of development, some magical “explanation” of variance ‘due to genes’ (Kuo 1929). Sadly this simple fact has not been yet explained to behavioral geneticists who continue to publish papers estimating heritability coefficients when it has been amply demonstrated they are of no meaning or use (Bailey 1997; Burt et. al 2015Crusio 1990, 2012; Lewontin & Feldman 1976Lewontin 1974;  Moore 2006; Tabery 2014; Taylor 2014Wahlsten 2003).

[1] This article was adopted from a rant I made in response to a Geoffrey Miller tweet.