The Equal Environment Assumption Is False

The most-employed tool in the “nature”-“nurture” wars is that of the twin study. If only twins knew that they would be the subject of over a hundred years of debate before they were first enrolled in Galton’s original investigation of “heredity genius”. Alas, twins have been the first and ‘most successful’ methodology used in the biological behavioral sciences: particularly behavioral genetics. Essentially, the twin study logic goes like this:

There are two types of twins: DZ (dizygotic/fraternal) twins and MZ (monozygotic/identical) twins. DZ twins share only half of their genes, while MZ twins share all of them. If we compare how similar monozygotic and dizygotic twins are on particular traits, then we should be able to estimate (somehow) how much of a trait is “due to” genes.

The way this is typically done is by comparing the correlations of dizygotic twins on a trait value to the correlations of monozygotic twins on a trait value. Then we can “decompose” the correlations into the different parts that comprise it: into the “heritability” component, the “shared environment” component (defined as any non-genetic influence that makes dyads more similar) and the “unshared environment” component (defined as any non-genetic influence that makes dyads less similar).
The equations are as follows:

$r_{MZ}=a^2+c^2$

$r_{DZ}=\frac{1}{2}\cdot a^2+c^2$

The assumption here is that monozygotic twins have the exact same “shared environment” component as dizygotic twins, which in more specific terms, is the assumption that monozygotic twins do not share more similar “environments” than dizygotic twins do. This is known as the “equal environment assumption” and has been subject to considerable debate.

However, there is considerable and convergent evidence indicating that the equal environment assumption is, indeed, false, and has pretty substantive impact on estimations.

But, But

Behavioral geneticists have managed to marshal up some “studies” purportedly demonstrating that the equal environment assumption is, in fact, true. However, this only appears true on a cursory look. Luckily, the defenders of the h2 studies have compiled a helpful list of these misleading studies for us to purview and examine in an appendix of the paper “Demonstrating the Validity of Twin Research in Criminology“. There are a lot of studies here, and I don’t have the time or space to deeply examine each of them separately, so I’m clumping them into categories and analyzing the most important ones. Essentially, there are three typical tests of the EEA. The first is using some sort of proxy for environmental similarity (contact, confusability, closeness, etc). The second utilizes twins whose zygosity is misperceived, to test whether the perception of twin pairs as a particular type (identical vs nonidentical) affects outcomes. And the third correlates physical similarity on environmental similarity to see if there is a direct effect here.

Misperceived zygosity studies

Felson gives a good review of the older misperceived zygosity studies above, noting Joseph’s criticism of them in The Gene Illusion, but there have been a few more recent studies employing this methodology. Most recent is Conley et. al (2013), which purported to find that the equal environment assumption was supported. A closer look at the data supports the opposite conclusion, which I detailed in this thread.

We should note that misperceived zygosity is only one possible environmental bias, and that there exists numerous other possible violations of the equal environment assumption (the space of possibilities is unbounded, as environmental influences on a trait are not bounded, nor are differential similarities between twin types). So to demonstrate that zygosity is not one such possible confound is not to demonstrate the validity of the equal environment assumption in full, but only for one particular environmental influence.

Environmental similarity studies

These studies are the ones that estimate how much contact twins have, the nature of the twins’ relationship, etc to estimate how these factors influence trait similarity. The issue here is threefold: first is that these studies, when done properly, typically demonstrate the EEA is false/lacks verisimilitude. Second, these studies are universally based on self-reports of the psychological facts about the relationship, which could be both systematically biased and are definitely filled with measurement error such that very large sample sizes are necessary to actually determine the impact. Third is that not a single one of these studies has studied all the relevant variables.

Physical similarity studies

The final study that attempts to test for violations of the EEA is those based on the physical similarity of twin dyads. The assumption here is that if twins are more physically similar, then individuals will treat them more similarly as a result of this, causing the greater physical similarity of twins to induce behavioral/psychological similarity. The issue here is threefold: one is that, again, there is a large amount of measurement error in these variables in addition to their estimation being based on subjective interpretations of individuals appearances. Second is the fact that these mechanisms of similarity are likely to be unable to detected statistically since they operate cumulatively, interactively (contingent upon certain social formations, presence of other variables, etc) and subtly. The third is a statistical issue: because twins are, well, twins, there is an extremely severe restriction of range in the similarity of MZ twins making it difficult to test for an effect.

An issue for all of the above methods is that they all require the absence of collider bias and of random assortment into each of the test variables conditional on certain background variables (chorionicity, developmental conditions), that are extremely unlikely to hold. More interactive violations of statistical assumptions for these tests can be found in Richardson & Norgate (2005).

Genomics Data

Behavioral genetics has seen its reckoning following the actual investigation of their core assumptions with actual genetic material. Following the failed “candidate gene” era, geneticists have moved to genome wide association studies. There have been various methodologies employed to estimate the heritability of traits using actual genetic data rather than inferred relatedness. These include methods such as Sib-Regression, other IDB-based estimators, admixture estimations and most recently RDR.

Peter Visscher developed a method known as Sib-Regression that can estimate the heritability of trait by exploiting the random distribution of genetic relatedness between twins. Essentially, because siblings inherit their genes from their parents at random, there are going to be some sibling pairs that inherit anywhere from 30-70% of their genes in common from their parents. The average, however, is the commonly reported 50%, but this distribution allows us to compare the similarity of twins with particular values of genetic relatedness to estimate how this genetic relatedness influences phenotypic relatedness. Visscher et. al (2006) employed this method to estimate the heritability of height and yielded an estimate of 80%, in good agreement with twin studies. However, the method hasn’t been really applied to other traits [1] and is known to have some implausible assumptions (despite the title) that seem to affect the results. We should treat SibRegression estimates as upper bounds on heritability, and hope that more studies using this method can be used.

Another way of estimating heritability comes from admixture studies, which exploit differences in allele frequency between populations to estimate heritability. Zaitlen et. al (2014) used the variation in local ancestry of admixed African-Americans and estimated heritabilities of height and weight as ~55% and 23%, respectively, which are about 25% and 50% lower than the estimations from familial studies. This is strong evidence of the violation of twin study assumptions, either with respect to the equal environment assumption, or with the absence of epistasis and gene-environment interactions.

The most recent (and most robust, although still not perfect) way of estimating heritability exploits the random segregation of ova and sperm in comparison to parental genotypes, and is known as relatedness disequilibrium regression. In the pioneering paper Young et. al (2018), they estimate using Icelandic genomic data that the twin estimates overestimated actual heritability by 33.2% on average.

Familial Data

Another way that we can estimate heritability without using twins is to employ other familial correlations: between cousins, grandparents-grandchildren, etc. These correlations have similar formula (although they get progressively more complicated for more distant relationships and when adding more variance components) that can be used in conjunction with each other to provide a cumulative estimate of the heritability of a trait. This method is not typically employed anymore, but there are a few papers that have estimated the heritability of IQ and other traits using these models. I reported the results of this literature for the heritability of IQ previously, but there is another more recent (and interesting) paper for other traits. Zaitlen et. al (2013) used extended genealogy designs to estimate the heritability of various traits (BMI, height, menarche, fertility, etc) from correlations between siblings, parents, half-siblings, avuncular, etc relationships. They estimated the heritability from each of their varying methods and found that heritability was overestimated when derived from closely related pairs of relatives, and that their pattern of correlations could only be explained by the shared environment (although it was consistent with small contributions of dominance and epistasis).

Epigenetic Similarity

Even beyond the environmental variables that seem to impact twin trait similarity, there are violations of the equal environment assumption all the way down to the genetic environment. Recent research in epigenetics has demonstrated the importance of environmental mediation of gene expression, the cellular environment that genes are transcribed in, and various other mediators, moderators and interactions of genes and environments. Given that monozygotic twins are known to have more similar prenatal environments (and that the actual mechanism that creates monozygotic twins necessarily induces more simliarity), it would make sense that they would have more similar epigenetic markings. This simple fact has been known since at least 2005, when Fraga et. al (2005) not only found that monozygotic twins have higher concordances for methylation, but that the frequency of contact among twins was correlated with concordance, indicating environmental exposure as a cause. It has also been found in varying types of cells, and most studies support epigenomic, environmental or developmental causes (Kaminsky et. al 2009, Poulsen et. al 2007, Wong et. al 2010, Ollikainen et. al 2010, Martino et. al 2013).This was tested in Van Baak et. al (2018), who found that MZ twin concordance on methylation was 2-16 times greater than DZ twin concordance, and found that this is very likely to be the result of a developmental process. They concluded that their findings could explain some of the missing heritability, as heritability of traits could be overestimated due to the violation of the equal environment assumption.

Environmental Estimates

There have also been several studies that use rough proxies of certain environmental variables to estimate the effect of the violation of the equal environment assumption on twin similarity. Felson (2009) systematically examined the literature purportedly in defense of the EEA and found that its results are often consistent with the violation of the EEA or ambiguous. He then performed his own examination using data from the Midlife Development in the United States survey and found violations for up to a third of the traits. However, we should be cautious in interpreting some of the commentary he presents alongside his analysis because of the imperfect reliability of the metrics in relation to actual environmental conditions, as well as the fact that only a limited number of environmental conditions were modeled. Fosse et. al (2015) also examined specific risk factors for mental ‘disorders’ and universally found that monozygotic twins experienced more similar environments than dizygotic twins.

A more thorough examination of the environmental violation of the equal environment assumption is given in Joseph (2015) and Richardson and Norgate (2005).

Prenatal Similarity

One aspect of the violation of the equal environment assumption that is often not discussed is the fact that monozygotic twins share more similar prenatal environments than dizygotic twins. Despite the assertion that “parenting doesn’t matter”, it has become pretty clear that there are abundant maternal effects on all sorts of behavioral, physical and psychological traits (Coe et al. 2003; Champagne 2010a; Field et al. 2004a; 2004b; 2006; Gluckman & Hanson 2005; 2006; Horton 2005; Heijmans et al. 2008; Huizink et al. 2002; Kemme et al. 2007; Langley-Evans et al. 1999; Lui et al. 2011; Maccari et al. 2003 Oberlander et al. 2008; O’Connor et al. 2005; Painter et al. 2005; Parnpiansil et al. 2003; Ryan & Vandenbergh 2002; Sandman et al. 2011; Schneider et al. 1999; Waterland & Michels 2007; Welberg & Seckl 2001; Weinstock 2008; Tamashiro & Moran 2010; Thompson 1957 [1]). If there were to be a violation of the EEA in utero, this would constitute a serious blow to the idea that twin studies can inform us on the causes of human variation. We have accumulating evidence that this is indeed the case. First, we should make a few distinction. Both MZ and DZ twins come in several forms. The first is monochorionic twins; where both twins share a single placenta. The other is dichorionic twins; where twins are in different placentas. Monozygotic twins can be either monochorionic ($\frac{2}{3}$) or dichorionic ($\frac{1}{3}$), while dyzygotic twins are always dichorionic. Even within each type of chorionicity, there are differing types of amnionicity; whether or not the twins share an amnion. 5% of MZ twins are monochorionic and monoamnionic (MC-MA) while most DZ twins are dichorionic and diamnionic (DC-DA). It has been shown that monochorionic twin pregnancies are more likely to be lost (inducing a confound) (Gibson & Cameron 2008; Glinianaia et al. 2011; Sebire et al. 1997),  more likely to have congenital abnormalities (Glinianaia et. al 2008), deliver preterm (Gibson & Camerson 2008), have fetal growth restriction, cerebral growth lesions (Adegbite et. al 2005),. Moreover, there is abundant evidence that the prenatal environment of monozygotic twins is much more stressful than that of dyzgotic twins due to chorionicity and amnionicity differences (Acosta-Rojas et al. 2007; Adegbite
et al. 2004; Carroll et al. 2005; Gibson & Cameron 2008; Hack et al. 2008). There are also confounds induced due to prenatal environment for twin research for autism (Hallmayer et. al 2011) and brain volumes (Knickmeyer et. al 2011).

Summary

The reason why the EEA is, in essence, an unverifiable assumption even when subject to purported ‘tests’ is because the list of environmental influences that could differ by zygosity is endless, since we know that there are numerous nonobvious environmental influences on behavioral development (Turvey & Sheya 2017, Wallman 1979). To behavioral geneticists, I hope you seek a more robust method of estimating meaningless h2 coefficients.

Footnotes

[1] I have just discovered a paper that utilizes the methodology (as of 11/11/2019); Hemani et. al (2013).

[2] Taken right from Charney 2012