How to Strawman A Critique

When going through the comments on my blog today, I noticed that I got a strange pingback from a bizarre “g-loaded” blog attempting to respond to my article on the Wilson effect. It’s titled “DevelopmentalSystems Against The Wilson Effect“.

The first few arguments DevSyst makes are simply fallacies against heritability and the use of twin studies to obtain heritability estimates. I will not explain them at length here as better, longer works have already covered them in detail (see Sesardic, 2005).

Unfortunately, I have read Sesardic’s book in detail, and nearly every sentence of it is wrong. If this individual would like to have a substantive debate on the merits of heritability estimates, then they can cite the specific parts of Sesardic’s book that they are referencing here. There are reviews of the book by Tabery and Taylor that might be a useful introduction, but given the nonchalatant citation here, I think it’s not worth reviewing.

The thing about GxE is it is kind of a cop out. It is an event which rarely occurs and when it does, may only make up miniscule portions of the variance. Little empirical support is available to claim that GxE interactions are significant enough to abandon classical twin studies.

The issue with “classical twin studies” is that they are unable to detect gene-environment interactions. This was established decades ago, and despite the claims of those like Jinks & Fulker (1970) that global analyses of gene-environment interactions are testable, they are only able to detect if they exist when environments are explicitly modeled (Purcell 2002), but have numerous issues that inhibit easy detection (Giangrande et. al 2019; Turkheimer 2008).

Plomin (1986) states:

“The Colorado Adoption Project analyses lead to the conclusion that genotype–environment interactions in infancy, if they exist at all, do not account for much variance. For this reason, Plomin and DeFries . . . proposed the following principle concerning the development of individual differences in infancy: “Genetic and environmental influences of infant development coact in an additive manner.””

I would point the author towards my article on adoption studies, and in particular Richardson & Norgate (2006)‘s cogent analysis of the Colorado Adoption Project. Despite Plomin’s claims of the the additivity of development, we have long known this to be false. Consider Gottlieb (1998); Gottlieb (2000); Gottlieb (2003)Gottlieb (2010) for a start.

Up to today, we are still incapable of finding substantial GxE effects which leads us to believe that if they are an issue at all, their only consequence is saying “roughly x% heritable” rather than “x% heritable”. For a full review on the non-problem of GxE effects, see Sesardic (2005).

The allegation that we are “incapable of finding substantial GxE effects” is not a statement of fact, but an exposition of ignorance. There are all sorts of well-documented gene-environment interaction effects like the contingency of MAOA, as well as recent work on education polygenic scores (including by the very Plomin “g-loaded” cites to indict gene-environment interactions!) like Fletcher (2019). More of these are reviewed in the Tabery (2014) and Oyama (1985) g-loaded dismissed.

If we were to assume a significant GxE, it still could be circumvented through placing MZ twins and DZ twins in the same home environments, which is what we could call the “industry standard” for quantitative genetics.

It would not circumvent the issues, but since the author neglects to labour on the issue, so shall I.

This is Lewontin’s popular argument of locality and, put simply, it is pseudoscientific. There is no reason that ANOVA results can’t be generalized to populations (and conversely, we should never automatically extrapolate results). What this argument does is claim something which does not matter and does not hold any weight to rational people who understand sample generalizability. It is useless.

There is nothing psuedoscientific about Lewontin’s argument about locality. If the author would like to explain why developmental conditions throughout the entire world permit the generalization of heritability estimates in a single population to those in another, despite ample evidence that heritability estimates do not generalize (Nicholson 1990), then I’d love to hear the explanation. Otherwise, I’ll continue espousing the fact that the environmental and genetic architecture, as well as the ontogeny of particular traits, are not uniform in all regions and populations. This has long been recognized by even behavioral geneticists, as well as quantitative geneticists and evolutionary geneticists who literally use these facts in their research. For instance, recall that heritability goes to zero in a selective sweep since all variation is destroyed.

So, DevSyst does give a theory here as to what may be happening instead of a change in heritability over time, but these aren’t as strong considering his underlying principles concerning heritability are flawed. More importantly, DevSyst does bring in an empirical question: do GWAS show a change in the heritability of intelligence over time? To be fair, DevSyst has simply not found a study showing this, but I have. A GWAS by Davies et al. (2015) identified the rs10119 SNP and “The effect was near to zero at younger mean ages and larger at older ages.” To be fair, this study was entirely focused on elderly people, hence we are not looking at a childhood-young adult sample, but we do get a look at a group where less environmental change may happen over time, and we must concede that differential heritability does, in fact, happen at all.

I appreciate the identification of a single SNP that shows differential age effects, but this is the APOE locus, which has long been known to be a robust genetic association with cognitive phenotypes (particularly Alzheimer’s) and does not represent variation of IQ scores in the normal range. Even worse, despite being found in the Davies et. al (2015) analysis, it does not seem to have been replicated in any of the subsequent intelligence GWAS. A quick perusal of the GWAS catalog will allow one to note that the variant rs10119 has only been associated with a phenotype in four studies. The studies investigated intelligence, Alzheimer’s, cerebral amyloid deposition, and “health study participation”, respectively. This is despite the fact that large intelligence GWAS are included in the GWAS catalog including Hill et. al (2019) (N=199,242, Nₑ=248,482); Lam et. al (2017) (N=107,207); Savage et. al (2018) (N=269,867); Smeland et. al (2019) (N=269,867); Sniekers et. al (2017) (N=78,308); Trampush et. al (2017) (N=35,298). Al but one of these studies had a larger sample than Davies et. al (2015), who only had 53,949.

It is, again, worth emphasizing that the variant is an Alzheimer’s SNP, which does seem to have pleiotropic effects on intelligence. It is unsurprising that a disease that occurs late in life would be differentially expressed (see footnote [1] in my original article). However, the claim that the effect size is zero earlier in life is not supported at all. They concluded this based on a significance test, which have been known to be inappropriate in most circumstances (Gelman & Stern 2006Hartgerink et. al 2017). A quick perusal at the plot would indicate that the effect size does indeed overlap zero at the earlier ages, but that the majority of the point estimates are negative at all points time. rs10119

However, we should be slightly cautious here. The first reason is that GWAS usually don’t identify causal SNPs, but tag SNPs that are in linkage disequilibrium with the true causal SNP (Young et. al 2019). The differential association could result from the true causal SNP being in linkage equilibrium in a particular age cohort, while being in linkage disequilibrium in the later age cohort, inducing a spurious age-moderation effect. There are also other possible causes, like ascertainment bias by age, gene-environment interactions, etc.

This is correct, but, as Mayhew and Mayre (2017) states, “Despite the many advantages of SEM, issues can arise when including too many variables in the model, particularly with a small sample size, and the constraint of the variance values to certain parameters can cause bias increasing heritability estimates.” Coincidentally, this appears to be a problem for DevSyst’s studies. For example, Wright (1931) uses a very unrepresentative sample of just 100 Californian school children (who were essentially picked by convenience).

Ironically, the cited excerpt would not be a problem for me, but for “g-loaded”, as Mayhew & Mayre (2017) claim that the heritability estimates would be biased upwards. In any case, the representativeness of the large literature used in Rao et. al (1976) shouldn’t be a large issue.

There are other critiques of path analysis in heritability analyses, some of which are given by Karlin et al. (1983), such as that they are strictly limited to linear models, and path analysts all tend to approach the data with different assumptions about it. A full critique of path analysis for estimating heritability, especially when pitted against twin studies, is beyond the scope of what need be covered.

I’m well aware of Karlin et. al (1983)‘s critiques of path analysis, as well as the responses (Cloninger et. al 1983Wright 1983) and counter-responses, and happen to agree with them. However, in comparison to twin studies, the validity of path analyses is far superior. Path analysis studies require fewer assumptions than twin studies, and have greater power.

DevSyst cites Rao et al. (1976) which isn’t a terrible study, but still has notable flaws. For example, they only list genotype as a derived factor in their model, rather than causal of some of the listed variables. This may be problematic for assuming outcomes (taking this as a critique may be a judgement call). Additionally, they fall into a trap that Karlin et al. describes which puts cultural effects on a pedestal even though they are generally non-linear (which is not good for path analysis; additionally, this critique is broad to just about all path analysts).

I don’t have any commentary on this other than to say that the twin model similarly assumes linearity, so it quite strange to mount an attack here against the path analyses models of Rao et. al, while defending the validity of twin studies.

I think citing Rao et al. (1982) was quite a bold move on DevSyst’s part. The paper generally claims their model holds a lot of uncertainty and purports an agnostic position, not an anti-hereditarian position. Rao et al. state, “Because of these unresolved differences, there is considerable uncertainty in the estimates . . . Variable as they are, the estimates are strikingly consistent in implicating both genetic and cultural inheritance, with no clear preponderance of one over the other. There are no grounds for a strongly hereditarian or environmentalist position. Thus the conclusions as well as the methods are synarchic.”

I don’t have any dog in the “hereditarian” vs “anti-hereditarian” game, as I think that that the dichotomy between genes and environments is nonsensical. As such, it is not quite a “bold move” on my part to cite Rao et. al (1982).

What happened here was DevSyst relied on path analyses to provide contrary evidence and did not even refute the modern meta-analyses suitably, considering the critiques against twin studies were not good.

The author seems to have missed something. They missed is the citation of Devlin et. al (1997), which found in their model fitting analysis of the family resemblances from Bouchard and McGue (1981) (though see Capron et. al 1997) that a Wilson effect model did not fit better (i.e. have a better Bayes factor) than their preferred model. Notably, Devlin et. al (1997) did not employ path analysis, but simply the standard model fitting that all behavioral geneticists employ in their work (see Schonemann 1989; 1997; Schonemann & Schonemann 19911994).

1: This first point is against everything up until “The fact is that”. The issue is this would need to be empirically validated and proven. The tests are still loading towards the same g factor, as we can tell.

I’m not sure what this argument is supposed to consist of. If the author can clarify the actual point being made here, I’d love to respond.

As for the claim that “the tests are still loading towards the same factor”, I would love for the author to clarify as to whether they think g is a real variable have causal influences on test scores on particular tests or if they live in the real world.

“nor is there a theory of ‘intelligence’ . . . that one can verify ‘increasing heritability of IQ’ with” This is one of those traps that anti-IQ people get sucked into (working on a response to Richardson), and it’s really just a non-argument. Even if there were no general theory of intelligence that solidified the modern IQ tests, this would not mean that the heritability of IQ scores can’t be measured, nor would it invalidate IQ as a measure. This sentence really doesn’t follow: “nor is there a theory of ‘intelligence’ . . . that one can verify the ‘increasing heritability of IQ’ with”.

The claim that “even if there were no general theory of intelligence that solidified the modern IQ tests, this would not mean that the heritability of IQ scores can’t be measured” violates the basic principles of science, as well as scientific psychology (Schonemann 1997).

The easiest way to defend classical twin studies is to point out that they are done under the Equal Environments Assumption (that is, the environments for MZ twins and DZ twins are assumed equal post-CoVGE), but DevSyst does not believe that the EEA is being held true. There is a great deal of debate to be had here, so it is best to simplify it by referring to different forms of twin studies, such as twins reared apart, which tend to find the same results.

Twins reared apart studies are of less scientific value than studies of twins reared together. That is to say that publishing these studies should count as a black mark on someone’s resume, given that they have produced not a single estimate worth of scientific consideration and have instead confused people as to how development works (Kamin & Goldeberger 2002; Joseph 2015).

However, note that in order for CoVGE to occur, genes must be playing a role and for it to increase, the genetic influence must increase as well. While Dickens and Flynn’s model (which is essentially just replicated in the papers that DevSyst cites) is based around the hidden environmental influences behind increasing population mean IQs, it is still dependent on direct and indirect genetic influence increasing. Dickens and Flynn state within the article, “As the fraction of variance explained by genes (both directly and indirectly) grows with age, the constancy of IQ over time grows—because genetic differences are stable. Therefore, it may not be a change in the child that increases the constancy of IQ with age, but rather a change in how much control the child is exercising over his or her environment.”

The author here is confused as to what gene-environment covariance consists of. It actually constitutes a violation of the equal environment assumption (see Carey 2019 on the smorgasbord model). The direct genetic effect does not change in the presence of  gene-environment correlation, the indirect genetic effect does, which is mediated by the environment. The Wilson effect here would not be the result of “genes” in a proximate sense (only in the ultimate sense), but the result of the environment. What this would appear as in a GWAS study is that the effect size for any given allele increases over the course of the life span, but not for genetic reasons, for environmental ones. As noted in Dickens & Flynn (2001),

 First of all, no matter what the type of effect on IQ, something is almost always being controlled. Therefore, even though you have little control as a child, your shared environment will still be stabile, to some extent, and therefore whatever IQ score you acquire will be as well

The author also seems to confuse the actual point of Dickens and Flynn’s model, which is not anything about “control”, but that the environments twins experience in childhood are similar, but as they grow up, they will become more dissimilar (as a result of the initial phenotypic trajectory – see Turkheimer & Gottesman 1996; Turkheimer 2004), and the shared environment component gets “transformed” into the nonshared environment.

Second of all, there is no reason to argue that you gain more control of your life by young adulthood. In fact, any young adult will tell you this is the opposite of true; college, dating, finding a job, becoming a mature adult and quitting drugs, etc. etc. Even high IQ people won’t have their lives fully figured out by their early 20’s and will all vary in lifestyle, yet they will still remain stable in IQ

I have no clue what the relevance of the stability of IQ that the author keeps on bringing up, but I will simply clarify that this is an extension of the widespread models of gene-environment covariance wherein individuals increasingly choose their environments over the course of their lifespan (Scarr & McCartney 1983). What college, dating, finding a job, etc have to do with the model is beyond me.

Of course, to DevSyst, this logical point may not matter, because there are multiple studies where they employ model-fitting and show that the reciprocal causation model is the best-fit model. There are two replies to this claim. First of all, as I have explained, the reciprocal causation model very well may pick up additional genetic effects.

The author doesn’t seem to have “explained” that the reciprocal causation model is “pick[ing] up additional genetic effects”, they just seem to have merely asserted it.

Second of all, the method of SEM (model fitting, path analysis) has already been discussed as being flawed above.

The author seems to have created an arbitrary distinction between “good” SEMs (fitting ACE models to twin data) and “bad” SEMs (fitting more complex ACE models to twin data) such that the abundant evidence that gene-environment correlations are the cause of increasing h^2 figures over the lifespan. There is no mathematical distinction, and current twin modeling for both are done in the same programs (OpenMX, lavaan, etc).

All in all, the reciprocal causation model is not mutually exclusive to the Wilson Effect, rather it is a non-detrimental explanation which even Jensen may give credence to.

Of course: Jensen posited that the reciprocal causation model may be an adequate explanation for the (alleged) observation. However, it is vitally important to note that the reason why a particular observation occurs is vitally important to our scientific conclusions. People like Bouchard and Jensen have been falsely concluding that environmental variables do not influence IQ scores because of the (alleged) disappearance of the shared environment component in adulthood and the attenuation of the nonshared environment component (taking apart the logical question of whether these latent variable models adequately represent actual genes or environments cf. Taylor 2012).

DevSyst does not provide compelling evidence that we can disregard the Wilson Effect. Arguments are made that may be useful in developing a foundation, but we will need much better studies in the future to determine the accuracy of any of these claims. For now, some things hold true. The first is that twin studies generally do their job well. The second is that genetic effects can still arise in the reciprocal causation model, and that it is generally faced with flaws if it purports to massively strengthen the environmental hypothesis. The third is that, clearly, more research needs to be done concerning the Wilson Effect to gain a full outlook of its effects. But, for now, it does hold enough empirical backing to be considered a legitimate theory, particularly relative to the data attempting to oppose it.

There are three things to be taken from this response article. The first is that selectively reading the evidence put forth by one author does not give anyone faith that the selective reader is engaging in good faith. The second is that the logical underpinnings behind hereditarian hypotheses are profoundly confused, given the confusion that gene-environment correlations are helpful to rather than a hindrance to themThe last is that speaking on topics you do not adequately understand (model fitting, twin studies [1]) will cause the general populace to think you are blustering your way through complex scientific hypotheses, rather than going on legitimate scientific inquiries.


[1] For a review of how twin studies are flawed, see Taylor (2007), Taylor (2008), Taylor (2009).