The Jewish Advantage

Jews are Intelligent and High-Achieving

Ashkenazi Jews are undoubtedly one of the most successful ethnic groups in the world. Richard Lynn (2011) and Steven Pinker (2011) have both written and made remarks to that effect. Lynn records that wherever Jews have gone they’ve shown exceptional educational, socioeconomic, and intellectual achievement. The extent of Jewish overachievement Lynn reports is sometimes astounding: Jews constituted 0.8% of Germany’s 1930s population and 24% of it’s Nobel laureates, 0.075% of Italy’s population and 24% of it’s, 2% of Russia’s and 70% of it’s, but also 10 of 14 Fields Prize or Wolf Prize winners for outstanding mathematical work, and 15 of the 33 Russian grandmasters. Perhaps nowhere is Jewish achievement better-highlighted than in America, where they’ve made up less than 3% of the 20th-century population but 62 of 200 Nobel Prize winners for science, literature, and economics, and 32% of Forbes’ 2009 400 richest Americans.

Figure 1 (Murray, 2003, p. 297). Note that philosophy is excluded because Jewish achievement in that area was so great from 1900–1950 that is distorted the other trendlines.

These scholars are not alone; the eminence of Jews has invited speculation for hundreds of years, from left and right, and both contemptuously and in admiration. Werner Sombart claimed that Jewish prominence was a result of an exceptional attachment to and drive for money; Houston Stewart Chamberlain said Jews had “abnormally developed will” and strong kin networks; Thorstein Veblen, admiring Zionists, believed that the marginal position of Jews motivated them to achieve success unconventionally; Daniel Moynihan ascribed exceptional skill in finance and business to America’s then-nascent Jewish minority; Friedrich Nietzsche regarded Jews as a race of potential-masters that freely decided against becoming Europe’s rulers (presumably because he believed they wanted to be absorbed into Europe, which may explain why he called for the expulsion of anti-Semites and the incorporation of Jews into the Prussian nobility); Cormac O’Grada writes that Jews have “bourgeois virtues such as sobriety, a desire to succeed, a dislike of violence, an emphasis on education and learning, and high self-esteem.” In contrast, Kevin MacDonald (here, here, here, here) augurs that Jewish achievement is explicable in terms of Judaism being a group evolutionary strategy that selected for ethnic nepotism and high verbal intelligence, which he argues are critical for their success (this viewpoint has been disputed recently by Nathan Cofnas, 2018).

Lynn also argues pointedly that Jewish achievement is biological, evolved. Instead of work ethics, culture, or relative functionalism, Lynn advances the theory that Jewish achievement is a natural consequence of a high level of intelligence. How they came to possess such natural talents is ultimately unknown, but Lynn advances three plausible theories:

  1. Eugenics: Wherein Jewish customs promoted the more intelligent to have more surviving children. This includes rabbis who were permitted to marry, in contrast with Christianity, which required celibacy among priests;
  2. Discrimination: Wherein Christian discrimination against Jews in the Middle Ages forced Jews into white-collar occupations that required a high level of intelligence, and;
  3. Persecution: Wherein the historical travails of Jews have selected against those who are unprepared. In this version of the story, less-intelligent Jews paid a higher price when pogroms, expulsions, and holocausts happened.

These theories are not mutually exclusive and any or all of them could be true to various degrees. Alternatively, none could be true and Jewish advantages could be explained by other evolutionary hypotheses or by some facet of their environments.

The second theory — Discrimination — is the same one advanced by Cochran, Hardy & Harpending (2006). These authors also add that the cognitive ability of Jews may be shaped by heterozygote advantage, explaining the high rates of genetic disorders like Tay-Sachs, Gaucher’s and torsional dystonia in the Jewish community (see Rivas et al., 2018). Rutgers anthropologist R. Brian Ferguson (2007) disputes this thesis, arguing instead that the Talmudic tradition is a coherent and plausible explanation for the high IQs of Jews. Notably, Lynn, Cochran, Hardy, and Harpending appear to be wrong on this second theory, because, as Botticini & Eckstein (2012) note, there is no evidence the laws necessary for this theory to be right even existed. Instead, they argue that Jews entered white-collar occupations because they were more lucrative and because religious custom made them both literate and advantaged when it came to this sort of business. This is the theory that Nicholas Wade (2014) believes.

But how big is the Jewish intellectual advantage and can it explain their high levels of achievement? The answer seems to be that it explains most of it, but it can be reasonably argued that it does not explain all of it. This post does not focus on the latter question, as it is discussed amply in Lynn (2011) and elsewhere.

Regarding the first question, Lynn has supplied a wealth of data suggesting that the average IQ of Ashkenazi Jews is between 109 and 115 depending on the type of cognitive battery they’re tested with. Lynn also provides some data on the IQs of Mizrahim, Sephardim, and Ethiopian Jews. The IQs of the Mizrahi are indistinguishable from Arabs, centering around 87 when sampled together, Sephardim have IQs of around 100, and Ethiopian Jews have an average IQ of 69. For comparison, White Gentiles have IQs of around 100. Their high average IQ makes Ashkenazi Jews the most intelligent group for which there are data sufficient to form such a conclusion. The 12-point advantage of Jews is seen in every country in which they reside; it carries with them whether they’ve moved to America, Russia, Latin America, or Israel. (It is important to note that the myth of low Jewish IQ at Ellis Island is just that — a myth; see Rushton, 1997; Snyderman & Herrnstein, 1983, 1985; the low national IQ of Israel is plausibly explained by the fact that the population is only partially Ashkenazi.).

To understand the Jewish advantage, we must understand Spearman’s hypothesis. Spearman’s hypothesis, in a sentence, is the idea that group differences in IQ are concentrated on the common (or general) factor of all cognitive ability tests, g. The greater a test’s affinity for this common factor, the greater it’s ‘g-loading.’ If something has an effect on g, it is called a Jensen effect (named after Berkeley educational psychologist Arthur R. Jensen by J. Philippe Rushton in 1998; in another vein, Dunkel et al. found evidence of a Jensen effect on the general factor of personality, suggesting that a Jewish personality advantage may contribute to their success). g is the most heritable component of cognitive tests, the component that constitutes almost all of a test’s predictive validity, and g-loadings have identity with heritability. Consequently, the difference between various groups — primarily Blacks and Whites — has been shown to be relegated primarily to g. This is argued to be proof that group differences in IQ are partially genetic. Cognitive differences are not entirely a product of g — the ‘strong’ form of Spearman’s hypothesis — because there are consistent differences between groups on specific abilities once g has been removed from the results of a cognitive test battery. IQ scores thus primarily represent g, but also other capacities like memory, visuospatial, or mathematical ability.

The Jewish IQ advantage appears to conform with the ‘weak’ Spearman’s hypothesis where group differences are mostly due to g, but also to secondary abilities (te Nijenhuis et al., 2014; additional data on Jewish intelligence are provided by Lynn & Longley, 2006; Lynn, 2004; Dunkel, 2014; Backman, 1972; Storfer, 1990, p. 314; MacDonald, 1994, p. 190; Herrnstein & Murray, 1994, p. 275). (Note also that the Ashkenazi advantage seems to be recent because non-Ashkenazi Jews are not as intelligent. These other groups of Jews seem to resemble the populations near them.) Richard Nisbett (2009) writes humorously about the typical profile of Jewish IQ and why he believes the non-g differences to be influenced by genetics:

Before leaving the topic of Jewish IQ, I should note that there is an anomaly concerning Jewish intelligence. The major random samples of Americans having large numbers of Jewish participants show that whereas verbal and mathematical IQ run 10 to 15 points above the non-Jewish average, scores on tests requiring spatial-relations ability (ability to mentally manipulate objects in two- and three-dimensional space) are about 10 points below the non-Jewish average (Flynn, 1991a). This is an absolutely enormous discrepancy and I know of no ethnic group that comes close to having this 20 to 25-point difference among Jews. I do not for a minute doubt that the discrepancy is real. I know half a dozen Jews who are at the top of their fields who are as likely to turn in the wrong direction as in the right direction when leaving a restaurant. The single ethnic difference that I believe is likely to have a genetic basis is the relative Jewish incapacity for spatial reasoning. I have no theory about why this should be the case, but I note that it casts an interesting light on the Jews’ wandering in the desert for forty years!

If it is reasonable to make conclusions about the etiology of the Jewish-White difference in cognitive profiles, it may also be valid to do the same for the differences between other groups. Jews are not the only groups with a consistent ability profile net of g (note: nearly all of the difference in IQ for all groups is due to g alone, despite small profile differences when g is removed). Arthur Jensen & Cecil Reynolds (1982) recorded that when g was partialled out of tests, there was a consistent pattern of Black-White differences in certain specific abilities like memory, in which Blacks have an advantage relative to Whites (again, net of g):

Figure 2

This difference was not due to SES differences between the groups, as the profile differences were negatively correlated with within-race SES differences. This profile and the Black memory advantage combined with a disadvantage in g and spatial ability have been reliably found (Mayfield & Reynolds, 1997; Rushton, 1998, 2003; Naglieri & Jensen, 1987; Frisby & Beaujean, 2015). Thomas Jefferson (1781) actually noted that “Comparing them by their faculties of memory, reason, and imagination, it appears to me, that in memory they are equal to the whites….”

An important feature of g is that, unlike IQ scores, it does not appear to be malleable. Given that g is the predictive part of tests, this is a significant finding because it implies when we see gains on IQ tests, they’re not likely to confer substantial benefits. The relationships between g and a variety of factors are given in a dissertation by van Bloois & Geutjes (2009):

Table 1

Comprehensive literature reviews (e.g., Metzen, 2012) have established general relationships between certain classes of variables and g. The effect of biological variables like brain volume and body symmetry on g is nearly 1, the effect of biological-environmental variables like lead poisoning, prenatal cocaine exposure, neurotoxins (Woodley of Menie et al., 2018), and malnutrition is nearly 0, and the effect of non-biological, cultural, and socioeconomic variables is nearly -1. There is no empirical support for the claim that, for instance, moving from low to high socioeconomic status — like when children are adopted (Jensen, 1997; te Nijenhuis, Jongeneel-Grimen & Armstrong, 2015) — will enhance general cognitive ability, even if observed (as opposed to latent) IQ scores are increased. Similarly, IQ gains from programs like Headstart (te Nijenhuis, Jongeneel-Grimen & Kirkegaard, 2014), pre-schooling and vitamin supplementation (Protzko, 2015), or phenomena such as the Flynn effect (te Nijenhuis & van der Flier, 2013; Rushton & Jensen, 2010) are unrelated to g, and in the former cases, fade away. Race is a Jensen effect, while lead poisoning is not.

It is possible that g-loadings are explained by confounding with culture (Kan et al., 2013). However, it is uncertain how this possibility comports with earlier works (e.g., Jensen & McGurk, 1987; above), the possibility that the mediation is itself due to g, recent and forthcoming data from the study of non-human cognition (e.g., Woodley of Menie et al., 2017; notably, g also appears to be a human universal per Warne & Burningham, 2019), life history theory (e.g., Woodley of Menie et al., 2013), culture fair test results, chronometric (Jensen, 1985, 1993; McGue et al., 1984; Shigehisa & Lynn, 1991; Chan & Lynn, 1989; Chan, Eysenck & Lynn, 1991; Ja-Song & Lynn, 1992; Lynn, Chan & Eysenck, 1991; Lynn & Holmshaw, 1990; Lynn & Ja-Song, 1993; Lynn & Shigehisa, 1991; for a summary see Rushton & Jensen, 2005, p. 245; cf. van de Vijver, 2008) and elementary cognitive task results (Pesta & Poznanski, 2008), and the fact that Spearman’s hypothesis remains valid when tests of crystallized intelligence are excluded (see Metzen, 2012). The lead author of this piece (Kan) was offered a chance to test his hypothesis more rigorously but neglected to respond. See Jensen (1998, p. 390) for more about the “vehicles” for g; Frisby & Beaujean (2015) and Urbach (1974) also deal briefly with this subject.

To review: Ashkenazi Jews have a 12 point advantage in cognitive ability compared to White Gentiles. The IQs of Sephardic Jews are not different from the IQs of White Gentiles. The Jewish advantage conforms with the weak form of Spearman’s hypothesis and thus may be expected to possess a genetic component. The Jewish advantage compared to White Gentiles is of the same psychometric nature as the Black-White difference. Differences in g are not explicable by normal aspects of the environment.

Returning to causes, Lynn & Satoshi Kanazawa (2008) argue against a cultural interpretation of the Jewish intellectual advantage. Their data are summarized in five points:

  1. The Jewish advantage in their sample is 9.25 IQ points. This implies that 9% of Jews have IQs above 130, compared to 2% of White Gentiles. At an IQ level of 145+, there are seven times as many Jews as White Gentiles: “An IQ advantage of this magnitude is sufficient to explain most and perhaps all of the high Jewish achievement.”
  2. Jews attach less importance to success and to studiousness than non-Jews, though these results are not significant. In the opposite direction, this is consistent with the observation in Rushton & Skuy (2000) (see also Jencks & Phillips, 1998) that Blacks are more studious than Whites;
  3. Jews attach more importance to four values than non-Jews: Considerateness, interest in how and why things happen (curiosity), judgment, and responsibility. It is difficult to see how these could impact Jewish achievement: “The results that Jewish parents are more likely to foster interest in how and why things happen suggest that this might contribute to the high Jewish achievement in science, but Jews have been equally successful in law, the humanities and business, for which an interest in how and why things happen would not seem to confer any obvious advantage.”
  4. Jews do not differ much in the values they would like their children to have: “Jews and non-Jews attach most importance to their children having good judgement, being considerate, honest and responsible, and Jews and non-Jews attach least importance to their children valuing cleanliness and appropriate sex role behaviour.”
  5. “[T]he results clearly support the high intelligence theory of Jewish achievement while at the same time provide no support for the cultural values theory as an explanation for Jewish success. Although the high Jewish IQ has been known for many decades, it has typically been ignored by historians, sociologists, and economists who have written on the high achievements of the Jews.”

More generally it appears that vertical cultural transfer, where parents culturally bequeath their traits to children (a necessity for explaining the consistency of Jewish achievement without genetics), lacks support as a strong variance component in the wider behavior genetic literature (Martin et al., 1986; Eaves et al., 1999; Wadsworth et al., 2002; van Leeuwen, van den Berg & Boomsma, 2008; Hatemi et al., 2010; Vinkhuyzen et al., 2012; Hicks et al., 2013; Kandler, Gottschling & Spinath, 2016; Swagerman et al., 2017; Lyngstad, Ystrøm & Zambrana, 2018; Kornadt et al., 2018; Bell, Kandler & Riemann, 2018; Bergen et al., 2018). This is consistent with Scarr & Weinberg’s (1978) investigation of the effects of familial environments on intelligence in adopted and biological children. Thomas Bouchard (2018, p. 24) regards their study as “dispositive” evidence in favor of the non-influence of families on intelligence; it states:

The conclusion that we feel is justified by our data is that intellectual differences among children at the end of the rearing period have little to do with environmental differences among families that range from solid working class to upper middle class…. The persistent finding that differences in class background bias adult achievements has been interpreted to mean that differences in family environments during the child-rearing period enhance or impede the intellectual, educational, and occupational achievements of the offspring for a lifetime. From our data, it appears to us that these linkages should be reinterpreted to mean that differences in family background that affect IQ are largely the result of genetic differences among parents, which affect their own status attainments and which are passed on genetically to their offspring, whose status attainments are subsequently affected.

So where does this leave the search for causes? Presumably at a genetic conclusion. However, to many, that conclusion remains underdetermined without proper analysis of using molecular genetic techniques. Luckily, new data are available which make possible not only the analysis of social class differences (e.g., Belsky et al., 2018; Belsky et al., 2016) in the genes predisposing individuals towards higher cognitive ability and more years in education (so-called ‘polygenic scores’ or PGS) but also ethnic group differences. These PGS are gaining in predictive validity each year and are capable of predicting market (Papageorge & Thom, 2017; Barth, Papageorge & Thom, 2018) and educational outcomes (Domingue et al., 2015; Schmitz & Conley, 2016), but also levels of general cognitive ability, g (Allegrini et al., 2018).


A Genetic Advantage?

Enter Dunkel, Woodley of Menie, Pallesen & Kirkegaard (2019). This paper purports to have assessed the mediation of the Jewish advantage via PGS in a comparison with White Gentiles in the Wisconsin Longitudinal Study (WLS). This was immediately replicated in the Health and Retirement Study (HRS). (Code is available here.)

The authors first compared the two largest Christian denominations in the WLS (all-White Lutherans and Catholics) in terms of their PGS and IQ scores. These groups were insignificantly different in both of these respects so they were combined for an average IQ of just under 100 (SD 14.8) and an n of 4630. The Jewish group had an average IQ of 108, an n of 53, and an SD of 14.6. The HRS had 6517 Christians and 212 Jews with similar statistics.

The small Jewish sample size may raise concerns over their representativeness. To address this possibility, the authors compared the salaries of Jews in Wisconsin to Jews in other states and found no significant difference. A comparison to the HRS also revealed no significant differences. Given these facts and the above data on Jewish IQ and SES, representativeness does not appear to be an issue.

General sample size complaints are not meaningful; anyone with a passing familiarity with statistical power realizes that this sample is more than large enough. To be sure, the authors conducted a power analysis and found that they had 0.998 power to significantly detect the observed differences in the WLS (power was greater than 0.999 in the HRS). The mediation p-values make this abundantly clear.

The results are unsurprising for those who expected a Jewish advantage. The higher average IQ of the Jews in the samples was moderated by substantially higher PGS than the White Christians. In the WLS, 71.6% of the White-Jewish difference was attributable to PGS and in the HRS this number was 73.1%. Differences remained mediated when SES was controlled (Jews had a 1.4 d advantage in SES), though it is worth noting that PGS also contributed to SES differences. Assortative mating on other genes for these traits may explain why so much of the mean group difference was mediated by a PGS that explains only a small proportion of individual variance. The high level of mediation could also be explained by group-level aggregation. The fact that these samples reached the same conclusions in different generations may be important to note.

The Jewish average PGS score of nearly 2 (d = 1.33) is higher than the average PGS for high-SES groups in other samples using the same PGS (see Belsky, above). (Without the exact numbers from these other studies it’s uncertain whether or not this is a valid inference.) This may vindicate Jensen’s (1973, 1998) assertion that groups with higher genetic potential for IQ should show disproportionately higher cognitive ability at each level of SES, just as Whites do when compared with Blacks. To confirm this conjecture, tests of regression to the mean comparing Jews and White Gentiles should be conducted.

It is possible that this result is explained by population stratification or that the predictive validity of the PGS is reduced in Jews because they’re distant from the group the PGS was trained in. The authors claim that this is unlikely because linkage decay occurs approximately linearly with fixation index (Fst) between a training and target population (Scutari, Mackay & Balding, 2016) and Ashkenazi Jews exhibit a negligible amount (defined as an Fst of between 0 and 0.05) of genetic differentiation with the training population (Tian et al., 2008). This concern may still apply but the likelihood is low.

A notable failure of PGS to generalize between populations gives credence to this concern, but it may be false credence. Urrichio et al. (2019) note that the height PGS derived from the GIANT consortium does not work in the UK Biobank (UKBB), while the educational attainment PGS does. If height has only been under indirect selection through being in linkage or having pleiotropy with intelligence — assumed to be the real target of selection — , then it is likely that controls for ancestry will routinely eliminate the utility of height PGS across populations. This sort of control can be spurious; if it were to be applied to Kukekova et al.’s (2018) sample of foxes, for instance, it would almost certainly eliminate the differences between the groups even though they are known to have undergone differential selection. Ancestry can in some cases be thought of as a very strong signal of, particularly indirect, selection.

The expected level of decay for this PGS in Ashkenazi Jews is small, but it could be large if there’s substantial non-European ancestry, as there’s expected to be in some Jewish communities. If the validity of the PGS were reduced in Jews, it would be a mystery why they score higher than Whites — as expected — when national intelligence level correlates so strongly with national PGS frequency (Piffer, 2015). To confirm the validity of the PGS, the authors could regress PGS on IQ in both groups and as a whole. The correlation for the combined group— r(PGS, IQ) — is 0.31. Disaggregated, the correlation in Jews is 0.07 (0.05 in the HRS; this value is lower because the IQ test is much less reliable) and the correlation in Christians is 0.3 (0.15). Correction for unequal variances raises the Jewish correlation to 0.071 (0.06). Restriction of range clearly does not justify the lower correlation in either sample. Part of the low correlation in the WLS sample is driven by three outliers, the removal of which increases the Jewish correlation to 0.2. The assessment of validity via regressing PGS against IQ may be biased against Jews because of a threshold effect for the PGS. When individuals with PGS below 0 are excluded, the correlation for the whole sample drops to 0.2 (0.07). At this point, there is barely any difference in the PGS-IQ correlation for Jews or Christians in either sample as far fewer Jews had a PGS below 0. Kurtosis is lower for Jews in both samples. The ρ in the WLS excluding sub-zero PGS is 0.185 for Jews and 0.18 for Christians. In the HRS, the weak IQ test — probably only good for determining disability and not giftedness — makes the Spearman correlation unreliable when the groups are restricted to values above 0. Unfortunately, more granular test information which could be used to correct for this is unavailable to the authors. More importantly for the possibility of predictive bias, however, is that there is no difference in the slopes between the groups, with or without these corrections — all of the difference is in the intercepts.

Figure 3

When the PGS and IQ means of all of the denominations in the WLS are plotted, there’s even less evidence of bias. Jews are an outlier, but only because when compared to Whites they have much higher IQs and PGS; they otherwise fit closely to the regression line.

The replication may have been less than ideal and better data is certainly still needed to make a dispositive conclusion. There will be regression bias because the Jewish mean PGS is higher and the reliability will be systematically lower among Jews. That noted, the higher correlations for Jews with a few controls seem to contradict that the PGS will have lost substantial predictive validity for whatever reasons. Even if these ad hoc corrections are not justifiable, the result is probably robust, as Figure 3 makes clear (though the success of the replication could be due to the source of error being population structure; the authors ought to investigate the PGS frequencies in Arabs, Mizrahim, and Sephardim). Assuming that there is no test bias — which could be assessed by computing congruence coefficients or constructing a measurement invariance model — , the PGS result is probably causal.

One might be tempted to theorize that differences in environmental circumstances continue to confound this association, however. It can be imagined that, for instance, North and South Koreans would have different correlations between height and PGS because, even with a higher PGS, environmental deprivation is substantial enough that a greater genetic propensity among North Koreans would be suppressed. The correlation between the PGS and height in both populations may be the same but environmental advantage has uplifted South relative to North Koreans. This is irrelevant to the issue of mediation of group differences. Furthermore, in this example, an equivalent regression would likely yield the same slope; it is the intercept which would be shifted downwards in randomly-sampled North Koreans, implicating environmental factors as the cause of mean differences. This result would be immediately determinable in this sort of analysis.

The North/South Korea example may be irrelevant either way because while educational attainment is environmentally malleable — with part of the PGS effect being due to uninherited parental PGS via the SES niche they create for their children — , IQ does not appear to be similarly molded (Willoughby & Lee, 2017; Bates et al., 2018; Bates et al., 2019). The fact that PGS seem to principally affect g (e.g., Trzaskowski et al., 2013; Rimfeld et al., 2015; for the “generalist genes” concept see Plomin & Spinath, 2002) and that g does not display malleability is considerable for this analysis: It implies that environmental confounding is not a probable explanation for at least the IQ results.


Implications for Other Group Differences

The fact that a racial group difference (between Ashkenazi Jews and Whites) is mediated by known genes and that this is likely an underestimation of the true mediation (because of the limited variance accounted for by current PGS and issues of transethnic validity) has disruptive implications for the study of group differences. This finding militates against:

  1. Explanations of ethnic group differences in IQ which are based on SES;
  2. Explanations based on appearance-based discrimination;
  3. Cultural explanations;
  4. X-factors.

Regarding SES, the Jewish group has a 1.4 d advantage over the Christian sample. The equivalent for Blacks and Whites is 0.65 d (Warne, personal communication). The larger SES difference between Whites and Jews compared to Whites and Blacks, coupled with a smaller IQ difference, evinces that SES either has diminishing returns at higher levels or is otherwise not the full story. The fact that the Black-White gap has remained unchanged despite a shrinking SES gap and the fact that the Jewish-White gap has remained unchanged despite a growing SES gap means that skepticism regarding SES as a cause for group differences is warranted.

Controlling for SES, Jews continued to show a PGS and IQ advantage. Similarly, the Black-White gap remains and even grows larger with higher SES. This result can be explained in terms of regression to the mean or discrimination in favor of Blacks relative to Whites. Because the Jewish advantage remains, it may also be the case that the White IQ-SES relationship will be flat compared to Jews, as it is for Blacks compared to Whites. This makes discrimination in favor of Blacks a less likely explanation if true. Finding large samples of Jewish siblings and parents with IQ and SES data makes testing this hypothesis difficult.

Appearance-based discrimination is hard to reconcile with these results. Analysis of the NLSY79 and 97 already reveals that darker skin color among Black sibling pairs is not related to lower IQ like it is in the general population. Other work has found that Mulatto children incorrectly identified as being Black still score in-between Blacks and Whites (Scarr & Weinberg, 1976) and that more African-looking Mulattoes still score between Blacks and Whites (Rowe, 2002). More importantly, the more similar appearances and environments of monozygotic twins appear not to bias their similarity (Breland, 1972; Matheny, 1979; Barnes et al., 2014, supplement).

How should Jews — who are distinguishable or indistinguishable from Whites depending on who you ask — be affected by appearance-based discrimination, and why should this increase their IQ? One of the implications of appearance discrimination-based theories of ethnic differences in IQ is that there should be a substantial expression of skin color GWAS in the brain and equally, there should be high expression of IQ GWAS in the integumentary system in affected groups. It is not likely that Whites are benefitted by positive appearance-based discrimination effects on IQ (or penalized relative to Jews) because IQ GWAS are not enriched in the integumentary system. It is uncertain if this is also true for Blacks and Jews (obviously in opposite directions). If we consider Jews to be indistinguishable from White Gentiles, then discrimination is threatened. Of course, the Black-White and Jewish-White differences can have different causes, but measurement invariance in comparisons means that discrimination is unlikely to affect the differences, as it would have to be a factor absent in Whites and present in Blacks. If it played a role in Jewish-White differences, the same would be true.

The fact that there was no Jewish effect net of PGS and SES, suggests that culture is not playing a large causal role in Jewish IQ. The differences in cognitive ability between Sephardic and Ashkenazi Jews are also consistent with an evolutionary but not a cultural theory. Culture may only play a role insofar as Jewish culture is related to a better education or SES niche. This is not likely to affect levels of g, though it may lead to greater heritability in Jews relative to Whites and Blacks, and it will certainly lead to higher educational attainment. The likelihood that IQ heritability is enhanced in Jews relative to Whites is minimal given the minimal differences in the effects of PGS and small SES effect on genetic expressivity and heritability(see Woodley of Menie, Pallesen & Sarraf, 2018; Figlio et al., 2017), and the fact that Blacks and Whites have the same heritabilities despite their SES differences (Fuerst, 2014). Sue & Okazaki (1990) have emphasized that when it comes to a similar difference — between Asians and Whites — , the cultural values surrounding education in Asian homes are associated with lower test scores in other groups, and values associated with achievement are more common in White families. The East Asian IQ advantage seems to persist in adoption.

X-factors are already a tenuous — indeed, pseudoscientific — explanation of group differences. Proponents of X-factors recognize that within-group heritabilities are high, but maintain that elusive, empirically unidentified non-genetic variables that contribute to differences between groups, but not within them, affect the observed group differences. When Blacks and Whites are matched (in a proper hierarchical or stepwise regression, not a multiple regression, which is susceptible to something called the Sociologist’s Fallacy) on virtually all environmental variables that associate with IQ within races and the difference remains, X-factors are generally proposed to explain the rest of the difference despite the fact that genetic factors constitute the largest source of variance within groups. The concept of an X-factor is ad hoc, empirically unfounded, unfalsifiable, and unparsimonious, but it has maintained clout — both tacitly and explicitly — because it’s favored over a genetic explanation. There are no reasons to believe in X-factors. Among the empirical reasons to doubt them are:

  1. The Rowe studies. In 1994, Rowe and colleagues (a, b) demonstrated that the equivalence of covariance matrices in Blacks, Whites, Asians, and Hispanics could be extended beyond the normal 2x2 format of equivalent regressions for outcomes such as [(estimated true) IQ, grades]. The authors evinced equivalent group matrices for up to 10x10 matrices in seven large studies (Tuscon Substance Use Study, NLSY, Wisconsin/California Study, Bowling Green Study, Richmond Youth Project, and Prevention Study). If there were X-factors (or race-specific gene-environment correlations and interactions) at play in the group differences between Whites and Blacks, presumably they would ramify throughout the covariance matrices and produce differences between the different groups. It was even tenable to split the group matrices and add randomly add them to the other groups; practically, this implies no novelties in the developmental process for Whites or Blacks. Rowe and friends went further with this in 1995, replicating their 1994 results with 8x8 matrices in the NLSY and then fitting an SEM to a correlation matrix pooled across groups. The SEM fit, confirming a lack of evidence for minority-specific developmental processes, and it was also shown — using data from multiple years — that the associations between family environments and academic achievement were non-causal. Rowe & Cleveland (1996) again repeated these experiments and found invariant variance and covariances among three achievement tests in samples of White and Black full- and half-siblings. They then compared different models, where similarities and dissimilarities were explained variously by genetic and non-genetic influences, and the model which fit best for the groups included genetics. This suggested the most likely scenario was the same influences producing within-group differences in academic achievement (genes + environments) and between-group differences, in the same magnitudes, with no racially-unique factors involved in the observed differences. Jensen (1998, p. 465) described this method and these papers and replicated these results using data from the Georgia Twin Study. More detail on this form of biometric decomposition can be found in Dolan (1992), Dolan, Molenaar & Boomsma (1992, 1994), Rodgers & McGue (1994), and Rodgers, Rowe & Li (1994).
  2. The invariance of the Flynn effect between racial groups within the same country. Ang, Rodgers & Wänström (2010) found no differences in the magnitude of the Flynn effect across demographic groups, implying that the environmental improvements underlying it reach both races equally and in a similar manner. The factors influencing cognitive ability are likely highly similar between races or at least intertemporally. This does not prove the non-existence of incredibly consistent X-factors, but it does constrain them to be the same factors or multiple influences to have the same impacts intergenerationally. This makes influences like racism very unlikely unless they’re specified in a likely ad hoc and unfalsifiable manner.
  3. Nothing else is affected. X-factors seem to have no impact on Black psychomotor development, self-esteem, suicide rates, aspirations, body satisfaction, &c. Some of these results are given in Table 2 (from Dalliard, 2014).
Table 2

4. The structure of cognitive abilities doesn’t vary much by group. This is relevant and it is no strawman. An X-factor might be expected to reverberate in such a way as to change the structure of cognitive abilities by race. There are also those who claim that the Black-White score gap is due to fundamental mental differences in the structure of cognitive abilities, the thought processes, and not simply the level in each group. Craig Frisby (1999, p. 199) describes this as the belief that “Lower-scoring American minority groups are exotic, having cultural traits and background experiences that are so unusual as to lay waste to traditional interpretations of cognitive abilities and its measurement with traditional instruments.” This patent absurdity is an excuse to dismiss the results of IQ tests and nothing more. It has no substantive value and as a hypothesis, it is not only wanting but insulting. Empirically, the facts are very different (see Wilson et al., 1975; Carretta & Ree, 1995; Ree & Carretta, 1995; Kaufman, Kaufman & McLean, 1995; Kush et al., 2001; Frisby & Beaujean, 2015).

5. Measurement invariance almost always holds for natives of the same country and native users of the same languages (more on this concept can be read here). In a multiple-group confirmatory factor model in which measurement invariance holds and residual variances are equated, it is almost certain that the factors being compared and the sources of variance in those factors are the same in the groups being compared (Lubke et al., 2003, 2010; Raykov, 2004). Gitta Lubke and colleagues write:

Consider a variation of the widely cited thought experiment provided by Lewontin (1974), in which between-group differences are in fact due to entirely different factors than individual differences within a group. The experiment is set up as follows. Seeds that vary with respect to the genetic make-up responsible for plant growth are randomly divided into two parts. Hence, there are no mean differences with respect to the genetic quality between the two parts, but there are individual differences within each part. One part is then sown in soil of high quality, whereas the other seeds are grown under poor conditions. Differences in growth are measured with variables such as height, weight, etc. Differences between groups in these variables are due to soil quality, while within-group differences are due to differences in genes. If an MI model were fitted to data from such an experiment, it would be very likely rejected for the following reason. Consider between-group differences first. The outcome variables (e.g., height and weight of the plants, etc.) are related in a specific way to the soil quality, which causes the mean differences between the two parts. Say that soil quality is especially important for the height of the plant. In the model, this would correspond to a high factor loading. Now consider the within-group differences. The relation of the same outcome variables to an underlying genetic factor are very likely to be different. For instance, the genetic variation within each of the two parts may be especially pronounced with respect to weight-related genes, causing weight to be the observed variable that is most strongly related to the underlying factor. The point is that a soil quality factor would have different factor loadings than a genetic factor, which means that… [t]he MI model would be rejected.
In the second scenario, the within-factors are a subset of the between-factors. For instance, a verbal test is taken in two groups from neighborhoods that differ with respect to SES. Suppose further that the observed mean differences are partially due to differences in SES. Within groups, SES does not play a role since each of the groups is homogeneous with respect to SES. Hence, in the model for the covariances, we have only a single factor, which is interpreted in terms of verbal ability. To explain the between-group differences, we would need two factors, verbal ability and SES. This is inconsistent with the MI model because, again, in that model the matrix of factor loadings has to be the same for the mean and the covariance model. This excludes a situation in which loadings are zero in the covariance model and nonzero in the mean model.
As a last example, consider the opposite case where the between-factors are a subset of the within-factors. For instance, an IQ test measuring three factors is administered in two groups and the groups differ only with respect to two of the factors. As mentioned above, this case is consistent with the MI model. The covariances within each group result in a three-factor model. As a consequence of fitting a three-factor model, the vector with factor means, α in Eq. (9), contains three elements. However, only two of the element corresponding to the factors with mean group differences are nonzero. The remaining element is zero. In practice, the hypothesis that an element of α is zero can be investigated by inspecting the associated standard error or by a likelihood ratio test.
In summary, the MI model is a suitable tool to investigate whether within- and between-group differences are due to the same factors. The model is likely to be rejected if the two types of differences are due to entirely different factors or if there are additional factors affecting between-group differences. Testing the hypothesis that only some of the within factors explain all between differences is straightforward. Tenability of the MI model provides evidence that measurement bias is absent and that, consequently, within- and between-group differences are due to factors with the same conceptual interpretation.

In that study, measurement invariance was found to be tenable in the Georgia Twin Study data which Jensen used for his earlier biometric decomposition (among many other published uses). Measurement invariance nearly always holds (Gustafsson, 1992; Pandolfi, 1997; Keith et al., 1999; Reed, 2000; Dolan, 2000; Dolan & Hamaker, 2001; Kush et al., 2001; Floyd, Gathercoil & Roid, 2004; Edwards & Oakland, 2006; Drasgow et al., 2010; Kane & Oakland, 2010; Trundt, 2013; Beaujean & McLaughlin, 2014; Blankson & McArdle, 2015; Barnes et al., 2016; Scheiber, 2015, 2016a, b; Trundt et al., 2017); an X-factor would presumably violate this condition and cause the model to be rejected. An X-factor would have to be specified in such a way as to only affect singular variables on their own, to avoid affecting multiple traits, and, given that the weak form of Spearman’s hypothesis has been evidenced (Frisby & Beaujean, 2015; Hu & Woodley of Menie, 2019), affect only the g factor. Unfortunately for an X-factor theory, there have been no environmental Jensen effects identified which could explain observed racial differences (Metzen, 2012).

6. Jensen effects occur as well in animals, where no X-factor plausibly explains differences (Fernandes, Woodley & te Nijenhuis, 2014; Woodley of Menie, Fernandes & Hopkins, 2015; Burkhart, Schubiger & van Schaik, 2017). Indeed, even the biological correlates (Hopkins, Li & Roberts, 2018) and effects of neurotoxins (Rice, 1998) are the same in animals. Different species observed to have different behavioral repertoires and brain sizes in the wild also maintain their differences in the homogeneous environment of zoos (Forss et al., 2016; see also Damerius et al., 2018).

7. It must vary with partial ancestry, so, for instance, Mulattoes have half the X-factor exposure of full-Blacks (Kirkegaard, Fuerst & Meisenberg, 2018).

8. It must have been consistent since the recording of group levels of intelligence began (Kirkegaard, Fuerst & Meisenberg, 2018, p. 49).

9. It must actually have an effect (see Flynn, 1980, p. 47–50) and to be consistent with the rest of the literature, it will need to affect g and only g, unlike any other discovered environmental factor (Metzen, 2012).

10. It must be consistent with the fact that regression to the mean results (the context in which Thoday and Jensen originally debated this concept in 1973) have been constant in different generations and that the column of White-Black differences is highly correlated with the column of sibling differences, known to be due to genes, and the columns of sibling differences for Blacks and Whites differ by the amount expected from their mean differences (Jensen, 1998, p. 471–472). This is certainly explicable by an X-factor, but it would have to be such that it always mimicked genetic effects, in its stability and it’s relationship to genetic variables elsewhere, and so on. It would also have to operate in this way while being contained within the unique or non-shared environment (Murray, 1999; note also that if the only environmental component of individual differences in intelligence in adulthood is unshared environment — see Bouchard, 2013; Briley & Tucker-Drob, 2013; Tucker-Drob & Briley, 2014; Segal et al., 2007; Allegrini et al., 2018; Hunt, 2010, p. 227; Plomin et al., 1997; Lee, 2010, p. 248; Wilson, 1976 — and there are no race-specific non-genetic factors at play, the group mean differences will be due entirely to the non-stationary variance component, genes, as equal error and randomness do not make systematic contributions to mean differences; see Rushton, Brainerd & Pressley, 1983; Ossenkopp & Mazmanian, 1985; van der Linden et al., 2018). Murray writes:

The sibling results presented here add another constraint to the burden on those who take a strict environmentalist position. Proponents of the convergence hypothesis must not only posit an environmental Factor X that affects blacks but not whites and that is relatively uniform in its effects across the range of IQ (possibly having a greater effect as IQ goes up). Because the shared environment is the same for both siblings by definition, they must also posit a causal mechanism that is expressed through the nonshared environment. This requirement is accentuated by the results… showing that differential regression to the mean is virtually unaffected by matching for parental income and education along with subject IQ. As a final consideration, a Factor X that satisfies the rest of the conditions must also be quite powerful if it is to produce BW differences in regression to the mean commensurate with those observed in the sibling samples.
Combining these constraints — a Factor X that is powerful, pervasive, uniform across the range of IQ, located in the nonshared environment, and consistent with equivalence of within-race heritability and within-race developmental processes — poses difficult problems, beginning with a basic problem of description. A defining feature of the nonshared environment is that it is random across siblings. How does one conceptualize a nonshared environment that is random with respect to siblings and both powerful and systematic with respect to race?

And now, with the present study illustrating that the Jewish-White difference is mediated substantially by genes and that culture has no significant residual effect on ability, an X-factor must be compatible with the fact that the same source of genetic variance that gives rise to individual cognitive differences also contributes substantially to the group difference. The EA1 PGS also had substantial validity in both Africans and Europeans, mediating that difference, though only marginally because of the small amount of variance it explained. This implies common causal variants (Marigorta & Navarro, 2013; see also Zanetti & Weale, 2018). (Much of the above assumes the same X-factor effects between, e.g., Blacks and Whites, and Jews and Whites, although this may not be tenable.)

Other research has discovered a relationship between genetically and ecologically-assessed admixture and group differences in ability (Lima-Costa et al., 2018) and socioeconomic status (Kirkegaard, Wang & Fuerst, 2016). The effects of admixture on various outcomes are also consistent — regardless of socioeconomic and geographical confounding — across the Americas (Fuerst & Kirkegaard, 2016; Christainsen, 2016). A large literature on color-based discrimination has shown that sibling controls virtually eliminate the negative relationships between color and outcomes (e.g., Francis-Tan, 2016; Mill & Stein, 2016; Marteleto & Dondero, 2016; Rangel, 2015; Kizer, 2017; Telles, 2006); more, similar research is forthcoming.


This study adds to a large, consistent, and underappreciated literature illustrating that genetics are important for individual differences, and that group differences are probably not more than individual differences. The authors should be commended.