IQ is largely a pseudoscientific swindle

Nassim Nicholas Taleb
INCERTO
Published in
14 min readJan 2, 2019

--

For some technical backbone to this piece,see here.

(Revised draft: added comments on sinister country profiling. Also 1) Used the same data as researchers to find that R² for IQ-wealth and IQ-income is effectively 0 in spite of the circularity. 2) Turns out IQ beats random selection in the best of applications by less than 6%, typically <2%, as the computation of correlations have a flaw and psychologists do not seem to know the informational value of correlation in terms of “how much do I gain information about B knowing A” and propagation of error (intra-test variance for a single individual). 3) Added information showing the story behind the effectiveness of Average National IQ is, statistically, a fraud. The psychologists who engaged me on this piece — with verbose writeups —made the mistake of showing me the best they got: papers with the strongest pro-IQ arguments. They do not seem to grasp what noise/signal really means in practice. )

Background : “IQ” is a stale test meant to measure mental capacity but in fact mostly measures extreme unintelligence (learning difficulties), as well as, to a lesser extent (with a lot of noise), a form of intelligence, stripped of 2nd order effects — how good someone is at taking some type of exams designed by unsophisticated nerds. It is via negativa not via positiva. Designed for learning disabilities, and given that it is not too needed there (see argument further down), it ends up selecting for exam-takers, paper shufflers, obedient IYIs (intellectuals yet idiots), ill adapted for “real life”. (The fact that it correlates with general incompetence makes the overall correlation look high, even when it is random, see Figures 1 and 2.) The concept is poorly thought out mathematically by the field (commits a severe flaw in correlation under fat tails and asymmetries; fails to properly deal with dimensionality; treats the mind as an instrument not a complex system), and seems to be promoted by

  • Racists/eugenists, people bent on showing that some populations have inferior mental abilities based on IQ test=intelligence; those have been upset with me for suddenly robbing them of a “scientific” tool, as evidenced by the bitter reactions to the initial post on twitter/smear campaigns by such mountebanks as Charles Murray. (Something observed by the great Karl Popper, psychologists have a tendency to pathologize people who bust them by tagging them with some type of disorder, or personality flaw such as “childish” , “narcissist”, “egomaniac”, or something similar). Note the online magazine Quillette seems to be a cover for a sinister eugenics program (with tendencies I’ve called “neo-Nazi” under the cover of “free thought”.) Note I am finding statistical flaws in Richard Plomin’s work — the pope of twin studies (see intransitivity of correlation in my technical addendum; he doesn’t get it).
  • Psychometrics peddlers looking for suckers (military, large corporations) buying the “this is the best measure in psychology” argument when it is not even technically a measure — it explains at best between 2 and 13% of the performance in some tasks (those tasks that are similar to the test itself)[see interpretation of .5 correlation further down], minus the data massaging and statistical cherrypicking by psychologists; it doesn’t satisfy the monotonicity and transitivity required to have a measure (at best it is a concave measure). No measure that fails 80–95% of the time should be part of “science” (nor should psychology — owing to its sinister track record — be part of science (rather scientism), but that’s another discussion).
Typical confusion: Graphs in Intelligence showing an effect of IQ and income for a large cohort. Even ignoring circularity (test takers get clerical and other boring jobs), injecting noise would show the lack of information in the graph. Note that the effect shown is lower than the variance between tests for the same individual!
The asymmetry is clear here.
Fig 1: The graph that summarizes the first flaw (assuming thin tailed situations), showing that “correlation” is meaningless in the absence of symmetry. We construct (in red) an intelligence test (horizontal), that is 100% correlated with negative performance (when IQ is, say, below 100) and 0% with upside, positive performance. We progressively add noise (with a 0 mean) and see correlation (on top) drop but shift to both sides. Performance is on the vertical axis. The problem gets worse with the “g” intelligence based on principal components. By comparison we show (graph below) the distribution of IQ and SAT scores. Most “correlations” entailing IQ suffer the same pathology. Note: this is in spite of the fact that IQ tests overlap with the SAT! (To echo Haldane, one ounce of rigorous algebra is worth more than a century of verbalistic statisticopsycholophastering).
  • It is at the bottom an immoral measure that, while not working, can put people (and, worse, groups) in boxes for the rest of their lives.
  • There is no significant statistical association between IQ and hard measures such as wealth. Most “achievements” linked to IQ are measured in circular stuff s.a. bureaucratic or academic success, things for test takers and salary earners in structured jobs that resemble the tests. Wealth may not mean success but it is the only “hard” number, not some discrete score of achievements. You can buy food with a $30, not with other “successes” s.a. rank, social prominence, or having had a selfie with the Queen.
The informational interpretation of correlation, in terms of “how much information do I get about A knowing B”. Add to that the variance in results of IQ tests for the very same person.
An extension of the first flaw that shows how correlations are overestimated. Probability is hard.
  • Psychologists do not realize that the effect of IQ (if any, ignoring circularity) is smaller than the difference between IQ tests for the same individual (correlation is 80% between test and retest, meaning you being you explains less than 64% of your test results and, worse, you are two thirds of a standard deviation away from yourself. ).
  • Some argue that IQ measures intellectual capacity — real world results come from, in addition, “wisdom” or patience, or “conscientiousness”, or decision-making or something of the sort. No. It does not even measure intellectual capacity/mental powers.
  • Dead Man Bias: Even if there were linearity and symmetry to IQ, the mere fact that on the left there is an absorbing state (dead is 0 IQ) without an equivalent to the right induces a severe bias.

If you want to detect how someone fares at a task, say loan sharking, tennis playing, or random matrix theory, make him/her do that task; we don’t need theoretical exams for a real world function by probability-challenged psychologists. Traders get it right away: hypothetical P/L from “simulated” paper strategies doesn’t count. Performance=actual. What goes in people’s head as a reaction to an image on a screen doesn’t exist (except via negativa).

IQ and wealth at low scale (outside the tail). Mostly Noise and no strikingly visible effect above $40K, but huge noise. Psychologists responding to this piece do not realize that statistics is about not interpreting noise. From Zagorsky (2007)
There is little information IQ/Income. From Zagorsky (2007). I redid the data and found suspicious selection from NLS database that truncates both Income, wealth and IQ in tails which artificially boosts R². [Will follow up with my own study since R² appears to be <.01 for Income and ,.02 for wealth, in spite of curcularity of test taking!]
R² is effectively zero!

Fat Tails If IQ is Gaussian by construction (well, almost) and if real world performance were, net, fat tailed (it is), then either the covariance between IQ and performance doesn’t exist or it is uninformational. It will show a finite number in sample but doesn’t exist statistically — and the metrics will overestimare the predictability. Another problem: when they say “black people are x standard deviations away”, they don’t know what they are talking about. Different populations have different variances, even different skewness and these comparisons require richer models. These are severe, severe mathematical flaws (a billion papers in psychometrics wouldn’t count if you have such a flaw). See the formal treatment in my next book.

Mensa members: typically high “IQ” losers in Birkenstocks.

But the “intelligence” in IQ is determined by academic psychologists (no geniuses) like the “paper trading” we mentioned above, via statistical constructs s.a. correlation that I show here (see Fig. 1) that they patently don’t understand. It does correlate to very negative performance (as it was initially designed to detect learning special needs) but then any measure would work there. A measure that works in the left tail not the right tail (IQ decorrelates as it goes higher) is problematic. We have gotten similar results since the famous Terman longitudinal study, even with massaged data for later studies. To get the point, consider that if someone has mental needs, there will be 100% correlation between performance and IQ tests. But the performance doesn’t correlate as well at higher levels, though, unaware of the effect of the nonlinearity, the psychologists will think it does.(The statistical spin, as a marketing argument, is that a person with an IQ of 70 cannot prove theorems, which is obvious for a measure of unintelligence — but they fail to reveal how many IQs of 150 are doing menial jobs. So “vey low IQ” may provide information, while “very high IQ” may convey nothing better than random — it is not even a necessary condition.).

It is a false comparison to claim that IQ “measures the hardware” rather than the software. It can measure some arbitrarily selected mental abilities (in a testing environment) believed to be useful. However, if you take a Popperian-Hayekian view on intelligence, you would realize that to measure future needs it you would need to know the mental skills needed in a future ecology, which requires predictability of said future ecology. It also requires some ergocity, the skills to make it to the future (hence the need for mental “biases” for survival). Example: you are designing a car for “performance”. A Maserati will perform best on a track and beat a goat there. But what if you need to cross the Corsican garigue? A goat will be ideal then. In NYC during traffic, pedestrians beat cars. So the notion of “performance” needs to be associated with a specific environment and necessarily predictive of it. (Footnote: Herb’s Simon’s notion of scissors:one blade represents capabilities, the other bladethe situational context.) The “g” because of its mathematical flaws fails to produce a general solution to this.

The Best Map Fallacy (Technical Incerto)

Real Life: In academia there is no difference between academia and the real world; in the real world there is. 1) When someone asks you a question in the real world, you focus first on “why is he/she asking me that?”, which shifts you to the environment (see Fat Tony vs Dr John in The Black Swan) and detracts you from the problem at hand. Philosophers have known about that problem forever. Only suckers don’t have that instinct. Further, take the sequence {1,2,3,4,x}. What should x be? Only someone who is clueless about induction would answer 5 as if it were the only answer (see Goodman’s problem in a philosophy textbook or ask your closest Fat Tony) [Note: We can also apply here Wittgenstein’s rule-following problem, which states that any of an infinite number of functions is compatible with any finite sequence. Source: Paul Bogossian]. Not only clueless, but obedient enough to want to think in a certain way. 2) Real life never never offers crisp questions with crisp answers (most questions don’t have answers; perhaps the worst problem with IQ is that it seem to selects for people who don’t like to say “there is no answer, don’t waste time, find something else”.) 3) It takes a certain type of person to waste intelligent concentration on classroom/academic problems. These are lifeless bureaucrats who can muster sterile motivation. Some people can only focus on problems that are real, not fictional textbook ones (see the note below where I explain that I can only concentrate with real not fictional problems). 4) IQ doesn’t detect convexity of mistakes (by an argument similar to bias-variance you need to make a lot of small inconsequential mistake in order to avoid a large consequential one. See Antifragile and how any measure of “intelligence” w/o convexity is sterile edge.org/conversation/n…). To do well you must survive; survival requires some mental biases directing to some errors. 5) Fooled by Randomness: seeing shallow patterns in not a virtue — it leads to naive interventionism. Some psychologist wrote back to me: “IQ selects for pattern recognition, essential for functioning in modern society”. No. Not seeing patterns except when they are significant is a virtue in real life. 6) To do well in life you need depth and ability to select your own problems and to think independently. And one has to be a lunatic (or a psychologist) to believe that a standardized test will reveal independent thinking.

This is no longer a regression. It is scientific fraud. A few random points from the same distribution can invert the slope of the regression. (From Jones and Schneider, 2010 attempting to make sense of the race-motivated notion of Average National IQ).
Upper bound: discount the massaging and correlation effects. Note that 50% correlation corresponds to 13% improvement over random picks. Figure from the highly unrigorous Intelligence: All That Matters by S. Ritchie.

National IQ is a Fraud. From engaging participants (who throw buzzwords at you), I realized that the concept has huge variance, enough to be uninformative. See graph. And note that the variance within populations is not used to draw conclusions (you average over functions, don’t use the funciton over averages) — a problem acute for tail contributions.

[In fact the seminal study says “for 104 of the 185 countries, no studies were available” and they computed the numbers… from ethnicity. Aside from intransivity of correlation, this is pure fraud.]

Notice the noise: the top 25% of janitors have higher IQ than the bottom 25% of college professors, even counting the circularity. The circularity bias shows most strikingly with MDs as medical schools require a higher SAT score.

Recall from Antifragile that if wealth were fat tailed, you’d need to focus on the tail minority (for which IQ has unpredictable payoff), never the average. Further it is leading to racist imbeciles who think that if a country has an IQ of 82 (assuming it is true not the result of lack of such training), it means politically that all the people there have an IQ of 82, hence let’s ban them from immigrating. As I said they don’t even get elementary statistical notions such as variance. Some people use National IQ as a basis for genetic differences: it doesn’t explain the sharp changes in Ireland and Croatia upon European integration, or, in the other direction, the difference between Israeli and U.S. Ashkenazis.

Additional Variance: Let us return to the point of the correlation test-retest. Unlike measurements of height or wealth, which carry a tiny relative error, many people get yuugely different results for the same IQ test (I mean the same person!), up to 2 standard deviations as measured across people, higher than the sampling error in the population itself! This additional source of sampling error weakens the effect by propagation of uncertainty way beyond its predictability when applied to the evaluation of a single individual. It also tells you that you as an individual are vastly more diverse than the crowd, at least with respect to that measure!

There is a severe nonlinearity in the correlation test-retest, in addition to the problem of intransitivity of correlation discussed in the technical note. Imagine a chronometer that varies by 1 hour per measurement!

Biases in Research: If, as psychologists show (see figure) MDs and academics tend to have a higher “IQ” that is slightly informative (higher, but on a noisy average), it is largely because to get into schools you need to score on a test similar to “IQ”. The mere presence of such a filter increases the visible mean and lower the visible variance. Probability and statistics confuse fools.

Functionary Quotient: If you renamed IQ , from “Intelligent Quotient” to FQ “Functionary Quotient” or SQ “Salaryperson Quotient”, then some of the stuff will be true. It measures best the ability to be a good slave confined to linear tasks. “IQ” is good for @davidgraeber’s “BS jobs”.

Metrification: If someone came up w/a numerical“Well Being Quotient” WBQ or “Sleep Quotient”, SQ, trying to mimic temperature or a physical quantity, you’d find it absurd. But put enough academics w/physics envy and race hatred on it and it will become an official measure.

Notes And Technical Notes

  • The argument by psychologists to make IQ useful is of the sort: who would you like to do brain surgery on you/who would you hire in your company/who would you recommend, someone with a 90 IQ or one with 130 is ...academic. Well, you pick people on task-specific performance, which should include some filtering. In the real world you interview people from their CV (not from some IQ number sent to you as in a thought experiment), and, once you have their CV, the 62 IQ fellow is naturally eliminated. So the only think for which IQ can select, the mentaly disabled, is already weeded out in real life: he/she can’t have a degree in engineering or medicine. Which explains why IQ is unnecessary and using it is risky because you miss out on the Einsteins and Feynmans.
  • “IQ” is most predictive of performance in military training, with correlation~.5, (which is circular since hiring isn’t random and training is another test).
  • Plomin who studies heredity doesn’t seem aware of the intransitivity of correlation. Among other flaws (he does not seem to know how to extract heredity, but that’s another problem).
  • There are contradictory stories about whether IQ ceases to work past a threshold, since Terman’s longitudinal study of “geniuses”. What these researchers don’t get is these contradictions come from the fact that the variance of the IQ measure increases with IQ. Not a good thing.
  • The argument that “some races are better at running” hence [some inference about the brain] is stale: mental capacity is much more dimensional and not defined in the same way running 100 m dash is.
  • I have here no psychological references in this piece (except via negativa, taking their “best”), not that I didn’t read these crap papers: simply, the field is bust. So far ~ 50% of the research does not replicate, and papers that do have weaker effect. Not counting the poor transfer to reality (psychological papers are ludic). How P values often — rather almost always — fraudulent: my paper arxiv.org/pdf/1603.07532…
  • The Flynn effect should warn us not just that IQ is somewhat environment dependent, but that it is at least partly circular.
  • Verbalism: Psychologists have a skin-deep statistical education & can’t translate something as trivial as “correlation” or “explained variance” into meaning, esp. under nonlinearities (see paper at the end).
  • The “best measure” charlatans: IQ is reminiscent of risk charlatans insisting on selling “value at risk”, VaR, and RiskMetrics saying “it’s the best measure”. That “best” measure, being unreliable blew them up many many times. Note the class of suckers for whom a bad measure is better than no measure across domains.
  • You can’t do statistics without probability.
  • Much of the stuff about IQ of physicists is suspicious, from self-reporting biases/selection in tests.
  • If you looked at Northern Europe from Ancient Babylon/Ancient Med/Egypt, you would have written the inhabitants off as losers who are devoid of potential... Then look at what happened after 1600. Be careful when you discuss populations.
  • The same people hold that IQ is heritable, that it determines success, that Asians have higher IQs than Caucasians, degrade Africans, then don’t realize that China for about a Century had one order of magnitude lower GDP than the West.

Responses by Psychologists

  • Alt-Right groups such as James Thompson
Reactions to this piece in the Alt-Right Media: all they got is a psychologist who still hasn’t gotten to the basics of elementary correlation and noise/signal. The fact that psychologists selected him to defend them (via retweets) speaks volumes about their sophistication.
  • Hack job by one Jonatan Pallesen, full of mistakes about this piece (and the “empiricism”), promoted by mountebanks such as Murray. He didn’t get that of course one can produce “correlation” from data. It is the interpretation of these correlations that is full of BS. Pallesen also produces some lies about what I said which have been detected in online comments (e.g. the quiz I gave and using Log vs X ).

Mathematical Considerations

The above is NOT a representation of IQ/Correlation but the mathematical consequences of correlation not being constant.

CURSE OF DIMENSIONALITY A flaw in the attempts to identify “intelligence” genes. You can get monogenic traits, not polygenic (note: additive monogenic used in animal breeding is NOT polygenic).

The Skin in the game issue

Note From Skin in the Game, 1
Note from Skin in the Game, 2

How social scientists have trouble translating a statistical construct into its practical meaning.

--

--