Another Mismeasurement

An article in the current issue of Scientific American Mind, How Birth Order Affects Your Personality, discusses current research that ostensibly examines how a person’s family position—whether you are a firstborn or a middle child, for example—may directly affect her intelligence and personality. Articles like these are why I rarely read Scientific American Mind. The “birth order theory” is yet another residual splinter from the dried shrub of pseudoscientific drivel claiming that one’s intelligence can be understood by manipulating a single independent variable, in this case, birth order. These outdated approaches to studying intelligence have long since been disposed of and rendered as kindling for better theories, and I condemn the journalistic integrity of SciAm Mind for so frequently promoting these bogus headline-grabbing theories.

To be fair to SciAm, the article itself can hardly be considered an important contribution in support for the correlation between intelligence and birth order. Perhaps noting this, its editors wisely changed the title for the online edition to emphasize the “personality” aspect of the research, rather than the “intelligence” angle emphasized in the print edition. Buried beneath the rhetorical headline and emboldened quotes are only a few paragraphs discussing new research in weak support for the theory, prefaced by a brief account of the decades of failed attempts to show any correlation between birth order and intelligence or personality beyond intuition. These sparse supporting studies mostly claimed to observe a correlation between one’s birth order and its affect on her relationships (firstborns are more likely to associate with other firstborns, etc).

However, the author—and to an extent, the journal in which he’s published—invalidates himself with an embarrassing logical fallacy about the effect that birth order has on spousal attraction. He writes (equivocally cloaking the fallacy by changing the order of his reasoning, but the quote is syntactically accurate and color coded to elucidate the fallacy),

…If birth order affects personality, [and] spouses correlate on personality…, spouses should correlate on birth order.
This is a textbook case of the undistributed middle fallacy. To illustrate the fallacy with a more straightforward example:
if all lily pads are green, and a frog is green, a frog is not a lily pad. (please correct me if I am the one who has made an embarrassing fallacy).

If there is one thing to know about the history of intelligence testing, it is that it is notoriously flawed. Stephen J. Gould’s The Mismeasure of Man (a title he chose deliberately for its inherent irony) serves to methodically critique and abolish one particular aspect to ranking of human groups: “the argument that intelligence can be meaningfully abstracted as a single number capable of ranking all people on a linear scale of intrinsic and unalterable mental worth.” The book chronicles the history of fallacious intelligence measurements, specifically biological determinism, or the belief that “shared behavioral norms, and the social and economic differences between human groups—primarily races, classes, and sexes—arise from inherited, inborn distinctions and that society, in this sense, is an accurate reflection of biology.”

Although the SciAm article’s author is interested in the environmental factors that affect personality and intelligence (the “nurture” in nature vs. nurture) rather than the biological factors, he still suffers from a fundamental lapse in reason about intelligence about which Gould is so critical: that it is a measurable thing. Often referred to as the Reification fallacy, we might think of this mistake along the lines of the famous John Stuart Mill quote, “To believe that whatever received a name must be an entity or being, having an independent existence of its own1.” It’s fairly difficult to measure a value when that value—in this case intelligence—is itself an abstraction. In this sense, Gould’s critique about intelligence testing is completely relevant. It’s not entirely unreasonable for a scientist pursue research about factors that affect intelligence, after all, we can intuitively recognize the importance of mentality and cognition in our lives. However, in terms of contemporary neuroscience (and not pulpy articles directed towards non-scientists), the concept of “intelligence” is a gross oversimplification.

The idea of measuring intelligence by measuring heads seemed ridiculous... I was on the point of abandoning this work, and I didn’t want to publish a single line of it. - Alfred Binet, creator of the IQ test

The development of the IQ test itself has a rather interesting story2. In 1898, French psychologist Alfred Binet, the father of the Intelligence Quotient (IQ) test, was convinced that Crainiometry was the best way to measure intelligence. At the time, he wrote “the relationship between the intelligence of subjects and the volume of their head… is very real. We conclude that the preceding proposition must be considered as incontestable.” However, as he began pursuing his own research, Binet began to doubt the validity of Crainiometry. Even more damning, he began to question the ability of any researcher to perform “objective” research at all. After performing a study of his own bias, Binet concluded, “suggestibility… works less on an act of which we have full consciousness, than on a half-conscious act and this is precisely its danger.” In the case of Crainiometry, he believed that previous research, including his own, had been inhibited by experimental bias, often against the poor. Discouraged, Binet wrote in 1900, “The idea of measuring intelligence by measuring heads seemed ridiculous… I was on the point of abandoning this work, and I didn’t want to publish a single line of it.”

A few years later, Binet’ had his big break. In 1904 when he was commissioned by the minister of public education to develop a series of techniques to identify children who did not succeed in a normal classroom setting and required special education. Unlike previous tests to identify this caliber of student that relied on physical measurement, Binet had his subjects perform a broad array of short tasks. Instead of focusing on learned skills, he chose to test different abilities and assign each child a single score. The test was a success, and Binet eventually published three versions of the test before his death in 1911.

The 1908 version assigned an age level to each task, defined as the youngest age at which a child of normal intelligence should be able to complete the task successfully. The age associated with the most difficult tasks she could perform was her mental age, and her intellectual level was calculated by subtracting this mental age from her true age. German psychologist W. Stern argued that mental age should be divided, not subtracted from chronological age, and thus the Intelligence Quotient (IQ) score was formed.

We are of the opinion that the most valuable use of this test will not be its application to the normal pupils, but rather to those of inferior grades of intelligence.

Binet believed that intelligence is a value too complex to be defined by a single number. As far as he was concerned, his test should have only been used as a practical tool to help identify students who deserved additional help in school. Thomas Jefferson once mused about persons with low intelligence that “whatever be their degree of talents, it is no measure of their rights.” Following from this line of thinking, Binet’s test was intended to increase the equity of education, rather than as a tool for division. He was of the opinion that “the most valuable use of [his IQ test] will not be its application to the normal pupils, but rather to those of inferior grades of intelligence.” Binet insisted that the scores of his test be used as a practical assessment device, and should not be used to develop any concept of a person’s intellect.

Alas, Binet’s intelligence test was quickly exploited and re-formatted, and is still used today as a method to define a person’s overall intelligence. Standardized testing has become a divisive measure in school systems across the country, operating under a reverse paradigm from Binet and Thomas Jefferson’s goals: school systems with higher scores are rewarded with more money. Thus, the schools with inadequate resources to adequately prepare students for standardized tests are stigmatized, and the students suffer.

Recent evidence continues to confirm the inadequacy of intelligence testing as a proxy for anything that is relevant in the real world. For example, a SciAm Mind article published last year reported that a woman who thinks that women as a group are believed to do worse then men in math will tend to perform worse on math tests as a result. Other experiments have indicated that during math testing, skilled mathematicians “choke” under pressure performing problems that, under other circumstances, they solve easily.

What can we learn from this? Mostly, to be wary of studies that seem to make strong causal claims about “intelligence.” The birth-order SciAm article ends with a strange call for support while admitting the spinelessness of his claims:

It’s fine for scientists to say “more study is needed,” but we must find love, gain self-knowledge and parent children now. In that sense, a great deal about who we are and how we think can be learned reading those shelves of birth order-related self-help books, even if the actual content is not yet—or will never be—experimentally confirmed.

I’m a huge supporter of saving the journalism industry. But if Scientific American Mind continues to publish openly irreverent cush pieces of this type, perhaps it will help us decide which magazines are not worth our attention in the salvation effort.



Footnotes

  1. Quote from John Stuart Mill. Analysis of the phenomena of the human mind (1829), page 5
  2. Information about the history of the IQ test is primarily from Gould’s The Mismeasure of Man unless otherwise credited.

Published on March 9, 2010

> Another Mismeasurement