Dettmann, D. & Ruthsatz, J.
N. Colangelo, & S. Assouline (Eds.), Talent Development IV. Scottsdale, AZ: Great Potential Press
This article is a book chapter by Douglas Dettmann and Joanne Ruthsatz. It addresses the sometimes unaccepted fact that everybody is different. In America equality is strived for to such an extent that we can fail to recognize that everybody is not the same, individuals do not have the same abilities and intellects and should not be treated the same in all cases. The achievement of the brightest is hindered by the one-size-fits-all approach of America.
Few appreciate the broad range of individual differences that exist and the effect they have on achievement. This is due to the American orientation toward equality which partly results from a misunderstanding of historical antecedents. With increasing levels of education, we are forced to confront the broad range of individual differences that exist. Intelligence is one ability that has been extensively investigated. The range of individual differences in intelligence is from 8 to 15 standard deviations. Attempts to increase intelligence have produced effects up to 1.5 standard deviations. Much of the achievement in nearly any domain can be traced to individual differences that exist before training begins. To understand exceptional behavior, we must understand the origins of individual differences.
The Importance of Individual Differences for Exceptional Achievement
Americans believe in equality. It is an idea that has permeated the American fiber since the European settlers arrived. Thomas Paine, in his pamphlet "Common Sense," wrote the following, "Where there are no distinctions, there can be no superiority; perfect equality affords no temptations." (Foner, 1945, p. 27). This statement was first published January 10, 1776.
"Common Sense" laid out the plans for the American revolution against England and, in the view of many, along with Paine's other writings, was a major force in gaining popular support for the coming war. The title of the work is a pun. Common sense refers to a logical way of thinking. More importantly, it refers to the common beliefs held by Americans of the time. It presented the sense of common thinking or the common sense.
It was only months after publication of Paine's pamphlet that the Declaration of Independence was written. It included the famous phrase, "We hold these truths to be self-evident, that all men are created equal, that they are endowed by their creator with certain unalienable rights, that among these are life, liberty, and the pursuit of happiness." Though every school child learns this phrase, it has become so well-known that its origin and true meaning is often forgotten.
Recently, in a meeting at Ohio State University where equality in America was discussed, a college student began a question to Secretary of State Albright with the statement that, according to the Constitution, all men are created equal. Secretary Albright pointed out that the Constitution of the United States does not include that phrase and, in fact, makes no statement about equality.
The point here is that, as a people, Americans believe that we are guaranteed equality. Some might argue that what we believe in is equality of opportunity not equality of outcome or ability, but this is a minor distinction. If we took a poll of Americans, almost certainly we would find that equality is something they strongly believe in. Many would even argue that anyone can be anything they like if they work hard enough. In fact, this is the basic position advocated by Ericsson and his colleagues (e.g., Ericsson & Chamess, 1994).
In this paper, we will argue that there is no such thing as equality. People differ immensely in what they are capable of and in what they accomplish. Until we accept this fact, many of our social institutions will not receive the alterations they badly need. We will further argue that this mistaken and naive belief that all people are equal in ability is actually simultaneously discriminatory and punitive. It discriminates against those of low ability who are expected to succeed to the same extent as those of higher ability. If persons of low ability don't succeed, then according to the equality view, it is because they didn't work hard enough. On the other hand, the equality idea is punitive to those of high ability because it lumps them in with everyone else denying them education appropriate to their ability level.
The idea of equality that permeates American thought is really based on a misunderstanding of what the original argument was about. Not many people realize the argument actually was about the divine rights and hereditary succession of monarchs and not the ability levels of Americans. The English believed that their king ruled the US and other colonies by divine right passed on through hereditary succession. In "Common Sense," Paine said that the king of England's authority over the US was illegitimate. He argued that God did not create two classes of people-kings and D others. Using quotations from the Bible and logical argument, he said that there was only one kind of person created by God and that none were given divine right to rule. Quoting the Old Testament of the Bible, he showed that Judaic and Christian doctrine were opposed to the divine rights of kings.
The arguments made by Paine were ultimately reflected in the American Constitution. One of Paine's arguments against hereditary succession was that there would be times when a very young king, without experience or the wisdom that comes with years, would rule. He felt this was not in the best interest of the country. As a result of this argument, the American Constitution requires that the President be 35 years of age or older.
The equality argument that Paine made against the monarchy was an exceptionally popular one at the time since the king of England was the source of American's perceived problems. Trying to recoup expenses for the military protection and political leadership he had provided, he had imposed unpopular taxes. It was, of course, these taxes that served as the precipitating event that began the revolution.
When the framers of the Declaration of Independence stated that all men are created equal, it was a direct attack on the monarchy, a slap in the face to the king. The line following the famous "all men are created equal" line in the Declaration of Independence is, "That to secure these rights, governments are instituted among men, deriving their just powers from the consent of the governed......The equality line was directed at a monarchy. Those who wrote die Constitution were not stating that everyone is the same. It is obvious that they didn't believe everyone was the same. Many were slave owners. None advocated the rights of women though they were urged to do so.
The idea of equality crept into the American psyche and became a bedrock assumption of American's beliefs. Slowly, over time, like many popular ideas, it has drifted to an entirely different meaning. Today, if you ask most people what Paine and the writers of the Declaration of Independence had in mind when they discussed equality, they will almost certainly tell you that it has to do with the equality of Americans. I am sure that both Paine and the writers of the Declaration of Independence would be shocked to find how their ideas have changed in the minds of average Americans.
Do Differences Exist?
Though most Americans believe in the naive view of equality discussed earlier, some are well aware of individual differences. One of those is the radio shock jock, Howard Stern, who likes nothing better than to attack sacred cows wherever he finds them. He frequently asks people simple questions that would seem to be within the experience of everyone. The results are often surprising. Here is an example of a quiz Stern gave to two former Playmate's of the Year who were on his show recently. They were Julie Caihni (1996 Playmate of the Year) and Stacey Sanchez (1997 Playmate of the Year). We obtained this interview when a former student sent one of us a copy by e-mail. Evidently people think the results are surprising enough to circulate.
Who is the President of Russia?
(correct answer: Boris Yeltsin)
Define the meaning of NAACP.
Julie: "Something, something, something, for Certified Pianists"
Stacey: "It's some kind of police organization."
(correct answer: National Association for the Advancement of Colored People)
Who was the inventor of the light bulb?
Julie: "I know Edison invented the telephone, but I can't remember the light bulb guy."
Stacey: "I don't know."
(correct answer: Thomas A. Edison. Alexander Graham Bell was the phone guy!)
Who is the Speaker of the House?
Julie: "Gore something-or-other."
Stacey: "Bill Clinton:"
(correct answer: Newt Gingrich)
Define the meaning of the letters CIA.
Julie: "I don't know."
Stacey: "Certified Investigation Association."
(correct answer: Central Intelligence Agency)
What is the center of our solar system?
Julie: "The Equator"
Stacey: "The Moon"
(correct answer: The Sun)
What this interview shows is that not everyone knows the same things. What is humorous about this interview is that these women do not know what most would consider some very simple information. While this particular interview reinforces the "dumb model' stereotype, Stem has tested other models who answer correctly all of the questions. Some have even done better than Howard and his pals on the show. However these little quizzes come out, it is clear that there are large differences between people in what they know.
The differences among people are more systematically explored in intelligence or IQ tests. One only need look at the results from a typical IQ vocabulary test to see that the differences in what people of the same age know are enormous.
The Wechsler Adult Intelligence Scale-Revised contains such a vocabulary test. There are 32 items on the test. Compare the actual performance of three high school seniors with IQs of 70 (two standard deviations below the mean), 100 (at the mean), and 130 (two standard deviations above the mean). Each of the words on the test is given to the subject who is asked to define it. Each item is given zero, one, or two points, depending on the correctness and completeness of the answer. The person with a 70 IQ obtained a raw score of 5. The person with and IQ of 100 got a score of 36. Finally, the persons with an IQ of 130 got a raw score of 62.
The person with a 70 IQ barely knows any of the words while the person with a 130 IQ knows nearly all of them. Further, there is a difference not only in the number of words known but also in the quality of their responses. The person with a 70 IQ has trouble providing a complete definition for any but the simplest words. Only two words on the vocabulary test are defined well enough by the person with a 70 IQ to get a full two points. On the other hand, the person with a 130 IQ seldom gets a score of one.
Though these results are from a vocabulary test, there are subtests of other kinds of knowledge on this same IQ test. The results from these subtests are correlated with each other so people tend to perform similarly from subtest to subtest. The kind of variation seen on the vocabulary subtest can be seen on every other subtest that makes up the
WAIS-R. In fact, the same kind of variation can be found on any test of mental ability that one would care to examine. Further, how one does on any of these tests or subtests is a good predictor of how that person will do on any other mental test. It is impossible to deny that systematic variation in knowledge or ability is a fact of human existence.
What Is the Range of Differences?
Determining the range of ability in the population would seem to be an easy matter. Just examine the normal curve and find a range that fits the population size. For example, one would expect a frequency of 6 in 100,000 cases at 4 standard deviations (3 from each end of the distribution). However, ability is not exactly normally distributed. Further, the occurrence of exceptional ability may not follow the same rules as the distribution of talent in the rest of the population.
By examining the normative data on IQ tests, we can approximate the range of ability that exists in the human population. Many tests allow for IQs between 30 and 180. That is a range of approximately 10 standard deviations. One problem with standard IQ tests is that they are not very good at the extremes. Because particularly high and low scores are rare, they tend to have too few really hard or easy items so the extremes are poorly measured.
Some have attempted to estimate the IQs of particularly talented individuals and these estimates often range above 200. At the other end of the distribution, IQs of some mentally retarded persons are estimated to be below 20. This is a range of 12 standard deviations.
There is no doubt that the range of ability in human populations is immense. As a conservative estimate, we would say that 10 standard deviations would be a minimum of the range of human talent existing in the population. We are seldom aware of the wide range of abilities because we are seldom exposed to the full range. The properties of the normal distribution dictate that a majority of the scores fall close to the mean. The more extreme the score, the fewer cases there are. If any two members of the population are drawn at random, the expected difference in their scores wild be about 16 points, a little over one standard deviation. The average interaction of two adults is within a standard deviation of their ability level. This is one reason we do not appreciate the gigantic range of ability that actually exists.
Though we have confined our discussion to IQ, the extreme variation in talent is even more obvious when an instrument is designed to identify exceptional talent. Elo (1978) designed a rating system to identify and rank top chess players. The system ranks players according to how they perform against other ranked players. The Elo scale has a mean of 1500 and a standard deviation of 200. To be ranked as a Class C amateur, you must score 1400, or one half standard deviation below the mean. If the Elo scale were transformed into an IQ scale with mean of 100 and standard deviation of 15 this would correspond to an IQ score of 93.
At the other extreme, to be a Grandmaster requires an Elo score of 2500, which is 5 standard deviations above the mean. This would correspond to an IQ score of 180. The highest ranking so far achieved on the Elo scale was 2810 by Kasparov in 1994. This is approximately 6.6 standard deviations above the mean and would correspond to an IQ score of 198.
In the short life of the Elo scale (about 40 years), rankings have ranged from 0 to 2810. This a range of 14 standard deviations! Two points have to be kept in mind. First, the Elo scale has been used for only a short time so we might expect the range to increase with time as even better players come along. Second, the Elo scale is not applied to the full range of talent because only those persons who compete in sanctioned events are rated. The scale is only applied to those chess players who think they are good enough to compete in sanctioned events and so represent an already highly selected group.
This cursory look at the range of talent indicates that the measurements of ability show a range falling somewhere between 10 and 15 standard deviations. Even though extreme cases are rare, they are also extremely important. It is the extreme cases that produce both problems and progress for mankind. We neglect this enormous variability at our own risk.
Why Are We Sensitive to these Differences Now?
Over the last century there has been a growing awareness of differences among people. But before that you will find relatively little written about individual differences in either science or literature. Given the huge range of differences that exist, why has it been relatively recently that we have become sensitive to these differences? Even now there is only a vague awareness of individual differences. Otherwise, a book Eke The Bell Curve (Hemstein & Murray, 1994) would not take us by surprise.
We believe that the most important force in making us aware of the individual differences has been the rapid increase in education, particularly in the US. While there have been very few changes in the way people are educated in the last 2000 years, in the last 100 years there have been dramatic increases in the number of people who are receiving an education.
Figure 1 shows the number of high school graduates as a percentage of 17 year-olds from 1890 to 1990 (data from Lagemann, 1997). In 1890, fewer than 5% of all 17 year-olds graduated from high school. If we assume that only the brightest persons attained this level education, then the minimum IQ for a high school graduate would be 125. In 1990, approximately 80% of 17 year-olds graduated from high school. Again assuming only the smartest graduate, the minimum IQ now needed for graduation from high school would be about 87. This is an incredible demographic trend that has gone largely ignored.
The implications are staggering. In 1890, high schools were roughly as selective as professional and graduate schools are now. As we end this century, we are approaching nearly asymptotic levels of high school graduation. Figure 2 shows the increase in high school graduation that has occurred in the last 25 years alone (data
from Smith, 1997). (Figures are slightly different here because of different methods of ascertainment.) These same trends are reflected in college graduation and in the granting of graduate and professional degrees.
*FIGURE 1. Number of 17 year-old high school graduates from 1890 to 1990.
*FIGURE 2. Number of 25-29 year-olds who have completed high school or its equivalent.
There is an obvious trend to educate everyone to the highest level possible. In our opinion, this trend is certainly in the world's best interest, but it will also inevitably bring us face to face with the differences among us.
How Much Can These Differences Be Changed?
An important question is how malleable individual differences are. This question actually has two parts. Part one of the question is if it is possible to move the mean by intervention. That is, can we increase the level of performance of some ability or skill? Note that changing mean performance says nothing about the range of talent existing in the population. It is entirely possible to raise the mean and at the same time increase the standard deviation. It would also be possible to raise the mean and decrease the standard deviation. The second question is to what extent it is possible to change the range of talent.
To examine this issue, we will consider areas of research that have made a systematic effort to assess these questions. We will confine our discussion to intelligence. We do this because there has been more research on this ability than any other. Even so, the studies that have been done still have major methodological problems. We also confine ourselves to studies where large groups of subjects have been selected on well-specified criteria. While there are many single-subject reports and case studies of people who appear to have increased their skill levels dramatically, these studies seldom assess initial performance levels or show that the skill level is universally attainable. The question we are asking is not if there are individuals who display exceptional skills. There certainly are such people. Rather we are asking what changes can be uniformly produced with systematic training.
Early intervention studies. There have been many efforts to intervene to increase IQ. These studies have involved thousands of subjects. Some have been very intensive with up to 40 hours or more of intervention per week and lasting years. Spitz (1992) reviewed all of this work. It is clear that intervention can increase mean performance by about one standard deviation. A typical early intervention program can produce average IQ changes of 15 points. These programs typically do not change the standard deviation of those groups given intervention.
The problem with intervention studies is that the effects are not lasting. Once intervention is discontinued, much if not all of the gain is lost. This "wash out" effect occurs even when the intervention is followed up with some support. Sustained effects seem to require sustained intervention. In this discussion, we are not interested in the long term effects of change, only to what extent it is optimally possible.
One of the most powerful kinds of intervention studies are adoption studies. Adopted children are raised in an environment that is constant and consistent until they reach adulthood. Capron and Duyme (1989) used this feature of adoption studies to assess the impact of environment on intelligence. They retrospectively assessed the birth and adoptive placement status of children who had been adopted. There were four groups. Children were born to low or high status parents and were then placed in low or high status adoptive homes.
Initial status had the expected effects. Children born to parents with higher educational levels had higher IQs. The environment of the home also had an impact on the child's IQ. Children raised in high status homes had IQs about a standard deviation higher than children raised in low status homes. This was true regardless of the status of the children's biological parents. Since adoption had the same incremental effect on all children, regardless of initial status, it is clear that adoption had no impact on the amount of variation in the population. Adoption into a high quality home seems to add a constant to the child's IQ when compared to being raised in a low quality home. When a constant is added to a score, the standard deviation of the intervened group will not change. However, one could speculate on various scenarios that could occur with different percentages of children being placed into different qualities of homes. Depending on the assumptions, the variance in the population could either decrease or increase. However, with random placement, the variance would be expected to be unchanged by adoption.
Other adoption studies have found very similar effects. Scarr and Weinberg (1976) studied black children adopted into white families. They found that the mean IQ of black children adopted into white families was higher than that of the population of blacks by about two-thirds of a standard deviation.
As with intervention studies, there are many caveats to the interpretation of adoption studies. Many of these have been discussed by Jensen (1998). In adoption studies, the IQ of the biological parents is frequently not known. There is also substantial evidence that intelligence becomes increasingly genetically determined with age. Many adoption studies do not include assessment in adulthood. Finally, an interpretation of outcome often is made on the basis of a comparison to normative data. For example, in the Scarr and Weinberg study, there is substantial debate on whether adopted black children should be compared to mid-western or national normative performance. Conclusions are substantially different based on which of these norms are used. Since our purpose here is to determine some upper limit on the malleability of ability in the population, we have chosen to overlook these problems.
Another area that can be considered to assess the effects of malleability are situations where we might expect different methods or quality of education to exist. Recently, there have been a number of cross-national studies on educational outcomes designed to assess each participating country's educational system. Mathematics seems to be the field most subject to large teaching effects, so this is the area most frequently studied. One typical cross-national study compared 13 year-olds on a standardized mathematics test (New York Times, July 21, 1996). Mean performance scores on the test varied from a low performance by the US of about 260 to a high performance by Taiwan of about 290. Means varied by about 30 points, or less than 1.5 standard deviations. Ranges of scores varied from 210 to over 330. While cross-cultural differences in mathematics performance were about 30 points, the performance of 13 year-olds within a culture is four or five times larger and appears to be generally about the same across cultures.
Once again there are caveats to the interpretation of these data. Asians are known to generally score higher on tests of mathematical skill. The two Asian countries tested, Taiwan and Korea, placed first and second in mathematics proficiency. Some might argue that a portion of the differences in math skills shown by this study are due to race differences (the superiority of Asians in math) and not entirely to the effectiveness of instruction. Another problem that is difficult to eliminate is the potential of differential sample selection across countries. Small differences in who takes the test in each country could add to mean differences.
One study that is free of many of the problems of the studies previously discussed was conducted by Anghoff and Johnson (1988). Their interest was in determining what extent different colleges and different curricula affected achievement. They selected over 20,000 students who had taken the Scholastic Aptitude Test (SAT) and then, about four years later, had taken the Graduate Record Exams (GRE). Since the math portions of each of these exams appeared most affected by instruction, they confined their investigation to math scores on each test. They then asked to what extent students' incoming SAT score, college major, and gender predicted their math performance on the GRE they took four years later. Any variance remaining could be attributed to a combination of test unreliability and differences in college attended.
What they found was a correlation between SAT-Math, major, and gender with GRE-Math of between .93 and .97. In other words, between 86% and 94% of the variance in GRE-Math was predictable by the variables used. About 74% of the variance in GRE-Math was predictable from SAT-Math alone. The authors conclude that there is very little room for the effect of college on these outcome measures. This is good news for parents. It indicates that a student receives as good an education at a less expensive school as at a more prestigious and more expensive one. But it is bad news for colleges because it suggests that differences in colleges are not translated into differences in academic achievement. In fact, the study indicates that most of the variance in academic achievement is accounted for by characteristics of the incoming student. Prestigious colleges are prestigious because they attract good students.
This situation can be understood by a simple analogy. There are two bars on our campus that are quite different from each other. The Euclid Tavern is what is commonly called a student hang out. The furniture is functional and mismatching and the restroom walls are covered with graffiti. It is frequented by graduate students in psychology. A beer can be purchased for $1.50 or less. The Greenhouse is more up-scale. Wood, stained glass, and green plants characterize the decor. A beer is $3.00. It is frequented mostly by medical and law students. A beer in either place has the same alcohol content. So the premium paid at The Greenhouse is for the atmosphere. That is true of colleges, too. Since the Anghoff and Johnson study convincingly showed that a student will get the same education at every university, what university a student attends will largely be due to preferences for atmosphere. Outcome will be largely due to the incoming characteristics of the student.
What do the Anghoff and Johnson results suggest for the malleability of ability? This study suggests far less malleability than the areas of research previously examined. There may be several reasons for this. Perhaps most importantly, subjects are older. Because they are older, tests are likely to be more reliable and subjects' behavior more stable. In fact, the Anghoff and Johnson study may be the best indicator of malleability. If so, then expected changes are substantially less than one standard deviation.
Conclusion about Malleability
When interventions are life-long and intensive, the maximum change in IQ may be about 1 to 1.5 standard deviations, at most. However, this change may leave the range unchanged or may increase the range. There is very little evidence that any interventions considered here would change the range of ability in the population. At best, we can hope to shift the entire distribution up. This does not change a person's standing in the distribution. It raises the mean. I estimate that approximately 60% to 80% of the variance in outcomes related to education or intervention is due to individual differences at onset when interventions are the same for all persons.
We can now compare the degree of malleability to the range of individual differences in the population. If we take an upper limit of 1.5 standard deviations for the range of possible change in ability and compare it to a lower limit of 8 standard deviations for the range of ability in the population, it is obvious that even the most powerful interventions are extremely limited in their effectiveness. Even when interventions are effective, they are very likely to raise the mean of the population and leave an individual's standing in the distribution the same as before the intervention. This does not mean that interventions are useless and should be abandoned.
Obviously, it would be ridiculous to argue against the power or importance of education. However, it does mean that until we have a full understanding of what produces differences among individuals, we will only be able to provide very limited interventions.
Are Differences Equally Malleable at All Levels of Ability?
There is one hopeful aspect concerning intervention. So far, we have considered interventions with low IQ subjects or the population as a whole. Is it possible that highly able, very bright subjects would demonstrate more flexibility than low IQ subjects? The answer to this question seems to be a limited yes.
Detterman and Daniel (1989) examined the correlational structure of IQ subtests for subjects of different IQ levels. They divided the IQ distribution into five equal parts. Within each part, they computed the correlations among subtests and then averaged across all of the subtests. Average correlations among subtests were twice as high for low IQ subjects as for high IQ subjects. For subjects with IQs below 70, the correlation among subtests averaged about .60; but for subjects with IQs above 130, the correlations among subtests were about .30. The three IQ categories between 70 and 130 showed inter- mediate average correlations among subtests. This finding has been replicated several times.
Detterman and Daniel interpreted these results within a theory suggested by Detterman (1987, 1994). This theory suggests that some central cognitive processes within an integrated system of cognitive processes are deficient in lower IQ subjects. Because the deficient processes are central ones, the entire system is brought to a lower level of operation. This means that low IQ subjects will be much more similar in all aspects of cognitive functioning. On the other hand, high IQ subjects are much more likely to show higher variability in the different cognitive processes which contribute to intelligence. In simple terms, higher IQ subjects are more likely to show distinct strengths and weaknesses, whereas low IQ subjects are not.
This has important implications for education, especially for high IQ subjects. For high IQ subjects it suggests the possibility that there may be a better fit between some areas of academic pursuit and an individual's pattern of abilities. If the strengths of a high IQ person could be determined early, it might be possible to be more efficient in designing their education. While this is supposition, one thing the Detterman and Daniel study certainly shows is that educational practice should be much different for subjects with high IQs than for subjects with low IQs.
Where Do Individual Differences Originate?
So far we have argued that there is huge variability in human intelligence and that our ability to increase intelligence is, at best, limited. Further, we have argued that, in general, most people are insensitive to the huge span of human ability and have only recently been made vaguely aware of the differences that exist by the dramatic increase in education that has occurred in the last century. But where do these differences come from?
Heredity. There is no question that half or more of the individual differences we are discussing are the result of genetic differences. This is clearly shown in a review by Bouchard and McGue (1981) in which they compiled the world's known studies of familial resemblance on IQ. These studies included tens of thousands of subjects and several hundred correlations. The results were quite clear: As family resemblance increased, so did the correlation of IQ between family members. Based on the kind of data presented by Bouchard and McGue, most estimates of the contributions of genes to intelligence run between 50% and 80%.
More recently there has been even more direct evidence of the importance of genetics in cognitive ability. A team of investigators lead by Plomin (Chomey et al., in press) has identified a locus on chromosome 6 that has a different frequency in higher and lower IQ subjects. If this exciting finding is supported, it means that in a short time we may have a catalogue of the genes that contribute to human intelligence.
Environment. There is also no doubt that the environment has an effect on intelligence. Unfortunately, there are few unambiguous results in this area (see Detterman, 1996 for an overview). The major problem with this area of study is that most research confounds environment and genetic effects and so it is impossible to clearly attribute the effects found to one or the other. Those studies which have had the appropriate controls have been unsuccessful in identifying specific environmental variables. In genetic studies it is possible to separate sources of environmental variance into common and unique environmental influences. Common genetic influences are those which act on all members within a fan3ily. Unique environmental influences are those specific to an individual within a family. By far the largest portion of environmental variance seems to be due to unique effects. It may be that environmental influences will be the most difficult to understand because they are unique and, individually, small.
What we have said to this point is obviously in marked contrast to a line of research that seems to be getting a good deal of attention, particularly in educational circles. This position has been advanced by Ericsson and his colleagues (Ericsson & Charness, 1995; Ericsson, Krampe, Tesch-Romer, 1993). According to this position, expertise is developed by investing 10,000 hours of deliberate practice. Further, the authors say that there is no evidence that differences in ability level affect final outcome. In other words, with the appropriate amount of deliberate practice, anyone can be an expert at anything.
The position we have presented here is obviously quite different. We have concluded that differences among people are immense, that malleability of these differences is limited, and that the differences among people are determined, at least in part, by genetics. In contrast to the position of deliberate practice we propose the following model: 1
Skilled Performance =
General Ability (g) + Specific Abilities (s) + Practice (p) + gs + gp + sp + gsp
We propose that skilled performance results from a combination of factors. First is general intellectual ability. We believe that this contributes to all skilled performance and may be the most important single component. Specific abilities would be things like spatial ability, fine motor ability, gross motor ability, personality traits, and other specific skills that are factorially separate from general intellectual ability. Next, practice or training is obviously an essential part of acquiring any skilled performance ability. Finally, there are a set of interactions including all of the previous terms. These interactions are quite important. In fact, we believe that individuals proclivity for practice is probably the result of one of these interactions. The gsp term defines the interaction of general ability X specific ability X practice. It may be this term that provides some the inclination to engage in extended practice. Without the appropriate pattern of general ability and specific skills, no one could endure the practice required to be an expert violinist or wrestler. Only those with the appropriate initial skills would find such a tedious effort reinforcing.
We should add that what our model does not invalidate the importance of practice. The research on practice can add greatly to our understanding of what it takes to become an expert. However, we believe that the study of practice alone will never provide the complete story of expertise. The truth is that some people are better suited to certain pursuits than others. To say that anybody can become anything with the appropriate amount of practice denies the obvious reality of differences in human abilities.
How have reasonable researchers come to this idea that practice alone can produce experts? We believe that the research supporting this position has a number of methodological flaws. These flaws mask the individual differences that would otherwise be obvious contributors to expertise. Some of these problems include highly selected samples, restricted range, retrospective reports, and the use of highly selected domains.
Selected samples. Much of the research on expertise involves the use of highly selected samples. For example, expert violinists, expert wrestlers, or expert chess players are frequently studied. Such highly selected groups are likely to be quite different from a group of persons that are randomly selected from the general population. Selecting a group of expert violinists and then studying what they do is much different than selecting two groups randomly and giving one of those groups deliberate practice. The selection of people who are already on their way to becoming experts has the potential for innumerable confoundings.
One serious confounding is that the groups may not be equal when they begin. For example in the study by Ericsson, Krampe, and Tesch-Romer (1993), there are differences detectable between their three groups early on. The three groups consisted of students in a conservatory who were considered elite or good violinists or music teachers. We obtained the data and analyzed the early performance of these three groups in competitions.2 Between the time they began taking lessons until the time of the study the elite group entered and won more competitions than the other groups. Beginning at age 8, the elite group won 67% of the competitions they performed in. In contrast, the good group won 54% and the music teachers won 18% of the early competitions they entered. These differences in winning competitions reflect differences among the groups that are apparent from the beginning. If differences are apparent from the beginning, they cannot be caused by differences in practice.
Restricted range. When samples are highly selected, there is another problem that results. Whatever abilities were necessary to achieve this high performance will, by definition, be very high in all persons in the selected group. In the worst case, all persons in the selected group will be at the highest ability level on the traits required for the skill. That means that differences in performance among members of this group can not be attributed to the trait that actually caused the skill. For example, there is no doubt that height is an important attribute for playing basketball. We correlated the height of National Basketball Association (NBA) players with the number of points they scored. The correlation was about zero. How could this be? Players in the NBA are selected for height and have a mean height greater than the general population. They are so heavily selected on this attribute that, when highly skilled players are selected, the range is so restricted it no longer affects performance.
Retrospective reports. Much of the study of expert performance is based on the retrospective reports of practice or performance. Retrospective reports are notoriously unreliable. The best example of this problem comes from a study by Hodges and Starkes (1996). They studied wrestlers who competed on an international or club level. Both groups began to wrestle at the age of 13. Historical, retrospective reports of international wrestlers indicated that they devote more time to practice than the club wrestlers. But diaries kept of current activities do not confirm this difference. International wrestlers devote 24.9 hours per week to wrestling while club wrestlers spend 24.6 hours. When it comes to deliberate practice, international wrestlers spend 9.0 hours per week but club wrestlers spend 9.8 hours per week according to their diaries. The best wrestlers actually spend less time at deliberate practice than the less capable wrestlers.
The problem with all self-reports is that they are easily influenced by extraneous factors such as the subject's expectations about what the experimenter wants and the subject's own perceptions and beliefs. We would all prefer to believe that our success is due to our own hard work and not some lucky confluence of abilities. People who believe that anybody could accomplish what they have accomplished are very likely to report working hard at what they are good at.
Selective domains. Another problem with the research on deliberate practice is the domains selected. Much of the research has been carried out on domains in which there is no general training of the population such as music, chess, and wrestling. In these highly specialized domains, there is really no indication about what attracts people to become experts in these areas. Therefore, anyone who participates in these specialized activities is already differentiated from the general population.
Studies of these very specialized activities can be contrasted to areas in which nearly everyone receives some training like math. As the Anghoff and Johnson (1988) study clearly showed, how a person performs after four years of instruction is closely related to how they performed at the start of instruction. This same conclusion can be applied to other fields in which everyone receives instruction.
Given all of these methodological problems with the study of deliberate practice, is there any methodology that would make a convincing case for the position? In our opinion, the position would be very easy to confirm or reject. One need only take a random sample of persons from the general population, measure initial differences in ability, divide the sample into two groups, and then give one group deliberate practice and the other group practice activities of a different kind. I believe such a study would confirm the model we have presented above. Our prediction is that there would be differences between deliberate practice and other forms of practice but that these differences would be swamped by preexisting differences in general and specific abilities.
Evidence for the Contribution of Individual Differences to Skilled Performance
It is true that the study of the development of specialized skills is not far advanced. However, there is sufficient evidence to indicate that the development of expertise in almost any area is dependent on differences in general intellectual ability as well as specialized skills. The following are just a few examples.
Music. Lynn, Wilson, and Gault (1989) found that musical ability tests loaded on a general intelligence factor with loadings ranging between .49 and .59. This finding shows that musical ability is correlated with general intelligence. Phillips (1978) found correlations between IQ and tests of musical ability between .61 and .69. Finally, MacLeish and Higgs (1982) found that mentally retarded persons were predictably low in musical ability.
There are also studies showing the influence of specific musical talent independent of general intelligence. Coon and Carey (1989) surveyed twins using subjective items about musical interest. They found heritabilities between .1 and .7 and some differed by gender. Given the subjective nature of the scales, the results are surprisingly strong. Shutter (1966) compared the musical abilities of five identical twins that had been reared apart on the Wing Tests of Musical Intelligence. Though each member of the twin pairs had very different musical training, their test scores were very similar. These studies show that there are heritable musical skills.
There is also extensive literature on topics such as perfect pitch, spatial ability in musicians and other specialized tests of musical ability that confirm the role of abilities in the development of musical expertise. All of these lines of research, though perhaps not as extensive as we might wish, support the model we presented above. Musical accomplishment is dependent on general intelligence, some special skills such as pitch, rhythm, and perhaps spatial ability and personality variables as well as the opportunity to practice.
Music is an area where the idea of deliberate practice can intuitively be seen to be incorrect. In our experience, people tend to know if they are musically talented or not. Many people who like music and listen to it regularly have no talent to perform it despite intensive instruction. Not everyone can become an expert musician, even if they practice for 10,000 hours.
Athletics. Both Geron (1979) and Levin (1984) found that they were able to differentiate child athletes on their profile scores on an intelligence test composed of multiple subtests. Children with different profiles participated in different sports. For example, swimmers were highest in abstract analysis and analogy while track and field participants were highest on concrete and abstract analytical thinking. Interestingly, tests were given to children before they began participation in these sports so participation in sports could not have caused these differences.
There is also evidence that children who participate in sports are significantly higher in general intelligence than children who do not. Boulton (1966) found that 66% of higher IQ children participated in sports while only 25% of lower IQ children did so.
Chess. Several studies have shown no relationship between chess ability and practice time. Chamess (1981) tested nationally ranked chess players and found no correlation between starting age or total years of chess playing with Elo ranking. Pfau (1983) found the same.
The work of Elo (1978) is some of the best with regard to chess ability. He plotted the rankings throughout the careers of 27 tournament chess players. Those who eventually became grand masters had a first tournament ranking of about 1450. Those who became international masters were first ranked approximately 1400, and those who became national masters were ranked 1350. These small differences were significant and became larger over years.
To further test this notion of ability in chess, Elo took a group of Milwaukee public school children and taught them chess. Elo claimed that differences in ability were apparent after only two weeks of instruction and that the differences widened as time went on. Higher ability subjects profited more from instruction.
Several studies also show that chess players have higher IQs than children that don't play chess (Frydman & Lynn, 1992, Horgan & Morgan, 1990). In one study, children who were precocious chess players were found to have an average IQ of 121. As in the other areas we have examined, there is substantial evidence that attaining expertise at chess requires abilities well above those of the general population.
In this paper, we have argued that individual differences are large. Though we illustrated our argument using intelligence as the ability, we believe similar differences would be found with other abilities. Typically, the range of individual differences will be 10 standard deviations or more. Next, we examined the degree to which differences were malleable. The research we surveyed indicates that I to 1.5 standard deviations is the upper limit for increasing an ability. In cases where this has been accomplished, the effects frequently do not persist once the intervention is discontinued. Other studies, like the one by Anghoff and Johnson, suggest there is a very limited flexibility for instructional effects. We estimate that between 60% and 80% of outcome is due to preexisting individual differences. In situations, where practice, instruction, or intervention is applied, the most important determinant of a person's final position in a distribution will be their position in that distribution before practice, instruction, or intervention.
Finally, we considered the position that deliberate practice is all that is necessary for expertise. We rejected that position in favor of one that includes general intellectual ability, specific abilities, practice, and the interactions of these factors. There is already a substantial amount of literature that supports this position.
Americans have always had a warm spot in their heart for equality. We would like to believe that we are all equally capable of anything. The truth is, we are not. There are marked and obvious differences among people. Until we are able to accept these differences and deal with them in a rational way, many of the problems that these differences currently produce will go unresolved.
1Author Note: Parts of this work were supported by Grants No. HDO7176 from the National Institute of Child Health and Human Development, Office of Mental Retardation.
1This model was suggested by Joanne Ruthsatz.
2 We would like to thank Prof. Krampe for making these data available.
*Please see original book for figure 1 and figure 2.
Anghoff, W.H., & Johnson, E.G. (1988). A study of the differential impact of curriculum on aptitude test scores. Research Report 88-46. Princeton, NJ: Educational Testing Service.
Bouchard, T.J., Jr., & McGue, M. (1981). Familial studies of intelligence: A review. Science, 212, 1055-1059.
Boulton, S.M. (1966). Relationships between mental ability, physique, and various competitive game activities of adolescent boys in comprehensive school. Research Papers in Physical Education, 3, 3-13.
Capron, C., & Duyme, M. (1989). Assessment of effects of socioeconomic status on IQ in a full cross-fostering study. Nature, 340, 552-554.
Charness, N. (1981). Aging and skilled problem solving. Journal of Experimental Psychology: General, 110, 21-28.
Chorney, M. J., Chorney, K., Seese, N., Owen, M. J., Daniels, J., McGuffin, P., Thompson, L. A., Detterman, D. K., Benbow, C. P., Lubinski, D., Eley, T., Plomin, R. (in press). A quantitative trait locus (QTL) associated with cognitive ability in children. Psychological Science.
Detterman, D. K. (1987). Theoretical notions of intelligence and mental retardation. American Journal of Mental Deficiency, 92, 2-1 1.
Detterman, D. K. (1994). A system theory of intelligence. In D. K. Detterman (Ed.). Current topics in human intelligence: Vol 4. Theories of Intelligence (pp. 85-115). Norwood, NJ: Ablex Publishing Corporation.
Detterman, D. K. (Ed.). (1996). Current topics in human intelligence: Vol. 5. The environment..Norwood, NJ: Ablex Publishing Corporation.
Detterman, D. K. & Daniel, M. H. (1989). Correlations of mental tests with each other and with cognitive variables are highest for low IQ groups, Intelligence, 13, 349-359.
Elo, A. (1978). The rating of chess players, past and present. New York: Arco.
Ericsson, K. A., & Chamess, N. (1995) Abilities: Evidence for talent or characteristics acquired through engagement in relevant activities. American Psychologist, 50, 803-804.
Ericsson, K.A., Krampe, R., & Tesch-Romer, C. (1993). The role of deliberate practice in the acquisition of expert performance. Psychological Review, 100, 363-406.
Foner, P. S. (1945). The complete writings of Thomas Paine. New York: Citadel Press.
Geron, E. (1979). Sport giftedness in gymnastics and intelligence in children. International Journal of Sports Psychology, 10, 18-30.
Hernstein, R.J., & Murray, C. (1994). The bell curve. New York: The Free Press.
Hodges, N. J., & Stark, J. L. (1996). Wrestling with the nature of expertise: A sport-specific test of Ericsson, Krame and Tesch-Romer's (1993) theory of 'deliberate practice." International Journal of Sport Psychology, 27, 400-424.
Horgan, D.D, & Morgan, D., (1990). Chess expertise in children. Applied Cognitive Psychology, 24,109-128.
Jensen, A. R. (1998). The g factor. New London, CN: Praeger.
Lagemann, E. C. (1997). Contested terrain: A history of education research in the United States, 1890-1990. Educational Researcher, 26(9), 5-17.
Levin, Z. (1984). Intelligence structure of elementary school age athletes. In U. Sinui, D. Eldear, & S. Lieberman, (Eds.). Preschool and elementary school children and physical activity (pp. 79-88). Proceedings of the 26th ICHIPER Congress.
Lynn, R., Wilson, G. R., Gault A. (1989) Simple musical tests as measures of Spearman's g.
Personality and Individual Differences, 10: 25-28.
McLeish, J., & Higgs, G. (1982) Musical ability and mental subnormality: An experimental investigation.
British Journal of Educational Psychology, 52, 370-373.
Morgan, F (1998).
Data file available for completions in postsecondary institutions. U.S. Department of Education, Office of Educational Research and Improvement, National Center for Education Statistics, Announcement NCES-97-41 1.
Pfau, D. (1982).
Chess skill and its development.- The role of semantic knowledge, episodic memory, tactical skill, and positional judgement. Unpublished Master's Thesis, University of Akron, OH.
An investigation of the relationship between musicality and intelligence. Psychology of Music, 4(2), 16-3 1.
Scarr, S., & Weinberg, R. A. (1976). IQ test performance of black children adopted by white families.
American Psychologist, 31, 726-739.
Smith, T. M. (1997).
The condition of education, 1997. Washington: National Center for Education Statistics, NCES# 97388, GPO# 065-000-00997-8. Also available at http://nces.ed.gov/
Spitz, H. (1986).
The raising of intelligence: A selected history of attempts to raise retarded intelligence. Hillsdale, NJ: Lawrence Erlbaum Associates.
Starkes, J. L., Deakin, J. M., Allard, F., Hodges, N. J., & Hayes, A. (1996). Deliberate practice in sports: What is it anyway? In K. A. Ericsson (Ed.).
The road to excellence: The acquisition of expert performance in the arts and sciences, sports and games (pp. 81-106). Mahwah. NJ: Lawrence Erlbaum Associates.