evolved minds and education: intelligence

The second vigorously debated area that Geary refers to in Educating the Evolved Mind is intelligence. In the early 1900s statistician Charles Spearman developed a technique called factor analysis. When he applied it to measures of a range of cognitive abilities he found a strong correlation between them, and concluded that there must be some underlying common factor that he called general intelligence (g). General intelligence was later subdivided into crystallised intelligence (gC) resulting from experience, and fluid intelligence (gF) representing a ‘biologically-based ability to acquire skills and knowledge’ (p.25). The correlation has been replicated many times and is reliable –  at the population level, at least.  What’s also reliable is the finding that intelligence, as Robert Plomin puts it “is one of the best predictors of important life outcomes such as education, occupation, mental and physical health and illness, and mortality”.

The first practical assessment of intelligence was developed by French psychologist Alfred Binet, commissioned by his government to devise a way of identifying the additional needs of children in need of remedial education. Binet first published his methods in 1903, the year before Spearman’s famous paper on intelligence. The Binet-Simon scale (Theodore Simon was Binet’s assistant) was introduced to the US and translated into English by Henry H Goddard. Goddard had a special interest in ‘feeble-mindedness’ and used a version of Binet’s scale for a controversial screening test for would-be immigrants. The Binet-Simon scale was standardised for American children by Lewis Terman at Stanford University and published in 1916 as the Stanford-Binet test. Later, the concept of intelligence quotient (IQ – mental age divided by chronological age and multiplied by 100) was introduced, and the rest, as they say, is history.

what’s the correlation?

Binet’s original scale was used to identify specific cognitive difficulties in order to provide specific remedial education. Although it has been superseded by tests such as the Wechsler Intelligence Scale for Children (WISC), what all intelligence tests have in common is that they contain a number of sub-tests that test different abilities. The 1905 Stanford-Binet scale had 30 sub-tests and the WISC-IV has 15. Although the scores in sub-tests tend to be strongly correlated, Early Years teachers, Educational Psychologists and special education practitioners will be familiar with the child with the ‘spiky profile’ who has high scores on some sub-tests but low ones on others. Their overall IQ might be average, but that can mask considerable variation in cognitive sub-skills. Deidre Lovecky, who runs a resource centre in Providence Rhode Island for gifted children with learning difficulties, reports in her book Different Minds having to essentially pick ‘n’ mix sub-tests from different assessment instruments because children were scoring at ceiling on some sub-tests and at floor on others. In short, Spearman’s correlation might be true at the population level, but it doesn’t hold for some individuals. And education systems have to educate individuals.

is it valid?

A number of issues have been vigorously debated in relation to intelligence. One is its construct validity. There’s no doubt intelligence tests measure something – but whether that something is a single biologically determined entity is another matter. We could actually be measuring several biologically determined functions that are strongly dependent on each other. Or some biologically determined functions interacting with culturally determined ones. As the psychologist Edwin Boring famously put it way back in 1923 “intelligence is what the tests test”, ie intelligence is whatever the tests test.

is it cultural?

Another contentious issue is the cultural factors implicit in the tests.  Goddard attempted to measure the ‘intelligence’ of European immigrants using sub-tests that included items culturally specific to the USA.  Stephen Jay Gould goes into detail in his criticism of this and other aspects of intelligence research in his book The Mismeasure of Man.  (Gould himself has been widely criticised so be aware you’re venturing into a conceptual minefield.)  You could just about justify culture-specificity in tests for children who had grown up in a particular culture, on the grounds that understanding cultural features contributed to overall intelligence. But there are obvious problems with the conclusions that can be drawn about gF in the case of children whose cultural background might be different.

I’m not going to venture in to bell-curve territory because the vigorous debate in that area is due to how intelligence tests are applied, rather than the content of the tests. Suffice it to say that much of the controversy about application has arisen because of assumptions made about what intelligence tests tell us. The Wikipedia discussion of Herrnstein & Murray’s book is a good starting point if you’re interested in following this up.

multiple intelligences?

There’s little doubt that intelligence tests are valid and reliable measures of the core abilities required to successfully acquire the knowledge and skills taught in schools in the developed industrialised world; knowledge and skills that are taught in schools because they are valued in the developed industrialised world.

But as Howard Gardner points out in his (also vigorously debated) book Frames of mind: The theory of multiple intelligences, what’s considered to be intelligence in different cultures depends on what abilities are valued by different cultures. In the developed industrialised world, intelligence is what intelligence tests measure. If, on the other hand, you live on a remote Pacific Island and are reliant for your survival on your ability to catch fish and navigate across the ocean using only the sun, moon and stars for reference, you might value other abilities. What would those abilities tell you about someone’s ‘intelligence’? Many people place a high value on the ability to kick a football, sing in tune or play stringed instruments; what do those abilities tell you about ‘intelligence’?

it’s all about the constructs

If intelligence tests are a good measure of the abilities necessary for learning what’s taught in school, then fine, let’s use them for that purpose. What we shouldn’t be using them for is drawing conclusions about a speculative entity we’ve named ‘intelligence’. Or assuming, on the basis of those tests, that we can label some people more or less ‘intelligent’ than others, as Geary does e.g.

Intelligent individuals identify and apprehend bits of social and ecological information more easily and quickly than do other people” (p.26)


Individuals with high IQ scores learned the task more quickly than their less-
intelligent peers” (p.59)


What concerned me most about Geary’s discussion of intelligence wasn’t what he had to say about accuracy and speed of processing, or about the reliability and predictive validity of intelligence tests, which are pretty well supported. It was the fact that he appears to accept the concepts of g, gC and gF without question. And the ‘vigorous debate’ that’s raged for over a century is reduced to ‘details to be resolved’ (p.25) which doesn’t quite do justice to the furore over the concept, or the devastation resulting from the belief that intelligence is a ‘thing’.  Geary’s apparently unquestioning acceptance of intelligence brings me to the subject of the next post; his model of the education system.



Gardner, H (1983). Frames of Mind: The theory of multiple intelligences. Fontana (1993).

Geary, D (2007).  Educating the evolved mind: Conceptual foundations for an evolutionary educational psychology, in Educating the evolved mind: Conceptual foundations for an evolutionary educational psychology, JS Carlson & JR Levin (Eds). Information Age Publishing.

Gould, SJ (1996).  The Mismeasure of Man.  WW Norton.

Lovecky, D V (2004).  Different minds: Gifted children with AD/HD, Asperger Syndrome and other learning deficits.  Jessica Kingsley.


learning styles: the evidence

The PTA meeting was drawing to a close. The decision to buy more books for the library instead of another interactive whiteboard had been unanimous, and the conversation had turned to educational fads.

“Now, of course,” the headteacher was saying, “it’s all learning styles. We’re visual, auditory or kinaesthetic learners – you know, Howard Gardner’s Multiple Intelligences.” His comment caught my attention because I was familiar with Gardner’s managerial competencies, but couldn’t recall them having anything to do with sensory modalities and I didn’t know they’d made their way into primary education. My curiosity piqued, I read Gardner’s book Frames of Mind: The Theory of Multiple Intelligences. It prompted me to delve into his intriguing earlier account of working with brain-damaged patients – The Shattered Mind.

Where does the VAK model come from?

Gardner’s multiple intelligences model was clearly derived from his pretty solid knowledge of brain function, but wherever the idea of visual, auditory and kinaesthetic (VAK) learning styles had come from, it didn’t look like it came from Gardner. A bit of Googling learning styles kept bringing up the names Dunn and Dunn, but I couldn’t find anything on the VAK model’s origins. So I phoned a friend. “It’s based on Neuro-Linguistic Programming”, she said.

This didn’t bode well. Neuro-Linguistic Programming (NLP) is a therapeutic approach devised in the 1970s by Richard Bandler, a psychology graduate, and John Grinder, then an assistant professor of psychology who, like Frank Smith, had worked in George magical-number-seven-plus-or-minus-two Miller’s lab and been influenced by Noam Chomsky’s ideas about linguistics.

If I’ve understood Bandler and Grinder’s idea correctly, they proposed that insights into people’s internal, subjective sensory representations can be gleaned from their eye movements and the words they use. According to their model, this makes it possible to change those internal representations to reduce anxiety or eliminate phobias. Although there are some valid elements in the theory behind NLP, evaluations of the model have in the main been critical and evidence supporting the effectiveness of NLP as a therapeutic approach has been notable by its absence (see e.g. Witkowski, 2010).

So the VAK Learning Styles model appeared to be an educational intervention derived from a debatable theory and a therapeutic technique that doesn’t work too well.

Evaluating the evidence

Soon after I’d phoned my friend, in 2004 Frank Coffield and colleagues published a systematic and rigorous evaluation of 13 learning styles models used in post-16 learning and found the reliability and validity of many of them wanting. They didn’t evaluate the VAK model as such, but did review the Dunn and Dunn Learning Styles Inventory which is very similar, and it didn’t come out with flying colours. I mentally consigned VAK Learning Styles to my educational fads wastebasket.

Fast forward a decade. Teachers using social media were becoming increasingly dismissive of VAK Learning Styles and of learning styles in general. Their objections appeared to trace back to Tom Bennett’s 2013 book Teacher Proof. Tom doesn’t like learning styles. In Separating neuromyths from science in education, an article on the New Scientist website, he summarises his ‘hitlist’ of neuromyths. He claims the VAK model is “the most popular version” of the learning styles theory, and that it originated in Neil Fleming’s VARK (visual, auditory, read-write, kinaesthetic) concept. According to Fleming, a teacher from New Zealand, his model does indeed derive from Neuro-Linguistic Programming. Bennett says the Coffield review “found up to 71 learning styles had been described, mostly not backed by credible evidence”.

This is where things started to get a bit confusing. The Coffield review identified 71 different learning styles models and evaluated 13 of them against four basic criteria; internal consistency, test-retest reliability, construct validity and predictive validity. The results were mixed, ranging from one model that met all four criteria to two that met none. Five of the 13 use the words ‘learning style(s)’ in their name. They included Dunn and Dunn’s Learning Styles Inventory that features visual, auditory, kinaesthetic and tactile (VAKT) modalities, but not Fleming’s VARK model nor the popular VAK Learning Styles model as such.

Having cited John Hattie’s research on the effect size of educational interventions that found the impact of individualisation to be relatively low, Coffield et al concluded “it seems sensible to concentrate limited resources and staff efforts on those interventions that have the largest effect sizes” (p.134).

A later review of learning styles by Pashler et al (2008) took a different approach. The authors evaluated the evidence for what they call the meshing hypothesis; the claim that individualizing instruction to the learner’s style can enable them to achieve a better learning outcome. They found “plentiful evidence arguing that people differ in the degree to which they have some fairly specific aptitudes for different kinds of thinking and for processing different types of information” (p.105). But like the Coffield team, Pashler et al concluded “at present, there is no adequate evidence base to justify incorporating learning-styles assessments into general educational practice. Thus, limited education resources would better be devoted to adopting other educational practices that have a strong evidence base, of which there are an increasing number” (p.105).

Populations, groups and individuals

The research by Coffield, Pashler and Hattie highlights a core challenge for any research relating to large populations; that what is true at the population level might not hold for minority groups or specific individuals – and vice versa. Behavioural studies that compare responses to different treatments usually present results at the group level (see for example Pashler et al’s Fig 1). Results from individuals that differ substantially from the group are usually treated as ‘outliers’ and overlooked. But a couple of high or low scores in a small group can make a substantial difference to the mean. It’s useful to know how the average student behaves if you’re researching teaching methods or developing educational policy, but the challenge for teachers is that they don’t teach the average student – they have to teach students across the range – including the outliers.

So although it makes sense at the population level to focus on Hattie’s top types of intervention, those interventions might not yield the best outcomes for particular classes, groups or individual students. And although the effect sizes of interventions involving the personal attributes of students are relatively low, they are far from non-existent.

In short, reviewers have noted that:
• there is evidence to support the idea that people have particular aptitudes for particular types of learning,
• some learning styles models have some validity and reliability,
• there is little evidence that teaching children in their ‘best’ sensory modality will improve learning outcomes,
• given the limited resources available, the evidence doesn’t warrant teachers investing a lot of time and effort in learning styles assessments.

But you wouldn’t know that from reading some commentaries on learning styles. In the next couple of posts, I want to look at what Daniel Willingham and Tom Bennett have to say about them.

Bandler, R. & Grinder, J (1975). The structure of magic I: A book about language and therapy. Science & Behaviour Books, Palo Alto.

Bandler, R. & Grinder, J (1979). Frogs into Princes: The introduction to Neuro-Linguistic Programming. Eden Grove Editions (1990).

Bennett, T. (2013). Teacher Proof: Why research in education doesn’t always mean what it claims, and what you can do about it, Routledge.

Coffield F., Moseley D., Hall, E. & Ecclestone, K (2004). Learning styles and pedagogy in post-16 learning: A systematic and critical review. Learning and Skills Research Council.

Fleming, N. & Mills, C. (1992). Not another invention, rather a catalyst for reflection. To Improve the Academy. Professional and Organizational Development Network in Higher Education. Paper 246.

Gardner, H. (1977). The Shattered Mind: The person after brain damage. Routledge & Kegan Paul.

Gardner, H. (1983). Frames of Mind: The theory of multiple intelligences. Fontana (1993).

Pashler, H. McDaniel, M. Rohrer, D. and Bjork, R. (2009). Learning Styles: Concepts and Evidence. Psychological Science in the Public Interest, 9, 106-116.

Witkowski, T (2010). Thirty-Five Years of Research on Neuro-Linguistic Programming.
NLP Research Data Base. State of the Art or Pseudoscientific Decoration? Polish Psychological Bulletin 41, 58-66.