seven myths about education: cognitive psychology & levels of abstraction

In her book Seven Myths about Education, Daisy Christodoulou claims that a certain set of ideas dominant in English education are misguided and presents evidence to support her claim. She says “Essentially, the evidence here is fairly straightforward and derives mostly from cognitive psychology”.

Whilst reading Daisy’s book, there were several points where I found it difficult to follow her argument despite the clarity of her writing style and the validity of the findings from cognitive psychology to which she appeals. It then occurred to me that Daisy and some of the writers she cites were using the same terminology to refer to different things, and different terminology to refer to the same thing. This is almost inevitable if you are drawing together ideas from different knowledge domains, but obviously definitions need be clarified or you end up with people misunderstanding each other.

In the next few posts, I want to compare the model of cognition that Daisy outlines with a framework for analysing knowledge that’s been proposed by researchers in several different fields. I’ve gone into some detail because of the need to clarify terms.

why cognitive psychology?

Cognitive psychology addresses the way people think, so has obvious implications for education. In Daisy’s view its findings challenge the assumptions implicit in her seven myths. In the final section of her chapter on myth 1, having recapped on what Rousseau, Dewey and Freire have to say, Daisy provides a brief introduction to cognitive psychology. Or at least to the interface between information theory and cognitive psychology in the 1960s and 70s that produced some important theoretical models of human cognition. Typically, researchers would look at how people perceived or remembered things or solved problems, infer a model that explained how the brain must have processed the information involved and would then test it by running computer simulations. Not only did this approach give some insights into how the brain worked, it also meant that software might be developed that could do some of the perceiving, remembering or problem-solving for us. At the time, there was a good deal of interest in expert systems – software that could mimic the way experts thought.

Much of the earlier work in cognitive psychology had involved the biology of the brain. Researchers knew that different parts of the brain specialised in processing different types of information, that the parts were connected by nerve fibres (neurons) activated by tiny electrical impulses. A major breakthrough came when they realised the brain wasn’t constructed like a railway network, with the nerve fibres connecting parts of the brain as a track connects stations, but in complex networks that were more like the veins in a leaf. Another breakthrough came when they realised information isn’t stored and retrieved in the form of millions of separate representations, like books in a vast library, but in the patterns of connections between the neurons. It’s like the way the same pixels on a computer monitor can display an infinite number of images, depending on which pixels are activated. A third breakthrough occurred when it was found that the brain doesn’t start off with all its neurons already connected – it creates and dissolves connections as it learns. So connections between facts and concepts aren’t just metaphorical, they are biological too.

Because it’s difficult to investigate functioning brains, computers offered a way of figuring out how information was being processed by the brain. Although this was a fruitful area of research in the 1960s and 70s, researchers kept running into difficulties. Problems arose because the human brain isn’t built like a computer; it’s more like a Heath Robinson contraption cobbled together from spare parts. It works after a fashion, and some parts of it are extremely efficient, but if you want understand how it works, you have get acquainted with its idiosyncrasies. The idiosyncrasies exist because the brain is a biological organ with all the quirky features that biological organs tend to have. Trying to figure out how it works from the way people use it has limitations; information about the biological structure and function of the brain is needed to explain why brains work in some rather odd ways.

Since the development of scanning techniques in the 1980s, the attention of cognitive science has shifted back towards the biological mechanisms involved. This doesn’t mean that the information theory approach is defunct – far from it – there’s been considerable interest in computational models of cognition and in cognitive errors and biases, for example. But the information theory and biological approaches are complementary; each approach makes more sense in the light of the other.

more than artificial intelligence

Daisy points out that “much of the modern research into intelligence was inspired and informed by research into artificial intelligence” (p.18). Yes, it was, but work on biological mechanisms, perception, attention and memory was going on simultaneously. Then “in the 1960s and 1970s researchers agreed on a basic mental model of cognition that has been refined and honed since then.” That’s one way of describing the sea change in cognitive science that’s happened since the introduction of scanning techniques, but it’s something of an understatement. Daisy then quotes Kirschner, Sweller and Clark; “ ‘working memory can be equated with consciousness’”. In a way it can, but facts and rules and digits are only a tiny fraction of what consciousness involves, though you wouldn’t know that to read Daisy’s account. Then there’s the nature of long-term memory. According to Daisy “when we try to solve any problem, we draw on all the knowledge that we have committed to long-term memory” (p.63). Yes, we do in a sense, but long-term memory is notoriously unreliable.

What Daisy didn’t say about cognitive psychology is as important as what she did say. Aside from all the cognitive research that wasn’t about artificial intelligence, Daisy fails to mention a model of working memory that’s dominated cognitive psychology for 40 years – the one proposed by Baddeley and Hitch in 1974. Recent research has shown that it’s an accurate representation of what happens in the brain. But despite being a leading authority on working memory, Baddeley gets only one mention in an endnote in Daisy’s book (the same ‘more technical’ reference that Willingham cites – also in an endnote) and isn’t mentioned at all in the Kirschner, Sweller and Clark paper. At the ResearchED conference in Birmingham in April this year, one teacher who’d given a presentation on memory told me he’d never heard of Baddeley. I’m drawing attention to this is not because have a special interest in Baddeley’s model, but because omitting his work from a body of evidence about working memory is a bit like discussing the structure of DNA without mentioning Crick and Watson’s double helix, or 19th century literature omitting Dickens. Also noticeable by her absence is Susan Gathercole, a professor of cognitive psychology at York, who researches working memory problems in children. Her work couldn’t be more relevant to education if it tried, but it’s not mentioned. Another missing name is Antonio Damasio, a neurologist who’s tackled the knotty problem of consciousness – highly relevant to working memory. Because of his background in biology, Damasio takes a strongly embodied view of consciousness; what we are aware of is affected by our physiology and emotions as well as our perceptions and memory. Daisy can’t write about everything, obviously, but it seemed odd to me that her model of cognition is drawn only from concepts central to one strand of one discipline at one period of time, not from an overview of the whole field. It was also odd that she cited secondary sources when work by people who have actually done the relevant research is readily accessible.

does this matter?

On her blog, Daisy sums up the evidence from cognitive psychology in three principles: “working memory is limited; long-term memory is powerful; and we remember what we think about”. When I’ve raised the issue of memory and cognition being more complex than Willingham’s explicitly ‘very simple’ model, teachers who support Daisy’s thesis have asked me if that makes any difference.

Other findings from cognitive psychology don’t make any difference to the three principles as they stand. Nor do they make it inappropriate for teachers to apply those principles, as they stand, to their teaching. But they do make a difference to the conclusions Daisy draws about facts, schemata and the curriculum. Whether they refute the myths or not depends on those conclusions.

a model of cognition

If I’ve understood correctly, Daisy is saying that working memory (WM) has limited capacity and limited duration, but long-term memory (LTM) has a much greater capacity and duration. If we pay attention to the information in WM, it’s stored permanently in LTM. The brain ‘chunks’ associated information in LTM, so that several smaller items can be retrieved into WM as one larger item, in effect increasing the capacity of WM. Daisy illustrates this by comparing the difficulty of recalling a string of 16 numerals

4871947503858604

with a string of 16 letters

the cat is on the mat

The numerals are difficult to recall, but the letters are easily recalled because our brains have already chunked those frequently encountered letter patterns into words, the capacity of WM is large enough to hold six words, and once the words are retrieved we can quickly decompose them into their component letters. So in Daisy’s model, memorising information increases the amount of information WM can handle.

I was with her so far. It was the conclusions that Daisy then goes on to draw about facts, schemata and the curriculum that puzzled me. The aha! moment came when I re-read her comments on Bloom’s taxonomy of educational objectives. Bloom adopts a concept that’s important in many fields, including information theory and cognitive psychology. It’s the concept of levels of abstraction, sometimes referred to as levels of granularity.

levels of abstraction

Levels of abstraction form an integral part of some knowledge domains. Chemists are familiar with thinking about their subject at the subatomic, atomic and molecular levels; biologists with thinking about a single organism at the molecular, cellular, organ, system or whole body level; geographers and sociologists with thinking about a population at the household, city or national level. It’s important to note three things about levels of abstraction:

First, the same fundamental entities are involved at different levels of abstraction. The subatomic ‘particles’ in a bowl of common salt are the same particles whether you’re observing their behaviour as subatomic particles, as atoms of sodium and chlorine or as molecules of sodium chloride. Cells are particular arrangements of chemicals, organs are particular arrangements of cells, and the circulatory or respiratory systems are particular arrangements of organs. The same people live in households, cities or nations.

Secondly, entities behave differently at different levels of abstraction. Molecules behave differently to their component atoms (think of the differences between sodium, chlorine and sodium chloride), the organs of the body behave differently to the cells they are built from, and nations behave differently to the populations of cities and households.

Thirdly, what happens at one level of abstraction determines what happens at the next level up. Sodium chloride has its properties because it’s formed from sodium and chlorine – if you replaced the sodium with potassium you’d get a chemical compound that tastes very different to salt. And if you replaced the cells in the heart with liver cells you wouldn’t have a heart, you’d have a liver. The behaviour of nations depends on how the population is made up.

Bloom’s taxonomy

The levels of abstraction Bloom uses in his taxonomy are (starting from the bottom) knowledge, comprehension, application, analysis, synthesis and evaluation. In her model of cognition Daisy refers to several levels of abstraction, although she doesn’t call them that and doesn’t clearly differentiate between them. That might be intentional. She describes Bloom’s taxonomy as a ‘metaphor’ and says it’s a misleading one because it implies that ‘the skills are somehow separate from knowledge’ and that ‘knowledge is somehow less worthy and important’ (p.21). Whether Bloom’s taxonomy is accurate or not, it looks as if Daisy’s perception of it as a ‘metaphor’, and her focus on the current popular emphasis on higher-level skills mean that she overlooks the core principle implicit in Bloom’s taxonomy that you can’t evaluate without synthesis, or synthesise without analysis or analyse without application or apply without comprehension. And you can’t do any of those things without knowledge. The various processes are described as ‘lower’ and ‘higher’ not because a value judgement is being made about their importance or because they involve different things entirely, but because the higher ones are derived from the lower ones in the taxonomy.

It’s possible, of course, that educational theorists have also got hold of the wrong end of the stick and have seen Bloom’s six levels of abstraction not as dependent on one another but as independent from each other. Daisy’s comments on Bloom explained why I’ve had some confusing conversations with teachers about ‘skills’. I’ve been using the term in a generic sense to denote facility in handling knowledge; the teachers have been using it in the narrow sense of specific higher-level skills required by the national curriculum.

Daisy appears to be saying that the relationship between knowledge and skills isn’t hierarchical. She provides two alternative ‘metaphors’; ED Hirsch’s scrambled egg and Joe Kirby’s double helix representing the dynamic, interactive relationship between knowledge and skills (p.21). I think Joe’s metaphor is infinitely better than Hirsch’s but it doesn’t take into account the different levels of abstraction of knowledge.

Bloom’s taxonomy is a framework for analysing educational objectives that are dependent on knowledge. In the next post, I look at a framework for analysing knowledge itself.

folk categorisation and implicit assumptions

In his second response to critics, Robert [Peal] tackles the issue of the false dichotomy. He says;

…categorisation invariably simplifies. This can be seen in all walks of life: music genres; architectural styles; political labels. However, though imprecise, categories are vital in allowing discussion to take place. Those who protest over their skinny lattes that they are far too sophisticated to use such un-nuanced language … are more often than not just trying to shut down debate.

Categorisation does indeed simplify. And it does allow discussion to take place. Grouping together things that have features in common and labelling the groups means we can refer to large numbers of thing by their collective labels, rather than having to list all their common features every time we want to discuss them. Whether all categorisation is equally helpful is another matter.

folk categorisation

The human brain categorises things as if it that was what it was built for; not surprising really because grouping things according to their similarities and differences and referring to them by a label is a very effective way of reducing cognitive load.

The things we detect with our senses are categorised by our brains quickly, automatically and pre-verbally (e.g. Haxby, Gobbini & Montgomery, 2004; Greene & Fei-Fei, 2014) – by which I mean that language isn’t necessary in order to form the categories – although language is often involved in categorisation. We also categorise pre-verbally in the sense that babies start to categorise things visually (such as toy trucks and toy animals) at between 7 and 10 months of age, before they acquire language (Younger, 2003). And babies acquire language itself by forming categories.

Once we do start to get the hang of language, we learn about how things are categorised and labelled by the communities we live in; we develop shared ways of categorising things. All human communities have these shared ‘folk’ categorisations, but not all groups categorise the same things in the same way. Nettles and chickweed would have been categorised as vegetables in the middle ages, but to most modern suburban gardeners they are ‘weeds’.

Not all communities agree on the categorisations they use either; political and religious groups are notorious for disagreements about the core features of their categories, who adheres to them and who doesn’t. Nor are folk categorisations equally useful in all circumstances. Describing a politician’s views as ‘right wing’ gives us a rough idea of what her views are likely to be, but doesn’t tell us what she thinks about specific policies.

Biologists have run into problems with folk categorisations too.  Mushrooms/toadstools, frogs/toads and horses/ponies are all folk classifications. So although biologists could distinguish between species of mushrooms/toadstools,  grouping the species together as either mushrooms or toadstools was impossible, because the differences between the folk categories ‘mushrooms’ and ‘toadstools’ aren’t clear enough, so biologists neatly sidestepped the problem by ignoring the folk category distinctions and grouping mushrooms and toadstools together as a phylum. The same principle apples to frogs/toads – so they form an order of their own. Horses and ponies, by contrast, are members of the same subspecies.

Incidentally 18th and 19th century biologists weren’t categorising these organisms just because of an obsessive interest in taxonomy. Their classification had a very practical purpose – to differentiate between species and identify the relationships between them. In a Europe that was fast running out of natural resources, farmers, manufacturers and doctors all had a keen interest in the plants and animals being brought back from far-flung parts of the world by traders, and accurate identification of different species was vital.

In short, folk categories do allow discussion to take place, but they have limitations. They’re not so useful when one needs to get down to specifics – how are particular MPs likely to vote, or is this fungus toxic or not? The catch is in the two words Robert uses to describe categories – ‘though imprecise’. My complaint about his educational categorisation is not categorisation per se, but its imprecision.

‘though imprecise’

The categories people use for their own convenience don’t always have clear-cut boundaries, nor do they map neatly on to the real world. They don’t always map neatly onto other people’s categories either. Eleanor Rosch’s work on prototype theory shed some light on this. What she found was that people’s mental categories have prototypical features – features that the members of the category share – but not all members of the category have all the prototypical features, and category members can have prototypical features to different extents. For example, the prototypical features of most people’s category {birds} are a beak, wings, feathers and being able to fly. A robin has a beak, wings and feathers and is able to fly, so it’s strongly prototypical of the category {birds}. A penguin can’t fly but uses its wings for swimming, so it’s weakly prototypical, although still a bird.

Mushrooms and toadstools have several prototypical features in common, as do frogs and toads, horses and ponies. The prototypical features that differentiate mushrooms from toadstools, frogs from toads and horses from ponies are the ideas that; toadstools are poisonous and often brightly coloured; toads have a warty skin, sometimes containing toxins; and horses are much larger than ponies. Although these differential features are useful for conversational purposes, they are not helpful for more specific ones such as putting edible fungi on your restaurant menu, using a particular toxin for medicinal purposes or breeding characteristics in or out of horses.

traditional vs progressive education

Traditional and progressive education are both types of education, obviously, so they have some prototypical features in common – teachers, learners, knowledge, schools etc. Robert proposes some core features of progressive education that differentiate it from traditional education; it is child-centered, focuses on skills rather than knowledge, sees strict discipline and moral education as oppressive and assumes that socio-economic background dictates success (pp. 5-8). He distilled these features from what’s been said and written about progressive education over the last fifty years, so it’s likely there’s a high degree of consensus on these core themes. The same might not be true for traditional education. Robert defines it only in terms of its core characteristics being the polar opposite of progressive education, although he appears to include in the category ‘traditional’ a list of other more peripheral features including blazers, badges and ties and class rankings.

Robert says “though imprecise, categories are vital in allowing discussion to take place.” No doubt about that, but if the categories are imprecise the discussion can be distinctly unfruitful. A lot of time and energy can be expended trying to figure out precise definitions and how accurately those definitions map onto the real world. Nor are imprecise categories helpful if we want to do something with them other than have a discussion. Categorising education as ‘traditional’ or ‘progressive’ is fine for referring conversationally to a particular teacher’s pedagogical approach or the type of educational philosophy favoured by a government minister, but those constructs are too complex and too imprecise to be of use in research.

implicit assumptions

An implicit assumption is, by definition, an assumption that isn’t made explicit. Implicit assumptions are sneaky things because if they are used in a discussion, people following the argument often overlook the fact that an implicit assumption is being made. An implicit assumption that’s completely wrong can easily slip by unnoticed. Implicit assumptions get even more sneaky; often the people making the argument aren’t aware of their implicit assumptions either. In the case of mushrooms and toadstools, any biologists who tried to group certain types of fungi into one or other of these categories would be on a hiding to nothing because of an implicit, but wrong, assumption that the fungi could be sorted into one or other of these categories.

Robert’s thesis appears to rest on an implicit assumption that because the state education system in the last fifty years has had shortcomings, some of them serious, and because progressive educational ideas have proliferated during the same period, it follows that progressive ideas must be the cause of the lack of effectiveness. This isn’t even the ever-popular ‘correlation equals causality’ error, because as far as I can see, Robert hasn’t actually established a correlation between progressive ideas and educational effectiveness. He can’t compare current traditional and progressive state schools because traditional state schools are a thing of the past. And he can’t compare current progressive state schools with historical traditional state schools because the relevant data isn’t available. Ironically, what data we do have suggest that numeracy and literacy rates have improved overall during this period. The reliability of the figures is questionable because of grade drift, but numeracy and literacy rates have clearly not plummeted.

What he does implicitly compare is state schools that he sees as broadly progressive, with independent schools that he sees as having “withstood the wilder extremes of the [progressive] movement”. The obvious problem with this comparison is that a progressive educational philosophy is not the only difference between the state and independent sectors.

In my previous post, I agreed with Robert that the education system in England leaves much to be desired, but making an implicit assumption that there’s only one cause and that other possible causes can be ignored is a risky approach to policy development. It would be instructive to compare schools that are effective (however you measure effectiveness) with schools that are less effective, to find out how the latter could be improved. But the differences between them could boil down to some very specific issues relating to the quality of teaching, classroom management, availability of additional support or allocation of budgets, rather than whether the schools take a ‘traditional’ or ‘progressive’ stance overall.

References
Greene, MR & Fie-Fie, L (2014).Visual categorization is automatic and obligatory: Evidence from Stroop-like paradigm. Journal of Vision, 14, article 14.
Haxby, J.V., Gobbini, M. I. & Montgomery, K. (2004). Spatial and temporal distribution of face and object representations in the human brain. In M. S. Gazzaniga (Ed.) The Cognitive Neurosciences (3rd edn.). Cambridge, MA: MIT Press.
Kuhl, P. (2004). Early language acquisition:Cracking the speech code. Nature Reviews Neuroscience 5, 831-843.
Younger, B (2003). Parsing objects into categories: Infants’ perception and use of correlated attributes. In Rakison & Oakes (eds.) Early Category and Concept development: Making sense of the blooming, buzzing confusion, Oxford University Press.

mixed methods for teaching reading (1)

Many issues in education are treated as either/or options and the Reading Wars have polarised opinion into synthetic phonics proponents on the one hand and those supporting the use of whole language (or ‘mixed methods’) on the other. I’ve been asked on Twitter what I think of ‘mixed methods’ for teaching reading. Apologies for the length of this reply, but I wanted to explain why I wouldn’t dismiss mixed methods outright and why I have some reservations about synthetic phonics. I wholeheartedly support the idea of using synthetic phonics (SP) to teach children to read. However, I have reservations about some of the assumptions made by SP proponents about the effectiveness of SP and about the quality of the evidence used to justify its use.

the history of mixed methods

As far as I’m aware, when education became compulsory in England in the late 19th century, reading was taught predominantly via letter-sound correspondence and analytic phonics – ‘the cat sat on the mat’ etc. A common assumption was that if people couldn’t read it was usually because they’d never been taught. What was found was that a proportion of children didn’t learn to read despite being taught in the same way as others in the class. The Warnock committee reported that teachers in England at the time were surprised by the numbers of children turning up for school with disabilities or learning difficulties. That resulted in special schools being set up for those with the most significant difficulties with learning. In France Alfred Binet was commissioned to devise a screening test to identify learning difficulties that evolved into the ‘intelligence test’. In Italy, Maria Montessori adapted methods to mainstream education that had been used to teach hearing-impaired children.

Research into acquired reading difficulties in adults generated an interest in developmental problems with learning to read, pioneered by James Hinshelwood and Samuel Orton in the early 20th century. The term developmental dyslexia began as a descriptive label for a range of problems with reading and gradually became reified into a ‘disorder’. Because using the alphabetic principle and analytic phonics clearly wasn’t an effective approach for teaching all children to read, and because of an increased interest in child development, researchers began to look at what adults and children actually did when reading and learning to read, rather than what it had been thought they should do.

What they found was that people use a range of cues (‘mixed methods’) to decode unfamiliar words; letter-sound correspondence, analytic phonics, recognising words by their shape, using key letters, grammar, context and pictures, for example. Educators reasoned that if some children hadn’t learned to read using alphabetic principles and/or analytic phonics, applying the strategies that people actually used when reading new words might be a more effective approach.

This idea, coinciding with an increased interest in child-led pedagogy and a belief that a species-specific genetic blueprint meant that children would follow the same developmental trajectory but at different rates, resulted in the concept of ‘reading-readiness’. The upshot was that no one panicked if children couldn’t read by 7, 9 or 11; they often did learn to read when they were ‘ready’. It’s impossible to compare the long-term outcomes of analytic phonics and mixed methods because the relevant data aren’t available. We don’t know for instance, whether children’s educational attainment suffered more if they got left behind by whole-class analytic phonics, or if they got left alone in schools that waited for them to become ‘reading-ready’.

Eventually, as is often the case, the descriptive observations about how people tackle unfamiliar words became prescriptive. Whole word recognition began to supersede analytic phonics after WW2, and in the 1960s Ken Goodman formalised mixed methods in a ‘whole language’ approach. Goodman was strongly influenced by Noam Chomsky, who believes that the structure underpinning language is essentially ‘hard-wired’ in humans. Goodman’s ideas chimed with the growing social constructivist approach to education that emphasises the importance of meaning mediated by language.

At the same time as whole language approaches were gaining ground, in England the national curriculum and standardised testing were introduced, which meant that children whose reading didn’t keep up with their peers were far more visible than they had been previously, and the complaints that had followed the introduction of whole language in the USA began to be heard here. In addition, the national curriculum appears to have focussed on the mechanics of understanding ‘texts’ rather than on reading books for enjoyment. What has also happened is that with the advent of multi-channel TV and electronic gadgets, reading has nowhere near the popularity it once had as a leisure activity amongst children, so children tend to get a lot less reading practice than they did in the past. These developments suggest that any decline in reading standards might have multiple causes, rather than ‘mixed methods’ being the only culprit.

what do I think about mixed methods?

I think Chomsky has drawn the wrong conclusions about his linguistic theory, so I don’t subscribe to Goodman’s reading theory either. Although meaning is undoubtedly a social construction, it’s more than that. Social constructivists tend to emphasise the mind at the expense of the brain. The mind is such vague concept that you can say more or less what you like about it, but we’re very constrained by how our brains function. I think marginalising the brain is an oversight on the part of social constructivists, and I can’t see how a child can extract meaning from a text if they can’t read the words.

Patricia Kuhl’s work suggests that babies acquire language computationally, from the frequency of sound patterns within speech. This is an implicit process; the baby’s brain detects the sounds and learns the patterns, but the baby isn’t aware of the learning process, nor of phonemes. What synthetic phonics does is to make the speech sounds explicit, develop phonemic awareness and allow children to learn phoneme-grapheme correspondence and how words are constructed.

My reservations about SP are not about the approach per se, but rather about how it’s applied and the reasons assumed to be responsible for its effectiveness. In cognitive terms, SP has three main components;

• phonemic and graphemic discrimination
• grapheme-phoneme correspondence
• building up phonemes/graphemes into words – blending

How efficient children become at these tasks is a function of the frequency of their exposure to the tasks and how easy they find them. Most children pick up the skills with little effort, but anyone who has problems with any or all of the tasks could need considerably more rehearsals. Problems with the cognitive components of SP aren’t necessarily a consequence of ineffective teaching or the child not trying hard enough. Specialist SP teachers will usually be aware of this, but policy-makers, parents, or schools that simply adopt a proprietary SP course might not.

My son’s school taught reading using Jolly Phonics. Most of the children in his class learned to read reasonably quickly. He took 18 months over it. He had problems with each of the three elements of SP. He couldn’t tell the difference between similar-sounding phonemes – i/e or b/d, for example. He couldn’t tell the difference between similar-looking graphemes either – such as b/d, h/n or i/j. As a consequence, he struggled with some grapheme-phoneme correspondences. Even in words where his grapheme-phoneme correspondences were secure, he couldn’t blend more than three letters.

After 18 months of struggling and failing, he suddenly began to read using whole word recognition. I could tell he was doing this because of the errors he was making; he was using initial and final letters and word shape and length as cues. Recognising patterns is what the human brain does for a living and once it’s recognised a pattern it’s extremely difficult to get it to unrecognise it. Brains are so good at recognising patterns they often see patterns that aren’t what they think they are – as in pareidolia or the behaviourists’ ‘superstition’. Once my son could recognise word-patterns, he was reading and there was no way he was going to be persuaded to carry on with all that tedious sounding-out business. He just wanted to get on with reading, and that’s what he did.

[Edited to add: I should point out that the reason the apparent failure of an SP programme to teach my son to read led to me supporting SP rather than dismissing it, was because after conversations with specialist SP teachers, I realised that he hadn’t had enough training in phonemic and graphemic discrimination. His school essentially put the children through the course, without identifying any specific problems or providing additional training that might have made a significant difference for him.]

When I trained as a teacher ‘mixed methods’ included a substantial phonics component – albeit as analytic phonics. I get the impression that the phonics component has diminished over time so ‘mixed methods’ aren’t what they once were. Even if they included phonics, I wouldn’t recommend ‘mixed methods’ prescriptively as an approach to teaching reading. Having said that, I think mixed methods have some validity descriptively, because they reflect the way adults/children actually read. I would recommend the use of SP for teaching reading, but I think some proponents of SP underestimate the way the human brain tends to cobble together its responses to challenges, rather than to follow a neat, straight pathway.

Advocacy of mixed methods and opposition to SP is often based on accurate observations of the strategies children use to read, not on evidence of what teaching methods are most effective. Our own personal observations tend to be far more salient to us than schools we’ve never visited reporting stunning SATs results. That’s why I think SP proponents need to ensure that the evidence they refer to as supporting SP is of a high enough quality to be convincing to sceptics.

getting it wrong from the beginning: natural learning

In my previous post, I said that I felt that in Getting It Wrong From The Beginning: Our Progressive Inheritance from Herbert Spencer, John Dewey and Jean Piaget Kieran Egan was too hard on Herbert Spencer and didn’t take sufficient account of the context in which Spencer formulated his ideas. In this post, I look in more detail at the ideas in question and Egan’s critique of them.

natural learning

Egan says that the “holy grail of progressiveness … has been to discover methods of school instruction derived from and modelled on children’s effortless learning … in households, streets and fields” (pp.38-39). In essence, progressives like Spencer see all learning as occurring in the same way, implying that children find school learning difficult only because it doesn’t take into account how they learn naturally. Their critics see school learning as qualitatively different to natural learning; it requires thinking, and thinking doesn’t come naturally and is effortful so students don’t like it.

It’s inaccurate to describe the learning children do in ‘households, streets and fields’ as ‘effortless’. Apparently effortless would be more accurate. That’s because a key factor in learning is rehearsal. Babies and toddlers spend many, many hours rehearsing their motor, language, and sensory processing skills and in acquiring information about the world around them. Adolescents do the same in respect of interacting with peers, using video games or playing in a band. Adults can become highly competent in the workplace or at cooking, motor mechanics or writing novels in their spare time. What makes this learning appear effortless is that the individuals are highly motivated to put in the effort, so the learning doesn’t feel like work. I think there are three main motivational factors in so-called ‘natural learning’; sensory satisfaction (in which I’d include novelty-seeking and mastery), social esteem and sheer necessity – if it’s a case of acquiring knowledge and skills or starving, the acquisition of knowledge and skills usually wins.

School learning tends to differs from ‘natural’ learning in two main respects. One is motivational. School learning is essentially enforced – someone else decides what you’re going to learn about regardless of whether you want to learn about it or see an immediate need to learn about it. The other is that the breadth of the school curriculum means that there isn’t enough time for learning to occur ‘naturally’. If I were to spend a year living with a Spanish family or working for a chemist I would learn more Spanish or chemistry naturally than I would if I had two Spanish or chemistry lessons a week at school simply because the amount of rehearsal time would be more in the Spanish family or in the chemistry lab than it would be in school. Schools generally teach the rules of languages or of science explicitly and students have to spend more time actively memorising vocabulary and formulae because there simply isn’t the time available to pick them up ‘naturally’.

progressive ‘myths’

Egan’s criticism of Spencer’s ideas centres around three core principles of progressive education; simple to complex, concrete to abstract and known to unknown – Egan calls the principles ‘myths’. Egan presents what at first appears to be a convincing demolition job on all three principles, but the way he uses the constructs involved is different to the way in which they are used by Spencer and/or by developmental psychology. Before unpacking Egan’s criticism of the core principles, I think it would be worth looking at the way he views cognition.

the concept of mind

Egan frequently refers to the concept of ‘mind’. ‘Mind’ is a useful shorthand term when referring to activities like feeling, thinking and learning, but it’s too vague a concept to be helpful when trying to figure out the fine detail of learning. Gilbert Ryle points out that even in making a distinction between mind and body, as Descartes did, we make a category error – a ‘mind’ isn’t the same sort of thing as a body, so we can’t make valid comparisons between them. If I’ve understood Ryle correctly, what he’s saying is that ‘mind’ isn’t just a different type of thing to a body, ‘mind’ doesn’t exist in the way a body exists, but is rather an emergent property of what a person does – of their ‘dispositions’, as he calls them.

Emergent properties that appear complex and sophisticated can result from some very simple interactions. An example is flocking behaviour. At first glance, the V-formation in flight adopted by geese and ducks or the extraordinary patterns made by flocks of starlings before roosting or by fish evading a predator look pretty complex and clever. But in fact these apparently complex behaviours can emerge from some very simple rules of thumb (heuristics) such as each bird or fish maintaining a certain distance from the birds or fish on either side of them, and moving in the general direction of its neighbours. Similarly, some human thinking can appear complex and sophisticated when in fact it’s the outcome of some simple biological processes. ‘Minds’ might not exist in the same way as bodies do, but brains are the same kind of thing as bodies and do exist in the same way as bodies do, and brains have a significant impact on how people feel, think, and learn.

the brain and learning

Egan appeals to Fodor’s model of the brain in which “we have fast input systems and and a slower, more deliberative central processor” (p.39). Fodor’s fast and ‘stupid’ input systems are dedicated to processing particular types of information and work automatically, meaning that we can’t not learn things like motor skills or language. Fodor is broadly correct in his distinction, but I think Egan has drawn the wrong conclusions from this idea. A core challenge in research is that often more than one hypothesis offers a plausible explanation for a particular phenomenon. The genius of research is in eliminating the hypotheses that actually don’t explain the phenomenon. But if you’re not familiar with a field and you’re not aware that there are competing hypotheses, it’s easy to assume that there’s only one explanation for the data. This is what Egan appears to do in relation to cognitive processes; he sees the cognitive data through the spectacles of a model that construes natural learning as qualitatively different to the type of learning that happens in school.

Egan assumes that the apparent ease with which children learn to recognise faces or pick up languages and the fact that there are dedicated brain areas for face recognition and for language implies that those functions are inbuilt automatic systems that result in effortless learning. But that’s not the only hypothesis in town. What’s equally possible that face-recognition and language need to be learned. There’s general agreement that the human brain is hard-wired to extract signals from noise – to recognise patterns – but the extent to which patterns are identified and learned depends on the frequency of exposure to the patterns. For most babies, human facial features are the first visual pattern they see, and it’s one they see a great many times during their first day of life, so it’s not surprising that, even at a few hours old, they ‘prefer’ facial features the right way up rather than upside down. It’s a relatively simple pattern, so would be learned quickly. Patricia Kuhl’s work on infants’ language acquisition suggests that a similar principle is in operation in relation to auditory information – babies’ brains extract patterns from the speech they hear and the rate at which the patterns are extracted is a function of the frequency of exposure to speech. The patterns in speech are much more complex than facial features, so language takes much longer to learn.

Egan’s understanding of mind and brain colours the way he views Spencer’s principles. He also uses the constructs embedded in the principles in a different way to Spencer. As a consequence, I feel his case against the principles is considerably weakened.

the three principles of progressive education

simple to complex

Spencer’s moment of epiphany with regard to education was when he realised that the gradual transition from simple to complex observed in the evolution of living organisms, the way human societies have developed and the pre-natal development of the foetus, also applied to the way human beings learn. Egan points out that this idea was challenged by the discovery of the second law of thermodynamics which states that isolated systems evolve towards maximum entropy – in other words complexity tends to head towards simplicity, the opposite of what Spencer and the evolutionists were claiming. What critics overlook is that although the second law of thermodynamics applies to the isolated system of the universe as a whole and any isolated system within it, most systems in the universe aren’t isolated. Within the vast, isolated universe system, subatomic particles, chemicals and living organisms are interacting with each other all the time. If that wasn’t the case, complex chemical reactions wouldn’t happen, organisms wouldn’t change their structure and babies wouldn’t be born. I think Egan makes a valid point about early human societies not consisting of simple savages, but human societies, like the evolution of living organisms, chemical reactions, the development of babies and the way people learn if left to their own devices, do tend to start simple and move towards complex.

Egan challenges the application of this principle to education by suggesting that the thinking of young children can be very complex as exemplified by their vivid imaginations and “mastering language and complex social rules when most adults can’t program a VCR” (p.62). He also claims this principle has “hidden and falsified those features of children’s thinking that are superior to adults’” (p.90), namely children’s use of metaphor that he says declines once they become literate (p.93). I think Egan is right that Spencer’s idea of cognition unfolding along a predetermined straight developmental line from simple to complex is too simplistic and doesn’t pay enough attention to the role of the environment. But I think he’s mistaken in suggesting that language, social behaviour and metaphor are examples of complex thinking in children. Egan himself attributes young children’s mastery of language and complex social rules to Fodor’s ‘stupid’ systems, which is why they are often seen as a product of ‘natural’ learning. Children might use metaphor more frequently than adults, but that could equally well be because adults have wider vocabularies, more precise terminology and simply don’t need to use metaphor so often. Frequency isn’t the same as complexity. Research into children’s motor, visuo-spatial, auditory, and cognitive skills all paints the same picture; that it starts simple and gets more complex over time.

concrete to abstract

By ‘abstract’ Spencer appears to have meant the abstraction of rules from concrete examples; the rules of grammar from speech, of algebraic rules from mathematical relationships, the laws of physics and chemistry from empirical observations and so on. Egan’s idea of ‘abstract’ is different – he appears to construe it as meaning ‘intangible’. He claims that children are capable of abstract thought because they have no problem imagining things that don’t exist, giving the example of Beatrix Potter’s Peter Rabbit (p.61). Peter Rabbit certainly isn’t concrete in the sense of actually existing in the real world, but all the concepts children need to comprehend his story are very concrete indeed; they include rabbits, items of clothing, tools, vegetables and gardens. And the ‘abstract’ emotions involved – anger, fear, security – are all ones with which children would be very familiar. Egan isn’t using ‘abstract’ in the same way as Spencer. Egan also claims that children’s ability to understand symbolic relationships means that Spencer was wrong. However, as Egan points out, symbols are ‘arbitrarily connected with what they symbolize’ and the ‘ready grasp of symbols’ is found in ‘children who are exposed to symbols’ which suggests that actually the children’s thinking does start with the concrete (what the symbols represent) and moves towards the abstract (the symbols and their arbitrary connection with what they symbolize). Spencer might have over-egged the pudding with respect to concrete to abstract principle, but I don’t think Egan manages to demonstrate that he was wrong.

known to unknown

Spencer was also insistent that education should start with what children knew – the things that were familiar to them in their own homes and communities. Egan raises several objections to this idea (pp.63-64):

1. “if this is a fundamental principle of human learning, there is no way the process can begin”
2. ‘if novelty – that is things unconnected with what is already known – is the problem … reducing the amount of novelty doesn’t solve the problem”
3. this principle has dumbed down the curriculum and comes close to “contempt for children’s intelligence”
4. “ this is the four-legged fly item … no one’s understanding of the world … expands according to this principle of gradual content association”

With regard to point 1, Spencer clearly wasn’t saying we have to know something in order to know anything else. What he was saying is that trying to get children to learn things that are completely unconnected with what they already know is likely to end in failure.

I can’t see how, in point 2, reducing the amount of novelty doesn’t solve the problem. If I were to attend a lecture delivered in Portuguese about the Higgs’ boson, the amount of novelty involved would be so high (I know only one Portuguese word and little about sub-atomic physics) that I would be likely to learn nothing. If, however, it was a Royal Institution Christmas Lecture in English for a general audience, the amount of novelty would be considerably reduced and I would probably learn a good deal. Exactly how much would depend on my prior knowledge about sub-atomic physics.

I do agree with Egan’s point 3, in the sense that taking this principle to extremes would result in an impoverished curriculum, but that’s a problem with implementation rather than the principle itself.

It’s ironic that Egan describes point 4 as the ‘four-legged fly’ item, since work on brain plasticity suggests that gradual content association, via the formation of new synapses, is precisely the way in which human beings do expand their understanding of the world. If we come across information with massive novel content, we tend to simply ignore it because of the time required to gather the additional information we need in order to make sense of it.

a traditional-liberal education

Egan’s critique of Spencer’s ideas is a pretty comprehensive one. For him, Spencer’s ideas are like the original version of the curate’s egg – not that parts of them are excellent, but that they are totally inedible. Egan says “I have already indicated that I consider the traditional-liberal principles equally as problematic as the progressive beliefs I am criticising” (p.54), but I couldn’t see where he’d actually done so.

A number of times Egan refers with apparent approval to some of the features commonly associated with a traditional-liberal education. He’s clearly uneasy about framing education in utilitarian terms, as Spencer did, but then Spencer was criticising a curriculum that was based on tradition and “the ornamental culture of the leisured class”. In the section entitled “What is wrong with Spencer’s curriculum?” (p.125ff) Egan highlights Spencer’s dismissal of grammar, history, Latin and the ‘useless arts’. In doing so, I think he has again overlooked the situation that Spencer was addressing.

As I understand it, the reason that Greek and Latin were originally considered essential to education was that for centuries in Europe, ancient Greek and Latin texts were the principal source of knowledge, as well as Latin being the lingua franca. From the Greek and Latin texts, you could get a broad understanding of what was known about literature, history, geography, theology, science, mathematics, politics, economics and law. If they understood what worked and what went wrong in Greek and Roman civilisations, boys from well-to-do families – the future movers and shakers – would be less likely to repeat the errors of previous generations. Over time, as contemporary knowledge increased and books were more frequently written in the vernacular, the need to learn Greek and Latin became less important; it persisted often because it was traditional, rather than because it was useful.

I’ve noticed that the loudest cries for reform of the education system in the English-speaking world have come from those with a background in subjects that involve high levels of abstraction; English, history, mathematics, philosophy. Egan’s special interest is in imaginative education. I’ve heard hardly a peep from scientists, geographers or PE teachers. It could be that highly abstracted subjects have been victims of the worst excesses of progressivism – or that in highly abstracted subjects there’s simply more scope for differences of opinion about subject content. I can understand why Egan is wary of utility being the guiding principle for education; it’s too open to exploitation by business and politicians, and education needs to do more than train an efficient workforce. But I’m not entirely clear what Egan wants to see in its place. He appears to see education as primarily for cultural purposes; so we can all participate in what Oakeshott called ‘the conversation of mankind’, a concept mentioned by other new traditionalists, such as Robert Peal and Toby Young. Egan sees a good education as needing to include grammar, Latin and history because they are pieces of the complex image that makes up ‘what we expect in an educated person'(p.160). I can see what he’s getting at, but this guiding principle for education is demonstrably unhelpful. We’ve been arguing about it at least since Spencer’s day, and have yet to reach a consensus.

In my view, education isn’t about a cultural conversation or about utility, although it involves both. But it should be useful. The more people who get a good knowledge and understanding of all aspects how the world the works, the more likely our communities are to achieve a good, sustainable standard of living and decent quality of life. We need our education system to produce people who make the world a better place, not just people who can talk about it.

the curate’s egg, the emperor’s new clothes and Aristotle’s flies: getting it wrong from the beginning

Alongside a recommendation to read Robert Peal’s Progressively Worse, came another to read Kieran Egan’s Getting It Wrong From The Beginning: Our Progressive Inheritance from Herbert Spencer, John Dewey and Jean Piaget. Egan’s book is in a different league to Peal’s; it’s scholarly, properly referenced and published by a mainstream publisher not a think-tank. Although it appears to be about Spencer, Dewey and Piaget, Egan’s critique is aimed almost solely at Spencer; Piaget’s ideas are addressed, but Dewey hardly gets a look in. During the first chapter – a historical sketch of Spencer and his ideas – Egan and I got along swimmingly. Before I read this book my knowledge of Spencer would have just about filled a postage stamp (I knew he was a Victorian polymath who coined the term ‘survival of the fittest’) so I found Egan’s account of Spencer’s influence illuminating. But once his analysis of Spencer’s ideas got going, we began to part company.

My first problem with Egan’s analysis was that I felt he was unduly hard on Spencer. There is a sense in which he has to be because he lays at Spencer’s feet the blame for most of the ills of the education systems in the English-speaking world. Spencer is portrayed as someone who dazzled the 19th century public in the UK and America with his apparently brilliant ideas, which were then rapidly discredited towards the end of his life and soon after his death he was forgotten. Yet Spencer, according to Egan, laid the foundation for the progressive ideas that form the basis for the education system in the US and the UK. That poses a problem for Egan because he then has to explain why, if Spencer’s ideas were so bad that academia and the public dismissed them, in education they have not only persisted but flourished in the century since his death.

misleading metaphors

Egan tackles this conundrum by appealing to three metaphors; the curate’s egg, the emperor’s new clothes and Aristotle’s flies. The curate’s egg – ‘good in parts’ – is often used to describe something of variable quality, but Egan refers to the original Punch cartoon in which the curate, faced with a rotten egg for breakfast, tries to be polite to his host the bishop. The emperor’s new clothes require no explanation. In other words, Egan explains the proliferation of Spencer’s educational theories as partly down to deference to someone who was once considered a great thinker, and partly to people continuing to believe something despite the evidence of their own eyes.

Bishop: “I’m afraid you’ve got a bad egg, Mr Jones”; Curate: “Oh, no, my Lord, I assure you that parts of it are excellent!”

Aristotle’s flies

The Aristotle’s flies metaphor does require more explanation. Egan claims “Aristotle’s spells are hard to break. In a careless moment he wrote that flies have four legs. Despite the easy evidence of anyone’s eyes, his magisterial authority ensured that this “fact” was repeated in natural history texts for more than a thousand years” (p.42). In other words, Spencer’s ideas, derived ultimately from Aristotle’s, have, like Aristotle’s, been perpetuated because of his ‘magisterial authority’ – something which Egan claims Spencer lost.

It’s certainly true that untruths can be perpetuated for many years through lazy copying from one text to another. But these are usually untruths that are hard to disprove – the causes of fever or the existence of the Loch Ness monster, or, in Aristotle’s case, the idea that the brain cooled the blood, for example – not untruths that could be dispelled in a few second’s observation by a child capable of counting to six. Aristotle’s alleged ‘careless moment’ caught my attention because ‘legs’ pose a particular challenge for comparative anatomists. Aristotle was interested in comparative anatomy and was a keen and careful observer of nature. It’s unlikely that he would have had such a ‘careless moment’, and much more likely that the error would have been due to a mistranslation.

The challenge of ‘legs’ is that in nature they have a tendency over time to morph into other things – arms in humans and wings in birds for example. Anyone who has observed a housefly for a few seconds will know that houseflies frequently use their first pair of legs for grooming – in other words, as arms. I thought it quite possible that Aristotle categorised the first pair of fly legs as ‘arms’ so I looked for the reference. Egan doesn’t give it but the story about the four-legged fly idea being perpetuated for a millennium is a popular one. In 2005 it appeared in an article in the journal European Molecular Biology Organisation Reportsand was subsequently challenged in 2008 in a zoology blog.

male mayfly

male mayfly

Aristotle’s observation is in a passage on animal locomotion and the word for ‘fly’ – ephemeron – is translated by D’Arcy Thompson as ‘dayfly’ – also commonly known as the mayfly (order Ephemeroptera, named for their short adult life). In mayfly the first pair of legs is enlarged and often held forward off the ground as the males use them for grasping the female during mating. So the fly walks on four legs – the point Aristotle is making. Egan’s book was published in 2002, before this critique was written, but even before the advent of the internet it wouldn’t have been difficult to check Aristotle’s text – in Greek or in translation.

Spencer in context

I felt also that much of Egan’s criticism of Spencer was from the vantage point of hindsight. Spencer was formulating his ideas whilst arguments about germ theory were ongoing, before the publication of On the Origin of Species, before the American Civil war, before all men (never mind women) were permitted to vote in the UK or the US, before state education was implemented in England, and a century before the discovery of the structure of DNA. His ideas were widely criticised by his contemporaries, but that doesn’t mean he was wrong about everything.

It’s also important to set Spencer’s educational ideas in context. He was writing in an era when mass education systems were in their infancy and schools were often significantly under-resourced. Textbooks and exercise books were unaffordable not just for most families, but for many schools. Consequently schools frequently resorted to the age-old practice of getting children to memorise, not just the alphabet and multiplication tables, but everything they were taught. Text committed to memory could be the only access to books that many people might get during their lifetime. If the children didn’t have books they couldn’t take material home to learn so had to do it in school. Memorisation takes time, so teachers were faced with a time constraint and a dilemma – whether to prioritise remembering or explaining. Not surprisingly, memorisation tended to win, because understanding can always come later. Consequently, many children could recite a lot of text, but hadn’t got a clue what it meant. For many, having at least learned to read and write at school, their education actually began after they left school and had earned enough money to buy books themselves or could borrow them from libraries. This is the rote learning referred to as ‘vicious’ by early progressive educators.

The sudden demand for teachers when mass education systems were first rolled out meant that schools had to get whatever teachers they could. Many had experience but no training and would simply expect children from very different backgrounds to those they had previously taught to learn the same material, such as reciting the grammatical rules of standard English when the children knew only their local dialect with different pronunciation, vocabulary and grammatical structure. For children in other parts of the UK it was literally a different language. The history of England, with its list of Kings and Queens was essentially meaningless to children whose only prior access to their nation’s history was a few stories passed down orally.

This was why Spencer placed so much emphasis on the principles of simple to complex, concrete to abstract and known to unknown. Without those starting points, many children’s experience of education was one of bobbing about in a sea of incomprehension and getting more lost as time went by – and Spencer was thinking of middle-class children, not working-class ones for whom the challenge would have been greater. The problem with Spencer’s ideas was that they were extended beyond what George Kelly calls their range of convenience; they were taken to unnecessary extremes that were indeed at risk of insulting children’s intelligence.

In the next post, I take a more detailed look at Egan’s critique of Spencer’s ideas.

Kirschner, Sweller & Clark: a summary of my critique

It’s important not just to know things, but to understand them, which is why I took three posts to explain my unease about the paper by Kirschner, Sweller & Clark. From the responses I’ve received I appear to have overstated my explanation but understated my key points, so for the benefit of anybody unable or unwilling to read all the words, here’s a summary.

1. I have not said that Kirschner, Sweller & Clark are wrong to claim that working memory has a limited capacity. I’ve never come across any evidence that says otherwise. My concerns are about other things.

2. The complex issue of approaches to learning and teaching is presented as a two-sided argument. Presenting complex issues in an oversimplified way invariably obscures rather than clarifies the debate.

3. The authors appeal to a model of working memory that’s almost half a century old, rather than one revised six years before their paper came out and widely accepted as more accurate. Why would they do that?

4. They give the distinct impression that long-term memory isn’t subject to working memory constraints, when it is very much subject to them.

5. They completely omit any mention of the biological mechanisms involved in processing information. Understanding the mechanisms is key if you want to understand how people learn.

6. They conclude that explicit, direct instruction is the only viable teaching approach based on the existence of a single constraining factor – the capacity of working memory to process yet-to-be learned information (though exactly what they mean by yet-to-be learned isn’t explained). In a process as complex as learning, it’s unlikely that there will be only one constraining factor.

Kirschner, Sweller & Clark appear to have based their conclusion on a model of memory that was current in the 1970s (I know because that’s when I first learned about it), to have ignored subsequent research, and to have oversimplified the picture at every available opportunity.

What also concerns me is that some teachers appear to be taking what Kirschner, Sweller & Clark say at face value, without making any attempt to check the accuracy of their model, to question their presentation of the problem or the validity of their conclusion. There’s been much discussion recently about ‘neuromyths’. Not much point replacing one set of neuromyths with another.

Reference
Kirschner, PA, Sweller, J & Clark, RE (2006). Why Minimal Guidance During Instruction Does Not Work: An Analysis of the Failure of Constructivist, Discovery, Problem-Based, Experiential, and Inquiry-Based Teaching Educational Psychologist, 41, 75-86.

cognitive load and learning

In the previous two posts I discussed the model of working memory used by Kirschner, Sweller & Clark and how working memory and long-term memory function. The authors emphasise that their rejection of minimal guidance approaches to teaching is based on the limited capacity of working memory in respect of novel information, and that even if experts might not need much guidance “…nearly everyone else thrives when provided with full, explicit instructional guidance (and should not be asked to discover any essential content or skills)” (Clark, Kirschner & Sweller, p.6) Whether they are right or not depends on what they mean by ‘novel’ information.

So what’s new?

Kirschner, Sweller & Clark define novel information as ‘new, yet to be learned’ information that has not been stored in long-term memory (p.77). But novelty isn’t a simple case of information either being yet–to-be-learned or stored-in-long-term memory. If I see a Russian sentence written in Cyrillic script, its novelty value to me on a scale of 1-10 would be about 9. I can recognise some Cyrillic letters and know a few Russian words, but my working memory would be overloaded after about the third letter because of the multiple operations involved in decoding, blending and translating. A random string of Arabic numerals would have a novelty value of about 4, however, because I am very familiar with Arabic numerals; the only novelty would be in their order in the string. The sentence ‘the cat sat on the mat’ would have a novelty value close to zero because I’m an expert at chunking the letter patterns in English and I’ve encountered that sentence so many times.

Because novelty isn’t an either/or thing but sits on a sliding scale, and because the information coming into working memory can vary between simple and complex, that means that ‘new, yet to be learned’ information can vary in both complexity and novelty.

You could map it on a 2×2 matrix like this;

novelty, complexity & cognitive load

novelty, complexity & cognitive load

A sentence such as ‘the monopsonistic equilibrium at M should now be contrasted with the equilibrium that would obtain under competitive conditions’ is complex (it contains many bits of information) but its novelty content would depend on the prior knowledge of the reader. It would score high on both the novelty and complexity scales of the average 5 year old. I don’t understand what the sentence means, but I do understand many of the words, so it would be mid-range in both novelty and complexity for me. An economist would probably give it a 3 for complexity but 0 for novelty. Trying to teach a 5 year-old what the sentence meant would completely overload their working memory. But it would be a manageable challenge for mine, and an economist would probably feel bored.

Kirschner, Sweller & Clark reject ‘constructivist, discovery, problem-based, experiential and inquiry-based approaches’ on the basis that they overload working memory and the excessive cognitive load means that learners don’t learn as efficiently as they would using explicit direct instruction. If only it were that simple.

‘Constructivist, discovery, problem-based, experiential and inquiry-based approaches’ were adopted initially not because teachers preferred them or because philosophers thought they were a good idea, but because by the end of the 19th century explicit, direct instruction – the only game in town for fledgling mass education systems – clearly wasn’t as effective as people had thought it would be. Alternative approaches were derived from three strategies that young children apply when learning ‘naturally’.

How young children learn

Human beings are mammals and young mammals learn by applying three key learning strategies which I’ll call ‘immersion’, trial-and-error and modelling (imitating the behaviour of other members of their species). By ‘strategy’, I mean an approach that they use, not that the baby mammals sit down and figure things out from first principles; all three strategies are outcomes of how mammals’ brains work.

Immersion

Most young children learn to walk, talk, feed and dress themselves and acquire a vast amount of information about their environment with very little explicit, direct instruction. And they acquire those skills pretty quickly and apparently effortlessly. The theory was that if you put school age children in a suitable environment, they would pick up other skills and knowledge equally effortlessly, without the boredom of rote-learning and the grief of repeated testing. Unfortunately, what advocates of discovery, problem-based, experiential and inquiry-based learning overlooked was the sheer amount of repetition involved in young children learning ‘naturally’.

Although babies’ learning is kick-started by some hard-wired processes such as reflexes, babies have to learn to do almost everything. They repeatedly rehearse their gross motor skills, fine motor skills and sensory processing. They practice babbling, crawling, toddling and making associations at every available opportunity. They observe things and detect patterns. A relatively simple skill like face-recognition, grasping an object or rolling over might only take a few attempts. More complex skills like using a spoon, crawling or walking take more. Very complex skills like using language require many thousands of rehearsals; it’s no coincidence that children’s speech and reading ability take several years to mature and their writing ability (an even more complex skill) doesn’t usually mature until adulthood.

The reason why children don’t learn to read, do maths or learn foreign languages as ‘effortlessly’ as they learn to walk or speak in their native tongue is largely because of the number of opportunities they have to rehearse those skills. An hour a day of reading or maths and a couple of French lessons a week bears no resemblance to the ‘immersion’ in motor development and their native language that children are exposed to. Inevitably, it will take them longer to acquire those skills. And if they take an unusually long time, it’s the child, the parent, the teacher or the method of that tends to be blamed, not the mechanism by which the skill is acquired.

Trial-and-error

The second strategy is trial-and-error. It plays a key role in the rehearsals involved in immersion, because it provides feedback to the brain about how the skill or knowledge is developing. Some skills, like walking, talking or handwriting, can only be acquired through trial-and-error because of the fine-grained motor feedback that’s required. Learning by trial-and-error can offer very vivid, never-forgotten experiences, regardless of whether the initial outcome is success or failure.

Modelling

The third strategy is modelling – imitating the behaviour of other members of the species (and sometimes other species or inanimate objects). In some cases, modelling is the most effective way of teaching because it’s difficult to explain (or understand) a series of actions in verbal terms.

Cognitive load

This brings us back to the issue of cognitive load. It isn’t the case that immersion, trial-and-error and modelling or discovery, problem-based, experiential and inquiry-based approaches always impose a high cognitive load, and that explicit direct instruction doesn’t. If that were true, young children would have to be actively taught to walk and talk and older ones would never forget anything. The problem with all these educational approaches is that they have all initially been seen as appropriate for teaching all knowledge and skills and have subsequently been rejected as ineffective. That’s not at all surprising, because different types of knowledge and skill require different strategies for effective learning.

Cognitive load is also affected by the complexity of incoming information and how novel it is to the learner. Nor is cognitive load confined to the capacity of working memory. 40 minutes of explicit, direct novel instruction, even if presented in well-paced working-memory-sized chunks, would pose a significant challenge to most brains. The reason, as I pointed out previously, is because the transfer of information from working memory to long-term memory is a biological process that takes time, resources and energy. Research into changes in the motor cortex suggests that the time involved might be as little as hours, but even that has implications for the pace at which students are expected to learn and how much new information they can process. There’s a reason why someone would find acquiring large amounts of new information tiring – their brain uses up a considerable amount of glucose getting that information embedded in the form of neural connections. The inevitable delay between information coming into the brain and being embedded in long-term memory suggests that down-time is as important as learning time – calling into question the assumption that the longer children spend actively ‘learning’ the more they will know.

Final thoughts

If I were forced to choose between constructivist, discovery, problem-based, experiential and inquiry-based approach to learning or explicit, direct instruction, I’d plump for explicit, direct instruction because the world we live in works according to discoverable principles and it makes sense to teach kids what those principles are, rather than to expect them to figure them out for themselves. However, it would have to be a forced choice, because we do learn through constructing our knowledge and through discovery, problem-solving, experiencing and inquiring as well as by explicit, direct instruction. The most appropriate learning strategy will depend on the knowledge or skill being learned.

The Kirschner, Sweller & Clark paper left me feeling perplexed and rather uneasy. I couldn’t understand why the authors frame the debate about educational approaches in terms of minimal guidance ‘on one side’ and direct instructional guidance ‘on the other’, when self-evidently the debate is more complex than that. Nor why they refer to Atkinson & Shiffrin’s model of working memory when Baddeley & Hitch’s more complex model is so widely accepted as more accurate. Nor why they omit any mention of the biological mechanisms involved in learning; not only are the biological mechanisms responsible for the way working memory and long-term memory operate, they also shed light on why any single educational approach doesn’t work for all knowledge, all skills – or even all students.

I felt it was ironic that the authors place so much emphasis on the way novices think but present a highly complex debate in binary terms – a classic feature of the way novices organise their knowledge. What was also ironic was that despite their emphasis on explicit, direct instruction, they failed to mention several important features of memory that would have helped a lay readership understand how memory works. This is all the more puzzling because some of these omissions (and a more nuanced model of instruction) are referred to in a paper on cognitive load by Paul Kirschner published four years earlier.

In order to fully understand what Kirschner, Sweller & Clark are saying, and to decide whether they were right or not, you’d need to have a fair amount of background knowledge about how brains work. To explain that clearly to a lay readership, and to address possible objections to their thesis, the authors would have had to extend the paper’s length by at least 50%. Their paper is just over 10 000 words long, suggesting that word-count issues might have resulted in them having to omit some points. That said, Educational Psychologist doesn’t currently apply a word limit, so maybe the authors were trying to keep the concepts as simple as possible.

Simplifying complex concepts for the benefit of a lay readership can certainly make things clearer, but over-simplifying them runs the risk of giving the wrong impression, and I think there’s a big risk of that happening here. Although the authors make it clear that explicit direct instruction can take many forms, they do appear to be proposing a one-size fits all approach that might not be appropriate for all knowledge, all skills or all students.

References

Clark, RE, Kirschner, PA & Sweller, J (2012). Putting students on the path to learning: The case for fully guided instruction, American Educator, Spring.

Kirschner, PA (2002). Cognitive load theory: implications of cognitive load theory on the design of learning, Learning and Instruction, 12 1–10.

Kirschner, PA, Sweller, J & Clark, RE (2006). Why Minimal Guidance During Instruction Does Not Work: An Analysis of the Failure of Constructivist, Discovery, Problem-Based, Experiential, and Inquiry-Based Teaching Educational Psychologist, 41, 75-86.