A tale of two Blobs

The think-tank Civitas has just published a 53-page pamphlet written by Toby Young and entitled ‘Prisoners of The Blob’. ‘The Blob’ for the uninitiated, is the name applied by the UK’s Secretary of State for Education, Michael Gove, to ‘leaders of the teaching unions, local authority officials, academic experts and university education departments’ described by Young as ‘opponents of educational reform’. The name’s not original. Young says it was coined by William J Bennett, a former US Education Secretary; it was also used by Chris Woodhead, first Chief Inspector of Ofsted in his book Class War.

It’s difficult to tell whether ‘The Blob’ is actually an amorphous fog-like mass whose members embrace an identical approach to education as Young claims, or whether such a diverse range of people espouse such a diverse range of views that it’s difficult for people who would like life to be nice and straightforward to understand all the differences.

Young says;

They all believe that skills like ‘problem-solving’ and ‘critical thinking’ are more important than subject knowledge; that education should be ‘child-centred’ rather than ‘didactic’ or ‘teacher-led’; that ‘group work’ and ‘independent learning’ are superior to ‘direct instruction’; that the way to interest children in a subject is to make it ‘relevant’; that ‘rote-learning’ and ‘regurgitating facts’ is bad, along with discipline, hierarchy, routine and anything else that involves treating the teacher as an authority figure. The list goes on.” (p.3)

It’s obvious that this is a literary device rather than a scientific analysis, but that’s what bothers me about it.

Initially, I had some sympathy with the advocates of ‘educational reform’. The national curriculum had a distinctly woolly appearance in places, enforced group-work and being required to imagine how historical figures must have felt drove my children to distraction, and the approach to behaviour management at their school seemed incoherent. So when I started to come across references to educational reform based on evidence, the importance of knowledge and skills being domain-specific, I was relieved. When I found that applying findings from cognitive science to education was being advocated, I got quite excited.

My excitement was short-lived. I had imagined that a community of researchers had been busily applying cognitive science findings to education, that the literatures on learning and expertise were being thoroughly mined and that an evidence-based route-map was beginning to emerge. Instead, I kept finding references to the same small group of people.

Most fields of discourse are dominated by a few individuals. Usually they are researchers responsible for significant findings or major theories. A new or specialist field might be dominated by only two or three people. The difference here is that education straddles many different fields of discourse (biology, psychology sociology, philosophy and politics, plus a range of subject areas) so I found it a bit odd that the same handful of names kept cropping up. I would have expected a major reform of the education system to have had a wider evidence base.

Evaluating the evidence

And then there was the evidence itself. I might be looking in the wrong place, but so far, although I’ve found a few references, I’ve uncovered no attempts by proponents of educational reform to evaluate the evidence they cite.

A major flaw in human thinking is confirmation bias. To represent a particular set of ideas, we develop a mental schema. Every time we encounter the same set of ideas, the neural network that carries the schema is activated. The more it’s activated, the more readily it’s activated in future. This means that any configuration of ideas that contradicts a pre-existing schema, has, almost literally, to swim against the electromagnetic tide. It’s going to take a good few reiterations of the new idea set before a strongly embedded pre-existing schema is likely to be overridden by a new one. Consequently we tend to favour evidence that confirms our existing views, and find it difficult to see things in a different way.

The best way we’ve found to counteract confirmation bias in the way we evaluate evidence is through hypothesis testing. Essentially you come up with a hypothesis and then try to disprove it. If you can’t, it doesn’t mean your hypothesis is right, it just means you can’t yet rule it out. Hypothesis testing as such is mainly used in the sciences, but the same principle underlies formal debating, the adversarial approach in courts of law, and having an opposition to government in parliament. The last two examples are often viewed as needlessly combative, when actually their job is to spot flaws in what other people are saying. How well they do that job is another matter.

It’s impossible to tell at first glance whether a small number of researchers have made a breakthrough in education theory, or whether their work is simply being cited to affirm a set of beliefs. My suspicion that it might be the latter was strengthened when I checked out the evidence.

The evidence

John Hattie conducted a meta-anlaysis of over 800 studies of student achievement. My immediate thought when I came across his work was of the well-documented problems associated with meta-analyses. Hattie does discuss these, but I’m not convinced he disposed of one key issue; the garbage-in-garbage-out problem. A major difficulty with meta-analyses is ensuring that all the studies involved use the same definitions for the constructs they are measuring; and I couldn’t find a discussion of what Hattie (or other researchers) mean by ‘achievement’. I assume that Hattie uses test scores as a proxy measure of achievement. This is fine if you think the job of schools is to ensure that children learn what somebody has decided they should learn. But that assumption poses problems. One is who determines what students should learn. Another is what happens to students who, for whatever reason, can’t learn at the same rate as the majority. And a third is how the achievement measured in Hattie’s study maps on to achievement in later life. What’s noticeable about the biographies of many ‘great thinkers’ – Darwin and Einstein are prominent examples – is how many of them didn’t do very well in school. It doesn’t follow that Hattie is wrong – Darwin and Einstein might have been even greater thinkers if their schools had adopted his recommendations – but it’s an outcome Hattie doesn’t appear to address.

Siegfreid Engelmann and Wesley C Becker developed a system called Direct Instruction System for Teaching Arithmetic and Reading (DISTAR) that was shown to be effective in Project Follow-Through – a evaluation of a number of educational approaches in the US education system over a 30 year period starting in the 1960s. There’s little doubt that Direct Instruction is more effective than many other systems at raising academic achievement and self-esteem. The problem is, again, who decides what students learn, what happens to students who don’t benefit as much as others, and what’s meant by ‘achievement’.

ED Hirsch developed the Core Knowledge sequence – essentially an off-the-shelf curriculum that’s been adapted for the UK and is available from Civitas. The US Core Knowledge sequence has a pretty obvious underlying rationale even if some might question its stance on some points. The same can’t be said of the UK version. Compare, for example, the content of US Grade 1 History and Geography with that of the UK version for Year 1. The US version includes Early People and Civilisations and the History of World Religion – all important for understanding how human geography and cultures have developed over time. The UK version focuses on British Pre-history and History (with an emphasis on the importance of literacy) followed by Kings and Queens, Prime ministers then Symbols and figures – namely the Union Jack, Buckingham Palace, 10 Downing Street and the Houses of Parliament – despite the fact that few children in Y1 are likely to understand how or why these people or symbols came to be important. Although the strands of world history and British history are broadly chronological, Y4s study Ancient Rome alongside the Stuarts, and Y6s the American Civil War potentially before the Industrial Revolution.

Daniel Willingham is a cognitive psychologist and the author of Why don’t students like school? A cognitive scientist answers questions about how the mind works and what it means for the classroom and When can you trust the experts? How to tell good science from bad in education. He also writes for a column in American Educator magazine. I found Willingham informative on cognitive psychology. However, I felt his view of education was a rather narrow one. There’s nothing wrong with applying cognitive psychology to how teachers teach the curriculum in schools – it’s just that learning and education involve considerably more than that.

Kirschner, Sweller and Clark have written several papers about the limitations of working memory and its implications for education. In my view, their analysis has three key weaknesses; they arbitrarily lump together a range of education methods as if they were essentially the same, they base their theory on an outdated and incomplete model of memory, and they conclude that only one teaching approach is effective – explicit, direct instruction – ignoring the fact that knowledge comes in different forms.

Conclusions

I agree with some of the points made by the reformers:
• I agree with the idea of evidence-based education – the more evidence the better, in my view.
• I have no problem with children being taught knowledge. I don’t subscribe to a constructivist view of education – in the sense that we each develop a unique understanding of the world and everybody’s worldview is as valid as everybody else’s – although cognitive science has shown that everybody’s construction of knowledge is unique. We know that some knowledge is more valid and/or more reliable than other knowledge and we’ve developed some quite sophisticated ways of figuring out what’s more certain and what’s less certain.
• The application of findings from cognitive science to education is long overdue.
• I have no problem with direct instruction (as distinct from Direct Instruction) per se.

However, some of what I read gave me cause for concern:
• The evidence-base presented by the reformers is limited and parts of it are weak and flawed. It’s vital to evaluate evidence, not just to cite evidence that at face-value appears to support what you already think. And a body of evidence isn’t a unitary thing; some parts of it can be sound whilst other parts are distinctly dodgy. It’s important to be able to sift through it and weigh up the pros and cons. Ignoring contradictory evidence can be catastrophic.
• Knowledge, likewise, isn’t a unitary thing; it can vary in terms of validity and reliability.
• The evidence from cognitive science also needs to be evaluated. It isn’t OK to assume that just because cognitive scientists say something it must be right; cognitive scientists certainly don’t do that. Being able to evaluate cognitive science might entail learning a fair bit about cognitive science first.
• Direct instruction, like any other educational method, is appropriate for acquiring some types of knowledge. It isn’t appropriate for acquiring all types of knowledge. The problem with approaches such as discovery learning and child-led learning is not that there’s anything inherently wrong with the approaches themselves, but that they’re not suitable for acquiring all types of knowledge.

What has struck me most forcibly about my exploration of the evidence cited by the education reformers is that, although I agree with some of the reformers’ reservations about what’s been termed ‘minimal instruction’ approaches to education, the reformers appear to be ignoring their own advice. They don’t have extensive knowledge of the relevant subject areas, they don’t evaluate the relevant evidence, and the direct instruction framework they are advocating – certainly the one Civitas is advocating – doesn’t appear to have a structure derived from the relevant knowledge domains.

Rather than a rational, evidence-based approach to education, the ‘educational reform’ movement has all the hallmarks of a belief system that’s using evidence selectively to support its cause; and that’s what worries me. This new Blob is beginning to look suspiciously like the old one.

Daisy debunks myths: or does she?

At the beginning of this month, Daisy Christodolou, star performer on University Challenge, CEO of The Curriculum Centre and a governor of the forthcoming Michaela Community School, published a book entitled Seven Myths about Education. Daisy has summarised the myths on her blog, The Wing to Heaven. There are few things I like better than seeing a myth debunked, but I didn’t rush to buy Daisy’s book. In fact I haven’t read it yet. Here’s why.

Debunking educational ‘myths’ is currently in vogue. But some of the debunkers have replaced the existing myths with new myths of their own; kind of second-order myths. The first myth is at least partly wrong, but the alternative proposed isn’t completely right either, which really doesn’t help. I’ve pointed this out previously in relation to ‘neuromyths’. One of the difficulties involved in debunking educational myths is that they are often not totally wrong, but in order to tease out what’s wrong and what’s right, you need to go into considerable detail, and busy teachers are unlikely to have the time or background knowledge to judge whether or not the criticism is valid.

Human beings have accumulated a vast body of knowledge about ourselves and the world we inhabit, which suggests strongly that the world operates according to knowable principles. It’s obviously necessary to be familiar with the structure and content of any particular knowledge domain in order to have a good understanding of it. And I agree with some of Daisy’s criticisms of current approaches to learning. So why do I feel so uneasy about what she’s proposing to put in its place?

Daisy’s claims

Daisy says she makes two claims in her book and presents evidence to support them. The claims and the evidence are:

Claim one: “that in English education, a certain set of ideas about education are predominant…” Daisy points out that it’s difficult to prove or disprove the first claim, but cites a number of sources to support it.

Claim two: “that these ideas are misguided”. Daisy says “Finding the evidence to prove the second point was relatively straightforward” and lists a number of references relating to working and long-term memory.

Daisy’s reasoning

The responses to claim one suggest that Daisy is probably right that ‘certain ideas’ are predominant in English education.

She is also broadly right when she says “it is scientifically well-established that working memory is limited and that long-term memory plays a significant role in the human intellect” – although she doesn’t define what she means by ‘intellect’.

She then says “this has clear implications for classroom practice, implications which others have made and which I was happy to recap.”

Her reasoning appears to follow that of Kirschner, Sweller & Clark, who lump together ‘constructivist, discovery, problem-based, experiential, and inquiry-based teaching’ under the heading ‘minimal instruction’ and treat them all as one. The authors then make the assumption that because some aspects of ‘minimal instruction’ might impose a high cognitive load on students, it should be discarded in favour of ‘direct instruction’ that takes into account the limitations of working memory.

This is the point at which I parted company with Daisy (and Kirschner, Sweller & Clark). Lumping together a set of complex and often loosely defined ideas and approaches to learning is hardly helpful, since it’s possible that some of their components might overload working memory, but others might not. I can see how what we know about working and long-term memory demonstrates that some aspects of the predominant ‘certain set of ideas’ might be ‘misguided’, but not how it demonstrates that they are misguided en masse.

The nature of the evidence

I also had reservations about the evidence Daisy cites in support of claim two.

First on the list is Dan Willingham’s book Why Don’t Students Like School? Willingham is a cognitive psychologist interested in applying scientific findings to education. I haven’t read his book either*, but I’ve yet to come across anything else he’s written that has appeared flawed. Why Don’t Students Like School? appears to be a reliable, accessible book written for a wide readership. So far, so good.

Next, Daisy cites Kirschner, Sweller and Clark’s paper “Why minimal guidance during instruction does not work: an analysis of the failure of constructivist, discovery, problem-based, experiential, and inquiry-based teaching”. This paper is obviously harder going than Willingham’s book, but is published in Educational Psychologist, so would be accessible to many teachers. I have several concerns about this paper and have gone through its arguments in detail.

My main reservations are;
• the simplistic way in which the pedagogical debate is presented,
• what’s left out of the discussion
• why a model of memory that’s half a century out of date is referred to.

That last point could apply to the next three items on Daisy’s list; two papers by Herb Simon, a Nobel prizewinner whose ideas have been highly influential in information theory, and one by John Anderson on his Adaptive Character of Thought model. Simon’s papers were published in 1973 and 1980 respectively, and Anderson’s in 1996 although his model dates from the 1970s.

Another feature of these papers is that they’re not easy reading – if you can actually get access to them, that is. Daisy’s links were to more links and I couldn’t get the Simon papers to open. And although Anderson’s paper is entitled ‘A simple theory of complex cognition’, what he means by that is that an apparently complex cognitive process can be explained by a simple information processing heuristic, not that his theory is easy to understand. He and Simon both write lucidly, but their material isn’t straightforward.

I completely agree with Daisy that the fundamentals of a knowledge domain don’t date – as she points out elsewhere, Pythagoras and Euripides have both stood the test of time. There’s no question that Simon’s and Anderson’s papers are key ones – for information scientists at least – and that the principles they set out have stood the test of time too. But quite why she should cite them and not more accessible material that takes into account several further decades of research into brain function, is puzzling.

It could be that there simply aren’t any publications that deal specifically with recent findings about memory and apply them to pedagogy. But even if there aren’t, it’s unlikely that most teachers would find Simon and Anderson the most accessible alternatives; for example Rita Carter’s Mapping the Mind is a beautifully illustrated, very informative description of how the brain works. (It’s worth forking out for the University of California Press edition because of the quality of the illustrations). Stanislas Dehaene’s Reading in the Brain is about reading, but is more recent and explains in more detail how the brain chunks, stores and accesses information.

It looks to me as if someone has given Daisy some key early references about working memory and she’s dutifully cited them, rather than ensuring that she has a thorough grasp of the knowledge domain of which they are part. If that’s true, it’s ironic, because having a thorough grasp of a knowledge domain is something Daisy advocates.

So Daisy’s logic is a bit flaky and her evidence base is a bit out of date. So what? The reason Daisy’s logic and evidence base are important because they form the foundation for an alternative curriculum being used by a chain of academies and a high-profile free school.

Implications for curriculum design

Daisy’s name doesn’t appear in the ‘who we are’ or ‘our advisors’ sections of The Curriculum Centre’s (supporting Future Academies) website, although their blog refers to her as their CEO. That might indicate the site simply needs updating. But disappointingly for an organisation describing itself as The Curriculum Centre their ‘complete offer – The Future Curriculum™ – is described as ‘information coming soon’, and the page about the three year KS2 curriculum is high on criticism of other approaches but low on information about itself.

Daisy is also ‘governor for knowledge’ at the Michaela Community School (headteacher Katherine Birbalsingh), a free school that’s already attracted press criticism even though it doesn’t open until September. Their curriculum page is a bit more detailed than that of The Curriculum Centre, but has some emphases that aren’t self-evident and aren’t explained, such as:

Our emphasis on traditional academic subjects will provide a solid base on which young people can build further skills and future careers, thus enabling them to grow into thinkers, authors, leaders, orators or whatever else they wish.

One has to wonder why the ‘traditional academic subjects’ don’t appear to be preparing pupils for careers with a more practical bent, such as doctors, economists or engineers.

Michaela recognises that English and Maths are fundamental to all other learning.”

No, they’re not. They are useful tools in accessing other learning, but non-English speakers who aren’t good at maths can be still be extremely knowledgeable.

Michaela Community School will teach knowledge sequentially so that the entire body of knowledge for a subject will be coherent and meaningful. The History curriculum will follow a chronological sequence of events. The English curriculum will follow a similar chronology of the history of literature, and will also build up knowledge of grammar and the parts of speech.”

The rationale for teaching history chronologically is obvious, but history is more than a sequence of events, and it’s not clear why it’s framed in that way. Nor is there an explanation for why literature should be taught chronologically. Nor why other subjects shouldn’t be. As it happens, I’m strongly in favour of structuring the curriculum chronologically, but I know from experience it’s impossible to teach English, Maths, Science, History, Geography, a modern foreign language (French/Spanish), Music and Art chronologically and in parallel because your chronology will be out of synch across the different subject areas. I’ve used a chronological curriculum with my own children and it gave them an excellent understanding of how everything connects. We started with the Big Bang and worked forward from there. But it meant that for about a year our core focus was on physics, chemistry and geography because for much of the earth’s history nothing else existed. I don’t get the impression Michaela or the Curriculum Centre have actually thought through curriculum development from first principles.

Then there was:

The Humanities curriculum at Michaela Community School will develop a chronologically secure knowledge and understanding of British, local and world history and introduce students to the origins and evolution of the major world religions and their enduring influence.”

I couldn’t help wondering why ‘British’ came before local and world history. And why highlight religions and ‘their enduring influence’? It could be that the curriculum section doesn’t summarise the curriculum very well, or it could be that there’s an agenda here that isn’t being made explicit.

I’m not convinced that Daisy has properly understood how human memory works, has used what’s been scientifically established about it to debunk any educational myths, or has thoroughly thought through its implications for classroom practice. Sorry, Daisy, but I think you need to have another go.

References
Carter, R (2010). Mapping the Mind. University of California Press.
Dehaene, S (2010). Reading in the Brain. Penguin.
Willingham, DT (2010). Why Don’t Students Like School? Jossey Bass.

* My bookshelves are groaning under the weight of books I’ve bought solely for the purpose of satisfying people who’ve told me I can’t criticise what someone’s saying until I’ve read their book. Very occasionally I come across a gem. More often than not, one can read between the lines of reviews.

all work and no play will make Jack and Jill bored and frustrated

Another educational dichotomy revealed by a recent Twitter conversation is learning vs play. Although I know people make this distinction, I found myself wondering why, traditionally, work and play have been contrasted, as in the old adage All work and no play makes Jack a dull boy, and when learning might have slipped into the place of work.

The function of play

Hunting and gathering

For many thousands of years, human beings have been hunter-gatherers. Most human infants are capable of gathering (foraging for berries, leaves, shoots etc) before they can walk, although they might need a bit of support and guidance in doing so. Hunting is a more complex skill and needs the dexterity, attentional control, tuition and rehearsal that only older children can handle.

The primary function of typical play in humans, like that seen in other mammals, is to develop the skills required to obtain food and to make sure you don’t become food for anyone else. All that chasing, hiding, running, fighting, climbing, observing, collecting and pulling things apart can make the difference between survival and starvation. Of course human beings are also social animals; hunter-gatherers forage, hunt and eat in groups because that increases everyone’s chances of survival. So humans, like many other mammals, have another characteristic in their play repertoire – mimicry. Copying the behaviour of older children and adults forms the foundation for a wide variety of adult skills not confined to acquiring food.

Hunting and gathering involves effort, but the effort is closely related to the reward of eating. The delay between expending the effort and eating the food is rarely more than a few hours, and in foraging, the food immediately follows the effort. The effort could be described as work, and a child who’s poking an anthill or fighting another child when they should be gathering or hunting could be considered to be playing as opposed to working, but the play of hunter-gatherer children is so closely aligned to their ‘work’, and the consequences of playing rather than working are so obvious, that the distinction between play and work is rather blurred.

Farming

For a few thousands of years, human beings have been farmers. Farming has advantages over hunting and gathering, which is why it’s been so widely adopted. It increases food security considerably, especially in areas that experience cold or dry seasons, because surplus food can be produced and stored for future use. It also reduces,but doesn’t eliminate, the risk of territorial conflict – having to compete for food with another tribe.

In contrast to hunting and gathering, farming involves a great deal of effort that isn’t immediately rewarded. There’s a delay of months or even years before food results from the effort expended to produce it. Human children, like other mammals, aren’t good at delayed gratification. In addition, their default play patterns, apart from mimicry, don’t closely resemble the skills needed to produce food via agriculture. Ploughing, sowing, irrigating, weeding, protecting, harvesting and storing food involve hard, repetitive effort for no immediate reward to an extent that rarely occurs in hunter-gatherer societies. In addition, farming requires a lot of equipment – tools, containers, buildings, furniture etc, also requiring repetitive effort in their manufacture and maintenance. Communities that survive by subsistence farming can do so only if children do some of the work; they don’t have the spare capacity to allow children to spend their childhood only in play. This means that for farming communities, there’s a clear divide between children’s play and the work involved in producing food.

Industrialisation

In England, subsistence farming was a way of life for thousands of years. As the population increased, pressure was put on land use, and areas of common land used for grazing animals, were increasingly ‘enclosed’ – landowners were given legal rights to take them out of public use. Following the Enclosure Acts of the late 18th/early 19th centuries, thousands of families found they didn’t have access to enough land to sustain themselves. They couldn’t survive by making and selling goods either, because of competition from the mass-production of cheap items in factories, made possible by the invention of the steam engine.

This double-whammy resulted in a mass migration to towns and cities to find work, which often consisted of hard, repetitive, dangerous labour in factories, or, because of the huge increase in demand for coal, in mines. Child labour was in great demand because it was cheap and plentiful, and many families couldn’t survive without their children’s earnings. Working in factories or in coal mines put children’s health in jeopardy. Previous generations of children working on the family smallholding might have found the work boring and repetitive and unpaid, but, poor harvests aside, would have had a reasonably good diet, plenty of fresh air and exercise and free time to pay with their friends. In industrial settings, children were working for twelve hours or more a day in dangerous environments, and, in the case of mines, almost complete darkness. The opportunity to play became a luxury.

Education

The terrible working conditions for children didn’t last that long; a series of Factory Acts in the 19th century were followed by the 1870 Education Act which made education compulsory, and further legislation made it free of charge. Increasing prosperity (as a result of the industrial revolution, ironically) meant that most communities had sufficient resources to allow children to spend their childhood learning rather than working.

Learning vs play

Not everybody saw education in the same light, however. For some at one extreme, education was a means to an end; it produced a literate, numerate workforce that would increase national and individual prosperity. For others, education offered a once-in-a-lifetime opportunity to be archetypally human; to be free of responsibility and engage only in learning and play – what children do naturally anyway. Not surprisingly, many popular children’s authors (popular because of the increase in child literacy) subscribed to the latter view, including Mark Twain, Louisa M Alcott, Lucy M Montgomery, Edith Nesbitt, Enid Blyton and CS Lewis.

Education has essentially been dominated by these two viewpoints ever since; the ‘traditionalists’ on the one hand and the ‘progressives’ on the other. It’s easy to see how the clear distinction between work and play that emerged with the advent of agriculture, and that became even more stark in the industrial revolution, could carry over into education. And how in some quarters, learning might be seen as children’s ‘work’.

In highly developed industrialised societies, the default play patterns of hunter-gatherers bear little resemblance to the skills children will need in later life. But children’s play is very versatile; they observe, mimic and learn from whatever they see around them, they experiment with technology and become skilled in using it. Children are still ‘doing it for themselves’ as they always have done. The informal education they would get if they didn’t attend school would still provide them, as it has for millennia, with the knowledge and skills they would need to survive as adults.

Of course for most people survival isn’t enough. The lives of people in communities that ‘survive’ tend to be nasty, brutish and short, and most people don’t want a life like that. The best way we’ve found to improve our quality of life and standard of living beyond ‘survival’ is to increase the efficiency with which we produce food, goods and services. In theory, at least, this frees up time to find ways of improving our quality of life further. In practice, the costs and benefits of increased efficiency tend to be rather unevenly distributed, with some people bearing most of the costs and others enjoying most of the benefits, but that’s another story.

The best way we’ve found to improve efficiency is for communities to have access to the knowledge we’ve acquired about how the world works. It isn’t necessary for everyone to know all about everything; what is necessary is for people have access to knowledge as and when they need it. Having said that, childhood and adolescence present a golden opportunity, before the responsibilities of adulthood kick in, to ensure that everyone has a good basic knowledge about how the world works.

Learning

A core characteristic of learning is the acquisition of new information in the form of knowledge and/or skills. But human beings aren’t robots; acquiring knowledge isn’t simply a matter of feeding in the knowledge, pressing a button and off we go. We are biological organisms; acquiring knowledge changes our brains via a biological process and it’s a process that takes time and that varies between individuals.

Play

One of the ways in which humans naturally acquire, assimilate and apply new knowledge is through play. A core characteristic of play is that it isn’t directly related to what you do to survive. Play essentially consists of rehearsing and experimentally applying knowledge and skills in a safe environment – one where the outcomes of your rehearsal and experimentation are unlikely to end in disaster.

The amount learning in play varies. Sometimes the play can consist almost entirely of learning – repetition of knowledge or skills until perfect, for example. Sometimes there’s very little learning – the play is primarily for rest and relaxation. And rest and relaxation play can provide the ‘down-time’ the brain needs in order for new information to be assimilated.

Young humans play more than older ones because they have more new knowledge and skills to assimilate and experiment with, and their play tends to incorporate more learning. For very young children all play is learning.

Older humans tend to play for rest and relaxation purposes because they don’t have to acquire so much knowledge. They do learn through play, but it often isn’t recognised as such; it’s ‘kicking an idea around’ or imagining different scenarios, or experimenting with new knowledge in different configurations. In other words learning through play in adults is often seen as a corollary of work – what you get paid to do – not as play per se.

What emerges from this is that construing learning and play as different things and assuming that children and young people must either be learning or playing, is not a valid way of classifying learning and play. Learning can include play and play can include learning. Since play is one of the ways through which human beings learn anyway, it makes sense to incorporate it into learning rather than to see it as something that distracts from learning.

Kirschner, Sweller & Clark: a summary of my critique

It’s important not just to know things, but to understand them, which is why I took three posts to explain my unease about the paper by Kirschner, Sweller & Clark. From the responses I’ve received I appear to have overstated my explanation but understated my key points, so for the benefit of anybody unable or unwilling to read all the words, here’s a summary.

1. I have not said that Kirschner, Sweller & Clark are wrong to claim that working memory has a limited capacity. I’ve never come across any evidence that says otherwise. My concerns are about other things.

2. The complex issue of approaches to learning and teaching is presented as a two-sided argument. Presenting complex issues in an oversimplified way invariably obscures rather than clarifies the debate.

3. The authors appeal to a model of working memory that’s almost half a century old, rather than one revised six years before their paper came out and widely accepted as more accurate. Why would they do that?

4. They give the distinct impression that long-term memory isn’t subject to working memory constraints, when it is very much subject to them.

5. They completely omit any mention of the biological mechanisms involved in processing information. Understanding the mechanisms is key if you want to understand how people learn.

6. They conclude that explicit, direct instruction is the only viable teaching approach based on the existence of a single constraining factor – the capacity of working memory to process yet-to-be learned information (though exactly what they mean by yet-to-be learned isn’t explained). In a process as complex as learning, it’s unlikely that there will be only one constraining factor.

Kirschner, Sweller & Clark appear to have based their conclusion on a model of memory that was current in the 1970s (I know because that’s when I first learned about it), to have ignored subsequent research, and to have oversimplified the picture at every available opportunity.

What also concerns me is that some teachers appear to be taking what Kirschner, Sweller & Clark say at face value, without making any attempt to check the accuracy of their model, to question their presentation of the problem or the validity of their conclusion. There’s been much discussion recently about ‘neuromyths’. Not much point replacing one set of neuromyths with another.

Reference
Kirschner, PA, Sweller, J & Clark, RE (2006). Why Minimal Guidance During Instruction Does Not Work: An Analysis of the Failure of Constructivist, Discovery, Problem-Based, Experiential, and Inquiry-Based Teaching Educational Psychologist, 41, 75-86.

cognitive load and learning

In the previous two posts I discussed the model of working memory used by Kirschner, Sweller & Clark and how working memory and long-term memory function. The authors emphasise that their rejection of minimal guidance approaches to teaching is based on the limited capacity of working memory in respect of novel information, and that even if experts might not need much guidance “…nearly everyone else thrives when provided with full, explicit instructional guidance (and should not be asked to discover any essential content or skills)” (Clark, Kirschner & Sweller, p.6) Whether they are right or not depends on what they mean by ‘novel’ information.

So what’s new?

Kirschner, Sweller & Clark define novel information as ‘new, yet to be learned’ information that has not been stored in long-term memory (p.77). But novelty isn’t a simple case of information either being yet–to-be-learned or stored-in-long-term memory. If I see a Russian sentence written in Cyrillic script, its novelty value to me on a scale of 1-10 would be about 9. I can recognise some Cyrillic letters and know a few Russian words, but my working memory would be overloaded after about the third letter because of the multiple operations involved in decoding, blending and translating. A random string of Arabic numerals would have a novelty value of about 4, however, because I am very familiar with Arabic numerals; the only novelty would be in their order in the string. The sentence ‘the cat sat on the mat’ would have a novelty value close to zero because I’m an expert at chunking the letter patterns in English and I’ve encountered that sentence so many times.

Because novelty isn’t an either/or thing but sits on a sliding scale, and because the information coming into working memory can vary between simple and complex, that means that ‘new, yet to be learned’ information can vary in both complexity and novelty.

You could map it on a 2×2 matrix like this;

novelty, complexity & cognitive load

novelty, complexity & cognitive load

A sentence such as ‘the monopsonistic equilibrium at M should now be contrasted with the equilibrium that would obtain under competitive conditions’ is complex (it contains many bits of information) but its novelty content would depend on the prior knowledge of the reader. It would score high on both the novelty and complexity scales of the average 5 year old. I don’t understand what the sentence means, but I do understand many of the words, so it would be mid-range in both novelty and complexity for me. An economist would probably give it a 3 for complexity but 0 for novelty. Trying to teach a 5 year-old what the sentence meant would completely overload their working memory. But it would be a manageable challenge for mine, and an economist would probably feel bored.

Kirschner, Sweller & Clark reject ‘constructivist, discovery, problem-based, experiential and inquiry-based approaches’ on the basis that they overload working memory and the excessive cognitive load means that learners don’t learn as efficiently as they would using explicit direct instruction. If only it were that simple.

‘Constructivist, discovery, problem-based, experiential and inquiry-based approaches’ weren’t adopted initially because teachers preferred them or because philosophers thought they were a good idea, but because by the end of the 19th century explicit, direct instruction – the only game in town for fledgling mass education systems – clearly wasn’t as effective as people had thought it would be. Alternative approaches were derived from three strategies that young children apply when learning ‘naturally’.

How young children learn

Human beings are mammals and young mammals learn by applying three key learning strategies which I’ll call ‘immersion’, trial-and-error and modelling (imitating the behaviour of other members of their species). By ‘strategy’, I mean an approach that they use, not that the baby mammals sit down and figure things out from first principles; all three strategies are outcomes of how mammals’ brains work.

Immersion

Most young children learn to walk, talk, feed and dress themselves and acquire a vast amount of information about their environment with very little explicit, direct instruction. And they acquire those skills pretty quickly and apparently effortlessly. The theory was that if you put school age children in a suitable environment, they would pick up other skills and knowledge equally effortlessly, without the boredom of rote-learning and the grief of repeated testing. Unfortunately, what advocates of discovery, problem-based, experiential and inquiry-based learning overlooked was the sheer amount of repetition involved in young children learning ‘naturally’.

Although babies’ learning is kick-started by some hard-wired processes such as reflexes, babies have to learn to do almost everything. They repeatedly rehearse their gross motor skills, fine motor skills and sensory processing. They practice babbling, crawling, toddling and making associations at every available opportunity. They observe things and detect patterns. A relatively simple skill like face-recognition, grasping an object or rolling over might only take a few attempts. More complex skills like using a spoon, crawling or walking take more. Very complex skills like using language require many thousands of rehearsals; it’s no coincidence that children’s speech and reading ability take several years to mature and their writing ability (an even more complex skill) doesn’t usually mature until adulthood.

The reason why children don’t learn to read, do maths or learn foreign languages as ‘effortlessly’ as they learn to walk or speak in their native tongue is largely because of the number of opportunities they have to rehearse those skills. An hour a day of reading or maths and a couple of French lessons a week bears no resemblance to the ‘immersion’ in motor development and their native language that children are exposed to. Inevitably, it will take them longer to acquire those skills. And if they take an unusually long time, it’s the child, the parent, the teacher or the method of that tends to be blamed, not the mechanism by which the skill is acquired.

Trial-and-error

The second strategy is trial-and-error. It plays a key role in the rehearsals involved in immersion, because it provides feedback to the brain about how the skill or knowledge is developing. Some skills, like walking, talking or handwriting, can only be acquired through trial-and-error because of the fine-grained motor feedback that’s required. Learning by trial-and-error can offer very vivid, never-forgotten experiences, regardless of whether the initial outcome is success or failure.

Modelling

The third strategy is modelling – imitating the behaviour of other members of the species (and sometimes other species or inanimate objects). In some cases, modelling is the most effective way of teaching because it’s difficult to explain (or understand) a series of actions in verbal terms.

Cognitive load

This brings us back to the issue of cognitive load. It isn’t the case that immersion, trial-and-error and modelling or discovery, problem-based, experiential and inquiry-based approaches always impose a high cognitive load, and that explicit direct instruction doesn’t. If that were true, young children would have to be actively taught to walk and talk and older ones would never forget anything. The problem with all these educational approaches is that they have all initially been seen as appropriate for teaching all knowledge and skills and have subsequently been rejected as ineffective. That’s not at all surprising, because different types of knowledge and skill require different strategies for effective learning.

Cognitive load is also affected by the complexity of incoming information and how novel it is to the learner. Nor is cognitive load confined to the capacity of working memory. 40 minutes of explicit, direct novel instruction, even if presented in well-paced working-memory-sized chunks, would pose a significant challenge to most brains. The reason, as I pointed out previously, is because the transfer of information from working memory to long-term memory is a biological process that takes time, resources and energy. Research into changes in the motor cortex suggests that the time involved might be as little as hours, but even that has implications for the pace at which students are expected to learn and how much new information they can process. There’s a reason why someone would find acquiring large amounts of new information tiring – their brain uses up a considerable amount of glucose getting that information embedded in the form of neural connections. The inevitable delay between information coming into the brain and being embedded in long-term memory suggests that down-time is as important as learning time – calling into question the assumption that the longer children spend actively ‘learning’ the more they will know.

Final thoughts

If I were forced to choose between constructivist, discovery, problem-based, experiential and inquiry-based approach to learning or explicit, direct instruction, I’d plump for explicit, direct instruction because the world we live in works according to discoverable principles and it makes sense to teach kids what those principles are, rather than to expect them to figure them out for themselves. However, it would have to be a forced choice, because we do learn through constructing our knowledge and through discovery, problem-solving, experiencing and inquiring as well as by explicit, direct instruction. The most appropriate learning strategy will depend on the knowledge or skill being learned.

The Kirschner, Sweller & Clark paper left me feeling perplexed and rather uneasy. I couldn’t understand why the authors frame the debate about educational approaches in terms of minimal guidance ‘on one side’ and direct instructional guidance ‘on the other’, when self-evidently the debate is more complex than that. Nor why they refer to Atkinson & Shiffrin’s model of working memory when Baddeley & Hitch’s more complex model is so widely accepted as more accurate. Nor why they omit any mention of the biological mechanisms involved in learning; not only are the biological mechanisms responsible for the way working memory and long-term memory operate, they also shed light on why any single educational approach doesn’t work for all knowledge, all skills – or even all students.

I felt it was ironic that the authors place so much emphasis on the way novices think but present a highly complex debate in binary terms – a classic feature of the way novices organise their knowledge. What was also ironic was that despite their emphasis on explicit, direct instruction, they failed to mention several important features of memory that would have helped a lay readership understand how memory works. This is all the more puzzling because some of these omissions (and a more nuanced model of instruction) are referred to in a paper on cognitive load by Paul Kirschner published four years earlier.

In order to fully understand what Kirschner, Sweller & Clark are saying, and to decide whether they were right or not, you’d need to have a fair amount of background knowledge about how brains work. To explain that clearly to a lay readership, and to address possible objections to their thesis, the authors would have had to extend the paper’s length by at least 50%. Their paper is just over 10 000 words long, suggesting that word-count issues might have resulted in them having to omit some points. That said, Educational Psychologist doesn’t currently apply a word limit, so maybe the authors were trying to keep the concepts as simple as possible.

Simplifying complex concepts for the benefit of a lay readership can certainly make things clearer, but over-simplifying them runs the risk of giving the wrong impression, and I think there’s a big risk of that happening here. Although the authors make it clear that explicit direct instruction can take many forms, they do appear to be proposing a one-size fits all approach that might not be appropriate for all knowledge, all skills or all students.

References
Clark, RE, Kirschner, PA & Sweller, J (2012). Putting students on the path to learning: The case for fully guided instruction, American Educator, Spring.
Kirschner, PA (2002). Cognitive load theory: implications of cognitive load theory on the design of learning, Learning and Instruction, 12 1–10.
Kirschner, PA, Sweller, J & Clark, RE (2006). Why Minimal Guidance During Instruction Does Not Work: An Analysis of the Failure of Constructivist, Discovery, Problem-Based, Experiential, and Inquiry-Based Teaching Educational Psychologist, 41, 75-86.

how working memory works

In my previous post I wondered why Kirschner, Sweller & Clark based their objections to minimal guidance in education on Atkinson & Schiffrin’s 1968 model of memory; it’s a model that assumes a mechanism for memory that’s now considerably out of date. A key factor in Kirschner, Sweller & Clark’s advocacy of direct instructional guidance is the limited capacity of working memory, and that’s what I want to look at in this post.

Other models are available

Atkinson & Shiffrin describe working memory as a ‘short-term store’. It has a limited capacity (around 4-9 bits of information) that it can retain for only a few seconds. It’s also a ‘buffer’; unless information in the short-term store is actively maintained, by rehearsal for example, it will be displaced by incoming information. Kirschner, Sweller & Clark note that ‘two well-known characteristics’ of working memory are its limited duration and capacity when ‘processing novel information’ (p.77), suggesting that their model of working memory is very similar to Atkinson & Shiffrin’s short-term store.

Slide1

In 1974 Alan Baddeley and Graham Hitch proposed a more sophisticated model for working memory that included dedicated auditory and visual information processing components. Their model has been revised in the light of more recent discoveries relating to the function of the prefrontal areas of the brain – the location of ‘working memory’. The Baddeley and Hitch model now looks a bit more complex than Atkinson & Shiffrin’s.

Baddeley & Hitch model

Baddeley & Hitch model

You could argue that it doesn’t matter how complex working memory is, or how the prefrontal areas of the brain work; neither alters the fact that the capacity of working memory is limited. Kirschner, Sweller & Clark question the effectiveness of educational methods involving minimal guidance because they increase cognitive load beyond the capacity of working memory. But Kirschner, Sweller & Clark’s model of working memory appears to be oversimplified and doesn’t take into account the biological mechanisms involved in learning.

Biological mechanisms involved in learning

Making connections

Learning is about associating one thing with another, and making associations is what the human brain does for a living. Associations are represented in the brain by connections formed between neurons; the ‘information’ is carried in the pattern of connections. A particular stimulus will trigger a series of electrical impulses through a particular network of connected neurons. So, if I spot my cat in the garden, that sight will trigger a series of electrical impulses that activates a particular network of neurons; the connections between the neurons represent all the information I’ve ever acquired about my cat. If I see my neighbour’s cat, much of the same neural pathway will be triggered because both cats are cats, it will then diverge slightly because I have acquired different information about each cat.

Novelty value

Neurons make connections with other neurons via synapses. Our current understanding of the role of synapses in information storage and retrieval suggests that new information triggers the formation of new synapses between neurons. If the same associations are encountered repeatedly, the relevant synapses are used repeatedly and those connections between neurons are strengthened, but if synapses aren’t active for a while, they are ‘pruned’. Toddlers form huge numbers of new synapses, but from the age of three through to adulthood, the number reduces dramatically as pruning takes place. It’s not clear whether synapse formation and pruning are pre-determined developmental phases or whether they happen in response to the kind of information that the brain is processing. Toddlers are exposed to vast amounts of novel information, but novelty rapidly tails off as they get older. Older adults tend to encounter very little novel information, often complaining that they’ve ‘seen it all before’.

The way working memory works

Most of the associations made by the brain occur in the cortex, the outer layer of the brain. Sensory information processed in specialised areas of cortex is ‘chunked’ into coherent wholes – what we call ‘perception’. Perceptual information is further chunked in the frontal areas of the brain to form an integrated picture of what’s going on around and within us. The picture that’s emerging from studies of prefrontal cortex is that this area receives, attends to, evaluates and responds to information from many other areas of the brain. It can do this because patterns of the electrical activity from other brain areas are maintained in prefrontal areas for a short time whilst evaluation takes place. As Antonio Damasio points out in Descartes’ Error, the evaluation isn’t always an active, or even a conscious process; there’s no little homunculus sitting at the front of the brain figuring out what information should take priority. What does happen is that streams of incoming information compete for attention. What gets attention depends on what information is coming in at any one time. If something happens that makes you angry during a maths lesson, you’re more likely to pay attention to that than to solving equations. During an exam, you might be concentrating so hard that you are unaware of anything happening around you.

The information coming into prefrontal cortex varies considerably. There’s a constant inflow from three main sources, of:

• real-time information from the environment via the sense organs;
• information about the physiological state of the body, including emotional responses to incoming information;
• information from the neural pathways formed by previous experience and activated by that sensory and physiological input (Kirschner, Sweller & Clark would call this long-term memory).

Working memory and long-term memory

‘Information’ and models of information processing are abstract concepts. You can’t pick them up or weigh them, so it’s tempting to think of information processing in the brain as an abstract process, involving rather abstract forces like electrical impulses. It would be easy to form the impression from Kirschner, Sweller & Clark’s model that well-paced, bite-sized chunks of novel information will flow smoothly from working memory to long-term memory, like water between two tanks. But the human brain is a biological organ, and it retains and accesses information using some very biological processes. Developing new synapses involves physical changes to the structure of neurons, and those changes take time, resources and energy. I’ll return to that point later, but first I want to focus on something that Kirschner, Sweller & Clark say about the relationship between working memory and long-term memory that struck me as a bit odd;

The limitations of working memory only apply to new, yet to be learned information that has not been stored in long-term memory. New information such as new combinations of numbers or letters can only be stored for brief periods with severe limitations on the amount of such information that can be dealt with. In contrast, when dealing with previously learned information stored in long-term memory, these limitations disappear.” (p77)

This statement is odd because it doesn’t tally with Atkinson & Shiffrin’s concept of the short-term store, and isn’t supported by decades of experimental work that show that capacity limitations apply to all information in working memory, regardless of its source. But Kirschner, Sweller & Clark go on to qualify their claim;

In the sense that information can be brought back from long-term memory to working memory over indefinite periods of time, the temporal limits of working memory become irrelevant.” (p77).

I think I can see what they’re getting at; because information is stored permanently in long-term memory it doesn’t rapidly fade away and you can access it any time you need to. But you have to access it via working memory, so it’s still subject to working memory constraints. I think the authors are referring implicitly to two ways in which the brain organizes information and which increase the capacity of working memory – chunking and schemata.

Chunking

If the brain frequently encounters small items of information that are usually associated with each other, it eventually ‘chunks’ them together and then processes them automatically as single units. George Miller, who in the 1950s did some pioneering research into working memory capacity, noted that people familiar with the binary notation then in widespread use by computer programmers, didn’t memorise random lists of 1s and 0s as random lists, but as numbers in the decimal system. So 10 would be remembered as 2, 100 as 8, 101 as 9 and so on. In this way, very long strings of 1s and 0s could be held in working memory in the form of decimal numbers that would automatically be translated back into 1s and 0s when the people taking part in the experiments were asked to recall the list. Morse code experts do the same; they don’t read messages as a series of dots and dashes, but chunk up the patterns of dots and dashes into letters and then into words. Exactly the same process occurs in reading, but we don’t call it chunking, we call it learning to read. Chunking effectively increases the capacity of working memory – but it doesn’t increase it by very much. Curiously, although Kirschner, Sweller & Clark refer to a paper by Egan and Schwartz that’s explicitly about chunking, they don’t mention chunking as such.

Schemata

What they do mention is the concept of the schema, particularly those of chess players. In the 1940s Adriaan de Groot discovered that expert chess players memorise a vast number of configurations of chess pieces on a board; he called each particular configuration a schema. I get the impression that Kirschner, Sweller & Clark see schemata and chunking as synonymous, even though a schema usually refers to a meta-level way of organising information, like a life-script or an overview, rather than an automatic processing of several bits of information as one unit. It’s quite possible that expert chess players do automatically read each configuration of chess pieces as one unit, but de Groot didn’t call it ‘chunking’ because his research was carried out a decade before George Miller coined the term.

Thinking about everything at once

Whether you call them chunks or schemata, what’s clear is that the brain has ways of increasing the amount of information held in working memory. Expert chess players aren’t limited to thinking about the four or five possible moves for one piece, but can think about four or five possible configurations for all pieces. But it doesn’t follow that the limitations of working memory in relation to long-term memory disappear as a result.

I mentioned in my previous post what information is made accessible via my neural networks if I see an apple. If I free-associate, I think of apples – apple trees – should we cover our apple trees if it’s wet and windy after they blossom? – will there be any bees to pollinate them? – bee viruses – viruses in ancient bodies found in melted permafrost – bodies of climbers found in melted glaciers, and so on. Because my neural connections represent multiple associations I can indeed access vast amounts of information stored in my brain. But I don’t access it all simultaneously. That’s just as well, because if I could access all that information at once my attempts to decide what to do with our remaining windfall apples would be thwarted by totally irrelevant thoughts about mountain rescue teams and St Bernard dogs. In short, if information stored in long-term memory weren’t subject to the capacity constraints of working memory, we’d never get anything done.

Chess masters (or ornithologists or brain surgeons) have access to vast amounts of information, but in any given situation they don’t need to access it all at once. In fact, accessing it all at once would be disastrous because it would take forever to eliminate information they didn’t need. At any point in any chess game, only a few configurations of pieces are possible, and that number is unlikely to exceed the capacity of working memory. Similarly, even if an ornithologist/brain surgeon can recognise thousands of species of birds/types of brain injury, in any given environment, most of those species/injuries are likely to be irrelevant, so don’t even need to be considered. There’s a good reason for working memory’s limited capacity and why all the information we process is subject to that limit.

In the next post, I want to look at how the limits of working memory impact on learning.

References

Atkinson, R, & Shiffrin, R (1968). Human memory: A proposed system and its control processes. In K. Spence & J. Spence (Eds.), The psychology of learning and motivation (Vol. 2, pp. 89–195). New York: Academic Press
Damasio, A (1994). Descartes’ Error, Vintage Books.
Kirschner, PA, Sweller, J & Clark, RE (2006). Why Minimal Guidance During Instruction Does Not Work: An Analysis of the Failure of Constructivist, Discovery, Problem-Based, Experiential, and Inquiry-Based Teaching Educational Psychologist, 41, 75-86.

memories are made of this

Education theory appears to be dominated by polarised debates. I’ve just come across another; minimal guidance vs direct instruction. Harry Webb has helpfully brought together what he calls the Kirschner, Sweller & Clark cycle of papers that seem to encapsulate it. The cycle consists of papers by these authors and responses to them, mostly published in Educational Psychologist during 2006-7.

Kirschner, Sweller & Clark are opposed to minimal guidance approaches in education and base their case on the structure of human cognitive architecture. As they rightly observe “Any instructional procedure that ignores the structures that constitute human cognitive architecture is not likely to be effective” (p.76). I agree completely, so let’s have a look at the structures of human cognitive architecture they’re referring to.

Older models

Kirschner, Sweller & Clark claim that “Most modern treatments of human cognitive architecture use the Atkinson and Shiffrin (1968) sensory memory–working memory–long-term memory model as their base” (p.76).

That depends on how you define ‘using a model as a base’. Atkinson and Shiffrin’s model is 45 years old. 45 years is a long time in the fast-developing field of brain research, so claiming that modern treatments use it as their base is a bit like claiming that modern treatments of blood circulation are based on William Harvey’s work (1628) or that modern biological classification is based on Carl Linnaeus’ system (1735). It would be true to say that modern treatments are derived from those models, but our understanding of circulation and biological classification has changed significantly since then, so the early models are almost invariably referred to only in an historical context. A modern treatment of cognitive architecture might mention Atkinson & Shiffrin if describing the history of memory research, but I couldn’t see why anyone would use it as a base for an educational theory – because the reality has turned out to be a lot more complicated than Atkinson and Shiffrin could have known at the time.

Atkinson and Shiffrin’s model was influential because it provided a coherent account of some apparently contradictory research findings about the characteristics of human memory. It was also based on the idea that features of information processing systems could be universally applied; that computers worked according to the same principles as did the nervous systems of sea slugs or the human brain. That idea wasn’t wrong, but the features of information processing systems have turned out to be a bit more complex than was first imagined.

The ups and downs of analogies

Theoretical models are rather like analogies; they are useful in explaining a concept that might otherwise be difficult for people to grasp. Atkinson and Shiffrin’s model essentially made the point that human memory wasn’t a single thing that behaved in puzzlingly different ways in different circumstances, but that it could have three components, each of which behaved consistently but differently.

But there’s a downside to analogies (and theoretical models); sometimes people forget that analogies are for illustrative purposes only, and that models show what hypotheses need to be tested. So they remember the analogy/model and forget what it’s illustrating, or they assume the analogy/model is an exact parallel of the reality, or, as I think has happened in this case, the analogy/model takes on a life of its own.

You can read most of Atkinson & Shiffrin’s chapter about their model here. There’s a diagram on p. 113. Atkinson and Shiffrin’s model is depicted as consisting of three boxes. One box is the ‘sensory register’ – sensory memory that persists for a very short time and then fades away. The second box is a short-term store with a very limited capacity (5-9 bits of information) that can retain that information for a few seconds. The third box is a long-term store, where information is retained indefinitely. The short-term and long-term stores are connected to each other and information can be transferred between them in both directions. The model based on what was known in 1968 about how memory behaved, but Atkinson and Shiffrin are quite explicit that there was a lot that wasn’t known.

Memories are made of this

Anyone looking at Atkinson & Shiffrin’s model for the first time could be forgiven for thinking that the long-term memory ‘store’ is like a library where memories are kept. That was certainly how many people thought about memory at the time. One of the problems with that way of thinking about memory is that the capacity required to store all the memories that people clearly do store, would exceed the number of cells in the brain and that accessing the memories by systematically searching through them would take a very long time – which it often doesn’t.

This puzzle was solved by the gradual realisation that the brain didn’t store individual memories in one place as if they were photographs in a huge album, but that ‘memories’ were activated via a vast network of interconnected neurons. A particular stimulus would activate a particular part of the neural network and that activation is the ‘memory’.

For example, if I see an apple, the pattern of light falling on my retina will trigger a chain of electrical impulses that activates all the neurons that have previously been activated in response to my seeing an apple. Or hearing about or reading about or eating apples. I will recall other apples I’ve seen, how they smell and taste, recipes that use apples, what the word ‘apple’ sounds like, how it’s spelled and written, ‘apple’ in other languages etc. That’s why memories can (usually) be retrieved so quickly. You don’t have to search through all memories to find the one you want. As Antonio Damasio puts it;

Images are not stored as facsimile pictures of things, or events or words, or sentences…In brief, there seem to be no permanently held pictures of anything, even miniaturized, no microfiches or microfilms, no hard copies… as the British psychologist Frederic Bartlett noted several decades ago, when he first proposed that memory is essentially reconstructive.” (p.100)

But Atkinson and Shiffrin don’t appear to have thought of memory in this way when they developed their model. Their references to ‘store’ and ‘search’ suggest they saw memory as more of a library than a network. That’s also how Kirschner, Sweller & Clark seem to view it. Although they say “our understanding of the role of long-term memory in human cognition has altered dramatically over the last few decades” (p.76), they repeatedly refer to long-term memory as a ‘store’ ‘containing huge amounts of information’. I think that description is misleading. Long-term memory is a property of neural networks – if any information is ‘stored’ it’s stored in the pattern and strength of the connections between neurons.

This is especially noticeable in the article the authors published in 2012 in American Educator from which it’s difficult not to draw the conclusion that long term memory is a store that contains many thousands of schemas, rather than a highly flexible network of connections that can be linked in an almost infinite number of ways.

Where did I put my memory?

In the first paper I mentioned, Kirschner, Sweller & Clark also refer to long-term memory and working memory as ‘structures’. Although they could mean ‘configurations’, the use of ‘structures’ does give the impression that there’s a bit of the brain dedicated to storing information long-term and another where it’s just passing through. Although some parts of the brain do have dedicated functions, those localities should be thought of as localities within a network of neurons. Information isn’t stored in particular locations in the brain, it’s distributed across it, although particular connections are located in particular places in the brain.

Theories having a life of their own

Atkinson and Shiffrin’s model isn’t exactly wrong; human memory does encompass short-lived sensory traces, short-term buffering and information that’s retained indefinitely. But implicit in their model are some assumptions about the way memory functions that have been superseded by later research.

At first I couldn’t figure out why anyone would base an educational theory on an out-dated conceptual model. Then it occurred to me that that’s exactly what’s happened in respect of theories about child development and autism. In both cases, someone has come up with a theory based on Freud’s ideas about children. Freud’s ideas in turn were based on his understanding of genetics and how the brain worked. Freud died in 1939, over a decade before the structure of DNA was discovered, and two decades before we began to get a detailed understanding of how brains process information. But what happened to the theories of child development and autism based on Freud’s understanding of genetics and brain function, is that they developed an independent existence and carried on regardless, instead of constantly being revised in the light of new understandings of genetics and brain function. Theories dominating autism research are finally being presented with a serious challenge from geneticists, but child development theories still have some way to go. Freud did a superb job with the knowledge available to him, but that doesn’t mean it’s a good idea to base a theory on his ideas as if new understandings of genetics and brain function haven’t happened.

Again I completely agree with Kirschner, Sweller & Clark that “any instructional procedure that ignores the structures that constitute human cognitive architecture is not likely to be effective”, but basing an educational theory on one aspect of human cognitive architecture – memory – and on an outdated concept of memory at that, is likely to be counterproductive.

A Twitter discussion of the Kirschner, Sweller & Clark model centred around the role of working memory, which is what I plan to tackle in my next post.

References

Atkinson, R, & Shiffrin, R (1968). Human memory: A proposed system and its control processes. In K. Spence & J. Spence (Eds.), The psychology of learning and motivation (Vol. 2, pp. 89–195). New York: Academic Press
Clark, RE, Kirschner, PA & Sweller, J (2012). Putting students on the path to learning: The case for fully guided instruction, American Educator, Spring.
Damasio, A (1994). Descartes’ Error, Vintage Books.
Kirschner, PA, Sweller, J & Clark, RE (2006). Why Minimal Guidance During Instruction Does Not Work: An Analysis of the Failure of Constructivist, Discovery, Problem-Based, Experiential, and Inquiry-Based Teaching Educational Psychologist, 41, 75-86.