Kieran Egan’s “The educated mind” 2

The second post in a two-part review of Kieran Egan’s book The Educated Mind: How Cognitive Tools Shape our Understanding.

For Egan, a key point in the historical development of understanding was the introduction by the Greeks of a fully alphabetic representation of language – it included symbols for vowels as well as consonants. He points out that being able to represent speech accurately in writing gives people a better understanding of how they use language and therefore of the concepts that language represents. Egan attributes the flowering of Greek reasoning and knowledge to their alphabet “from which all alphabetic systems are derived” (p.75).

This claim would be persuasive if it were accurate. But it isn’t. As far as we know, the Phoenicians – renowned traders – invented the first alphabetic representation of language. It was a consonantal alphabet that reflected the structure of Semitic languages and it spread through the Middle East. The Greeks adapted it, introducing symbols for vowels. This wasn’t a stroke of genius on their part – Semitic writing systems also used symbols for vowels where required for disambiguation – but a necessary addition because Greek is an Indo-European language with a syllabic structure. The script used by the Mycenaean civilisation that preceded the Greeks was a syllabic one.

“a distinctive kind of literate thinking”

Egan argues that this alphabet enabled the Greeks to develop “extended discursive writing” that “is not an external copy of a kind of thinking that goes on in the head; it represents a distinctive kind of literate thinking” (p.76). I agree that extended discursive writing changes thinking, but I’m not convinced that it’s distinctive nor that it results from literacy.

There’s been some discussion amongst teachers recently about the claim that committing facts to long-term memory mitigates the limitations of working memory. Thorough memorisation of information certainly helps – we can recall it quickly and easily when we need it – but we can still only juggle half-a-dozen items at a time in working memory. The pre-literate and semi-literate civilisations that preceded the Greeks relied on long-term memory for the storage and transmission of information because they didn’t have an alternative. But long-term memory has its own limitations in the form of errors, biases and decay. Even people who had memorisation down to a fine art were obliged to develop writing in order to have an accurate record of things that long-term memory isn’t good at handling, such as what’s in sealed sacks and jars and how old it is. Being able to represent spoken language in writing takes things a step further. Written language not only circumvents the weaknesses of long-term memory, it helps with the limitations of working memory too. Extended discursive writing can encompass thousands of facts, ideas and arguments that a speaker and a listener would find it impossible to keep track of in conversation. So extended discursive writing doesn’t represent “a distinctive kind of literate thinking” so much as significantly extending pre-literate thinking.

the Greek miracle

It’s true that the sudden arrival in Greece of “democracy, logic, philosophy, history, drama [and] reflective introspection… were explainable in large part as an implication of the development and spread of alphabetic literacy” (p.76). But although alphabetic literacy might be a necessary condition for the “Greek miracle”, it isn’t a sufficient one.

Like all the civilisations that had preceded it, the economy of the Greek city states was predominantly agricultural, although it also supported thriving industries in mining, metalwork, leatherwork and pottery. Over time agricultural communities had figured out more efficient ways of producing, storing and trading food. Communities learn from each other, so sooner or later, one of them would produce enough surplus food to free up some of its members to focus on thinking and problem-solving, and would have the means to make a permanent record of the thoughts and solutions that emerged. The Greeks used agricultural methods employed across the Middle East, adapted the Phoenician alphabet and slavery fuelled the Greek economy as it had previous civilisations. The literate Greeks were standing on the shoulders of pre-literate Middle Eastern giants.

The ability to make a permanent record of thoughts and solutions gave the next generation of thinkers and problem-solvers a head start and created the virtuous cycle of understanding that’s continued almost unabated to the present day. I say almost unabated, because there have been periods during which it’s been impossible for communities to support thinkers and problem-solvers; earthquakes, volcanic eruptions, drought, flood, disease, war and invasion have all had a devastating and long-term impact on food production and on the infrastructure that relies on it.

language, knowledge and understanding

Egan’s types of understanding – Somatic, Mythic, Romantic, Philosophic and Ironic – have descriptive validity; they do reflect the way understanding has developed historically, and the way it develops in children. But from a causal perspective, although those phases correlate with literacy they also correlate with the complexity of knowledge. As complexity of knowledge increases, so understanding shifts from binary to scalar to systematic to the exceptions to systems; binary classifications, for example, are characteristic of the way people, however literate they are, tend to categorise knowledge in a domain that’s new to them (e.g. Lewandowski et al, 2005).

Egan doesn’t just see literacy as an important factor in the development of understanding, he frames understanding in terms of literacy. What this means is that in Egan’s framework, knowledge (notably pre-verbal and non-verbal knowledge) has to get in line behind literacy when it comes to the development of understanding. It also means that Egan overlooks the key role of agriculture and trade in the development of writing systems and of the cultures that invented them. And that apprenticeship, for millennia widely used as a means of passing on knowledge, is considered only in relation to ‘aboriginal’ cultures (p.49). And that Somatic understanding is relegated to a few pages at the end of the chapter on the Ironic.

non-verbal knowledge

These are significant oversights. Non-verbal knowledge is a sine qua non for designers, artisans, architects, builders, farmers, engineers, mariners, surgeons, physiotherapists, artists, chefs, parfumiers, musicians – the list goes on and on. It’s true that much of the knowledge associated with these occupations is transmitted verbally, but much of it can’t be transmitted through language, but acquired only by looking, listening or doing. Jenny Uglow in The Lunar Men attributes the speed at which the industrial revolution took place not to literacy, but to the development of a way to reproduce technical drawings accurately.

Egan appears sceptical about practical people and practical things because when

those who see themselves as practical people engaging in practical things [who] tend not to place any value on acquiring the abstract languages framed to deal with an order than underlies surface diversity” are “powerful in government, education departments and legislatures, pressures mount for an increasingly down-to-earth, real-world curriculum. Abstractions and theories are seen as idle, ivory-tower indulgences removed from the gritty reality of sensible life.” (p.228)

We’re all familiar with the type of people Egan refers to, and I’d agree that the purpose of education isn’t simply to produce a workforce for industry. But there are other practical people engaging in practical things who are noticeable by their absence from this book; farmers, craftspeople, traders and engineers who are very interested in abstractions, theories and the order that underlies surface diversity. The importance of knowledge that’s difficult to verbalise has significant implications for the curriculum and for the traditional academic/vocational divide. Although there is clearly a difference between ‘abstractions and theories’ and their application, theory and application are interdependent; neither is more important than the other, something that policy-makers often find difficult to grasp.

Egan acknowledges that there’s a problem with emphasising the importance of non-verbal knowledge in circles that assume that language underpins understanding. As he points out “Much modernist and postmodernist theory is built on the assumption that human understanding is essentially languaged understanding” (p.166). Egan’s framework elbows aside language to make room for non-verbal knowledge, but it’s a vague, incoherent “ineffable” sort of non-verbal knowledge that’s best expressed linguistically through irony (p.170). It doesn’t appear to include the very coherent, concrete kind of non-verbal knowledge that enables us to grow food, build bridges or carry out heart-transplants.

the internal coherence of what’s out there

Clearly, bodies of knowledge transmitted from person to person via language will be shaped by language and by the thought-processes that produce it, so the knowledge transmitted won’t be 100% complete, objective or error-free. But a vast amount of knowledge refers to what’s out there, and what’s out there has an existence independent of our thought-processes and language. What’s out there also has an internally coherent structure that becomes clearer the more we learn about it, so over time our collective bodies of knowledge more accurately reflect what’s out there and become more internally coherent despite their incompleteness, subjectivity and errors.

The implication is that in education, the internal coherence of knowledge itself should play at least some part in shaping the curriculum. But because the driving force behind Egan’s framework is literacy rather than knowledge, the internal coherence of knowledge can’t get a word in edgeways. During the Romantic phase of children’s thinking, for example, Egan recommends introducing topics randomly to induce ‘wonder and awe’ (p.218), rather than introducing them systematically to help children make sense of the world. To me this doesn’t look very different from the “gradual extension from what is already familiar” (p.86) approach of which Egan is pretty critical. I thought the chapter on Philosophic understanding might have something to say about this but it’s about how people think about knowledge rather than the internal coherence of knowledge itself – not quite the same thing.

the cherries on the straw hat of society

The sociologist Jacques Ellul once described hippies as the cherries on the straw hat of society* meaning that they were in a position to be critical of society only because of the nature of the society of which they were critical. I think this also serves as an analogy for Egan’s educational framework. He’s free to construct an educational theory framed solely in terms of literacy only because of the non-literate knowledge of practical people like farmers, craftspeople, traders and engineers. That brings me back to my original agricultural analogy; wonder and awe, like apple blossom and the aroma of hops, might make might make our experience of education and of agriculture transcendent, but if it wasn’t for coherent bodies of non-verbal knowledge and potatoes, swedes and Brussels sprouts, we wouldn’t be in a position to appreciate transcendence at all.


Lewandowski G, Gutschow A, McCartney R, Sanders K, Shinners-Kennedy D (2005). What novice programmers don’t know. Proceedings of the first international workshop on computing education research, 1-12. ACM New York, NY.

Uglow, J (2003). The Lunar Men: The Friends who made the Future. Faber & Faber.

*I can’t remember which of Ellul’s books this reference is from and can’t find it quoted anywhere. If anyone knows, I’d be grateful for the source.

Kieran Egan’s “The educated mind” 1

I grew up in a small hamlet on the edge of the English Fens. The clay soil it was built on retains nutrients and moisture, so, well-drained, it provides an ideal medium for arable farming. Arable crops aren’t very romantic. The backdrop to my childhood wasn’t acres of lacy apple blossom in spring or aromatic hops in summer, although there were a few fields of waving golden wheat. I grew up amongst potatoes, swedes and Brussels sprouts. Not romantic at all, but the produce of East Anglia has long contributed to the UK population getting through the winter.

A few weeks ago on Twitter Tim Taylor (@imagineinquiry) asked me what I thought about Kieran Egan’s book The Educated Mind: How Cognitive Tools Shape our Understanding. This book is widely cited by teachers, so I read it. It reminded me of the sticky clay and root vegetables of my childhood – because sticky clay, root vegetables and other mundane essentials are noticeable by their absence from Egan’s educational and cultural framework. For Egan, minds aren’t grounded in the earth, but in language. To me the educational model he proposes is the equivalent of clouds of apple blossom and heady hops; breathtakingly beautiful and dizzying, but only if you’ve managed to get through the winter living on swedes and potatoes. My agricultural allusion isn’t just a simile.


Egan begins by claiming there’s a crisis in mass education systems in the West due to their being shaped by three fundamentally incompatible ideas; socialisation, Plato’s concept of reason and knowledge, and Rousseau’s focus on the fulfilment of individual potential. To resolve this inherent conflict, Egan proposes an alternative educational framework based on the concept of recapitulation. Recapitulation was a popular idea in the 19th century, fuelled by the theory of evolution and the discovery that during gestation human embryos go through phases that look remarkably like transformations from simple life forms to more complex ones. As Ernst Haeckel put it ‘ontogeny recapitulates phylogeny’.

The theory of recapitulation has been largely abandoned by biologists, but is still influential in other domains. Egan applies it to the intellectual tools – the sign systems that children first encounter in others and then internalise – that Vygotsky claimed shape our understanding of the world. Egan maps the historical ‘culturally accumulated complexity in language’ onto the ways that children’s understanding changes as they get older and proposes that what children are taught and the way they are taught should be shaped by five distinct, though not always separate, phases of understanding:

Somatic; pre-linguistic understanding
Mythic; binary opposites – good/bad, strong/weak, right/wrong
Romantic; transcendent qualities – heroism, bravery, wickedness
Philosophic; the principles underlying patterns in information
Ironic; being able to challenge philosophic principles – seeing alternatives.

At first glance Egan’s arguments appear persuasive but I think they have several fundamental weaknesses, all originating in flawed implicit assumptions. First, the crisis in education.

crisis? what crisis?

I can see why a fundamentally incoherent education system might run into difficulties, but Egan observes:

“…today we are puzzled by the schools’ difficulty in providing even the most rudimentary education to students”… “the costs of…social alienation, psychological rootlessness and ignorance of the world and possibilities of human experience within it, are incalculable and heartbreaking.” (p.1)

Wait a minute. There’s no doubt that Western education systems fail to provide even the most rudimentary education for some students, but those form a tiny minority. And although some school pupils could be described as socially alienated, psychologically rootless or ignorant of the world and possibilities of human experience within it, that description wouldn’t apply to many others. So what exactly is the crisis Egan refers to? The only clue I could find was on page 2 where he describes ‘the educational ineffectiveness of our schools’ as a ‘modern social puzzle’ and defines ‘modern’ as beginning with the ‘late nineteenth century development of mass schooling’.

To claim an educational system is in crisis, you have to compare it to something. Critics often make comparisons with other nations, with the best schools (depending on how you define ‘best’) or with what they think the education system should be like. Egan appears to fall into the last category, but to overlook the fact that prior to mass schooling children did well if they manage to learn to read and write at all, and that girls and children with disabilities often didn’t get any education at all.

Critics often miss a crucial point. Mass education systems, unlike specific schools, cater for entire populations, with all their genetic variation, socio-economic fluctuations, dysfunctional families, unexpected illnesses and disruptive life events. In a recent radio interview, Tony Little headmaster of Eton College was asked if he thought the very successful Eton model could be rolled out elsewhere. He pointed out, dryly, that Eton is a highly selective school, which might just be a factor in its academic success. One obvious reason for the perceived success of schools outside state systems is that those schools are not obliged to teach whichever children happen to live nearby. Even the best education system won’t be problem-free because life is complex and problems are inextricably woven into the fabric of life itself. I’m not suggesting that we tolerate bad schools or have low aspirations. What I am suggesting is that our expectations for mass education systems need to be realistic, not based on idealised speculation.

incompatible ideas

Speculation also comes into play with regard to the incompatibility of the three ideas Egan claims shape mass education in the West. They have certainly shaped education historically and you could see them as in tension. But the ideas are incompatible only if you believe that one idea should predominate or that the aims inherent in each idea can be perfectly met. There’s no reason why schools shouldn’t inculcate social values, teach reason and knowledge and develop individual potential. Indeed, it would be difficult for any school that taught reasoning and knowledge to avoid socialisation because of the nature of schools, and in developing reasoning and knowledge children would move towards realising their potential anyway.

If, as Egan argues, Western mass education systems have been ineffective since they started, his complaint appears to be rooted in assumptions about what the system should be like rather than in evidence about its actual potential. And as long as different constituencies have different opinions about the aims of the education system, someone somewhere will be calling ‘Crisis!’. That doesn’t mean there is one. But Egan believes there is, hence his new framework. The framework is based on the development of written language and its impact on thinking and understanding. For Egan, written language marked a crucial turning point in human history.

why write?

There’s no doubt that written language is an important factor in knowledge and understanding. Spoken language enables us to communicate ideas about things that aren’t right here right now. Written language enables us to communicate with people who aren’t right here right now. The increasing sophistication of written language as it developed from pictograms to syllabaries to alphabets enabled increasingly sophisticated ideas to be communicated. But the widely held belief that language is the determining factor when it comes to knowledge and understanding is open to question.

The earliest known examples of writing were not representations of language as such but records of agricultural products; noting whether it was wheat or barley in the sacks, wine or oil in the jars, when the produce was harvested and how many sacks and jars were stored where. Early writing consisted of pictograms (images of what the symbols represent) and ideograms (symbols for ideas). It was centuries before these were to develop into the alphabetic representations of language we’re familiar with today. To understand why it took so long, we need to put ourselves in the shoes (or sandals) of the early adopters of agriculture.

food is wealth

Farming provides a more reliable food supply than hunting and gathering. Farming allows food that’s surplus to requirements to be stored in case the next harvest is a bad one, or to be traded. Surplus food enables a community to support people who aren’t directly involved in food production; rulers, administrators, artisans, traders, scribes, teachers, a militia to defend its territory. The militia has other uses too. Conquering and enslaving neighbouring peoples has for millennia been a popular way of increasing food production in order to support a complex infrastructure.

But for surplus food to be turned into wealth, storage and trade are required. Storage and trade require written records and writing is labour-intensive. While scribes are being trained and are maintaining records they can’t do much farming; writing is costly. So communities that can’t predict when a series of bad harvests will next result in them living hand-to-mouth, will focus on writing about things that are difficult to remember – what’s in a sealed container, when it was harvested etc. They won’t need to keep records of how to grow food, look after animals, histories, myths, poems or general knowledge if that information can be transmitted reliably from person to person orally. It’s only when oral transmission stops being reliable that written language as distinct from record-keeping, starts to look like a good idea. And the more you trade, the more oral transmission gets to be a problem. Travellers might need detailed written descriptions of people, places and things. Builders and engineers using imported designs or materials might need precise instructions.

Spoken language wasn’t the only driving force behind the development of written language – economic and technical factors played a significant role. I don’t think Egan gives these factors sufficient weight in his account of the development of human understanding nor in his model for education, as I explain in the next post.

seven myths about education: cognitive psychology & levels of abstraction

In her book Seven Myths about Education, Daisy Christodoulou claims that a certain set of ideas dominant in English education are misguided and presents evidence to support her claim. She says “Essentially, the evidence here is fairly straightforward and derives mostly from cognitive psychology”.

Whilst reading Daisy’s book, there were several points where I found it difficult to follow her argument despite the clarity of her writing style and the validity of the findings from cognitive psychology to which she appeals. It then occurred to me that Daisy and some of the writers she cites were using the same terminology to refer to different things, and different terminology to refer to the same thing. This is almost inevitable if you are drawing together ideas from different knowledge domains, but obviously definitions need be clarified or you end up with people misunderstanding each other.

In the next few posts, I want to compare the model of cognition that Daisy outlines with a framework for analysing knowledge that’s been proposed by researchers in several different fields. I’ve gone into some detail because of the need to clarify terms.

why cognitive psychology?

Cognitive psychology addresses the way people think, so has obvious implications for education. In Daisy’s view its findings challenge the assumptions implicit in her seven myths. In the final section of her chapter on myth 1, having recapped on what Rousseau, Dewey and Freire have to say, Daisy provides a brief introduction to cognitive psychology. Or at least to the interface between information theory and cognitive psychology in the 1960s and 70s that produced some important theoretical models of human cognition. Typically, researchers would look at how people perceived or remembered things or solved problems, infer a model that explained how the brain must have processed the information involved and would then test it by running computer simulations. Not only did this approach give some insights into how the brain worked, it also meant that software might be developed that could do some of the perceiving, remembering or problem-solving for us. At the time, there was a good deal of interest in expert systems – software that could mimic the way experts thought.

Much of the earlier work in cognitive psychology had involved the biology of the brain. Researchers knew that different parts of the brain specialised in processing different types of information, that the parts were connected by nerve fibres (neurons) activated by tiny electrical impulses. A major breakthrough came when they realised the brain wasn’t constructed like a railway network, with the nerve fibres connecting parts of the brain as a track connects stations, but in complex networks that were more like the veins in a leaf. Another breakthrough came when they realised information isn’t stored and retrieved in the form of millions of separate representations, like books in a vast library, but in the patterns of connections between the neurons. It’s like the way the same pixels on a computer monitor can display an infinite number of images, depending on which pixels are activated. A third breakthrough occurred when it was found that the brain doesn’t start off with all its neurons already connected – it creates and dissolves connections as it learns. So connections between facts and concepts aren’t just metaphorical, they are biological too.

Because it’s difficult to investigate functioning brains, computers offered a way of figuring out how information was being processed by the brain. Although this was a fruitful area of research in the 1960s and 70s, researchers kept running into difficulties. Problems arose because the human brain isn’t built like a computer; it’s more like a Heath Robinson contraption cobbled together from spare parts. It works after a fashion, and some parts of it are extremely efficient, but if you want understand how it works, you have get acquainted with its idiosyncrasies. The idiosyncrasies exist because the brain is a biological organ with all the quirky features that biological organs tend to have. Trying to figure out how it works from the way people use it has limitations; information about the biological structure and function of the brain is needed to explain why brains work in some rather odd ways.

Since the development of scanning techniques in the 1980s, the attention of cognitive science has shifted back towards the biological mechanisms involved. This doesn’t mean that the information theory approach is defunct – far from it – there’s been considerable interest in computational models of cognition and in cognitive errors and biases, for example. But the information theory and biological approaches are complementary; each approach makes more sense in the light of the other.

more than artificial intelligence

Daisy points out that “much of the modern research into intelligence was inspired and informed by research into artificial intelligence” (p.18). Yes, it was, but work on biological mechanisms, perception, attention and memory was going on simultaneously. Then “in the 1960s and 1970s researchers agreed on a basic mental model of cognition that has been refined and honed since then.” That’s one way of describing the sea change in cognitive science that’s happened since the introduction of scanning techniques, but it’s something of an understatement. Daisy then quotes Kirschner, Sweller and Clark; “ ‘working memory can be equated with consciousness’”. In a way it can, but facts and rules and digits are only a tiny fraction of what consciousness involves, though you wouldn’t know that to read Daisy’s account. Then there’s the nature of long-term memory. According to Daisy “when we try to solve any problem, we draw on all the knowledge that we have committed to long-term memory” (p.63). Yes, we do in a sense, but long-term memory is notoriously unreliable.

What Daisy didn’t say about cognitive psychology is as important as what she did say. Aside from all the cognitive research that wasn’t about artificial intelligence, Daisy fails to mention a model of working memory that’s dominated cognitive psychology for 40 years – the one proposed by Baddeley and Hitch in 1974. Recent research has shown that it’s an accurate representation of what happens in the brain. But despite being a leading authority on working memory, Baddeley gets only one mention in an endnote in Daisy’s book (the same ‘more technical’ reference that Willingham cites – also in an endnote) and isn’t mentioned at all in the Kirschner, Sweller and Clark paper. At the ResearchED conference in Birmingham in April this year, one teacher who’d given a presentation on memory told me he’d never heard of Baddeley. I’m drawing attention to this is not because have a special interest in Baddeley’s model, but because omitting his work from a body of evidence about working memory is a bit like discussing the structure of DNA without mentioning Crick and Watson’s double helix, or 19th century literature omitting Dickens. Also noticeable by her absence is Susan Gathercole, a professor of cognitive psychology at York, who researches working memory problems in children. Her work couldn’t be more relevant to education if it tried, but it’s not mentioned. Another missing name is Antonio Damasio, a neurologist who’s tackled the knotty problem of consciousness – highly relevant to working memory. Because of his background in biology, Damasio takes a strongly embodied view of consciousness; what we are aware of is affected by our physiology and emotions as well as our perceptions and memory. Daisy can’t write about everything, obviously, but it seemed odd to me that her model of cognition is drawn only from concepts central to one strand of one discipline at one period of time, not from an overview of the whole field. It was also odd that she cited secondary sources when work by people who have actually done the relevant research is readily accessible.

does this matter?

On her blog, Daisy sums up the evidence from cognitive psychology in three principles: “working memory is limited; long-term memory is powerful; and we remember what we think about”. When I’ve raised the issue of memory and cognition being more complex than Willingham’s explicitly ‘very simple’ model, teachers who support Daisy’s thesis have asked me if that makes any difference.

Other findings from cognitive psychology don’t make any difference to the three principles as they stand. Nor do they make it inappropriate for teachers to apply those principles, as they stand, to their teaching. But they do make a difference to the conclusions Daisy draws about facts, schemata and the curriculum. Whether they refute the myths or not depends on those conclusions.

a model of cognition

If I’ve understood correctly, Daisy is saying that working memory (WM) has limited capacity and limited duration, but long-term memory (LTM) has a much greater capacity and duration. If we pay attention to the information in WM, it’s stored permanently in LTM. The brain ‘chunks’ associated information in LTM, so that several smaller items can be retrieved into WM as one larger item, in effect increasing the capacity of WM. Daisy illustrates this by comparing the difficulty of recalling a string of 16 numerals


with a string of 16 letters

the cat is on the mat

The numerals are difficult to recall, but the letters are easily recalled because our brains have already chunked those frequently encountered letter patterns into words, the capacity of WM is large enough to hold six words, and once the words are retrieved we can quickly decompose them into their component letters. So in Daisy’s model, memorising information increases the amount of information WM can handle.

I was with her so far. It was the conclusions that Daisy then goes on to draw about facts, schemata and the curriculum that puzzled me. The aha! moment came when I re-read her comments on Bloom’s taxonomy of educational objectives. Bloom adopts a concept that’s important in many fields, including information theory and cognitive psychology. It’s the concept of levels of abstraction, sometimes referred to as levels of granularity.

levels of abstraction

Levels of abstraction form an integral part of some knowledge domains. Chemists are familiar with thinking about their subject at the subatomic, atomic and molecular levels; biologists with thinking about a single organism at the molecular, cellular, organ, system or whole body level; geographers and sociologists with thinking about a population at the household, city or national level. It’s important to note three things about levels of abstraction:

First, the same fundamental entities are involved at different levels of abstraction. The subatomic ‘particles’ in a bowl of common salt are the same particles whether you’re observing their behaviour as subatomic particles, as atoms of sodium and chlorine or as molecules of sodium chloride. Cells are particular arrangements of chemicals, organs are particular arrangements of cells, and the circulatory or respiratory systems are particular arrangements of organs. The same people live in households, cities or nations.

Secondly, entities behave differently at different levels of abstraction. Molecules behave differently to their component atoms (think of the differences between sodium, chlorine and sodium chloride), the organs of the body behave differently to the cells they are built from, and nations behave differently to the populations of cities and households.

Thirdly, what happens at one level of abstraction determines what happens at the next level up. Sodium chloride has its properties because it’s formed from sodium and chlorine – if you replaced the sodium with potassium you’d get a chemical compound that tastes very different to salt. And if you replaced the cells in the heart with liver cells you wouldn’t have a heart, you’d have a liver. The behaviour of nations depends on how the population is made up.

Bloom’s taxonomy

The levels of abstraction Bloom uses in his taxonomy are (starting from the bottom) knowledge, comprehension, application, analysis, synthesis and evaluation. In her model of cognition Daisy refers to several levels of abstraction, although she doesn’t call them that and doesn’t clearly differentiate between them. That might be intentional. She describes Bloom’s taxonomy as a ‘metaphor’ and says it’s a misleading one because it implies that ‘the skills are somehow separate from knowledge’ and that ‘knowledge is somehow less worthy and important’ (p.21). Whether Bloom’s taxonomy is accurate or not, it looks as if Daisy’s perception of it as a ‘metaphor’, and her focus on the current popular emphasis on higher-level skills mean that she overlooks the core principle implicit in Bloom’s taxonomy that you can’t evaluate without synthesis, or synthesise without analysis or analyse without application or apply without comprehension. And you can’t do any of those things without knowledge. The various processes are described as ‘lower’ and ‘higher’ not because a value judgement is being made about their importance or because they involve different things entirely, but because the higher ones are derived from the lower ones in the taxonomy.

It’s possible, of course, that educational theorists have also got hold of the wrong end of the stick and have seen Bloom’s six levels of abstraction not as dependent on one another but as independent from each other. Daisy’s comments on Bloom explained why I’ve had some confusing conversations with teachers about ‘skills’. I’ve been using the term in a generic sense to denote facility in handling knowledge; the teachers have been using it in the narrow sense of specific higher-level skills required by the national curriculum.

Daisy appears to be saying that the relationship between knowledge and skills isn’t hierarchical. She provides two alternative ‘metaphors’; ED Hirsch’s scrambled egg and Joe Kirby’s double helix representing the dynamic, interactive relationship between knowledge and skills (p.21). I think Joe’s metaphor is infinitely better than Hirsch’s but it doesn’t take into account the different levels of abstraction of knowledge.

Bloom’s taxonomy is a framework for analysing educational objectives that are dependent on knowledge. In the next post, I look at a framework for analysing knowledge itself.

cognitive load and learning

In the previous two posts I discussed the model of working memory used by Kirschner, Sweller & Clark and how working memory and long-term memory function. The authors emphasise that their rejection of minimal guidance approaches to teaching is based on the limited capacity of working memory in respect of novel information, and that even if experts might not need much guidance “…nearly everyone else thrives when provided with full, explicit instructional guidance (and should not be asked to discover any essential content or skills)” (Clark, Kirschner & Sweller, p.6) Whether they are right or not depends on what they mean by ‘novel’ information.

So what’s new?

Kirschner, Sweller & Clark define novel information as ‘new, yet to be learned’ information that has not been stored in long-term memory (p.77). But novelty isn’t a simple case of information either being yet–to-be-learned or stored-in-long-term memory. If I see a Russian sentence written in Cyrillic script, its novelty value to me on a scale of 1-10 would be about 9. I can recognise some Cyrillic letters and know a few Russian words, but my working memory would be overloaded after about the third letter because of the multiple operations involved in decoding, blending and translating. A random string of Arabic numerals would have a novelty value of about 4, however, because I am very familiar with Arabic numerals; the only novelty would be in their order in the string. The sentence ‘the cat sat on the mat’ would have a novelty value close to zero because I’m an expert at chunking the letter patterns in English and I’ve encountered that sentence so many times.

Because novelty isn’t an either/or thing but sits on a sliding scale, and because the information coming into working memory can vary between simple and complex, that means that ‘new, yet to be learned’ information can vary in both complexity and novelty.

You could map it on a 2×2 matrix like this;

novelty, complexity & cognitive load

novelty, complexity & cognitive load

A sentence such as ‘the monopsonistic equilibrium at M should now be contrasted with the equilibrium that would obtain under competitive conditions’ is complex (it contains many bits of information) but its novelty content would depend on the prior knowledge of the reader. It would score high on both the novelty and complexity scales of the average 5 year old. I don’t understand what the sentence means, but I do understand many of the words, so it would be mid-range in both novelty and complexity for me. An economist would probably give it a 3 for complexity but 0 for novelty. Trying to teach a 5 year-old what the sentence meant would completely overload their working memory. But it would be a manageable challenge for mine, and an economist would probably feel bored.

Kirschner, Sweller & Clark reject ‘constructivist, discovery, problem-based, experiential and inquiry-based approaches’ on the basis that they overload working memory and the excessive cognitive load means that learners don’t learn as efficiently as they would using explicit direct instruction. If only it were that simple.

‘Constructivist, discovery, problem-based, experiential and inquiry-based approaches’ weren’t adopted initially because teachers preferred them or because philosophers thought they were a good idea, but because by the end of the 19th century explicit, direct instruction – the only game in town for fledgling mass education systems – clearly wasn’t as effective as people had thought it would be. Alternative approaches were derived from three strategies that young children apply when learning ‘naturally’.

How young children learn

Human beings are mammals and young mammals learn by applying three key learning strategies which I’ll call ‘immersion’, trial-and-error and modelling (imitating the behaviour of other members of their species). By ‘strategy’, I mean an approach that they use, not that the baby mammals sit down and figure things out from first principles; all three strategies are outcomes of how mammals’ brains work.


Most young children learn to walk, talk, feed and dress themselves and acquire a vast amount of information about their environment with very little explicit, direct instruction. And they acquire those skills pretty quickly and apparently effortlessly. The theory was that if you put school age children in a suitable environment, they would pick up other skills and knowledge equally effortlessly, without the boredom of rote-learning and the grief of repeated testing. Unfortunately, what advocates of discovery, problem-based, experiential and inquiry-based learning overlooked was the sheer amount of repetition involved in young children learning ‘naturally’.

Although babies’ learning is kick-started by some hard-wired processes such as reflexes, babies have to learn to do almost everything. They repeatedly rehearse their gross motor skills, fine motor skills and sensory processing. They practice babbling, crawling, toddling and making associations at every available opportunity. They observe things and detect patterns. A relatively simple skill like face-recognition, grasping an object or rolling over might only take a few attempts. More complex skills like using a spoon, crawling or walking take more. Very complex skills like using language require many thousands of rehearsals; it’s no coincidence that children’s speech and reading ability take several years to mature and their writing ability (an even more complex skill) doesn’t usually mature until adulthood.

The reason why children don’t learn to read, do maths or learn foreign languages as ‘effortlessly’ as they learn to walk or speak in their native tongue is largely because of the number of opportunities they have to rehearse those skills. An hour a day of reading or maths and a couple of French lessons a week bears no resemblance to the ‘immersion’ in motor development and their native language that children are exposed to. Inevitably, it will take them longer to acquire those skills. And if they take an unusually long time, it’s the child, the parent, the teacher or the method of that tends to be blamed, not the mechanism by which the skill is acquired.


The second strategy is trial-and-error. It plays a key role in the rehearsals involved in immersion, because it provides feedback to the brain about how the skill or knowledge is developing. Some skills, like walking, talking or handwriting, can only be acquired through trial-and-error because of the fine-grained motor feedback that’s required. Learning by trial-and-error can offer very vivid, never-forgotten experiences, regardless of whether the initial outcome is success or failure.


The third strategy is modelling – imitating the behaviour of other members of the species (and sometimes other species or inanimate objects). In some cases, modelling is the most effective way of teaching because it’s difficult to explain (or understand) a series of actions in verbal terms.

Cognitive load

This brings us back to the issue of cognitive load. It isn’t the case that immersion, trial-and-error and modelling or discovery, problem-based, experiential and inquiry-based approaches always impose a high cognitive load, and that explicit direct instruction doesn’t. If that were true, young children would have to be actively taught to walk and talk and older ones would never forget anything. The problem with all these educational approaches is that they have all initially been seen as appropriate for teaching all knowledge and skills and have subsequently been rejected as ineffective. That’s not at all surprising, because different types of knowledge and skill require different strategies for effective learning.

Cognitive load is also affected by the complexity of incoming information and how novel it is to the learner. Nor is cognitive load confined to the capacity of working memory. 40 minutes of explicit, direct novel instruction, even if presented in well-paced working-memory-sized chunks, would pose a significant challenge to most brains. The reason, as I pointed out previously, is because the transfer of information from working memory to long-term memory is a biological process that takes time, resources and energy. Research into changes in the motor cortex suggests that the time involved might be as little as hours, but even that has implications for the pace at which students are expected to learn and how much new information they can process. There’s a reason why someone would find acquiring large amounts of new information tiring – their brain uses up a considerable amount of glucose getting that information embedded in the form of neural connections. The inevitable delay between information coming into the brain and being embedded in long-term memory suggests that down-time is as important as learning time – calling into question the assumption that the longer children spend actively ‘learning’ the more they will know.

Final thoughts

If I were forced to choose between constructivist, discovery, problem-based, experiential and inquiry-based approach to learning or explicit, direct instruction, I’d plump for explicit, direct instruction because the world we live in works according to discoverable principles and it makes sense to teach kids what those principles are, rather than to expect them to figure them out for themselves. However, it would have to be a forced choice, because we do learn through constructing our knowledge and through discovery, problem-solving, experiencing and inquiring as well as by explicit, direct instruction. The most appropriate learning strategy will depend on the knowledge or skill being learned.

The Kirschner, Sweller & Clark paper left me feeling perplexed and rather uneasy. I couldn’t understand why the authors frame the debate about educational approaches in terms of minimal guidance ‘on one side’ and direct instructional guidance ‘on the other’, when self-evidently the debate is more complex than that. Nor why they refer to Atkinson & Shiffrin’s model of working memory when Baddeley & Hitch’s more complex model is so widely accepted as more accurate. Nor why they omit any mention of the biological mechanisms involved in learning; not only are the biological mechanisms responsible for the way working memory and long-term memory operate, they also shed light on why any single educational approach doesn’t work for all knowledge, all skills – or even all students.

I felt it was ironic that the authors place so much emphasis on the way novices think but present a highly complex debate in binary terms – a classic feature of the way novices organise their knowledge. What was also ironic was that despite their emphasis on explicit, direct instruction, they failed to mention several important features of memory that would have helped a lay readership understand how memory works. This is all the more puzzling because some of these omissions (and a more nuanced model of instruction) are referred to in a paper on cognitive load by Paul Kirschner published four years earlier.

In order to fully understand what Kirschner, Sweller & Clark are saying, and to decide whether they were right or not, you’d need to have a fair amount of background knowledge about how brains work. To explain that clearly to a lay readership, and to address possible objections to their thesis, the authors would have had to extend the paper’s length by at least 50%. Their paper is just over 10 000 words long, suggesting that word-count issues might have resulted in them having to omit some points. That said, Educational Psychologist doesn’t currently apply a word limit, so maybe the authors were trying to keep the concepts as simple as possible.

Simplifying complex concepts for the benefit of a lay readership can certainly make things clearer, but over-simplifying them runs the risk of giving the wrong impression, and I think there’s a big risk of that happening here. Although the authors make it clear that explicit direct instruction can take many forms, they do appear to be proposing a one-size fits all approach that might not be appropriate for all knowledge, all skills or all students.

Clark, RE, Kirschner, PA & Sweller, J (2012). Putting students on the path to learning: The case for fully guided instruction, American Educator, Spring.
Kirschner, PA (2002). Cognitive load theory: implications of cognitive load theory on the design of learning, Learning and Instruction, 12 1–10.
Kirschner, PA, Sweller, J & Clark, RE (2006). Why Minimal Guidance During Instruction Does Not Work: An Analysis of the Failure of Constructivist, Discovery, Problem-Based, Experiential, and Inquiry-Based Teaching Educational Psychologist, 41, 75-86.

how working memory works

In my previous post I wondered why Kirschner, Sweller & Clark based their objections to minimal guidance in education on Atkinson & Schiffrin’s 1968 model of memory; it’s a model that assumes a mechanism for memory that’s now considerably out of date. A key factor in Kirschner, Sweller & Clark’s advocacy of direct instructional guidance is the limited capacity of working memory, and that’s what I want to look at in this post.

Other models are available

Atkinson & Shiffrin describe working memory as a ‘short-term store’. It has a limited capacity (around 4-9 bits of information) that it can retain for only a few seconds. It’s also a ‘buffer’; unless information in the short-term store is actively maintained, by rehearsal for example, it will be displaced by incoming information. Kirschner, Sweller & Clark note that ‘two well-known characteristics’ of working memory are its limited duration and capacity when ‘processing novel information’ (p.77), suggesting that their model of working memory is very similar to Atkinson & Shiffrin’s short-term store.


In 1974 Alan Baddeley and Graham Hitch proposed a more sophisticated model for working memory that included dedicated auditory and visual information processing components. Their model has been revised in the light of more recent discoveries relating to the function of the prefrontal areas of the brain – the location of ‘working memory’. The Baddeley and Hitch model now looks a bit more complex than Atkinson & Shiffrin’s.

Baddeley & Hitch model

Baddeley & Hitch model

You could argue that it doesn’t matter how complex working memory is, or how the prefrontal areas of the brain work; neither alters the fact that the capacity of working memory is limited. Kirschner, Sweller & Clark question the effectiveness of educational methods involving minimal guidance because they increase cognitive load beyond the capacity of working memory. But Kirschner, Sweller & Clark’s model of working memory appears to be oversimplified and doesn’t take into account the biological mechanisms involved in learning.

Biological mechanisms involved in learning

Making connections

Learning is about associating one thing with another, and making associations is what the human brain does for a living. Associations are represented in the brain by connections formed between neurons; the ‘information’ is carried in the pattern of connections. A particular stimulus will trigger a series of electrical impulses through a particular network of connected neurons. So, if I spot my cat in the garden, that sight will trigger a series of electrical impulses that activates a particular network of neurons; the connections between the neurons represent all the information I’ve ever acquired about my cat. If I see my neighbour’s cat, much of the same neural pathway will be triggered because both cats are cats, it will then diverge slightly because I have acquired different information about each cat.

Novelty value

Neurons make connections with other neurons via synapses. Our current understanding of the role of synapses in information storage and retrieval suggests that new information triggers the formation of new synapses between neurons. If the same associations are encountered repeatedly, the relevant synapses are used repeatedly and those connections between neurons are strengthened, but if synapses aren’t active for a while, they are ‘pruned’. Toddlers form huge numbers of new synapses, but from the age of three through to adulthood, the number reduces dramatically as pruning takes place. It’s not clear whether synapse formation and pruning are pre-determined developmental phases or whether they happen in response to the kind of information that the brain is processing. Toddlers are exposed to vast amounts of novel information, but novelty rapidly tails off as they get older. Older adults tend to encounter very little novel information, often complaining that they’ve ‘seen it all before’.

The way working memory works

Most of the associations made by the brain occur in the cortex, the outer layer of the brain. Sensory information processed in specialised areas of cortex is ‘chunked’ into coherent wholes – what we call ‘perception’. Perceptual information is further chunked in the frontal areas of the brain to form an integrated picture of what’s going on around and within us. The picture that’s emerging from studies of prefrontal cortex is that this area receives, attends to, evaluates and responds to information from many other areas of the brain. It can do this because patterns of the electrical activity from other brain areas are maintained in prefrontal areas for a short time whilst evaluation takes place. As Antonio Damasio points out in Descartes’ Error, the evaluation isn’t always an active, or even a conscious process; there’s no little homunculus sitting at the front of the brain figuring out what information should take priority. What does happen is that streams of incoming information compete for attention. What gets attention depends on what information is coming in at any one time. If something happens that makes you angry during a maths lesson, you’re more likely to pay attention to that than to solving equations. During an exam, you might be concentrating so hard that you are unaware of anything happening around you.

The information coming into prefrontal cortex varies considerably. There’s a constant inflow from three main sources, of:

• real-time information from the environment via the sense organs;
• information about the physiological state of the body, including emotional responses to incoming information;
• information from the neural pathways formed by previous experience and activated by that sensory and physiological input (Kirschner, Sweller & Clark would call this long-term memory).

Working memory and long-term memory

‘Information’ and models of information processing are abstract concepts. You can’t pick them up or weigh them, so it’s tempting to think of information processing in the brain as an abstract process, involving rather abstract forces like electrical impulses. It would be easy to form the impression from Kirschner, Sweller & Clark’s model that well-paced, bite-sized chunks of novel information will flow smoothly from working memory to long-term memory, like water between two tanks. But the human brain is a biological organ, and it retains and accesses information using some very biological processes. Developing new synapses involves physical changes to the structure of neurons, and those changes take time, resources and energy. I’ll return to that point later, but first I want to focus on something that Kirschner, Sweller & Clark say about the relationship between working memory and long-term memory that struck me as a bit odd;

The limitations of working memory only apply to new, yet to be learned information that has not been stored in long-term memory. New information such as new combinations of numbers or letters can only be stored for brief periods with severe limitations on the amount of such information that can be dealt with. In contrast, when dealing with previously learned information stored in long-term memory, these limitations disappear.” (p77)

This statement is odd because it doesn’t tally with Atkinson & Shiffrin’s concept of the short-term store, and isn’t supported by decades of experimental work that show that capacity limitations apply to all information in working memory, regardless of its source. But Kirschner, Sweller & Clark go on to qualify their claim;

In the sense that information can be brought back from long-term memory to working memory over indefinite periods of time, the temporal limits of working memory become irrelevant.” (p77).

I think I can see what they’re getting at; because information is stored permanently in long-term memory it doesn’t rapidly fade away and you can access it any time you need to. But you have to access it via working memory, so it’s still subject to working memory constraints. I think the authors are referring implicitly to two ways in which the brain organizes information and which increase the capacity of working memory – chunking and schemata.


If the brain frequently encounters small items of information that are usually associated with each other, it eventually ‘chunks’ them together and then processes them automatically as single units. George Miller, who in the 1950s did some pioneering research into working memory capacity, noted that people familiar with the binary notation then in widespread use by computer programmers, didn’t memorise random lists of 1s and 0s as random lists, but as numbers in the decimal system. So 10 would be remembered as 2, 100 as 8, 101 as 9 and so on. In this way, very long strings of 1s and 0s could be held in working memory in the form of decimal numbers that would automatically be translated back into 1s and 0s when the people taking part in the experiments were asked to recall the list. Morse code experts do the same; they don’t read messages as a series of dots and dashes, but chunk up the patterns of dots and dashes into letters and then into words. Exactly the same process occurs in reading, but we don’t call it chunking, we call it learning to read. Chunking effectively increases the capacity of working memory – but it doesn’t increase it by very much. Curiously, although Kirschner, Sweller & Clark refer to a paper by Egan and Schwartz that’s explicitly about chunking, they don’t mention chunking as such.


What they do mention is the concept of the schema, particularly those of chess players. In the 1940s Adriaan de Groot discovered that expert chess players memorise a vast number of configurations of chess pieces on a board; he called each particular configuration a schema. I get the impression that Kirschner, Sweller & Clark see schemata and chunking as synonymous, even though a schema usually refers to a meta-level way of organising information, like a life-script or an overview, rather than an automatic processing of several bits of information as one unit. It’s quite possible that expert chess players do automatically read each configuration of chess pieces as one unit, but de Groot didn’t call it ‘chunking’ because his research was carried out a decade before George Miller coined the term.

Thinking about everything at once

Whether you call them chunks or schemata, what’s clear is that the brain has ways of increasing the amount of information held in working memory. Expert chess players aren’t limited to thinking about the four or five possible moves for one piece, but can think about four or five possible configurations for all pieces. But it doesn’t follow that the limitations of working memory in relation to long-term memory disappear as a result.

I mentioned in my previous post what information is made accessible via my neural networks if I see an apple. If I free-associate, I think of apples – apple trees – should we cover our apple trees if it’s wet and windy after they blossom? – will there be any bees to pollinate them? – bee viruses – viruses in ancient bodies found in melted permafrost – bodies of climbers found in melted glaciers, and so on. Because my neural connections represent multiple associations I can indeed access vast amounts of information stored in my brain. But I don’t access it all simultaneously. That’s just as well, because if I could access all that information at once my attempts to decide what to do with our remaining windfall apples would be thwarted by totally irrelevant thoughts about mountain rescue teams and St Bernard dogs. In short, if information stored in long-term memory weren’t subject to the capacity constraints of working memory, we’d never get anything done.

Chess masters (or ornithologists or brain surgeons) have access to vast amounts of information, but in any given situation they don’t need to access it all at once. In fact, accessing it all at once would be disastrous because it would take forever to eliminate information they didn’t need. At any point in any chess game, only a few configurations of pieces are possible, and that number is unlikely to exceed the capacity of working memory. Similarly, even if an ornithologist/brain surgeon can recognise thousands of species of birds/types of brain injury, in any given environment, most of those species/injuries are likely to be irrelevant, so don’t even need to be considered. There’s a good reason for working memory’s limited capacity and why all the information we process is subject to that limit.

In the next post, I want to look at how the limits of working memory impact on learning.


Atkinson, R, & Shiffrin, R (1968). Human memory: A proposed system and its control processes. In K. Spence & J. Spence (Eds.), The psychology of learning and motivation (Vol. 2, pp. 89–195). New York: Academic Press
Damasio, A (1994). Descartes’ Error, Vintage Books.
Kirschner, PA, Sweller, J & Clark, RE (2006). Why Minimal Guidance During Instruction Does Not Work: An Analysis of the Failure of Constructivist, Discovery, Problem-Based, Experiential, and Inquiry-Based Teaching Educational Psychologist, 41, 75-86.