evolved minds and education: evolved minds

At the recent Australian College of Educators conference in Melbourne, John Sweller summarised his talk as follows:  “Biologically primary, generic-cognitive skills do not need explicit instruction.  Biologically secondary, domain-specific skills do need explicit instruction.”

sweller.png

Biologically primary and biologically secondary cognitive skills

This distinction was proposed by David Geary, a cognitive developmental and evolutionary psychologist at the University of Missouri. In a recent blogpost, Greg Ashman refers to a chapter by Geary that sets out his theory in detail.

If I’ve understood it correctly, here’s the idea at the heart of Geary’s model:

*****

The cognitive processes we use by default have evolved over millennia to deal with information (e.g. about predators, food sources) that has remained stable for much of that time. Geary calls these biologically primary knowledge and abilities. The processes involved are fast, frugal, simple and implicit.

But we also have to deal with novel information, including knowledge we’ve learned from previous generations, so we’ve evolved flexible mechanisms for processing what Geary terms biologically secondary knowledge and abilities. The flexible mechanisms are slow, effortful, complex and explicit/conscious.

Biologically secondary processes are influenced by an underlying factor we call general intelligence, or g, related to the accuracy and speed of processing novel information. We use biologically primary processes by default, so they tend to hinder the acquisition of the biologically secondary knowledge taught in schools. Geary concludes the best way for students to acquire the latter is through direct, explicit instruction.

*****

On the face of it, Geary’s model is a convincing one.   The errors and biases associated with the cognitive processes we use by default do make it difficult for us to think logically and rationally. Children are not going to automatically absorb the body of human knowledge accumulated over the centuries, and will need to be taught it actively. Geary’s model is also coherent; its components make sense when put together. And the evidence he marshals in support is formidable; there are 21 pages of references.

However, on closer inspection the distinction between biologically primary and secondary knowledge and abilities begins to look a little blurred. It rests on some assumptions that are the subject of what Geary terms ‘vigorous debate’. Geary does note the debate, but because he plumps for one view, doesn’t evaluate the supporting evidence, and doesn’t go into detail about competing theories, teachers unfamiliar with the domains in question could easily remain unaware of possible flaws in his model. In addition, Geary adopts a particular cultural frame of reference; essentially that of a developed, industrialised society that places high value on intellectual and academic skills. There are good reasons for adopting that perspective; and equally good reasons for not doing so. In a series of three posts, I plan to examine two concepts that have prompted vigorous debate – modularity and intelligence – and to look at Geary’s cultural frame of reference.

Modularity

The concept of modularity – that particular parts of the brain are dedicated to particular functions – is fundamental to Geary’s model.   Physicians have known for centuries that some parts of the brain specialise in processing specific information. Some stroke patients for example, have been reported as being able to write but no longer able to read (alexia without agraphia), to be able to read symbols but not words (pure alexia), or to be unable to recall some types of words (anomia). Language isn’t the only ability involving specialised modules; different areas of the brain are dedicated to processing the visual features of, for example, faces, places and tools.

One question that has long perplexed researchers is how modular the brain actually is. Some functions clearly occur in particular locations and in those locations only; others appear to be more distributed. In the early 1980s, Jerry Fodor tackled this conundrum head-on in his book The modularity of mind. What he concluded is that at the perceptual and linguistic level functions are largely modular, i.e. specialised and stable, but at the higher levels of association and ‘thought’ they are distributed and unstable.  This makes sense; you’d want stability in what you perceive, but flexibility in what you do with those perceptions.

Geary refers to the ‘vigorous debate’ (p.12) between those who lean towards specialised brain functions being evolved and modular, and those who see specialised brain functions as emerging from interactions between lower-level stable mechanisms. Although he acknowledges the importance of interaction and emergence during development (pp. 14,18) you wouldn’t know that from Fig 1.2, showing his ‘evolved cognitive modules’.

At first glance, Geary’s distinction between stable biologically primary functions and flexible biologically secondary functions appears to be the same as Fodor’s stable/unstable distinction. But it isn’t.  Fodor’s modules are low-level perceptual ones; some of Geary’s modules in Fig. 1.2 (e.g. theory of mind, language, non-verbal behaviour) engage frontal brain areas used for the flexible processing of higher-level information.

Novices and experts; novelty and automation

Later in his chapter, Geary refers to research involving these frontal brain areas. Two findings are particularly relevant to his modular theory. The first is that frontal areas of the brain are initially engaged whilst people are learning a complex task, but as the task becomes increasingly automated, frontal area involvement decreases (p.59). Second, research comparing experts’ and novices’ perceptions of physical phenomena (p.69) showed that if there is a conflict between what people see and their current schemas, frontal areas of their brains are engaged to resolve the conflict. So, when physics novices are shown a scientifically accurate explanation, or when physics experts are shown a ‘folk’ explanation, both groups experience conflict.

In other words, what’s processed quickly, automatically and pre-consciously is familiar, overlearned information. If that familiar and overlearned information consists of incomplete and partially understood bits and pieces that people have picked up as they’ve gone along, errors in their ‘folk’ psychology, biology and physics concepts (p.13) are unsurprising. But it doesn’t follow that there must be dedicated modules in the brain that have evolved to produce those concepts.

If the familiar overlearned information is, in contrast, extensive and scientifically accurate, the ‘folk’ concepts get overridden and the scientific concepts become the ones that are accessed quickly, automatically and pre-consciously. In other words, the line between biologically primary and secondary knowledge and abilities might not be as clear as Geary’s model implies.  Here’s an example; the ability to draw what you see.

The eye of the beholder

Most of us are able to recognise, immediately and without error, the face of an old friend, the front of our own house, or the family car. However, if asked to draw an accurate representation of those items, even if they were in front of us at the time, most of us would struggle. That’s because the processes involved in visual recognition are fast, frugal, simple and implicit; they appear to be evolved, modular systems. But there are people can draw accurately what they see in front of them; some can do so ‘naturally’, others train themselves to do so, and still others are taught to do so via direct instruction.  It looks as if the ability to draw accurately straddles Geary’s biologically primary and secondary divide.  The extent to which modules are actually modular is further called into question by recent research involving the fusiform face area (FFA).

Fusiform face area

The FFA is one of the visual processing areas of the brain. It specialises in processing information about faces. What wasn’t initially clear to researchers was whether it processed information about faces only, or whether faces were simply a special case of the type of information it processes. There was considerable debate about this until a series of experiments found that various experts used their FFA for differentiating subtle visual differences within classes of items as diverse as birds, cars, chess configurations, x-ray images, Pokémon, and objects named ‘greebles’ invented by researchers.

What these experiments tell us is that an area of the brain apparently dedicated to processing information about faces, is also used to process information about modern artifacts with features that require fine-grained differentiation in order to tell them apart. They also tell us that modules in the brain don’t seem to draw a clear line between biologically primary information such as faces (no explicit instruction required), and biologically secondary information such as x-ray images or fictitious creatures (where initial explicit instruction is required).

What the experiments don’t tell us is whether the FFA evolved to process information about faces and is being co-opted to process other visually similar information, or whether it evolved to process fine-grained visual distinctions, of which faces happen to be the most frequent example most people encounter.

We know that brain mechanisms have evolved and that has resulted in some modular processing. What isn’t yet clear is exactly how modular the modules are, or whether there is actually a clear divide between biologically primary and biologically secondary abilities. Another component of Geary’s model about which there has been considerable debate is intelligence – the subject of the next post.

Incidentally, it would be interesting to know how Sweller developed his summary because it doesn’t quite map on to a concept of modularity in which the cognitive skills are anything but generic.

References

Fodor, J (1983).  The modularity of mind.  MIT Press.

Geary, D (2007).  Educating the evolved mind: Conceptual foundations for an evolutionary educational psychology, in Educating the evolved mind: Conceptual foundations for an evolutionary educational psychology, JS Carlson & JR Levin (Eds). Information Age Publishing.

Acknowledgements

I thought the image was from @greg_ashman’s Twitter timeline but can’t now find it.  Happy to acknowledge correctly if notified.

synthetic phonics, dyslexia and natural learning

Too intense a focus on the virtues of synthetic phonics (SP) can, it seems, result in related issues getting a bit blurred. I discovered that some whole language supporters do appear to have been ideologically motivated but that the whole language approach didn’t originate in ideology. And as far as I can tell we don’t know if SP can reduce adult functional illiteracy rates. But I wouldn’t have known either of those things from the way SP is framed by its supporters. SP proponents also make claims about how the brain is involved in reading. In this post I’ll look at two of them; dyslexia and natural learning.

Dyslexia

Dyslexia started life as a descriptive label for the reading difficulties adults can develop due to brain damage caused by a stroke or head injury. Some children were observed to have similar reading difficulties despite otherwise normal development. The adults’ dyslexia was acquired (they’d previously been able to read) but the children’s dyslexia was developmental (they’d never learned to read). The most obvious conclusion was that the children also had brain damage – but in the early 20th century when the research started in earnest there was no easy way to determine that.

Medically, developmental dyslexia is still only a descriptive label meaning ‘reading difficulties’ (causes unknown, might/might not be biological, might vary from child to child). However, dyslexia is now also used to denote a supposed medical condition that causes reading difficulties. This new usage is something that Diane McGuinness complains about in Why Children Don’t Learn to Read.

I completely agree with McGuinness that this use isn’t justified and has led to confusion and unintended and unwanted outcomes. But I think she muddies the water further by peppering her discussion of dyslexia (pp. 132-140) with debatable assertions such as:

“We call complex human traits ‘talents’”.

“Normal variation is on a continuum but people working from a medical or clinical model tend to think in dichotomies…”.

“Reading is definitely not a property of the human brain”.

“If reading is a biological property of the brain, transmitted genetically, then this must have occurred by Lamarckian evolution.”

Why debatable? Because complex human traits are not necessarily ‘talents’; clinicians tend to be more aware of normal variation than most people; reading must be a ‘property of the brain’ if we need a brain to read; and the research McGuinness refers to didn’t claim that ‘reading’ was transmitted genetically.

I can understand why McGuinness might be trying to move away from the idea that reading difficulties are caused by a biological impairment that we can’t fix. After all, the research suggests SP can improve the poor phonological awareness that’s strongly associated with reading difficulties. I get the distinct impression, however, that she’s uneasy with the whole idea of reading difficulties having biological causes. She concedes that phonological processing might be inherited (p.140) but then denies that a weakness in discriminating phonemes could be due to organic brain damage. She’s right that brain scans had revealed no structural brain differences between dyslexics and good readers. And in scans that show functional variations, the ability to read might be a cause, rather than an effect.

But as McGuinness herself points out reading is a complex skill involving many brain areas, and biological mechanisms tend to vary between individuals. In a complex biological process there’s a lot of scope for variation. Poor phonological awareness might be a significant factor, but it might not be the only factor. A child with poor phonological awareness plus visual processing impairments plus limited working memory capacity plus slow processing speed – all factors known to be associated with reading difficulties – would be unlikely to find those difficulties eliminated by SP alone. The risk in conceding that reading difficulties might have biological origins is that using teaching methods to remediate them might then called into question – just what McGuinness doesn’t want to happen, and for good reason.

Natural and unnatural abilities

McGuinness’s view of the role of biology in reading seems to be derived from her ideas about the origin of skills. She says;

It is the natural abilities of people that are transmitted genetically, not unnatural abilities that depend upon instruction and involve the integration of many subskills”. (p.140, emphasis McGuinness)

This is a distinction often made by SP proponents. I’ve been told that children don’t need to be taught to walk or talk because these abilities are natural and so develop instinctively and effortlessly. Written language, in contrast, is a recent man-made invention; there hasn’t been time to evolve a natural mechanism for reading, so we need to be taught how to do it and have to work hard to master it. Steven Pinker, who wrote the foreword to Why Children Can’t Read seems to agree. He says “More than a century ago, Charles Darwin got it right: language is a human instinct, but written language is not” (p.ix).

Although that’s a plausible model, what Pinker and McGuinness fail to mention is that it’s also a controversial one. The part played by nature and nurture in the development of language (and other abilities) has been the subject of heated debate for decades. The reason for the debate is that the relevant research findings can be interpreted in different ways. McGuinness is entitled to her interpretation but it’s disingenuous in a book aimed at a general readership not to tell readers that other researchers would disagree.

Research evidence suggests that the natural/unnatural skills model has got it wrong. The same natural/unnatural distinction was made recently in the case of part of the brain called the fusiform gyrus. In the fusiform gyrus, visual information about objects is categorised. Different types of objects, such as faces, places and small items like tools, have their own dedicated locations. Because those types of objects are naturally occurring, researchers initially thought their dedicated locations might be hard-wired.

But there’s also word recognition area. And in experts, the faces area is also used for cars, chess positions, and specially invented items called greebles. To become an expert in any of those things you require some instruction – you’d need to learn the rules of chess or the names of cars or greebles. But your visual system can still learn to accurately recognise, discriminate between and categorise many thousands of items like faces, places, tools, cars, chess positions and greebles simply through hours and hours of visual exposure.

Practice makes perfect

What claimants for ‘natural’ skills also tend to overlook is how much rehearsal goes into them. Most parents don’t actively teach children to talk, but babies hear and rehearse speech for many months before they can say recognisable words. Most parents don’t teach toddlers to walk, but it takes young children years to become fully stable on their feet despite hours of daily practice.

There’s no evidence that as far as the brain is concerned there’s any difference between ‘natural’ and ‘unnatural’ knowledge and skills. How much instruction and practice knowledge or skills require will depend on their transparency and complexity. Walking and bike-riding are pretty transparent; you can see what’s involved by watching other people. But they take a while to learn because of the complexity of the motor-co-ordination and balance involved. Speech and reading are less transparent and more complex than walking and bike-riding, so take much longer to master. But some children require intensive instruction in order to learn to speak, and many children learn to read with minimal input from adults. The natural/unnatural distinction is a false one and it’s as unhelpful as assuming that reading difficulties are caused by ‘dyslexia’.

Multiple causes

What underpins SP proponents’ reluctance to admit biological factors as causes for reading difficulties is, I suspect, an error often made when assessing cause and effect. It’s an easy one to make, but one that people advocating changes to public policy need to be aware of.

Let’s say for the sake of argument that we know, for sure, that reading difficulties have three major causes, A, B and C. The one that occurs most often is A. We can confidently predict that children showing A will have reading difficulties. What we can’t say, without further investigation, is whether a particular child’s reading difficulties are due to A. Or if A is involved, that it’s the only cause.

We know that poor phonological awareness is frequently associated with reading difficulties. Because SP trains children to be aware of phonological features in speech, and because that training improves word reading and spelling, it’s a safe bet that poor phonological awareness is also a cause of reading difficulties. But because reading is a complex skill, there are many possible causes for reading difficulties. We can’t assume that poor phonological awareness is the only cause, or that it’s a cause in all cases.

The evidence that SP improves children’s decoding ability is persuasive. However, the evidence also suggests that 12% – 15% of children will still struggle to learn to decode using SP. And that around 15% of children will struggle with reading comprehension. Having a method of reading instruction that works for most children is great, but education should benefit all children, and since the minority of children who struggle are the ones people keep complaining about, we need to pay attention to what causes reading difficulties for those children – as individuals. In education, one size might fit most, but it doesn’t fit all.

Reference

McGuinness, D. (1998). Why Children Can’t Read and What We Can Do About It. Penguin.

truth and knowledge

A couple of days ago I became embroiled in a long-running Twitter debate about the nature of truth and knowledge, during which at least one person fell asleep. @EdSacredProfane has asked me where I ‘sit’ on truth. So, for the record, here’s what I think about truth and knowledge.

1. I think it’s safe to assume that reality and truth are out there. Even if they’re not out there and we’re all experiencing a collective hallucination we might as well assume that reality is real and that truth is true because if we don’t, our experience – whether real or imagined – is likely to get pretty unpleasant.

2. I’m comfortable with the definition of knowledge as justified true belief. But that’s a definition of an abstract concept. The extent to which people can actually justify or demonstrate the truth of their beliefs (collectively or individually) varies considerably.

3. The reason for this is the way perception works. All incoming sensory information is interpreted by our brains, and brains aren’t entirely reliable when it comes to interpreting sensory information. So we’ve devised methods of cross-checking what our senses tell us to make sure we haven’t got it disastrously wrong. One approach is known as the scientific method.

4. Science works on the basis of probability. We can never say for sure that A or B exists or that C definitely causes D. But for the purposes of getting on with our lives if there’s enough evidence suggesting that A or B exists and that C causes D, we assume those things to be true and justified to varying extents.

5. Even though our perception is a bit flaky and we can’t be 100% sure of anything, it doesn’t follow that reality is flaky or not 100% real. Just that our knowledge about it isn’t 100% reliable. The more evidence we’ve gathered, the more consistent and predictable reality looks. Unfortunately it’s also complicated, which, coupled with our flaky and uncertain perceptions, makes life challenging.

seven myths about education: deep structure

deep structure and understanding

Extracting information from data is crucially important for learning; if we can’t spot patterns that enable us to identify changes and make connections and predictions, no amount of data will enable us to learn anything. Similarly, spotting patterns within and between facts enables us to identify changes and connections and make predictions will help us understand how the world works. Understanding is a concept that crops up a lot in information theory and education. Several of the proposed hierarchies of knowledge have included the concept of understanding – almost invariably at or above the knowledge level of the DIKW pyramid. Understanding is often equated with what’s referred to as the deep structure of knowledge. In this post I want to look at deep structure in two contexts; when it involves a small number of facts, and when it involves a very large number, as in an entire knowledge domain.

When I discussed the DIKW pyramid, I referred to information being extracted from a ‘lower’ level of abstraction to form a ‘higher’ one. Now I’m talking about ‘deep’ structure. What’s the difference, if any? The concept of deep structure comes from the field of linguistics. The idea is that you can say the same thing in different ways; the surface features of what you say might be different, but the deep structure of the statements could still be the same. So the sentences ‘the cat is on the mat’ and ‘the mat is under the cat’ have different surface features but the same deep structure. Similarly, ‘the dog is on the box’ and ‘the box is under the dog’ share the same deep structure. From an information-processing perspective the sentences about the dog and the cat share the same underlying schema.

In the DIKW knowledge hierarchy, extracted information is at a ‘higher’ level, not a ‘deeper’ one. The two different terminologies are used because the concepts of ‘higher’ level extraction of information and ‘deep’ structure have different origins, but essentially they are the same thing. All you need to remember is that in terms of information-processing ‘high’ and ‘deep’ both refer to the same vertical dimension – which term you use depends on your perspective. Higher-level abstractions, deep structure and schemata refer broadly to the same thing.

deep structure and small numbers of facts

Daniel Willingham devotes an entire chapter of his book Why don’t students like school? to the deep structure of knowledge when addressing students’ difficulty in understanding abstract ideas. Willingham describes mathematical problems presented in verbal form that have different surface features but the same deep structure – in his opening example they involve the calculation of the area of a table top and of a soccer pitch (Willingham, p.87). What he is referring to is clearly the concept of a schema, though he doesn’t call it that.

Willingham recognises that students often struggle with deep structure concepts and recommends providing them with many examples and using analogies they’re are familiar with. These strategies would certainly help, but as we’ve seen previously, because the surface features of facts aren’t consistent in terms of sensory data, students’ brains are not going to spot patterns automatically and pre-consciously in the way they do with consistent low-level data and information. To the human brain, a cat on a mat is not the same as a dog on a box. And a couple trying to figure out whether a dining table would be big enough involves very different sensory data to that involved in a groundsman working out how much turf will be needed for a new football pitch.

Willingham’s problems involve several levels of abstraction. Note that the levels of abstraction only provide an overall framework, they’re not set in stone; I’ve had to split the information level into two to illustrate how information needs to be extracted at several successive levels before students can even begin to calculate the area of the table or the football pitch. The levels of abstraction are;

• data – the squiggles that make up letters and the sounds that make up speech
• first-order information – letters and words (chunked)
• second-order information – what the couple is trying to do and what the groundsman is trying to do (not chunked)
• knowledge – the deep structure/schema underlying each problem.

To anyone familiar with calculating area, the problems are simple ones; to anyone unfamiliar with the schema involved, they impose a high cognitive load because the brain is trying to juggle information about couples, tables, groundsmen and football pitches and can’t see the forest for the trees. Most brains would require quite a few examples before they had enough information to be able to spot the two patterns, so it’s not surprising that students who haven’t had much practical experience of buying tables, fitting carpets, painting walls or laying turf take a while to cotton on.

visual vs verbal representations

What might help students further is making explicit the deep structure of groups of facts with the help of visual representations. Visual representations have one huge advantage over verbal representations. Verbal representations, by definition, are processed sequentially – you can only say, hear or read one word at a time. Most people can process verbal information at the same rate at which they hear it or read it, so most students will be able to follow what a teacher is saying or what they are reading, even if it takes a while to figure out what the teacher or the book are getting at. However, if you can’t process verbal information quickly enough, can’t recall earlier sentences whilst processing the current one, miss a word, or don’t understand a crucial word or concept, it will be impossible to make sense of the whole thing. In visual representations, you can see all the key units of information at a glance, most of the information can be processed in parallel and the underlying schema is more obvious.

The concept of calculating area lends itself very well to visual representation; it is a geometry problem after all. Getting the students to draw a diagram of each problem would not only focus their attention on the deep structure rather than its surface features, it would also demonstrate clearly that problems with different surface features can have the same underlying deep structure.

It might not be so easy to make visual representations of the deep structure of other groups of facts, but it’s an approach worth trying because it makes explicit the deep structure of the relationship between the facts. In Seven Myths about Education, one of Daisy’s examples of a fact is the date of the battle of Waterloo. Battles are an excellent example of deep structure/schemata in action. There is a large but limited number of ways two opposing forces can position themselves in battle, whoever they are and whenever and wherever they are fighting, which is why ancient battles are studied by modern military strategists. The configurations of forces and what subsequent configurations are available to them are very similar to the configurations of pieces and next possible moves in chess. Of course chess began as a game of military strategy – as a visual representation of the deep structure of battles.

Deep structure/underlying schemata are a key factor in other domains too. Different atoms and different molecules can share the same deep structure in their bonding and reactions and chemists have developed formal notations for representing that visually; the deep structure of anatomy and physiology can be the same for many different animals – biologists rely heavily on diagrams to convey deep structure information. Historical events and the plots of plays can follow similar patterns even if the events occurred or the plays were written thousands of years apart. I don’t know how often history or English teachers use visual representations to illustrate the deep structure of concepts or groups of facts, but it might help students’ understanding.

deep structure of knowledge domains

It’s not just single facts or small groups of facts that have a deep structure or underlying schema. Entire knowledge domains have a deep structure too, although not necessarily in the form of a single schema; many connected schemata might be involved. How they are connected will depend on how experts arrange their knowledge or how much is known about a particular field.

Making students aware of the overall structure of a knowledge domain – especially if that’s via a visual representation so they can see the whole thing at once – could go a long way to improving their understanding of whatever they happen to be studying at any given time. It’s like the difference between Google Street View and Google Maps. Google Street View is invaluable if you’re going somewhere you’ve never been before and you want to see what it looks like. But Google Maps tells you where you are in relation to where you want to be – essential if you want to know how to get there. Having a mental map of an entire knowledge domain shows you how a particular fact or group of facts fits in to the big picture, and also tells you how much or how little you know.

Daisy’s model of cognition

Daisy doesn’t go into detail about deep structure or schemata. She touches on these concepts only a few times; once in reference to forming a chronological schema of historical events, then when referring to Joe Kirby’s double-helix metaphor for knowledge and skills and again when discussing curriculum design.

I don’t know if Daisy emphasises facts but downplays deep structure and schemata to highlight the point that the educational orthodoxy does essentially the opposite, or whether she doesn’t appreciate the importance of deep structure and schemata compared to surface features. I suspect it’s the latter. Daisy doesn’t provide any evidence to support her suggestion that simply memorising facts reduces cognitive load when she says;

“So when we commit facts to long-term memory, they actually become part of our thinking apparatus and have the ability to expand one of the biggest limitations of human cognition”(p.20).

The examples she refers to immediately prior to this assertion are multiplication facts that meet the criteria for chunking – they are simple and highly consistent and if they are chunked they’d be treated as one item by working memory. Whether facts like the dates of historical events meet the criteria for chunking or whether they occupy less space in working memory when memorised is debatable.

What’s more likely is that if more complex and less consistent facts are committed to memory, they are accessed more quickly and reliably than those that haven’t been memorised. Research evidence suggests that neural connections that are activated frequently become stronger and are accessed faster. Because information is carried in networks of neural connections, the more frequently we access facts or groups of facts, the faster and more reliably we will be able to access them. That’s a good thing. It doesn’t follow that those facts will occupy less space in working memory.

It certainly isn’t the case that simply committing to memory hundreds or thousands of facts will enable students to form a schema, or if they do, that it will be the schema their teacher would like them to form. Teachers might need to be explicit about the schemata that link facts. Since hundreds or thousands of facts tend to be linked by several different schemata – you can arrange the same facts in different ways – being explicit about the different ways they can be linked might be crucial to students’ understanding.

Essentially, deep structure schemata play an important role in three ways;

Students’ pre-existing schemata will affect their understanding of new information – they will interpret it in the light of the way they currently organise their knowledge. Teachers need to know about common misunderstandings as well as what they want students to understand.

Secondly, being able to identify the schema underlying one fact or small group of facts is the starting point for spotting similarities and differences between several groups of facts.

Thirdly, having a bird’s-eye view of the schemata involved in an entire knowledge domain increases students’ chances of understanding where a particular fact fits in to the grand scheme of things – and their awareness of what they don’t know.

Having a bird’s-eye view of the curriculum can help too, because it can show how different subject areas are linked. Subject areas and the curriculum are the subjects of the next post.

seven myths about education: facts and schemata

Knowledge occupies the bottom level of Bloom’s taxonomy of educational objectives. In the 1950s, Bloom and his colleagues would have known a good deal about the strategies teachers use to help students to acquire knowledge. What they couldn’t have known is how students formed their knowledge; how they extracted information from data and knowledge from information. At the time cognitive psychologists knew a fair amount about learning but had only a hazy idea about how it all fitted together. The DIKW pyramid I referred to in the previous post explains how the bottom layer of Bloom’s taxonomy works – how students extract information and knowledge during learning. Anderson’s simple theory of cognition explains how people extract low-level information. More recent research at the knowledge and wisdom levels is beginning to shed light on Bloom’s higher-level skills, why people organise the same body of knowledge in different ways and why they misunderstand and make mistakes.

Seven Myths about Education addresses the knowledge level of Bloom’s taxonomy. Daisy Christodoulou presents a model of cognition that she feels puts the higher-level skills in Bloom’s taxonomy firmly into context. Her model also forms the basis for a pedagogical approach and a structure for a curriculum, which I’ll discuss in another post. Facts are a core feature of Daisy’s model. I’ve mentioned previously that many disciplines find facts problematic because facts, by definition, have to be valid (true), and it’s often difficult to determine their validity. In this post I want to focus instead on the information processing entailed in learning facts.

a simple theory of cognition

Having explained the concept of chunking and the relationship between working and long-term memory, Daisy introduces Anderson’s paper;

So when we commit facts to long-term memory, they actually become part of our thinking apparatus and have the ability to expand one of the biggest limitations of human cognition. Anderson puts it thus:

‘All that there is to intelligence is the simple accrual and tuning of many small units of knowledge that in total produce complex cognition. The whole is no more than the sum of its parts, but it has a lot of parts.’”

She then says “a lot is no exaggeration. Long-term memory is capable of storing thousands of facts, and when we have memorised thousands of facts on a specific topic, these facts together form what is known as a schema” (p. 20).

facts

This was one of the points where I began to lose track of Daisy’s argument. I think she’s saying this:

Anderson shows that low-level data can be chunked into a ‘unit of knowledge’ that is then treated as one item by WM – in effect increasing the capacity of WM. In the same way, thousands of memorised facts can be chunked into a more complex unit (a schema) that is then treated as one item by WM – this essentially bypasses the limitations of WM.

I think Daisy assumes that the principle Anderson found pertaining to low-level ‘units of knowledge’ applies to all units of knowledge at whatever level of abstraction. It doesn’t. Before considering why it doesn’t, it’s worth noting a problem with the use of the word ‘facts’ when describing data. Some researchers have equated data with ‘raw facts’. The difficulty with defining data as ‘facts’ is that by definition a fact has to be valid (true) and not all data is valid, as the GIGO (garbage-in-garbage-out) principle that bedevils computer data processing and the human brain’s often flaky perception of sensory input demonstrate. In addition, ‘facts’ are more complex than raw (unprocessed) data or raw (unprocessed) sensory input.

It’s clear from Daisy’s examples of facts that she isn’t referring to raw data or raw sensory input. Her examples include the date of the battle of Waterloo, key facts about numerous historical events and ‘all of the twelve times tables’. She makes it clear in the rest of the book that in order to understand such facts, students need prior knowledge. In terms of the DIKW hierarchy, Daisy’s ‘facts’ are at a higher level to Anderson’s ‘units of knowledge’ and are unlikely to be processed automatically and pre-consciously in the same way as Anderson’s units. To understand why, we need to take another look at Anderson’s units of knowledge and why chunking happens.

chunking revisited

Data that can be chunked easily have two key characteristics; they involve small amounts of information and the patterns within them are highly consistent. As I mentioned in the previous post, one of Anderson’s examples of chunking is the visual features of upper case H. As far as the brain is concerned, the two parallel vertical lines and linking horizontal line that make up the letter H don’t involve much information. Also, although fonts and handwriting vary, the core features of all the Hs the brain perceives are highly consistent. So the brain soon starts perceiving all Hs as the same thing and chunks up the core features into a single unit – the letter H. If H could also be written Ĥ and Ħ in English, it would take a bit longer for the brain to chunk the three different configurations of lines and to learn the association between them, but not much longer, since the three variants involve little information and are still highly consistent.

understanding facts

But the letter H isn’t a fact, it’s a symbol. So are + and the numerals 1 and 2. ‘1+2’ isn’t a fact in the sense that Daisy uses the term, it’s a series of symbols. ‘1+2=3’ could be considered a fact because it consists of symbols representing two entities and the relationship between them. If you know what the symbols refer to, you can understand it. It could probably be chunked because it contains a small amount of information and has consistent visual features. Each multiplication fact in multiplication tables could probably be chunked, too, since they meet the same criteria. But that’s not true for all the facts that Daisy refers to, because they are more complex and less consistent.

‘The cat is on the mat’ is a fact, but in order to understand it, you need some prior knowledge about cats, mats and what ‘on’ means. These would be treated by working memory as different items. Most English-speaking 5 year-olds would understand the ‘cat is on the mat’ fact, but because there are different sorts of cats, different sorts of mats and different ways in which the cat could be on the mat, each child could have a different mental image of the cat on the mat. A particular child might conjure up a different mental image each time he or she encountered the fact, meaning that different sensory data were involved each time, the mental representations of the fact would be low in consistency, and the fact’s component parts couldn’t be chunked into a single unit in the same way as lower-level more consistent representations. Consequently the fact is less likely to be treated as one item in working memory.

Similarly, in order to understand a fact like ‘the battle of Waterloo was in 1815’ you’d need to know what a battle is, where Waterloo is (or at least that it’s a place), what 1815 means and how ‘of’ links a battle and a place name. If you’re learning about the Napoleonic wars, your perception of the battle is likely to keep changing and the components of the facts would have low consistency meaning that it couldn’t be chunked in the way Anderson describes.

The same problem involving inconsistency would prevent two or more facts being chunked into a single unit. But clearly people do mentally link facts and the components of facts. They do it using a schema, but not quite in the way Daisy describes.

schemata

Before discussing how people use schemata (schemas), a comment on the biological structures that enable us to form them. I mentioned in an earlier post that the neurons in the brain form complex networks a bit like the veins in a leaf. Physical connections are formed between neighbouring neurons when the neurons are activated simultaneously by incoming data. If the same or very similar data are encountered repeatedly, the same neurons are activated repeatedly, connections between them are strengthened and eventually networks of neurons are formed that can carry a vast amount of information in their patterns of connections. The patterns of connections between the neurons represent the individual’s perception of the patterns in the data.

So if I see a cat on a mat, or read a sentence about a cat on a mat, or imagine a cat on a mat, my networks of neurons carrying information about cats and mats will be activated. Facts and concepts about cats, mats and things related to them will readily spring to mind. But I won’t have access to all of those facts and concepts at once. That would completely overload my working memory. Instead, what I recall is a stream of facts and concepts about cats and mats that takes time to access. It’s only a short time, but it doesn’t happen all at once. Also, some facts and concepts will be activated immediately and strongly and others will take longer and might be a bit hazy. In essence, a schema is a network of related facts and concepts, not a chunked ‘unit of knowledge’.

Daisy says “when we have memorised thousands of facts on a specific topic, these facts together form what is known as a schema” (p. 20). It doesn’t work quite like that, for several reasons.

the structure of a schema A schema is what it sounds like – a schematic plan or framework. It doesn’t consist of facts or concepts, but it’s a representation of how someone mentally arranges facts or concepts. In the same way the floor-plan of a building doesn’t consist of actual walls, doors and windows, but it does show you where those things are in the building in relation to each other. The importance of this apparently pedantic point will become clear when I discuss deep structure.

implicit and explicit schemata Schemata can be implicit – the brain organises facts and concepts in a particular way but we’re not aware of what it is – or explicit – we actively organise facts and concepts in a particular way and we aware of how they are organised.

the size of a schema Schemata can vary in size and complexity. The configuration of the three lines that make up the letter H is a schema, so is the way a doctor organises his or her knowledge about the human circulatory system. A schema doesn’t have to represent all the facts or concepts it links together. If it did, a schema involving thousands of facts would be so complex it wouldn’t be much help in showing how the facts were related. And in order to encompass all the different relationships between thousands of facts, a single schema for them would need to be very simple.

For example, a simple schema for chemistry would be that different chemicals are formed from different configurations of the sub-atomic ‘particles’ that make up atoms and configurations of atoms that form molecules. Thousands of facts can be fitted into that schema. In order to have a good understanding of chemistry, students would need to know about schemata other than just that simple one, and would need to know thousands of facts about chemistry before they would qualify as experts, but the simple schema plus a few examples would give them a basic understanding of what chemistry was about.

experts’ schemata Research into expertise (e.g. Chi et al, 1981) shows that experts don’t usually have one single schema for all the facts they know, but instead use different schemata for different aspects of their body of knowledge. Sometimes those schemata are explicitly linked, but sometimes they’re not. Sometimes they can’t be linked because no one knows how the linkage works yet.

chess experts

Daisy refers to research showing that expert chess players memorise thousands of different configurations of chess pieces (p.78). This is classic chunking; although in different chess sets specific pieces vary in appearance, their core visual features and the moves they can make are highly consistent, so frequently-encountered configurations of pieces are eventually treated by the brain as single units – the brain chunks the positions of the chess pieces in essentially the same way as it chunks letters into words.

De Groot’s work showed that chess experts initially identified the configurations of pieces that were possible as a next move, and then went through a process of eliminating the possibilities. The particular configuration of pieces on the board would activate several associated schemata involving possible next and subsequent moves.

So, each of the different configurations of chess pieces that are encountered so frequently they are chunked, has an underlying (simple) schema. Expert chess players then access more complex schemata for next and subsequent possible moves. Even if they have an underlying schema for chess as a whole, it doesn’t follow that they treat chess as a single unit or that they recall all possible configurations at once. Most people can reliably recognise thousands of faces and thousands of words and have schemata for organising them, but when thinking about faces or words, they don’t recall all faces or all words simultaneously. That would rapidly overload working memory.

Compared to most knowledge domains, chess is pretty simple. Chess expertise consists of memorising a large but limited number of configurations and having schemata that predict the likely outcomes from a selection of them. Because of the rules of chess, although lots of moves are possible, the possibilities are clearly defined and limited. Expertise in medicine, say, or history, is considerably more complex and less certain. A doctor might have many schemata for human biology; one for each of the skeletal, nervous, circulatory, respiratory and digestive systems, for cell metabolism, biochemistry and genetics etc. Not only is human biology more complex than chess, there’s also more uncertainty involved. Some of those schemata we’re pretty sure about, some we’re not so sure about and some we know very little about. There’s even more uncertainty involved in history. Evaluating evidence about how the human body works might be difficult, but the evidence itself is readily available in the form of human bodies. Historical evidence is often absent and likely to stay that way, which makes establishing facts and developing schemata more challenging.

To illustrate her point about schemata Daisy claims that learning couple of key facts about 150 historical events from 3000BC to the present, will form “the fundamental chronological schema that is the basis of all historical understanding” (p.20). Chronological sequencing could certainly form a simple schema for history, but you don’t need to know about many events in order to grasp that principle – two or three would suffice. Again, although this simple schema would give students a basic understanding of what history was about, in order to have a good understanding of history, students would need to know not only thousands of facts, but to develop many schemata about how those facts were linked before they would qualify as experts. This brings us on to the deep structure of knowledge, the subject of the next post.

references
Chi, MTH, Feltovich, PJ & Glaser, R (1981). Categorisation and Representation of Physics Problems by Experts and Novices, Cognitive Science, 5, 121-152
de Groot, AD (1978). Thought in Chess. Mouton.

Edited for clarity 8/1/17.

seven myths about education: a knowledge framework

In Seven Myths about Education Daisy Christodoulou refers to Bloom’s taxonomy of educational objectives as a metaphor that leads to two false conclusions; that skills are separate from knowledge and that knowledge is ‘somehow less worthy and important’ (p.21). Bloom’s taxonomy was developed in the 1950s as a way of systematising what students need to do with their knowledge. At the time, quite a lot was known about what people did with knowledge because they usually process it actively and explicitly. Quite a lot less was known about how people acquire knowledge, because much of that process is implicit; students usually ‘just learned’ – or they didn’t. Daisy’s book focuses on how students acquire knowledge, but her framework is an implicit one; she doesn’t link up the various stages of acquiring knowledge in an explicit formal model like Bloom’s. Although I think Daisy makes some valid points about the educational orthodoxy, some features of her model lead to conclusions that are open to question. In this post, I compare the model of cognition that Daisy describes with an established framework for analysing knowledge with origins outside the education sector.

a framework for knowledge

Researchers from a variety of disciplines have proposed frameworks involving levels of abstraction in relation to how knowledge is acquired and organised. The frameworks are remarkably similar. Although there are differences of opinion about terminology and how knowledge is organised at higher levels, there’s general agreement that knowledge is processed along the lines of the catchily named DIKW pyramid – DIKW stands for data, information, knowledge and wisdom. The Wikipedia entry gives you a feel for the areas of agreement and disagreement involved. In the pyramid, each level except the data level involves the extraction of information from the level below. I’ll start at the bottom.



Data

As far as the brain is concerned, data don’t actually tell us anything except whether something is there or not. For computers, data are a series of 0s and 1s; for the brain data is largely in the form of sensory input – light, dark and colour, sounds, tactile sensations, etc.

Information
It’s only when we spot patterns within data that the data can tell us anything. Information consists of patterns that enable us to identify changes, identify connections and make predictions. For computers, information involves detecting patterns in all the 0s and 1s. For the brain it involves detecting patterns in sensory input.

Knowledge
Knowledge has proved more difficult to define, but involves the organisation of information.

Wisdom
Although several researchers have suggested that knowledge is also organised at a meta-level, this hasn’t been extensively explored.

The processes involved in the lower levels of the hierarchy – data and information – are well-established thanks to both computer modelling and brain research. We know a fair bit about the knowledge level largely due to work on how experts and novices think, but how people organise knowledge at a meta-level isn’t so clear.

The key concept in this framework is information. Used in this context, ‘information’ tells you whether something has changed or not, whether two things are the same or not, and identifies patterns. The DIKW hierarchy is sometimes summarised as; information is information about data, knowledge is information about information, and wisdom is information about knowledge.

a simple theory of complex cognition

Daisy begins her exploration of cognitive psychology with a quote by John Anderson, from his paper ACT: A simple theory of complex cognition (p.20). Anderson’s paper tackles the mystique often attached to human intelligence when compared to that of other species. He demonstrates that it isn’t as sophisticated or as complex as it appears, but is derived from a simple underlying principle. He goes on to explain how people extract information from data, deduce production rules and make predictions about commonly occurring patterns, which suggests that the more examples of particular data the brain perceives, the more quickly and accurately it learns. He demonstrates the principle using examples from visual recognition, mathematical problem solving and prediction of word endings.

natural learning

What Anderson describes is how human beings learn naturally; the way brains automatically process any information that happens to come their way unless something interferes with that process. It’s the principle we use to recognise and categorise faces, places and things. It’s the one we use when we learn to talk, solve problems and associate cause with effect. Scattergrams provide a good example of how we extract information from data in this way.

Scatterplot of longitudinal measurements of total brain volume for males (N=475 scans, shown in dark blue) and females (N=354 scans, shown in red).  From Lenroot et al (2007).

Scatterplot of longitudinal measurements of total brain volume for
males (N=475 scans, shown in dark blue) and females (N=354 scans,
shown in red). From Lenroot et al (2007).

Although the image consists of a mass of dots and lines in two colours, we can see at a glance that the different coloured dots and lines form two clusters.

Note that I’m not making the same distinction that Daisy makes between ‘natural’ and ‘not natural’ learning (p.36). Anderson is describing the way the brain learns, by default, when it encounters data. Daisy, in contrast, claims that we learn things like spoken language without visible effort because language is ‘natural’ whereas we need to be taught ‘formally and explicitly’, inventions like the alphabet and numbers. That distinction, although frequently made, isn’t necessarily a valid one. It’s based on an assumption that the brain has evolved mechanisms to process some types of data e.g. to recognise faces and understand speech, but can’t have had time to evolve mechanisms to process recent inventions like writing and mathematics. This assumption about brain hardwiring is a contentious one, and the evidence about how brains learn (including the work that’s developed from Anderson’s theory) makes it look increasingly likely that it’s wrong. If formal and explicit instruction are necessary in order to learn man-made skills like writing and mathematics, it begs the question of how these skills were invented in the first place, and Anderson would not have been able to use mathematical problem-solving and word prediction as his examples of the underlying mechanism of human learning. The theory that the brain is hardwired to process some types of information but not others, and the theory that the same mechanism processes all information, both explain how people appear to learn some things automatically and ‘naturally’. Which theory is right (or whether both are right) is still the subject of intense debate. I’ll return to the second theory later when I discuss schemata.

data, information and chunking

Chunking is a core concept in Daisy’s model of cognition. Chunking occurs when the brain links together several bits of data it encounters frequently and treats them as a single item – groups of letters that frequently co-occur are chunked into words. Anderson’s paper is about the information processing involved in chunking. One of his examples is how the brain chunks the three lines that make up an upper case H. Although Anderson doesn’t make an explicit distinction between data and information, in his examples the three lines would be categorised as data in the DIKW framework, as would be the curves and lines that make up numerals. When the brain figures out the production rule for the configuration of the lines in the letter H, it’s extracting information from the data – spotting a pattern. Because the pattern is highly consistent – H is almost always written using this configuration of lines – the brain can chunk the configuration of lines into the single unit we call the letter H. The letters A and Z also consist of three lines, but have different production rules for their configurations. Anderson shows that chunking can also occur at a slightly higher level; letters (already chunked) can be chunked again into words that are processed as single units, and numerals (already chunked) can be chunked into numbers to which production rules can be applied to solve problems. Again, chunking can take place because the patterns of letters in the words, and the patterns of numerals in Anderson’s mathematical problems are highly consistent. Anderson calls these chunked units and production rules ‘units of knowledge’. He doesn’t use the same nomenclature as the DIKW model, but it’s clear from his model that initial chunking occurs at the data level and further chunking can occur at the information level.

The brain chunks data and low-level units of information automatically; evidence for this comes from research showing that babies begin to identify and categorise objects using visual features and categorise speech sounds using auditory features by about the age of 9 months (Younger, 2003). Chunking also occurs pre-consciously (e.g. Lamme 2003); we know that people are often aware of changes to a chunked unit like a face, a landscape or a piece of music, but don’t know what has changed – someone has shaved off their moustache, a tree has been felled, the song is a cover version with different instrumentation. In addition, research into visual and auditory processing shows that sensory information initially feeds forward in the brain; a lot of processing occurs before the information reaches the location of working memory in the frontal lobes. So at this level, what we are talking about is an automatic, usually pre-conscious process that we use by default.

knowledge – the organisation of information

Anderson’s paper was written in 1995 – twenty years ago – at about the time the DIKW framework was first proposed, which explains why he doesn’t used the same terminology. He calls the chunked units and production rules ‘units of knowledge’ rather than ‘units of information’ because they are the fundamental low-level units from which higher-level knowledge is formed.

Although Anderson’s model of information processing for low-level units still holds true, what has puzzled researchers in the intervening couple of decades is why that process doesn’t scale up. The way people process low-level ‘units of knowledge’ is logical and rational enough to be accurately modelled using computer software, but when handling large amounts of information, such as the concepts involved in day-to-day life, or trying to comprehend, apply, analyse, synthesise or evaluate it, the human brain goes a bit haywire. People (including experts) exhibit a number of errors and biases in their thinking. These aren’t just occasional idiosyncrasies – everybody shows the same errors and biases to varying extents. Since complex information isn’t inherently different to simple information – there’s just more of it – researchers suspected that the errors and biases were due to the wiring of the brain. Work on judgement and decision-making and on the biological mechanisms involved in processing information at higher levels has demonstrated that brains are indeed wired up differently to computers. The reason is that what has shaped the evolution of the human brain isn’t the need to produce logical, rational solutions to problems, but the need to survive, and overall quick-and-dirty information processing tends to result in higher survival rates than slow, precise processing.

What this means is that Anderson’s information processing principle can be applied directly to low-level units of information, but might not be directly applicable to the way people process information at a higher-level, the way they process facts, for example. Facts are the subject of the next post.

References
Anderson, J (1996) ACT: A simple theory of complex cognition, American Psychologist, 51, 355-365.
Lamme, VAF (2003) Why visual attention and awareness are different, TRENDS in Cognitive Sciences, 7, 12-18.
Lenroot,RK, Gogtay, N, Greenstein, DK, Molloy, E, Wallace, GL, Clasen, LS, Blumenthal JD, Lerch,J, Zijdenbos, AP, Evans, AC, Thompson, PM & Giedd, JN (2007). Sexual dimorphism of brain developmental trajectories during childhood and adolescence. NeuroImage, 36, 1065–1073.
Younger, B (2003). Parsing objects into categories: Infants’ perception and use of correlated attributes. In Rakison & Oakes (eds.) Early Category and Concept development: Making sense of the blooming, buzzing confusion, Oxford University Press.

folk categorisation and implicit assumptions

In his second response to critics, Robert [Peal] tackles the issue of the false dichotomy. He says;

…categorisation invariably simplifies. This can be seen in all walks of life: music genres; architectural styles; political labels. However, though imprecise, categories are vital in allowing discussion to take place. Those who protest over their skinny lattes that they are far too sophisticated to use such un-nuanced language … are more often than not just trying to shut down debate.

Categorisation does indeed simplify. And it does allow discussion to take place. Grouping together things that have features in common and labelling the groups means we can refer to large numbers of thing by their collective labels, rather than having to list all their common features every time we want to discuss them. Whether all categorisation is equally helpful is another matter.

folk categorisation

The human brain categorises things as if it that was what it was built for; not surprising really because grouping things according to their similarities and differences and referring to them by a label is a very effective way of reducing cognitive load.

The things we detect with our senses are categorised by our brains quickly, automatically and pre-verbally (e.g. Haxby, Gobbini & Montgomery, 2004; Greene & Fei-Fei, 2014) – by which I mean that language isn’t necessary in order to form the categories – although language is often involved in categorisation. We also categorise pre-verbally in the sense that babies start to categorise things visually (such as toy trucks and toy animals) at between 7 and 10 months of age, before they acquire language (Younger, 2003). And babies acquire language itself by forming categories.

Once we do start to get the hang of language, we learn about how things are categorised and labelled by the communities we live in; we develop shared ways of categorising things. All human communities have these shared ‘folk’ categorisations, but not all groups categorise the same things in the same way. Nettles and chickweed would have been categorised as vegetables in the middle ages, but to most modern suburban gardeners they are ‘weeds’.

Not all communities agree on the categorisations they use either; political and religious groups are notorious for disagreements about the core features of their categories, who adheres to them and who doesn’t. Nor are folk categorisations equally useful in all circumstances. Describing a politician’s views as ‘right wing’ gives us a rough idea of what her views are likely to be, but doesn’t tell us what she thinks about specific policies.

Biologists have run into problems with folk categorisations too.  Mushrooms/toadstools, frogs/toads and horses/ponies are all folk classifications. So although biologists could distinguish between species of mushrooms/toadstools,  grouping the species together as either mushrooms or toadstools was impossible, because the differences between the folk categories ‘mushrooms’ and ‘toadstools’ aren’t clear enough, so biologists neatly sidestepped the problem by ignoring the folk category distinctions and grouping mushrooms and toadstools together as a phylum. The same principle apples to frogs/toads – so they form an order of their own. Horses and ponies, by contrast, are members of the same subspecies.

Incidentally 18th and 19th century biologists weren’t categorising these organisms just because of an obsessive interest in taxonomy. Their classification had a very practical purpose – to differentiate between species and identify the relationships between them. In a Europe that was fast running out of natural resources, farmers, manufacturers and doctors all had a keen interest in the plants and animals being brought back from far-flung parts of the world by traders, and accurate identification of different species was vital.

In short, folk categories do allow discussion to take place, but they have limitations. They’re not so useful when one needs to get down to specifics – how are particular MPs likely to vote, or is this fungus toxic or not? The catch is in the two words Robert uses to describe categories – ‘though imprecise’. My complaint about his educational categorisation is not categorisation per se, but its imprecision.

‘though imprecise’

The categories people use for their own convenience don’t always have clear-cut boundaries, nor do they map neatly on to the real world. They don’t always map neatly onto other people’s categories either. Eleanor Rosch’s work on prototype theory shed some light on this. What she found was that people’s mental categories have prototypical features – features that the members of the category share – but not all members of the category have all the prototypical features, and category members can have prototypical features to different extents. For example, the prototypical features of most people’s category {birds} are a beak, wings, feathers and being able to fly. A robin has a beak, wings and feathers and is able to fly, so it’s strongly prototypical of the category {birds}. A penguin can’t fly but uses its wings for swimming, so it’s weakly prototypical, although still a bird.

Mushrooms and toadstools have several prototypical features in common, as do frogs and toads, horses and ponies. The prototypical features that differentiate mushrooms from toadstools, frogs from toads and horses from ponies are the ideas that; toadstools are poisonous and often brightly coloured; toads have a warty skin, sometimes containing toxins; and horses are much larger than ponies. Although these differential features are useful for conversational purposes, they are not helpful for more specific ones such as putting edible fungi on your restaurant menu, using a particular toxin for medicinal purposes or breeding characteristics in or out of horses.

traditional vs progressive education

Traditional and progressive education are both types of education, obviously, so they have some prototypical features in common – teachers, learners, knowledge, schools etc. Robert proposes some core features of progressive education that differentiate it from traditional education; it is child-centered, focuses on skills rather than knowledge, sees strict discipline and moral education as oppressive and assumes that socio-economic background dictates success (pp. 5-8). He distilled these features from what’s been said and written about progressive education over the last fifty years, so it’s likely there’s a high degree of consensus on these core themes. The same might not be true for traditional education. Robert defines it only in terms of its core characteristics being the polar opposite of progressive education, although he appears to include in the category ‘traditional’ a list of other more peripheral features including blazers, badges and ties and class rankings.

Robert says “though imprecise, categories are vital in allowing discussion to take place.” No doubt about that, but if the categories are imprecise the discussion can be distinctly unfruitful. A lot of time and energy can be expended trying to figure out precise definitions and how accurately those definitions map onto the real world. Nor are imprecise categories helpful if we want to do something with them other than have a discussion. Categorising education as ‘traditional’ or ‘progressive’ is fine for referring conversationally to a particular teacher’s pedagogical approach or the type of educational philosophy favoured by a government minister, but those constructs are too complex and too imprecise to be of use in research.

implicit assumptions

An implicit assumption is, by definition, an assumption that isn’t made explicit. Implicit assumptions are sneaky things because if they are used in a discussion, people following the argument often overlook the fact that an implicit assumption is being made. An implicit assumption that’s completely wrong can easily slip by unnoticed. Implicit assumptions get even more sneaky; often the people making the argument aren’t aware of their implicit assumptions either. In the case of mushrooms and toadstools, any biologists who tried to group certain types of fungi into one or other of these categories would be on a hiding to nothing because of an implicit, but wrong, assumption that the fungi could be sorted into one or other of these categories.

Robert’s thesis appears to rest on an implicit assumption that because the state education system in the last fifty years has had shortcomings, some of them serious, and because progressive educational ideas have proliferated during the same period, it follows that progressive ideas must be the cause of the lack of effectiveness. This isn’t even the ever-popular ‘correlation equals causality’ error, because as far as I can see, Robert hasn’t actually established a correlation between progressive ideas and educational effectiveness. He can’t compare current traditional and progressive state schools because traditional state schools are a thing of the past. And he can’t compare current progressive state schools with historical traditional state schools because the relevant data isn’t available. Ironically, what data we do have suggest that numeracy and literacy rates have improved overall during this period. The reliability of the figures is questionable because of grade drift, but numeracy and literacy rates have clearly not plummeted.

What he does implicitly compare is state schools that he sees as broadly progressive, with independent schools that he sees as having “withstood the wilder extremes of the [progressive] movement”. The obvious problem with this comparison is that a progressive educational philosophy is not the only difference between the state and independent sectors.

In my previous post, I agreed with Robert that the education system in England leaves much to be desired, but making an implicit assumption that there’s only one cause and that other possible causes can be ignored is a risky approach to policy development. It would be instructive to compare schools that are effective (however you measure effectiveness) with schools that are less effective, to find out how the latter could be improved. But the differences between them could boil down to some very specific issues relating to the quality of teaching, classroom management, availability of additional support or allocation of budgets, rather than whether the schools take a ‘traditional’ or ‘progressive’ stance overall.

References
Greene, MR & Fie-Fie, L (2014).Visual categorization is automatic and obligatory: Evidence from Stroop-like paradigm. Journal of Vision, 14, article 14.
Haxby, J.V., Gobbini, M. I. & Montgomery, K. (2004). Spatial and temporal distribution of face and object representations in the human brain. In M. S. Gazzaniga (Ed.) The Cognitive Neurosciences (3rd edn.). Cambridge, MA: MIT Press.
Kuhl, P. (2004). Early language acquisition:Cracking the speech code. Nature Reviews Neuroscience 5, 831-843.
Younger, B (2003). Parsing objects into categories: Infants’ perception and use of correlated attributes. In Rakison & Oakes (eds.) Early Category and Concept development: Making sense of the blooming, buzzing confusion, Oxford University Press.