seven myths about education: a knowledge framework

In Seven Myths about Education Daisy Christodoulou refers to Bloom’s taxonomy of educational objectives as a metaphor that leads to two false conclusions; that skills are separate from knowledge and that knowledge is ‘somehow less worthy and important’ (p.21). Bloom’s taxonomy was developed in the 1950s as a way of systematising what students need to do with their knowledge. At the time, quite a lot was known about what people did with knowledge because they usually process it actively and explicitly. Quite a lot less was known about how people acquire knowledge, because much of that process is implicit; students usually ‘just learned’ – or they didn’t. Daisy’s book focuses on how students acquire knowledge, but her framework is an implicit one; she doesn’t link up the various stages of acquiring knowledge in an explicit formal model like Bloom’s. Although I think Daisy makes some valid points about the educational orthodoxy, some features of her model lead to conclusions that are open to question. In this post, I compare the model of cognition that Daisy describes with an established framework for analysing knowledge with origins outside the education sector.

a framework for knowledge

Researchers from a variety of disciplines have proposed frameworks involving levels of abstraction in relation to how knowledge is acquired and organised. The frameworks are remarkably similar. Although there are differences of opinion about terminology and how knowledge is organised at higher levels, there’s general agreement that knowledge is processed along the lines of the catchily named DIKW pyramid – DIKW stands for data, information, knowledge and wisdom. The Wikipedia entry gives you a feel for the areas of agreement and disagreement involved. In the pyramid, each level except the data level involves the extraction of information from the level below. I’ll start at the bottom.



Data

As far as the brain is concerned, data don’t actually tell us anything except whether something is there or not. For computers, data are a series of 0s and 1s; for the brain data is largely in the form of sensory input – light, dark and colour, sounds, tactile sensations, etc.

Information
It’s only when we spot patterns within data that the data can tell us anything. Information consists of patterns that enable us to identify changes, identify connections and make predictions. For computers, information involves detecting patterns in all the 0s and 1s. For the brain it involves detecting patterns in sensory input.

Knowledge
Knowledge has proved more difficult to define, but involves the organisation of information.

Wisdom
Although several researchers have suggested that knowledge is also organised at a meta-level, this hasn’t been extensively explored.

The processes involved in the lower levels of the hierarchy – data and information – are well-established thanks to both computer modelling and brain research. We know a fair bit about the knowledge level largely due to work on how experts and novices think, but how people organise knowledge at a meta-level isn’t so clear.

The key concept in this framework is information. Used in this context, ‘information’ tells you whether something has changed or not, whether two things are the same or not, and identifies patterns. The DIKW hierarchy is sometimes summarised as; information is information about data, knowledge is information about information, and wisdom is information about knowledge.

a simple theory of complex cognition

Daisy begins her exploration of cognitive psychology with a quote by John Anderson, from his paper ACT: A simple theory of complex cognition (p.20). Anderson’s paper tackles the mystique often attached to human intelligence when compared to that of other species. He demonstrates that it isn’t as sophisticated or as complex as it appears, but is derived from a simple underlying principle. He goes on to explain how people extract information from data, deduce production rules and make predictions about commonly occurring patterns, which suggests that the more examples of particular data the brain perceives, the more quickly and accurately it learns. He demonstrates the principle using examples from visual recognition, mathematical problem solving and prediction of word endings.

natural learning

What Anderson describes is how human beings learn naturally; the way brains automatically process any information that happens to come their way unless something interferes with that process. It’s the principle we use to recognise and categorise faces, places and things. It’s the one we use when we learn to talk, solve problems and associate cause with effect. Scattergrams provide a good example of how we extract information from data in this way.

Scatterplot of longitudinal measurements of total brain volume for males (N=475 scans, shown in dark blue) and females (N=354 scans, shown in red).  From Lenroot et al (2007).

Scatterplot of longitudinal measurements of total brain volume for
males (N=475 scans, shown in dark blue) and females (N=354 scans,
shown in red). From Lenroot et al (2007).

Although the image consists of a mass of dots and lines in two colours, we can see at a glance that the different coloured dots and lines form two clusters.

Note that I’m not making the same distinction that Daisy makes between ‘natural’ and ‘not natural’ learning (p.36). Anderson is describing the way the brain learns, by default, when it encounters data. Daisy, in contrast, claims that we learn things like spoken language without visible effort because language is ‘natural’ whereas we need to be taught ‘formally and explicitly’, inventions like the alphabet and numbers. That distinction, although frequently made, isn’t necessarily a valid one. It’s based on an assumption that the brain has evolved mechanisms to process some types of data e.g. to recognise faces and understand speech, but can’t have had time to evolve mechanisms to process recent inventions like writing and mathematics. This assumption about brain hardwiring is a contentious one, and the evidence about how brains learn (including the work that’s developed from Anderson’s theory) makes it look increasingly likely that it’s wrong. If formal and explicit instruction are necessary in order to learn man-made skills like writing and mathematics, it begs the question of how these skills were invented in the first place, and Anderson would not have been able to use mathematical problem-solving and word prediction as his examples of the underlying mechanism of human learning. The theory that the brain is hardwired to process some types of information but not others, and the theory that the same mechanism processes all information, both explain how people appear to learn some things automatically and ‘naturally’. Which theory is right (or whether both are right) is still the subject of intense debate. I’ll return to the second theory later when I discuss schemata.

data, information and chunking

Chunking is a core concept in Daisy’s model of cognition. Chunking occurs when the brain links together several bits of data it encounters frequently and treats them as a single item – groups of letters that frequently co-occur are chunked into words. Anderson’s paper is about the information processing involved in chunking. One of his examples is how the brain chunks the three lines that make up an upper case H. Although Anderson doesn’t make an explicit distinction between data and information, in his examples the three lines would be categorised as data in the DIKW framework, as would be the curves and lines that make up numerals. When the brain figures out the production rule for the configuration of the lines in the letter H, it’s extracting information from the data – spotting a pattern. Because the pattern is highly consistent – H is almost always written using this configuration of lines – the brain can chunk the configuration of lines into the single unit we call the letter H. The letters A and Z also consist of three lines, but have different production rules for their configurations. Anderson shows that chunking can also occur at a slightly higher level; letters (already chunked) can be chunked again into words that are processed as single units, and numerals (already chunked) can be chunked into numbers to which production rules can be applied to solve problems. Again, chunking can take place because the patterns of letters in the words, and the patterns of numerals in Anderson’s mathematical problems are highly consistent. Anderson calls these chunked units and production rules ‘units of knowledge’. He doesn’t use the same nomenclature as the DIKW model, but it’s clear from his model that initial chunking occurs at the data level and further chunking can occur at the information level.

The brain chunks data and low-level units of information automatically; evidence for this comes from research showing that babies begin to identify and categorise objects using visual features and categorise speech sounds using auditory features by about the age of 9 months (Younger, 2003). Chunking also occurs pre-consciously (e.g. Lamme 2003); we know that people are often aware of changes to a chunked unit like a face, a landscape or a piece of music, but don’t know what has changed – someone has shaved off their moustache, a tree has been felled, the song is a cover version with different instrumentation. In addition, research into visual and auditory processing shows that sensory information initially feeds forward in the brain; a lot of processing occurs before the information reaches the location of working memory in the frontal lobes. So at this level, what we are talking about is an automatic, usually pre-conscious process that we use by default.

knowledge – the organisation of information

Anderson’s paper was written in 1995 – twenty years ago – at about the time the DIKW framework was first proposed, which explains why he doesn’t used the same terminology. He calls the chunked units and production rules ‘units of knowledge’ rather than ‘units of information’ because they are the fundamental low-level units from which higher-level knowledge is formed.

Although Anderson’s model of information processing for low-level units still holds true, what has puzzled researchers in the intervening couple of decades is why that process doesn’t scale up. The way people process low-level ‘units of knowledge’ is logical and rational enough to be accurately modelled using computer software, but when handling large amounts of information, such as the concepts involved in day-to-day life, or trying to comprehend, apply, analyse, synthesise or evaluate it, the human brain goes a bit haywire. People (including experts) exhibit a number of errors and biases in their thinking. These aren’t just occasional idiosyncrasies – everybody shows the same errors and biases to varying extents. Since complex information isn’t inherently different to simple information – there’s just more of it – researchers suspected that the errors and biases were due to the wiring of the brain. Work on judgement and decision-making and on the biological mechanisms involved in processing information at higher levels has demonstrated that brains are indeed wired up differently to computers. The reason is that what has shaped the evolution of the human brain isn’t the need to produce logical, rational solutions to problems, but the need to survive, and overall quick-and-dirty information processing tends to result in higher survival rates than slow, precise processing.

What this means is that Anderson’s information processing principle can be applied directly to low-level units of information, but might not be directly applicable to the way people process information at a higher-level, the way they process facts, for example. Facts are the subject of the next post.

References
Anderson, J (1996) ACT: A simple theory of complex cognition, American Psychologist, 51, 355-365.
Lamme, VAF (2003) Why visual attention and awareness are different, TRENDS in Cognitive Sciences, 7, 12-18.
Lenroot,RK, Gogtay, N, Greenstein, DK, Molloy, E, Wallace, GL, Clasen, LS, Blumenthal JD, Lerch,J, Zijdenbos, AP, Evans, AC, Thompson, PM & Giedd, JN (2007). Sexual dimorphism of brain developmental trajectories during childhood and adolescence. NeuroImage, 36, 1065–1073.
Younger, B (2003). Parsing objects into categories: Infants’ perception and use of correlated attributes. In Rakison & Oakes (eds.) Early Category and Concept development: Making sense of the blooming, buzzing confusion, Oxford University Press.

Advertisements

the venomous data bore

Robert Peal has posted a series of responses to critics of his book Progressively Worse here. The second is on ‘data and dichotomies’. In this post I want to comment on some of the things he says about data and evidence.

when ‘evidence doesn’t work’

Robert* refers back to a previous post entitled ‘When evidence doesn’t work’ summarising several sessions at the ResearchED conference held at Dulwich College last year. He rightly draws attention to the problem of hard-to-measure outcomes, and to which outcomes we decide to measure in the first place. But he appears to conclude that there are some things – ideology, morality, values – that are self-evidently good or bad and that are outside the remit of evidence.

In his response to critics, Robert claims that one reason ‘evidence doesn’t work’ is because “some of the key debates in education are based on value judgements, not efficacy.” This is certainly true – and those key debates have resulted in a massive waste of resources in education over the past 140 years. There’s been little consensus on what long-term outcomes people want from the education system, what short-term outcomes they want, what pedagogies are effective and how effectiveness can be assessed. If a decision as to whether Shakespeare ‘should’ be studied at GCSE is based on value judgements it’s hardly surprising it’s been the subject of heated debate for decades. Robert’s conclusion appears to be that heated debate about value judgements is inevitable because values aren’t things that lend themselves to being treated as evidence. I disagree.

data

I think he draws this conclusion because his view of data is rather limited. Data don’t just consist of ‘things we can easily measure’ like exam results (Robert’s second reason why ‘evidence doesn’t work’). They don’t have to involve measuring things at all; qualitative data can be very informative. Let’s take the benefits of studying Shakespeare in school. Robert asks “Can an RTC tell us, for example, whether secondary school pupils benefit from studying Shakespeare?” If it was carefully controlled it could, though we would have to tackle the question of what outcomes to measure. But randomised controlled trials are only one of many methods for gathering data. Collecting qualitative data from a representative sample of the population about the impact studying Shakespeare had had on their lives could give some insights, not only into whether Shakespeare should be studied in school, but how his work should be studied. And whether people should have the opportunity to undertake some formal study of Shakespeare in later life if they wanted to. People might appreciate actually being asked.

venomous data bore*

venomous data bore Buprestis octoguttata§

opinion

I don’t know whether Robert sees me as what he refers to as a ‘data bore’, but if he does I accept the epithet as a badge of honour. For the record however, not only have I never let a skinny latte pass my lips, but the word ‘nuanced’ has never done so either (not in public, at least). Nor do I have a “lofty distain for anything so naïve as ‘having an opinion’”.

I’m more than happy for people to have opinions and to express them and for them to be taken into account when education policy is being devised. But not all opinions are equal. They can vary between professional, expert opinion derived from a thorough theoretical knowledge and familiarity with a particular research literature, through well-informed personal opinion, to someone simply liking or not liking something but not having a clue why. I would not want to receive medical treatment based on a vox pop carried out in my doctor’s waiting room, nor do I want a public sector service to be designed on a similar basis. If it is, then the people who voice their opinions most loudly are likely to get what they want, leaving the rest of us, ‘data bores’ included, to work on the damage limitation.

rationality and values

Robert appears to have a deep suspicion of rationality. He says “rational man believes that they can make their way in the world without recourse to the murky business of ideology and morality, or to use a more contemporary term, ‘values’.” He also says it was ‘terrific’ to hear Sam Freedman expound the findings of Jonathan Haidt and Daniel Kahnemann “about the dominance of the subconscious, emotional part of our minds, over the logical, conscious part.” He could add Antonio Damasio to that list. There’s little doubt that our judgement and decision-making is dominated by the subconscious emotional part of our minds. That doesn’t mean it’s a good thing.

Ideology, morality and values can inspire people to do great things, and rationality can inflict appalling damage, but it’s not always like that. Every significant step that’s been ever been taken towards reducing infant mortality, maternal mortality, disease, famine, poverty and conflict and every technological advance ever made has involved people using the ‘logical conscious part’ of their minds as well as, or instead of, the ‘subconscious emotional part’. Those steps have sometimes involved a lifetime’s painstaking work in the teeth of bitter opposition. In contrast, many of the victims of ideology, morality and values lie buried where they fell on the world’s battlefields.

Robert’s last point about data is that they are “simply not able to ‘speak for themselves’. Its voice is always mediated by human judgement.” That’s not quite the impression given on page 4 of his book when referring to a list of statistics he felt showed there was a fundamental problem in British education. In the case of these statistics, ‘the bare figures are hard to ignore’.

Robert is quite right that the voice of the data is always mediated by human judgement, but we have devised ways of interpreting the data that make them less susceptible to bias. The data are perfectly capable of speaking for themselves, if we know how to listen to them. Clearly the researcher, like the historian, suffers from selection bias, but some fields of discourse, unlike history it seems, have developed robust methodologies to address that. The biggest problem faced by the data is that they can’t get a word in edgeways because of all the opinion being voiced.

endnote

According to this tweet from Civitas…

civitas venom

Robert says he has responded to criticism in blogs by Tim Taylor, Guy Woolnough and myself. I’m doubtless biased, but the comment most closely resembling ‘venom’ that I could find was actually in a scurrilous tweet from Debra Kidd, shown in Robert’s third response to his critics. Debra, shockingly for a teacher, uses a four-letter-word to describe Robert’s description of state schools as ‘a persistent source of national embarrassment’. She calls it ‘tosh’. If Civitas thinks that’s venom, it clearly has little experience of academia, politics or the playground. Rather worrying on all counts, if it’s a think tank playing a significant role in education reform.

* I felt we should be on first name terms now we’ve had a one-to-one conversation about statistics.

§ Image courtesy Christian Fischer from Britannica Kids.

It’s not really a venomous data bore, it’s a Metallic wood-boring beetle. It’s not really metallic either, it just looks like it. Nor does the beetle bore wood, its larvae do. Words can be so misleading.