biologically primary and secondary knowledge?

David Geary is an evolutionary psychologist who developed the concept of biologically primary and biologically secondary knowledge, popular with some teachers. I’ve previously critiqued Geary’s ideas as he set them out in a chapter entitled Educating the Evolved Mind. One teacher responded by suggesting I read Geary’s The Origin of Mind because it explained his ideas in more detail. So I did.

Geary’s theory

If I’ve understood correctly, Geary’s argument goes like this:

The human body and brain have evolved over time in response to environmental pressures ranging from climate and diet through to social interaction. For Geary, social interaction is a key driver of evolved brain structures because social interactions can increase the resources available to individuals.

Environmental pressures have resulted in the evolution of brain ‘modules’ specialising in processing certain types of information, such as language or facial features. Information is processed by the modules rapidly, automatically and implicitly, resulting in heuristics (rules of thumb) characteristic of the ‘folk’ psychology, biology and physics that form the default patterns for the way we think. But we are also capable of flexible thought that overrides those default patterns. The flexibility is due to the highly plastic frontal areas of our brain responsible for intelligence. Geary refers to the thinking using the evolved modules as biologically primary, and that involving the plastic frontal areas as biologically secondary.

Chapters 2 & 3 of The Origin of Mind offer a clear, coherent account of Darwinian and hominid evolution respectively. They’d make a great resource for teachers. But when Geary moves on to cognition his model begins to get a little shaky – because it rests on several assumptions.

Theories about evolution of the brain are inevitably speculative because brain tissue decomposes and the fossil record is incomplete. Theories about brain function also involve speculation because our knowledge about how brains work is incomplete. There’s broad agreement on the general principles, but some hypotheses have generated what Geary calls ‘hot debate’. Despite acknowledging the debates, Geary’s model is built on assumptions about which side of the debate is correct. The assumptions involve the modularity of the brain, folk systems, intelligence, and motivation-to-control.


The general principle of modularity – that there are specific areas of the brain dedicated to processing specific types of information – is not in question. What is less clear is how specialised the modules are. For example, the fusiform face area (FFA) specialises in processing information about faces. But not just faces. It has also been shown to process information about cars, birds, butterflies, chess pieces, Digimon, and novel items called greebles. This raises the question of whether the FFA evolved to process information about faces as such (the Face Specific Hypothesis), or to process information about objects requiring fine-grained discrimination (the Expertise Hypothesis). Geary comes down on the Faces side of the debate on the grounds that the FFA does not “generally respond to other types of objects … that do not have facelike features, except in individuals with inherent sociocognitive deficits, such as autism” (p.141). Geary is entitled to his view, but that’s not the only hotly debated interpretation of the evidence.

folk systems

The general principle of ‘folk’ systems – evolved forms of thought that result from information being processed rapidly, automatically and implicitly – is also not in question. Geary admits it’s unclear whether the research is “best understood in terms of inherent modular constraints, or as the result of general learning mechanisms” but comes down on the side of children’s thinking being the result of “inherent modular systems”.  I couldn’t find a reference to Eleanor Rosch’s prototype theory developed in the 1970s, which explains folk categories in terms of general learning mechanisms. And it’s regrettable that Rakison & Oakes’ 2008 review of research into how children form categories (that also lends weight to the general learning mechanisms hypothesis) wasn’t published until three years after The Origin of Mind. I don’t know whether either would have prompted Geary to amend his theory.


In 1904 Charles Spearman published a review of attempts to measure intellectual ability. He concluded that the correlations between various specific abilities indicated “that there really exists a something that we may provisionally term “General Sensory Discrimination” and similarly a “General Intelligence”” (Spearman p.272).

It’s worth looking at what the specific abilities included. Spearman ranks (p. 276) in order of their correlation with ‘General Intelligence’, performance in: Classics, Common Sense, Pitch Discrimination, French, Cleverness, English, Mathematics, Pitch Discrimination among the uncultured, Music, Light Discrimination and Weight Discrimination.

So, measures of school performance turned out to be good predictors of school performance. The measures of school performance correlated strongly with ‘General Intelligence’ – a construct derived from… the measures of school performance. This tautology wasn’t lost on other psychologists and Spearman’s conclusions received considerable criticism. As Edwin Boring pointed out in 1923, ‘intelligence’ is defined by the content of ‘intelligence’ tests. The correlations between specific abilities and the predictive power of intelligence tests are well-established. What’s contentious is whether they indicate the existence of an underlying ‘general mental ability’.

Geary says the idea that children’s intellectual functioning can be improved is ‘hotly debated’ (p.295). But he appears to look right past the even hotter debate that’s raged since Spearman’s work was published, about whether the construct general intellectual ability (g) actually represents ‘a something’ that ‘really exists’. Geary assumes it does, and also accepts Cattell’s later constructs crystallised and fluid intelligence without question.

Clearly some people are more ‘intelligent’ than others, so the idea of g initially appears valid. But ‘intelligence’ is, ironically, a ‘folk’ construct. It’s a label we apply to a set of loosely defined characteristics – a useful shorthand descriptive term. It doesn’t follow that ‘intelligence’ is a biologically determined ‘something’ that ‘really exists’.


The motivation to control relationships, events and resources is a key part of Geary’s theory. He argues that motivation-to-control is an evolved disposition (inherent in the way people think) that manifests itself most clearly in the behaviour of despots – who seek to maximise their control of resources. Curiously, in referring to despots, Geary cites a paper by Herb Simon (Simon, 1990) on altruism (a notoriously knotty problem for evolution researchers). Geary describes an equally successful alternative strategy to despotism, not as altruism but as “adherence to [social] laws and mores”, even though the evidence suggests altruism is an evolved disposition, not merely a behaviour.

Altruism calls into question the control part of the motivation-to-control hypothesis. Many people have a tendency to behave in ways that increase their control of resources, but many tend to collaborate and co-operate instead, strategies that increase individual access to resources, despite reducing individual control over them. The altruism debate is another that’s been going on for decades, but you wouldn’t know that to read Geary.

Then there’s the motivation part. Like ‘intelligence’, ‘motivation’ is a label for a loosely defined bunch of factors that provide incentives for behaviour. ‘Motivation’ is a useful label. But again it doesn’t follow that ‘motivation’ is ‘a something’ that ‘really exists’. The biological mechanisms involved in the motivation to eat or drink are unlikely to be the same as those involved in wanting to marry the boss’s daughter or improve on our personal best for the half-marathon. The first two examples are likely to increase our access to resources; whether they increase our control over them will depend on the circumstances. Geary doesn’t explain the biological mechanism involved.

biologically primary and secondary knowledge

In The Origin of Mind, Geary touches on the idea of biologically primary and secondary competencies and abilities but doesn’t go into detail about their implications for education. Instead, he illustrates the principle by referring to the controlled problem solving used by Charles Darwin and Alfred Wallace in tackling the problem of how different species had arisen.

Geary says that such problem solving requires the inhibition of ‘heuristic-based folk systems’ (p.197), and repeatedly proposes (pp.188, 311, 331, 332) that the prior knowledge of scientific pioneers such as Linnaeus, Darwin and Wallace “arose from evolved folk biological systems…as elaborated by associated academic learning” (p.188). He cites as evidence the assumptions resulting from religious belief made by anatomist and palaeontologist Richard Owen (p.187), and Wallace’s reference to an ‘Overruling Intelligence’ being behind natural selection (p.83). But this proposal is problematic, for three reasons:

The first problem is that some ‘evolved’ folk knowledge is explicit, not implicit. Belief in a deity is undoubtedly folk knowledge; societies all over the world have come up with variations on the concept. But the folk knowledge about religious beliefs is usually culturally transmitted to children, rather than generated by them spontaneously.

Another difficulty is that thinkers such as Linnaeus, Darwin and Wallace had a tendency to be born into scholarly families, so their starting point, even as young children, would not have been merely ‘folk biological systems’. And each of the above had the advantage of previous researchers having already reduced the problem space.

A third challenge is that heuristics aren’t exclusively biologically primary; they can be learned, as Geary points out, via biologically secondary knowledge (p.185).

So if biologically primary knowledge sometimes involves explicit instruction, and biologically secondary knowledge can result in the development of fast, automatic, implicit heuristics, how can we tell which type of knowledge is which?

use of evidence

Geary accepts contentious constructs such as motivation, intelligence and personality (p.319) without question. And he appears to have a rather unique take on concepts such as bounded rationality (p.172), satisficing (p.173) and schemata (p.186).

In addition, Geary’s evidence is not always contentious; sometimes it’s his conclusions that are tenuous. For example, he predicts that if social competition were a driving force during evolution, “a burning desire to master algebra or Newtonian physics will not be universal or even common. Surveys of the attitudes and preferences of American schoolchildren support this prediction and indicate that they value achievement in sports … much more than achievement in any academic area” (pp.334-5), citing a 1993 paper by Eccles et al. The ‘surveys’ were two studies, the ‘American schoolchildren’ 865 elementary school students, the ‘attitudes and preferences’ competence beliefs and task values, and the ‘academic areas’ math, reading and music. Responses show some statistically significant differences. Geary appears to overegg the evidential pudding somewhat, and to completely look past the possibility that there might be culturally transmitted factors involved.


I find Geary’s model perplexing. Most of the key links in it – brain evolution, brain modularity, the heuristics and biases that result in ‘folk’ thinking, motivation and intelligence – involve highly contentious hypotheses.  Geary mentions the ‘hot debates’ but doesn’t go into detail. He simply comes down on one side of the debate and builds his model on the assumption that that side is correct.

He appears to have developed an overarching model of cognition and learning and squeezed the evidence into it, rather than building the model according to the evidence. The problem with the second approach of course, is that if the evidence is inconclusive, you can’t develop an overarching model of cognition and learning without it being highly speculative.

What also perplexes me about Geary’s model is its purpose. Teachers have been aware of the difference between implicit and explicit learning (even if they didn’t call it that) for centuries. It’s useful for them to know about brain evolution and modularity and the heuristics and biases that result in ‘folk’ thinking etc. But teachers can usually spot whether children are learning something apparently effortlessly (implicitly) or whether they need step-by-step (explicit) instruction. That’s essentially why teachers exist. Why do they need yet another speculative educational model?


Eccles, J., Wigfield, A., Harold, R.D.,  & Blumenfeld, P. (1993). Age and gender differences in children’s self‐and task perceptions during elementary school, Child Development, 64, 830-847.

Gauthier, I., Tarr, M.J., Anderson, A.W., Skudlarski, P. & Gore, J.C.  (1999). Activation of the middle fusiform ‘face area’ increases with expertise in recognizing novel objects, Nature Neuroscience, 2, 568-573.

Rakison, D.H.  & Oakes L.M. (eds) (2008). Early Category and Concept Development.  Oxford University Press.

Simon, H.A. (1990). A mechanism for social selection and successful altruism. Science, 250, 1665-1668.

Spearman, C.  (1904).  ‘General Intelligence’ objectively determined and measured.  The American Journal of Psychology, 15, 201-292.




memories are made of this

Education theory appears to be dominated by polarised debates. I’ve just come across another; minimal guidance vs direct instruction. Harry Webb has helpfully brought together what he calls the Kirschner, Sweller & Clark cycle of papers that seem to encapsulate it. The cycle consists of papers by these authors and responses to them, mostly published in Educational Psychologist during 2006-7.

Kirschner, Sweller & Clark are opposed to minimal guidance approaches in education and base their case on the structure of human cognitive architecture. As they rightly observe “Any instructional procedure that ignores the structures that constitute human cognitive architecture is not likely to be effective” (p.76). I agree completely, so let’s have a look at the structures of human cognitive architecture they’re referring to.

Older models

Kirschner, Sweller & Clark claim that “Most modern treatments of human cognitive architecture use the Atkinson and Shiffrin (1968) sensory memory–working memory–long-term memory model as their base” (p.76).

That depends on how you define ‘using a model as a base’. Atkinson and Shiffrin’s model is 45 years old. 45 years is a long time in the fast-developing field of brain research, so claiming that modern treatments use it as their base is a bit like claiming that modern treatments of blood circulation are based on William Harvey’s work (1628) or that modern biological classification is based on Carl Linnaeus’ system (1735). It would be true to say that modern treatments are derived from those models, but our understanding of circulation and biological classification has changed significantly since then, so the early models are almost invariably referred to only in an historical context. A modern treatment of cognitive architecture might mention Atkinson & Shiffrin if describing the history of memory research, but I couldn’t see why anyone would use it as a base for an educational theory – because the reality has turned out to be a lot more complicated than Atkinson and Shiffrin could have known at the time.

Atkinson and Shiffrin’s model was influential because it provided a coherent account of some apparently contradictory research findings about the characteristics of human memory. It was also based on the idea that features of information processing systems could be universally applied; that computers worked according to the same principles as did the nervous systems of sea slugs or the human brain. That idea wasn’t wrong, but the features of information processing systems have turned out to be a bit more complex than was first imagined.

The ups and downs of analogies

Theoretical models are rather like analogies; they are useful in explaining a concept that might otherwise be difficult for people to grasp. Atkinson and Shiffrin’s model essentially made the point that human memory wasn’t a single thing that behaved in puzzlingly different ways in different circumstances, but that it could have three components, each of which behaved consistently but differently.

But there’s a downside to analogies (and theoretical models); sometimes people forget that analogies are for illustrative purposes only, and that models show what hypotheses need to be tested. So they remember the analogy/model and forget what it’s illustrating, or they assume the analogy/model is an exact parallel of the reality, or, as I think has happened in this case, the analogy/model takes on a life of its own.

You can read most of Atkinson & Shiffrin’s chapter about their model here. There’s a diagram on p.113. Atkinson and Shiffrin’s model is depicted as consisting of three boxes. One box is the ‘sensory register’ – sensory memory that persists for a very short time and then fades away. The second box is a short-term store with a very limited capacity (5-9 bits of information) that can retain that information for a few seconds. The third box is a long-term store, where information is retained indefinitely. The short-term and long-term stores are connected to each other and information can be transferred between them in both directions. The model is based on what was known in 1968 about how memory behaved, but Atkinson and Shiffrin are quite explicit that there was a lot that wasn’t known.

Memories are made of this

Anyone looking at Atkinson & Shiffrin’s model for the first time could be forgiven for thinking that the long-term memory ‘store’ is like a library where memories are kept. That was certainly how many people thought about memory at the time. One of the problems with that way of thinking about memory is that the capacity required to store all the memories that people clearly do store, would exceed the number of cells in the brain and that accessing the memories by systematically searching through them would take a very long time – which it often doesn’t.

This puzzle was solved by the gradual realisation that the brain didn’t store individual memories in one place as if they were photographs in a huge album, but that ‘memories’ were activated via a vast network of interconnected neurons. A particular stimulus would activate a particular part of the neural network and that activation is the ‘memory’.

For example, if I see an apple, the pattern of light falling on my retina will trigger a chain of electrical impulses that activates all the neurons that have previously been activated in response to my seeing an apple. Or hearing about or reading about or eating apples. I will recall other apples I’ve seen, how they smell and taste, recipes that use apples, what the word ‘apple’ sounds like, how it’s spelled and written, ‘apple’ in other languages etc. That’s why memories can (usually) be retrieved so quickly. You don’t have to search through all memories to find the one you want. As Antonio Damasio puts it;

Images are not stored as facsimile pictures of things, or events or words, or sentences…In brief, there seem to be no permanently held pictures of anything, even miniaturized, no microfiches or microfilms, no hard copies… as the British psychologist Frederic Bartlett noted several decades ago, when he first proposed that memory is essentially reconstructive.” (p.100)

But Atkinson and Shiffrin don’t appear to have thought of memory in this way when they developed their model. Their references to ‘store’ and ‘search’ suggest they saw memory as more of a library than a network. That’s also how Kirschner, Sweller & Clark seem to view it. Although they say “our understanding of the role of long-term memory in human cognition has altered dramatically over the last few decades” (p.76), they repeatedly refer to long-term memory as a ‘store’ ‘containing huge amounts of information’. I think that description is misleading. Long-term memory is a property of neural networks – if any information is ‘stored’ it’s stored in the pattern and strength of the connections between neurons.

This is especially noticeable in the article the authors published in 2012 in American Educator from which it’s difficult not to draw the conclusion that long term memory is a store that contains many thousands of schemas, rather than a highly flexible network of connections that can be linked in an almost infinite number of ways.

Where did I put my memory?

In the first paper I mentioned, Kirschner, Sweller & Clark also refer to long-term memory and working memory as ‘structures’. Although they could mean ‘configurations’, the use of ‘structures’ does give the impression that there’s a bit of the brain dedicated to storing information long-term and another where it’s just passing through. Although some parts of the brain do have dedicated functions, those localities should be thought of as localities within a network of neurons. Information isn’t stored in particular locations in the brain, it’s distributed across it, although particular connections are located in particular places in the brain.

Theories having a life of their own

Atkinson and Shiffrin’s model isn’t exactly wrong; human memory does encompass short-lived sensory traces, short-term buffering and information that’s retained indefinitely. But implicit in their model are some assumptions about the way memory functions that have been superseded by later research.

At first I couldn’t figure out why anyone would base an educational theory on an out-dated conceptual model. Then it occurred to me that that’s exactly what’s happened in respect of theories about child development and autism. In both cases, someone has come up with a theory based on Freud’s ideas about children. Freud’s ideas in turn were based on his understanding of genetics and how the brain worked. Freud died in 1939, over a decade before the structure of DNA was discovered, and two decades before we began to get a detailed understanding of how brains process information. But what happened to the theories of child development and autism based on Freud’s understanding of genetics and brain function, is that they developed an independent existence and carried on regardless, instead of constantly being revised in the light of new understandings of genetics and brain function. Theories dominating autism research are finally being presented with a serious challenge from geneticists, but child development theories still have some way to go. Freud did a superb job with the knowledge available to him, but that doesn’t mean it’s a good idea to base a theory on his ideas as if new understandings of genetics and brain function haven’t happened.

Again I completely agree with Kirschner, Sweller & Clark that “any instructional procedure that ignores the structures that constitute human cognitive architecture is not likely to be effective”, but basing an educational theory on one aspect of human cognitive architecture – memory – and on an outdated concept of memory at that, is likely to be counterproductive.

A Twitter discussion of the Kirschner, Sweller & Clark model centred around the role of working memory, which is what I plan to tackle in my next post.


Atkinson, R, & Shiffrin, R (1968). Human memory: A proposed system and its control processes. In K. Spence & J. Spence (Eds.), The psychology of learning and motivation (Vol. 2, pp. 89–195). New York: Academic Press
Clark, RE, Kirschner, PA & Sweller, J (2012). Putting students on the path to learning: The case for fully guided instruction, American Educator, Spring.
Damasio, A (1994). Descartes’ Error, Vintage Books.
Kirschner, PA, Sweller, J & Clark, RE (2006). Why Minimal Guidance During Instruction Does Not Work: An Analysis of the Failure of Constructivist, Discovery, Problem-Based, Experiential, and Inquiry-Based Teaching Educational Psychologist, 41, 75-86.