“waiter’s memory”

At the ResearchED conference last Saturday, when I queried the usefulness of the diagram of working memory that was being used, I was asked two questions. Here’s the first:

What’s wrong with Willingham’s model of working memory?

Nothing’s wrong with Willingham’s model. As far as I can tell, the diagram of working memory that was being used by teachers at the ResearchED conference had been simplified to illustrate two key points; that working memory has limited capacity and that information can be transferred from working memory to long-term memory and vice-versa.

My reservation about it is that if it’s the only model of working memory you’ve seen, you won’t know what Willingham has left out, nor how working memory fits into the way the brain processes information. And over-simplified models of things, if unconstrained by reality, tend to take on a life of their own which doesn’t help anyone. The left-brain right-brain mythology is a case in point. An oversimplified understanding of the differences between right and left hemispheres followed by a process of Chinese whispers ended up producing some bizarre educational practices.

The second question was this:

What difference would it make if we knew more about how information is processed in the brain?

It’s a good question. The short answer is that if you rely on Willingham’s diagram for your understanding of working memory, you could conclude, as some people have done, that direct instruction is the only way students should be taught. As I hope I showed in my previous post, the way information is processed is more complex than the diagram suggests. I think there are three key points that are worth bearing in mind.

Long-term memory is constantly being updated by incoming sensory information

Children are learning all the time. They learn implicitly, informally and incidentally from their environment as well as explicitly when being taught. It’s well worth utilising that ability to learn from ‘background’ information. Posters, displays, playground activities, informal conversations, and dvds and books used primarily for entertainment, can all exploit implicit, informal and incidental learning that will support and extend and reinforce explicit learning.

We’re not always aware that we are learning

I only need two or three exposures to an unfamiliar place, or face or song before I can recognise it again, and I don’t need to actively pay attention to, or put any effort into recalling, the place, face or song in order to do so. I would have reliably learned new things, but my learning would be implicit. I wouldn’t be able to give accurate directions, describe the face so that someone else would recognise it, or hum the tune. (Daniel Willingham suggests that implicit memory doesn’t exist, but he’s talking about the classification rather than the phenomenon.)

Peter Blenkinsop and I found that we were using different definitions of learning. My definition was; long-term changes to the brain as a result of incoming information. His was; being able to explicitly recall information from long-term memory. Both definitions are valid, but they are different.

Working memory is complex

George Miller’s paper ‘The magical number seven, plus or minus two’ is well worth reading. What’s become clear since Miller wrote it is that his finding that working memory can handle only 7±2 bits of information at once applies to the loops/sketchpads/buffers in working memory. At first, it was assumed there was only one loop/sketchpad/buffer. Since then more have been discovered. In addition, due to information being chunked, the amount of information in the loops/sketchpads/buffers can actually be quite large. On top of that, the central executive is simultaneously monitoring information from the environment, the body and long-term memory. That’s quite a lot of information flowing through working memory all the time. We don’t actively pay attention to all of it, but it doesn’t follow that anything we don’t pay attention to disappears forever. In addition to working memory capacity there are several other things the brain does that make it easier, or harder, for people to learn.

Things that make learning easier (and harder)

1. Pre-existing information

People learn by extending their existing mental schemata. This involves extending neural networks – literally. If information is totally novel to us, it won’t mean anything to us and we’re unlikely to remember it. Because each human being has had a unique set of life experiences, each of us has a unique set of neural networks and the way we structure our knowledge is also unique. It doesn’t follow that everybody’s knowledge framework is equally valid. The way the world is structured and the way it functions are pretty reliable and we know quite a lot about both. Students do need to acquire core knowledge about the world and it is possible to teach it. Having said that, there are often fundamental disagreements within knowledge domains about the nature of that core knowledge, so students also need to know how to look at knowledge from different perspectives and how to test its reliability and validity.

Tapping into children’s existing schemata, not just those relating to what they are supposed to be learning in school but what they know about the world in general, can provide hooks on which to hang tricky concepts. Schemata from football, pop culture or Dr Who can be exploited, not in order to make learning ‘fun’, but to make sense of it. That doesn’t mean that teachers have to refer to pop culture, or that they should do so if it’s likely to prove a distraction.

2. Multi-sensory input

Because learning is about the real world and takes place in the real world, it usually involves more than one sensory modality – human beings rely most heavily on the visual, auditory and tactile senses. Neural connections linking information from several sensory modalities make things we’ve learned more secure because they can be accessed via several different sensory routes. It also makes sense to map the way information is presented as accurately as possible onto what it relates to in the real world. Visits, audio-visuals, high quality illustrations and physical activities can convey information that chalk-and-talk and a focus on abstract information can’t. Again, the job of multi-sensory vehicles for learning isn’t to make the learning ‘fun’ (although they might do that) or to distract the learner, but to increase the amount of information available.

3. Trial-and-error

The brain relies on trial-and-error feedback to fine-tune skills and ensure that knowledge is fit for purpose. We call trial-and-error learning in young children ‘play’. Older children and adults also use play to learn – if they get the opportunity. In more formal educational settings, formative assessment that gives feedback to individual students is a form of trial-and-error learning. It’s important to note that human beings tend to attach greater weight to the risk of failure and sanctions than they do to opportunities for success and reward. This means that tasks need to be challenging but not too challenging. Too many failures – or too many successes – can reduce interest and motivation.

4. Rehearsal

Willingham emphasises the importance of rehearsal in learning. The more times neural networks are activated, the stronger the connections become within them, and the more easily information will be recalled. Rehearsal at intervals is more effective than ‘cramming’. That’s because the connections between neurons have to be formed, physically, and there’s no opportunity for that to happen if the network is being constantly activated by incoming information. There’s a reason why human beings need rest and relaxation.

5. Problem-solving

Willingham is often quoted as saying ‘the brain is not designed for thinking’. That’s true in the sense that our brains default to quick-and-dirty solutions to problems rather than using logical, rational thought. What’s also true is what Willingham goes on to say; ‘people like to solve problems, but not to work on unsolveable problems’ (p.3). The point he’s making is that our problem-solving capacity is limited. Nonetheless, human technology bears witness to the fact that human beings are problem-solvers extraordinaire, and the attempts to resolve problems have resulted in a vast body of knowledge about how the world works. It’s futile to expect children to do all their learning by problem-solving, but because problem-solving involves researching, re-iterating, testing and reconfiguring knowledge it can be an effective way of acquiring new information and making it very memorable.

6. Writing things down

Advocates of direct instruction place a lot of emphasis on the importance of long-term memory; the impression one gets is that if factual information is memorised it can be recalled whenever it’s needed. Unfortunately, long-term memory doesn’t work like that. Over time information fades if it’s not used very often and memories can become distorted (assuming they were accurate in the first place). If we’ve acquired a great deal of factual information, we won’t have time to keep rehearsing all of it to keep it all easily accessible. Memorising factual information we currently need makes sense, but what we need long-term is factual information to hand when required, and that’s why we invented writing. And books. And the internet, although that has some of the properties of long-term memory. Recording information enormously increases the capacity and reliability of long-term memory.


In a classic Sesame Street sketch, Mr Johnson the restaurant customer suggests that Grover the waiter write down his order. Grover is affronted: “Sir! I am a trained professional! I do not need to write things down. Instead, I use my ‘waiter’s memory’.” Waiters are faced with an interesting memory challenge; they need to remember a customer’s order for longer than is usually possible in working memory, but don’t need to remember the order long-term. So they tend to use technical support in the form of a written note. Worth watching the sketch, because it’s a beautiful illustration of how a great deal of information can be packed into a small timeframe, without any obvious working memory overload. (First time round most children would miss some of it, but Sesame Street repeats sketches for that reason.)


It won’t have escaped the attention of some readers that I have offered evidence from cognitive science to support educational methods lumped together as ‘minimal guidance’ and described as ‘failing’ by Kirschner, Sweller and Clark; constructivist, discovery, problem-based, experiential, and inquiry-based teaching. A couple of points are worth noting in relation to these approaches.

The first is that they didn’t appear suddenly out of the blue. Each of them has emerged at different points in time from 150 years of research into how human beings learn. We do learn by experiencing, inquiring, discovering, problem-solving and constructing our knowledge in different ways. There is no doubt about that. There’s also no doubt that we can learn by direct instruction.

The second point is that the reason why these approaches have demonstrably failed to ensure that all children have a good knowledge of how the world works, is because they have been extended beyond what George Kelley called their range of convenience.

In other words they’ve been applied inappropriately. You can’t just construct your own understanding of the world and expect the world to conform to it. Trying to learn everything by experience, discovery, inquiry or problem-solving is a waste of effort if someone’s already experienced, discovered or inquired about it, or if a problem’s already been solved. Advocates of direct instruction are quite right to point out that you usually need prior knowledge before you can solve a problem, and a good understanding of a knowledge domain before you know what you need to inquire about, and that many failures in education have come about because novices have been expected to mimic the surface features of experts’ behavior without having the knowledge of experts.

Having said that, relying on an oversimplified model of working memory introduces the risk of exactly the same thing happening with direct instruction. The way the brain processes information is complex, but not so complex it can’t be summarised in a few key principles. Human beings acquire information in multiple ways, but not in so many ways we can’t keep track of them. Figuring out what teaching approaches are best used for what knowledge might take a bit of time, but it’s a worthwhile investment, and should help to avoid the one-size-fits-all approach that has bedevilled the education system for too long.


Image of Grover from Muppet Wiki http://muppet.wikia.com/wiki/Grover

6 thoughts on ““waiter’s memory”

  1. Most of what you mention here seems irrelevant in that it seems not to challenge anything in the simple model of working memory that you complain about, or it repeats a point Willingham has already discussed in his book and presumably is already known to anybody who has acquired their model of working memory from Willingham’s book.

    However, you seem to be making three points that are direct challenges (although even then, I am perhaps having to interpret, perhaps incorrectly, in order to identify those challenges).

    1) We can learn a lot, in an academically relevant way, unconsciously, i.e. without paying attention to it.

    2) The claim that using more senses to take in information increases the capacity of working memory to such a degree that cognitive overload is no longer an issue.

    3) The claim that problem-solving learning is effective.

    Unfortunately, you don’t seem to have provided any information about the empirical evidence for any of these claims. I fear that point 1, is simply based on unreliable extrapolation from those things we do learn well implicitly. I don’t have a clue where you get 2) or 3) from.

    Please let me know if I have interpreted your claims accurately, or if I have missed any evidence that you have presented.

    • I’m commenting on the model of working memory that was being used in the sessions at the ResearchED conference and the conclusions that were being drawn from it.

      1) I got the impression that ‘chunking’ was seen as taking place in long-term memory; that it’s something experts do with their vast domain knowledge, so chunked information is retrieved from LTM into WM. But if I’ve understood it correctly, sensory processing research has found that chunking takes place pre-consciously (as distinct from unconscious; pre-conscious information can be accessed albeit with difficulty – unconscious information can’t be accessed at all).

      For example visual chunking of faces, places and objects takes place in the V1 area of visual cortex before it reaches the visuo-spatial sketchpad; images and scenarios arrive in working memory whole and entire. We usually know if something about a familiar face, place or object has changed but often find it difficult to say exactly what has changed. Artists have to train themselves to be aware of the components of visual images. A similar thing happens with auditory information, which is why people find it difficult to hear the component sounds in speech, even though it’s by chunking those component sounds they learned to understand speech in the first place; and why SP training is so effective.

      In essence, we are all ‘experts’ in processing the visual and auditory information about the world around us. The chunking that domain knowledge experts do is an extension of the chunking we all do. Experts chunk related bits of information because they are exposed to many, many examples of that information being related – the chicken-sexing experts being a good example. It’s possible to make that pre-conscious information accessible and to teach it explicitly, which is a kind of short-cut to expertise.

      However, the point I want to make is that chunked information (i.e. low cognitive load) is flowing through WM into LTM all the time. We are constantly learning from our environment. The only way we can stop that happening is if we keep our eyes and ears closed, and then we’d get tactile information whether we liked it or not. Schools could tap into that constant informal, incidental learning to support the focussed, explicit learning in the classroom, by thinking about what information is available in the school environment, not just what’s in lesson plans.

      2) Multi-sensory input is how we normally acquire information, so it’s represented by multi-sensory links in neural networks. That increased redundancy increases the likelihood of recall because we can access the same information via different sensory routes. It also increases the likelihood, in the advent of brain damage, of someone still being able to access the information they need on a daily basis to survive.

      If information is learned via more than one sensory modality there is an increased likelihood of recall. But that doesn’t mean superfluous multi-sensory information – fireworks, bells and whistles – are a good thing; they’re not, they just distract.

      3) Obviously, if students don’t have the information they need to solve problems, then problem-solving tasks aren’t going to help them learn. However, once they do have that information, problem-solving using the information is a good way to reinforce learning because when you are solving problems you have to go over and over your data and reconfigure it to find a way of solving the problem. Also, it’s very important for even young students to know how to find the information they need to solve problems, rather than assuming the only source is a nearby adult.

      You’re quite right that I haven’t provided any support for my claims. I should have put up a reading list. Will do that.

      1) comes largely from recent research on visual and auditory processing. Michael Gazzaniga’s “Cognitive Neurosciences III” was my main starting point. Stanislas Dehaene’s book ‘Reading in the Brain’ provides an excellent summary of chunking.

      2) The bit about learning usually involving more than one sensory domain is self-evident; we see, hear, touch, smell and taste things all the time. The fact that multi-sensory representations make recall more secure is based on the principle of redundancy in information theory. It’s also something that emerges from research involving brain-damaged patients.

      3) Again, this is self-evident. If you are trying to solve a problem, whether it’s how to plan your new kitchen or why your research data tell you your current theory is wrong, you have to go over the relevant information again and again and look at it in different configurations. All that repetition means a) you’re not likely to forget it in a hurry and b) you can generalise it much better.

      • You appear to be doing that thing where you write loads, but it is hard to identify any clear points.

        As far as I can tell, and I admit I may have missed something amid all the red herrings and digressions, you have ignored point 1 (addressing a straw man about chunking instead) and on points 2 & 3 you have admitted that there is no empirical evidence for what appeared to be your key challenges. Therefore, your entire criticism of the simple model of working memory seems to be completely baseless.

      • 1) If visual and auditory information are chunked before they get to WM, that opens up scope for incidental background learning with little effort.

        On top of that, the 3-5 bits of novel information limit applies to the loops/sketchpads/buffers that form only part of WM. There are at least three of them, possibly more, in different sensory modalities, suggesting that what WM can process depends on the qualitative properties of the novel information, not just its quantitative ones. You don’t want to overload students, but you don’t want to spoon-feed them either.

        2) Why would you want ‘empirical evidence’ for something that’s self-evident? Are you saying we learn in single sensory modalities only? Or that we can’t access the same memory via different sensory modalities? Much of the stroke literature is about this; people can access memories via one route but not via another.

        3) The same principle applies to problem-solving. Maybe you’ve never solved a problem or followed your thought-processes while you did so.

  2. I have found this posting very interesting – and the opening paragraphs in particular resonate with my approach to teaching and learning in the field of phonics and basic literacy. In effect, I have formalised the notion of ‘incidental teaching’ by providing ways (rationale, resources and guidance) of giving an overview of the complex alphabetic code that we are endeavouring to teach – and therefore specifically running incidental teaching alongside the direct instruction and over-learning which form the basis of ‘systematic’ synthetic phonics programmes. Thus, I call this the ‘two-pronged systematic and incidental phonics teaching’ approach and suggest that it is better still than a systematic way forwards alone. This two-pronged approach acknowledges the learning that can take place through a number of routes – not just direct and systematic instruction, it is a great reinforcer and also is a great provider for differentiation and catering for individual capacity to learn. Many learners can easily self-teach especially when they understand what it is that they are being taught and when patter and resources are designed to support plenty of incidental teaching and learning (both in school and at home).

    Thank you for your thoughts.

    • Thank you for your comment Debbie.

      I think people teaching young children and basic skills tend to be more aware of incidental learning because they can’t rely on students having a body of abstract knowledge – at all. They have to build that body of abstract knowledge from scratch and start with how learners have learned to date – which is often via incidental learning.

      My PGCE course tutor had trained in the 1930s. She was ruthless about making sure that nothing in the classroom environment would incidentally give children the wrong idea, and that every opportunity was taken to convey the right ones. She saw the classroom and the school building as a learning environment.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s