The distinction between fast and slow thinking, or system 1 and system 2 thinking, made popular by Daniel Kahneman’s book *Thinking Fast and Slow*, might be a helpful lens to view LLMs.
System 1 is fast, automatic, frequent, emotional, stereotypic, and subconscious. Examples of System 1 from Kahneman’s book:
determine that an object is at a greater distance than another
complete the phrase "war and ..."
think of a good chess move (if you're a chess master)
understand simple sentences
System 2 is slow, effortful, infrequent, logical, calculating, and conscious. Examples of system 2:
prepare yourself for the start of a sprint
count the number of A's in a certain text
solve 17 × 24
direct your attention towards someone at a loud party
LLMs have mainly been used to augment human tasks. Maybe even as a cognitive prosthetic, but not a replacement. But we're seeing the first glimpses of how LLMs will be used for system 1 tasks. It can already do most of Kahneman’s examples (GPT3.5 Instruct is an 1800 ELO chess player), and “completing the phrase” is just a simplification of an autoregressive transformer.
System 2 thinking is still reserved for humans. We might use LLMs to get a first draft, but we don't have the tools to do analytical thinking with LLMs (yet). Asking it to solve a complex equation will fail. Asking ChatGPT to spell ”mayonnaise” backward or count the number of letters in a complex text might fail.
We’re in the process of building out the scaffolding for Systems 2 thinking with LLMs.
Chain-of-thought. “think step-by-step”
Tool usage.
LLMOps
Both modes are important. And we might accomplish both with LLMs one day.
Insightful post Matt! I often read your writing. Good stuff. Keep going!
Do LLMs exhibit some of the same cognitive fallacies / biases / failure modes that occur in humans during S1 thinking?
I’m not sure how far the parallels to human thinking will go - we’re already seeing some failure modes come up much more frequently in these models than they do in humans (hallucinations etc). It’ll be interesting if a whole new class of biases arise from purely stochastic models