In Psychology
In psychology, metacognition generally refers to reflective beliefs and reasoning about one’s own ability to reason. Fleming (2024) says a particular class is of interest: “propositional confidence in one's own (hypothetical) decisions or actions”. So in LLMs, meta-cognition would be propositional confidence in decisions or actions of the model.
Various definitions (Flavell, 1979) include:
- Cognitive psychology:
- “cognition about cognition”; generally oprationalized in terms of online rating. e.g., In cognitive psychological research, confidence in trial-by-trial accuracy within perceptual or memory tasks is a common approach (Fleming and Lau, 2014); “the degree of association between accuracy and confidence can be taken as a quantitative measure of cognition”.
- Developmental: distinguish between
- metacognitive knowledge (understanding of one’s own cognitive processes and those of others) (Flavell, 1979)
- metacognitive experience (Flavell, 1979), e.g.,
- feelings of knowing
- tip-of-the-tongue states
- Judgements of learning, e.g., the feeling of being close to solving a problem
- the feeling of “grokking” something; the “a-ha moment”
- Megacognitive strategies (Flavell, 1979): deliberate actions individuals take to regulate cognitve processes: planning, monitoring, evluatign one’s own learning or problem-solving approaches
- personality and social
- Processing and contorl of one’s own mental states and processes (Norman et al. 2019), including attitudes, self-identity, intepersonal relationships, mentalizing
- Ethology
- Metacognition in animals can be assessed through tasks involving uncertainty expression, memory judgments, wagering, and information gathering.
Metacognition is intrinsically linked to theory of mind*.*
Heyes et al. (2020) say that metacognition is “discrimination, interpretation, and broadcasting of subtle cues indicating the rightness of ongoign thought and behavior”
In LLMs
In an expansive or basic sense, LLMs come with metacognition built in, because in applications like chatbots, and in most LLM applications, each token generated serves as part of the input for generating the following token.
Renze and Guven (2024) demonstrated that self-reflection can improve problem-solving performance. I’m not super convinced—I think their answers (that they provide to the LLM after an incorrect answer before it gives the correct one) could leak into the “reflections”, and even though the answer is subsequently redacted, information could be preserved.
Research on “confidence” or “calibration”