Claude Dreaming — When AI Starts to "Dream", and Why It Might Hallucinate More

On 6 May 2026, at Code with Claude 2026, Anthropic shipped a new Claude Managed Agents feature called Dreaming — a scheduled process where an AI agent reviews its past sessions between jobs, extracts patterns, and writes them into a new memory file. Anthropic shipped Outcomes alongside it: a self-grading loop that scores agent output against a written rubric. This article unpacks how dreaming actually works, what AI is really "dreaming about," how machine dreams differ from human ones, and — most importantly — whether this feature is going to make AI hallucinate more.

Quick summary — what Claude Dreaming is / isn't

✓ It is: a scheduled batch job that replays past session logs, extracts patterns, and writes them to a memory file the next session can read.

✗ It isn't: a subconscious, AGI awakening, or a dream with emotion or imagination that invents new things.

How Dreaming Actually Works — The Mechanism, Not the Metaphor

Anthropic compares dreaming to hippocampal memory consolidation — the process the human brain runs during sleep, replaying the day's events and deciding what to keep as long-term memory. In engineering terms it is a scheduled job that runs in four stages:

Stage	Process	Output
1. Scan	Read the full transcript of past sessions	Raw material
2. Extract	Pull out patterns: recurring mistakes, recurring workflows, preferences	List of patterns
3. Write	Persist patterns to a memory file (markdown format)	Memory usable in the next session
4. Apply	The next session starts with the consolidated memory pre-loaded	Agent appears to "get smarter" over time

The numbers Anthropic disclosed from the Harvey (legal-AI startup) pilot: task completion rates climbed roughly 6x, .docx file quality rose 8.4%, .pptx rose 10.1% — when dreaming was paired with an outcomes rubric.

Dreaming vs Context Window vs Fine-tuning — what's the difference?

Method	Where memory lives	Update frequency	Cost model
Context Window	Inside the prompt of every call	Every call	Per token, every call
Fine-tuning	Inside model weights	Per retraining run	High training cost
Dreaming	In an external memory file	After every session (batch)	Per token of batch job

What Is the AI Actually Dreaming About?

The word "dream" sounds like imagination inventing new things — but in reality AI is not creating anything new. It is doing statistical pattern extraction over an existing transcript. Anthropic states clearly that dreaming captures three main pattern types:

Recurring mistakes — error patterns that recur across past sessions, e.g. the agent forgetting to check authentication before calling an API multiple times
Workflow convergence — step sequences the agent keeps repeating for similar tasks, e.g. "open file → grep keyword → edit → run test"
Shared preferences — things a user or team keeps signaling they like or dislike, e.g. "user dislikes output with emoji"

Worked example

Suppose you correct an agent three times in a row with "no emoji please" — the next dream pass will scan the transcript, notice the user rejected every emoji output, and write a rule into the memory file like preference: no_emoji. The following session, the agent reads that memory at startup, and you never have to mention it again.

This is why an AI agent "seems to remember" after a while of use — not because the model became smarter, but because its memory file grew thicker with each dream pass. The same principle is discussed in AI vs Humans — Who's Better at Knowledge Work: most of an AI's "competence" lives outside the model, not inside it.

Human Dreams vs AI Dreams — Where They Diverge

Most people hear "AI dreaming" and picture an AI asleep, dreaming the way humans do. They are different in 5 important dimensions:

Dimension	Human dreams	"AI dreams"
Source material	Memory + emotion + imagination	Transcript only
Invents things that aren't real?	Yes (new places, monsters, strangers)	No — samples from the log only
Emotion	Yes (fear, joy, sadness)	None
Outcome	Memory consolidation + creativity	Rule extraction into a markdown file
Auditable?	No (recallable, not provable)	Yes — open the memory file line by line

The dangerous thing they share: both simplify reality into a narrative. The human brain stores summaries of experience, not raw data. AI extracts patterns from transcripts instead of keeping the full transcript. That simplification is exactly where distortion enters.

⚠️ Will Dreaming Make AI Hallucinate More? — 4 Risks to Know

This is the most important question in this piece. The short answer: quite possibly, without mitigation — and Anthropic acknowledges it themselves, which is why Dreaming has to be paired with an Outcomes rubric to catch drift on every run.

1. Pattern over-generalization

The AI sees a pattern in a small sample and writes it as a general rule. Example: a user corrects the agent three times to use formal language in a legal document context — dream may decide "user always prefers formal language" and apply it to casual chat the next day.

2. Sample bias amplification

If past sessions are skewed (e.g. the user only used the agent for writing Thai blog posts), dream will extract "the main job is writing Thai blog posts" — and the next session will lean that way even when the user asks for something else.

3. Memory drift

Memory that passes through dream many times gets re-summarized on top of itself — like the children's game where the final whispered word usually has drifted from the original. This is exactly why Harvey told Anthropic that dreaming worked best paired with a tight outcomes rubric, so the grader catches drift on the very next run.

4. Compounding hallucination (the vicious cycle)

This is the scariest risk:

Dream extracts the wrong pattern → writes the wrong memory
The next session uses that wrong memory → works incorrectly
The next dream pass sees a transcript where "the whole session was wrong" → confirms the wrong pattern
The wrong memory becomes more deeply embedded → a positive feedback loop of hallucination

Risk	Mitigation	Who must act
Over-generalization	Write rubrics that specify context clearly	User / Dev
Sample bias	Use diverse sessions, limit memory scope	Dev / Admin
Memory drift	Outcomes rubric catches drift each run, set memory expiry	Anthropic + User
Compounding hallucination	Audit memory file periodically, delete incorrect entries	User / Admin

This compounding-error pattern isn't new — Report Gap — Why AI Reports Look Good but You Can't Act on Them argues that AI being confident in something wrong is harder to detect than AI saying it doesn't know.

Lessons for Humans — Your Input Matters More Than You Think

When the AI you use has dreaming, every prompt and every correction you give will be recorded and become future behavior of that AI (and possibly your team's agent too, if memory scope is team-wide). This changes how we should use AI:

3 rules for using AI with memory

When you correct, explain why — not just "wrong" or a delete. If you just say "wrong," dream learns only the surface pattern ("don't answer like this"). If you say "wrong because X," dream learns a rule it can generalize correctly.
Don't rush your instructions — careless phrases like "be quick" or "keep it short" can become a pattern that "the user prefers short answers" → the next time, the agent will drop important context you didn't actually want cut.
Review memory periodically — consolidated memory can go stale or be misinterpreted from the start. Claude Code, for instance, has ~/.claude/memory/ you can open and edit yourself.

Easy comparison: using AI that has memory is like teaching a child — input quality determines output quality. Talk to AI like a search engine (terse commands, no explanation) and memory degrades fast. Talk to it like a new teammate, and memory will gradually help you.

How to Use This in Practice — 5 Approaches for Different Roles

You don't have to be a Claude Managed Agents customer to benefit from this idea — the "Dreaming + Outcomes" pattern is a design pattern that works with any AI.

A. For general AI users (executives / staff)

Use Claude Projects or ChatGPT Custom GPTs as "manual dreaming" — summarize context from past sessions into project instructions. At the end of each day, ask the AI: "What did we do today? Give me 3 patterns that recurred." Take that output and paste it into the next day's memory — meaningfully reduces the time you spend re-briefing AI in new sessions.

B. For developers building agents

You can implement this without Managed Agents:

A cron job that summarizes transcripts into a memory file after each session
A separate LLM call acting as a grader against a written rubric
Ready-to-use tooling: Claude Agent SDK, LangChain memory primitives, Mem0

C. For team leads deploying AI to a team

You need to decide on memory scope — who can see whose memory:

Level	Who sees it	Best for	Risk
Per-user	One person	Personal work, research	Low
Per-team	One team	Shared workflow	Medium (groupthink)
Per-org	Everyone	Brand voice, policy	High (drift spreads)

D. For knowledge workers — write a rubric before instructing AI

Instead of an open instruction like "write an article about X", specify a rubric: length, tone, must-haves, must-not-haves. The result: AI can grade itself → one correction round instead of five. The same principle is used in AI in Accounting: you need a rubric covering compliance correctness before any output is accepted.

E. For organizations buying AI — a 5-point checklist before purchase

Is there an audit log of memory? Can you inspect what the AI has "learned"?
Can you delete the memory of a single user? Important for data-protection compliance
What is the memory scope — user / team / org?
Is there rubric-based evaluation built in, or do you have to build it?
Can memory be exported? Insurance against future vendor lock-in

How This Connects to ERP — 3 Questions to Ask Your Vendor

Modern ERP systems that will include an AI assistant — including Saeree ERP, which is currently training its AI Assistant — will face the same question Anthropic is facing: once AI in ERP starts "learning" from users, who owns that knowledge?

Three questions to ask a vendor before turning on AI features in an ERP:

1. At what level does memory consolidate?

If memory consolidates at the org level, one user "teaching wrong" can become a rule for everyone. This is dangerous in an ERP: a chart-of-accounts mistake or document-numbering quirk from one department could become default behavior every department then has to deal with.

2. Is there an audit trail of the "dreams"?

Compare it to the audit log every ERP has always had — who changed what and when. AI memory deserves the same: which memory entry came from which session, who triggered it, and crucially can you delete just that entry without nuking the whole memory? This is the same kind of foundational design principle as two-factor authentication in an ERP — auditability must be baseline.

3. Who has the right to delete a "dream"?

When a user leaves, when a business process changes, when a regulator audits — who has the authority to delete the AI's memory in the ERP? Admin? Vendor? The data owner (per data-protection law)? Write this into the contract before signing.

Saeree ERP's design stance on its AI Assistant: built with auditability, per-user memory scope by default, and deletion rights vested in the org admin — so AI inside the ERP doesn't become a black box that makes decisions on people's behalf with no one able to inspect it.

Summary — Pros vs Cons of AI Dreaming

✓ Pros	✗ Cons / Risks
Agent improves over time without retraining the model	Sample bias amplification — skewed past sessions produce skewed future work
Reduces re-briefing time in new sessions	Memory drift from repeated re-summarization
Memory is inspectable (lives in a markdown file, not model weights)	Compounding hallucination — wrong memory becomes more deeply embedded
Reduces context-window token cost over long-running sessions	Privacy + ownership — who owns the AI's "dreams"?

"AI doesn't dream — it iterates on the input we feed it. Iterating on careless input is hallucination with a paper trail."

Questions worth sitting with

If your AI agent "dreams" every night — do you know what it's dreaming about you? Have you ever actually opened its memory file? And most importantly — if that dream is wrong, do you know where to fix it?

Because the same questions are the ones Thai organizations need to answer before turning on AI features in an ERP — not after they've turned them on and run into trouble. If you're considering AI inside your business's core systems, book a consultation with the Saeree ERP team to assess readiness and design governance that is actually auditable.

References

Anthropic — New in Claude Managed Agents: dreaming, outcomes, and multiagent orchestration (6 May 2026)
VentureBeat — Anthropic introduces "dreaming," a system that lets AI agents learn from their own mistakes
The New Stack — Anthropic will let its managed agents dream
SiliconANGLE — Anthropic is letting Claude agents 'dream' so they don't sleep on the job
Digital Trends — Anthropic just taught Claude to dream between tasks

Claude Dreaming — When AI Starts to Dream

How Dreaming Actually Works — The Mechanism, Not the Metaphor

Dreaming vs Context Window vs Fine-tuning — what's the difference?

What Is the AI Actually Dreaming About?

Human Dreams vs AI Dreams — Where They Diverge