- 14
- May
On 6 May 2026, at Code with Claude 2026, Anthropic shipped a new Claude Managed Agents feature called Dreaming — a scheduled process where an AI agent reviews its past sessions between jobs, extracts patterns, and writes them into a new memory file. Anthropic shipped Outcomes alongside it: a self-grading loop that scores agent output against a written rubric. This article unpacks how dreaming actually works, what AI is really "dreaming about," how machine dreams differ from human ones, and — most importantly — whether this feature is going to make AI hallucinate more.
Quick summary — what Claude Dreaming is / isn't
✓ It is: a scheduled batch job that replays past session logs, extracts patterns, and writes them to a memory file the next session can read.
✗ It isn't: a subconscious, AGI awakening, or a dream with emotion or imagination that invents new things.
How Dreaming Actually Works — The Mechanism, Not the Metaphor
Anthropic compares dreaming to hippocampal memory consolidation — the process the human brain runs during sleep, replaying the day's events and deciding what to keep as long-term memory. In engineering terms it is a scheduled job that runs in four stages:
| Stage | Process | Output |
|---|---|---|
| 1. Scan | Read the full transcript of past sessions | Raw material |
| 2. Extract | Pull out patterns: recurring mistakes, recurring workflows, preferences | List of patterns |
| 3. Write | Persist patterns to a memory file (markdown format) | Memory usable in the next session |
| 4. Apply | The next session starts with the consolidated memory pre-loaded | Agent appears to "get smarter" over time |
The numbers Anthropic disclosed from the Harvey (legal-AI startup) pilot: task completion rates climbed roughly 6x, .docx file quality rose 8.4%, .pptx rose 10.1% — when dreaming was paired with an outcomes rubric.
Dreaming vs Context Window vs Fine-tuning — what's the difference?
| Method | Where memory lives | Update frequency | Cost model |
|---|---|---|---|
| Context Window | Inside the prompt of every call | Every call | Per token, every call |
| Fine-tuning | Inside model weights | Per retraining run | High training cost |
| Dreaming | In an external memory file | After every session (batch) | Per token of batch job |
What Is the AI Actually Dreaming About?
The word "dream" sounds like imagination inventing new things — but in reality AI is not creating anything new. It is doing statistical pattern extraction over an existing transcript. Anthropic states clearly that dreaming captures three main pattern types:
- Recurring mistakes — error patterns that recur across past sessions, e.g. the agent forgetting to check authentication before calling an API multiple times
- Workflow convergence — step sequences the agent keeps repeating for similar tasks, e.g. "open file → grep keyword → edit → run test"
- Shared preferences — things a user or team keeps signaling they like or dislike, e.g. "user dislikes output with emoji"
Worked example
Suppose you correct an agent three times in a row with "no emoji please" — the next dream pass will scan the transcript, notice the user rejected every emoji output, and write a rule into the memory file like preference: no_emoji. The following session, the agent reads that memory at startup, and you never have to mention it again.
This is why an AI agent "seems to remember" after a while of use — not because the model became smarter, but because its memory file grew thicker with each dream pass. The same principle is discussed in AI vs Humans — Who's Better at Knowledge Work: most of an AI's "competence" lives outside the model, not inside it.
Human Dreams vs AI Dreams — Where They Diverge
Most people hear "AI dreaming" and picture an AI asleep, dreaming the way humans do. They are different in 5 important dimensions:
| Dimension | Human dreams | "AI dreams" |
|---|---|---|
| Source material | Memory + emotion + imagination | Transcript only |
| Invents things that aren't real? | Yes (new places, monsters, strangers) | No — samples from the log only |
| Emotion | Yes (fear, joy, sadness) | None |
| Outcome | Memory consolidation + creativity | Rule extraction into a markdown file |
| Auditable? | No (recallable, not provable) | Yes — open the memory file line by line |
The dangerous thing they share: both simplify reality into a narrative. The human brain stores summaries of experience, not raw data. AI extracts patterns from transcripts instead of keeping the full transcript. That simplification is exactly where distortion enters.
⚠️ Will Dreaming Make AI Hallucinate More? — 4 Risks to Know
This is the most important question in this piece. The short answer: quite possibly, without mitigation — and Anthropic acknowledges it themselves, which is why Dreaming has to be paired with an Outcomes rubric to catch drift on every run.
1. Pattern over-generalization
The AI sees a pattern in a small sample and writes it as a general rule. Example: a user corrects the agent three times to use formal language in a legal document context — dream may decide "user always prefers formal language" and apply it to casual chat the next day.
2. Sample bias amplification
If past sessions are skewed (e.g. the user only used the agent for writing Thai blog posts), dream will extract "the main job is writing Thai blog posts" — and the next session will lean that way even when the user asks for something else.
3. Memory drift
Memory that passes through dream many times gets re-summarized on top of itself — like the children's game where the final whispered word usually has drifted from the original. This is exactly why Harvey told Anthropic that dreaming worked best paired with a tight outcomes rubric, so the grader catches drift on the very next run.
4. Compounding hallucination (the vicious cycle)
This is the scariest risk:
- Dream extracts the wrong pattern → writes the wrong memory
- The next session uses that wrong memory → works incorrectly
- The next dream pass sees a transcript where "the whole session was wrong" → confirms the wrong pattern
- The wrong memory becomes more deeply embedded → a positive feedback loop of hallucination
| Risk | Mitigation | Who must act |
|---|---|---|
| Over-generalization | Write rubrics that specify context clearly | User / Dev |
| Sample bias | Use diverse sessions, limit memory scope | Dev / Admin |
| Memory drift | Outcomes rubric catches drift each run, set memory expiry | Anthropic + User |
| Compounding hallucination | Audit memory file periodically, delete incorrect entries | User / Admin |
This compounding-error pattern isn't new — Report Gap — Why AI Reports Look Good but You Can't Act on Them argues that AI being confident in something wrong is harder to detect than AI saying it doesn't know.
Lessons for Humans — Your Input Matters More Than You Think
When the AI you use has dreaming, every prompt and every correction you give will be recorded and become future behavior of that AI (and possibly your team's agent too, if memory scope is team-wide). This changes how we should use AI:
3 rules for using AI with memory
- When you correct, explain why — not just "wrong" or a delete. If you just say "wrong," dream learns only the surface pattern ("don't answer like this"). If you say "wrong because X," dream learns a rule it can generalize correctly.
- Don't rush your instructions — careless phrases like "be quick" or "keep it short" can become a pattern that "the user prefers short answers" → the next time, the agent will drop important context you didn't actually want cut.
- Review memory periodically — consolidated memory can go stale or be misinterpreted from the start. Claude Code, for instance, has
~/.claude/memory/you can open and edit yourself.
Easy comparison: using AI that has memory is like teaching a child — input quality determines output quality. Talk to AI like a search engine (terse commands, no explanation) and memory degrades fast. Talk to it like a new teammate, and memory will gradually help you.
How to Use This in Practice — 5 Approaches for Different Roles
You don't have to be a Claude Managed Agents customer to benefit from this idea — the "Dreaming + Outcomes" pattern is a design pattern that works with any AI.
A. For general AI users (executives / staff)
Use Claude Projects or ChatGPT Custom GPTs as "manual dreaming" — summarize context from past sessions into project instructions. At the end of each day, ask the AI: "What did we do today? Give me 3 patterns that recurred." Take that output and paste it into the next day's memory — meaningfully reduces the time you spend re-briefing AI in new sessions.
B. For developers building agents
You can implement this without Managed Agents:
- A cron job that summarizes transcripts into a memory file after each session
- A separate LLM call acting as a grader against a written rubric
- Ready-to-use tooling: Claude Agent SDK, LangChain memory primitives, Mem0
C. For team leads deploying AI to a team
You need to decide on memory scope — who can see whose memory:
| Level | Who sees it | Best for | Risk |
|---|---|---|---|
| Per-user | One person | Personal work, research | Low |
| Per-team | One team | Shared workflow | Medium (groupthink) |
| Per-org | Everyone | Brand voice, policy | High (drift spreads) |
D. For knowledge workers — write a rubric before instructing AI
Instead of an open instruction like "write an article about X", specify a rubric: length, tone, must-haves, must-not-haves. The result: AI can grade itself → one correction round instead of five. The same principle is used in AI in Accounting: you need a rubric covering compliance correctness before any output is accepted.
E. For organizations buying AI — a 5-point checklist before purchase
- Is there an audit log of memory? Can you inspect what the AI has "learned"?
- Can you delete the memory of a single user? Important for data-protection compliance
- What is the memory scope — user / team / org?
- Is there rubric-based evaluation built in, or do you have to build it?
- Can memory be exported? Insurance against future vendor lock-in
How This Connects to ERP — 3 Questions to Ask Your Vendor
Modern ERP systems that will include an AI assistant — including Saeree ERP, which is currently training its AI Assistant — will face the same question Anthropic is facing: once AI in ERP starts "learning" from users, who owns that knowledge?
Three questions to ask a vendor before turning on AI features in an ERP:
1. At what level does memory consolidate?
If memory consolidates at the org level, one user "teaching wrong" can become a rule for everyone. This is dangerous in an ERP: a chart-of-accounts mistake or document-numbering quirk from one department could become default behavior every department then has to deal with.
2. Is there an audit trail of the "dreams"?
Compare it to the audit log every ERP has always had — who changed what and when. AI memory deserves the same: which memory entry came from which session, who triggered it, and crucially can you delete just that entry without nuking the whole memory? This is the same kind of foundational design principle as two-factor authentication in an ERP — auditability must be baseline.
3. Who has the right to delete a "dream"?
When a user leaves, when a business process changes, when a regulator audits — who has the authority to delete the AI's memory in the ERP? Admin? Vendor? The data owner (per data-protection law)? Write this into the contract before signing.
Saeree ERP's design stance on its AI Assistant: built with auditability, per-user memory scope by default, and deletion rights vested in the org admin — so AI inside the ERP doesn't become a black box that makes decisions on people's behalf with no one able to inspect it.
Summary — Pros vs Cons of AI Dreaming
| ✓ Pros | ✗ Cons / Risks |
|---|---|
| Agent improves over time without retraining the model | Sample bias amplification — skewed past sessions produce skewed future work |
| Reduces re-briefing time in new sessions | Memory drift from repeated re-summarization |
| Memory is inspectable (lives in a markdown file, not model weights) | Compounding hallucination — wrong memory becomes more deeply embedded |
| Reduces context-window token cost over long-running sessions | Privacy + ownership — who owns the AI's "dreams"? |
"AI doesn't dream — it iterates on the input we feed it. Iterating on careless input is hallucination with a paper trail."
Questions worth sitting with
If your AI agent "dreams" every night — do you know what it's dreaming about you? Have you ever actually opened its memory file? And most importantly — if that dream is wrong, do you know where to fix it?
Because the same questions are the ones Thai organizations need to answer before turning on AI features in an ERP — not after they've turned them on and run into trouble. If you're considering AI inside your business's core systems, book a consultation with the Saeree ERP team to assess readiness and design governance that is actually auditable.
References
- Anthropic — New in Claude Managed Agents: dreaming, outcomes, and multiagent orchestration (6 May 2026)
- VentureBeat — Anthropic introduces "dreaming," a system that lets AI agents learn from their own mistakes
- The New Stack — Anthropic will let its managed agents dream
- SiliconANGLE — Anthropic is letting Claude agents 'dream' so they don't sleep on the job
- Digital Trends — Anthropic just taught Claude to dream between tasks
