Skip to content

Hooks & Platform · how reflect captures and recalls across Claude + Codex

The short version — Two harnesses (Claude Code, Codex CLI) wire the same hook scripts into different config files (~/.claude/settings.json vs ~/.codex/hooks.json) and share one on-disk knowledge base (~/.reflect/ queue + ~/.learnings/ documents + GraphRAG index). SessionStart fires the baseline recall + the bg-drainer; UserPromptSubmit fires the intent-sharp recall with per-session dedupe; PreCompact, Stop, and PostToolUse capture learnings into the shared store. A codex session can enqueue a reflection that a later Claude session drains — and vice versa.


Architecture at a glance

Two harnesses, the same hook scripts, one shared knowledge base. Solid arrows are control flow; dashed clay arrows are recall (read into context); dashed olive arrows are capture (write to disk).

HARNESS HOOK SCRIPTS SHARED STATE HEADLESS fires SessionStart · fires PreCompact fires SessionStart · fires PreCompact Claude Code session reads hooks from ~/.claude/settings.json wired by plugin.json autowire (CLAUDE_PLUGIN_ROOT) — OR — manual claude_adapter.py install Codex CLI session reads hooks from ~/.codex/hooks.json wired by codex_adapter.py install (no plugin runtime — adapter autowires itself) session_start_recall.py fired by: SessionStart "what learnings apply here?" builds query from cwd · branch git log · returns top-3 via additionalContext precompact_reflect.py fired by: PreCompact "save this transcript for later" never blocks compaction append-only enqueue; reflection happens later reflect-drain-bg.sh fired by: SessionStart "process any queued reflections" detached (nohup &) PID-locked · daily-capped shells out to claude -p Pending queue ~/.reflect/pending_reflections.jsonl one line per queued transcript {transcript_path, session_id, trigger, harness, queued_at} harness-agnostic — any harness writes, any harness drains Learnings store ~/.learnings/documents/ <slug>.md (the learning) <slug>.entities.yaml (sidecar for GraphRAG ingest) written by the headless /reflect run GraphRAG + vector index ~/.learnings/graphrag/ communities · entities · relations nano-vector store (hnswlib) queried by recall · refreshed by `reflect reindex` after each successful drain claude -p "/reflect <transcript>" spawned by reflect-drain-bg.sh "extract the learnings from this transcript" --output-format json · --max-turns 25 --permission-mode bypassPermissions always claude — even when a codex session triggered the drain 1 2 3 top-3 learnings additionalContext 4 5 enqueue transcript_path 6 7 8 9 .md + .entities.yaml 10 reflect reindex

Session timeline · one coding session, end to end

A horizontal view of what fires when. Hook events show above the spine; data I/O below.

HOOK FIRES DATA READ / WRITTEN Install adapter / plugin writes hooks to config file Session starts harness boots SessionStart fires recall + drain (bg) reads graphrag/ injects top-3 additionalContext User prompts conversation model context + learnings Tool uses edits / runs transcript grows; ~/.claude/ JSONL Context fills approaching limit harness detects token threshold PreCompact hook fires PreCompact fires precompact_reflect.py appends to pending_reflections .jsonl (queue) Compaction harness compresses transcript archived; context window reset Session ends process exits queue entry waits; drained next start time

Storage layers · three tiers, one knowledge base

The pending queue, the learnings store, and the GraphRAG index. Append-only, grep-able, portable.

QUEUE DOCS INDEX Pending queue ~/.reflect/pending_reflections.jsonl one line per queued transcript · {transcript_path, session_id, trigger, harness, queued_at} IN: precompact_reflect.py (appends on PreCompact) OUT: reflect-drain-bg.sh reads + dequeues append-only ~1 line / compaction harness-agnostic any harness writes or drains Learnings store ~/.learnings/documents/ <slug>.md + <slug>.entities.yaml markdown document + YAML entity sidecar per learning · grep-able on disk IN: claude -p /reflect (headless · spawned by drain) reflect reindex markdown + YAML one .md per learning entity sidecar for graph human-readable · git-able GraphRAG + vector index ~/.learnings/graphrag/ communities · entities · relations · hnswlib vectors nano-graphrag · vector + entity graph · hybrid search · queried on SessionStart IN: reflect reindex (after each successful drain) OUT: session_start_recall.py queries top-3 vector + entity graph queried at SessionStart rebuilt by reflect reindex dense + graph hybrid

Platform · adapter interface and shared knowledge base

Each harness has its own adapter that wires hooks into its own config file. Both adapters point at the same hook scripts and the same shared knowledge base.

HARNESS ADAPTERS Claude Code .claude-plugin/plugin.json → ~/.claude/settings.json /plugin install reflect@agents-in-a-box plugin runtime Codex CLI adapters/codex/codex_adapter.py → ~/.codex/hooks.json python codex_adapter.py install adapter direct Copilot CLI adapters/copilot/copilot_adapter.py → TBD python copilot_adapter.py install (planned) planned REFLECT CORE reflect Hook Scripts session_start_recall.py · precompact_reflect.py reflect-drain-bg.sh CLI reflect · recall · reindex · search python -m reflect_kb Headless Worker claude -p /reflect <transcript> --output-format json · --max-turns 25 SHARED KNOWLEDGE BASE ~/.reflect/ + ~/.learnings/ shared by ALL harnesses — same directory pending queue ~/.reflect/pending_reflections.jsonl learnings store ~/.learnings/documents/ GraphRAG + vector index ~/.learnings/graphrag/ wires hooks into harness config file enqueue write .md reindex recall · top-3 learnings injected into session context ALL adapters point here — harness-agnostic

Install paths · Claude Code vs Codex CLI

Claude has a plugin runtime (/plugin install) so installation is two slash commands. Codex has no plugin runtime, so the adapter does the autowire itself with one python command.

INSTALL PATH LEGEND Claude Code 1. Marketplace add /plugin marketplace add stevengonsalvez/agents-in-a-box 2. Plugin install /plugin install reflect@agents-in-a-box 3. Plugin runtime (automated) extracts plugin.json · expands CLAUDE_PLUGIN_ROOT → ~/.claude/settings.json user effort: 2 commands · runtime handles the rest writes: ~/.claude/settings.json (hooks block merged) + skills/ extracted to ~/.claude/plugins/reflect/ Codex CLI 1. Clone repo or use existing agents-in-a-box checkout 2. Run adapter directly python plugins/reflect/ adapters/codex/codex_adapter.py install 3. Adapter copies skills plugins/ → ~/.codex/skills/ merges ~/.codex/hooks.json 4. Done hooks.json active user effort: 1 python command · no plugin runtime required writes: ~/.codex/hooks.json (SessionStart + PreCompact entries merged) + skills/ copied to ~/.codex/skills/reflect/ Step type User command (Claude Code) User action (Codex) Automated (no user input) Config written / result Config files produced Claude: ~/.claude/settings.json (hooks block · plugin runtime writes) Codex: ~/.codex/hooks.json (hooks block · adapter writes directly) Both point at the SAME hook scripts and the SAME shared knowledge base. Relative user-facing complexity Claude Code 2 slash commands Codex CLI 1 python command (more explicit path)
Terminal window
# Claude Code — managed install via plugin runtime
/plugin marketplace add stevengonsalvez/agents-in-a-box
/plugin install reflect@agents-in-a-box
# Codex CLI — adapter does the autowire
python plugins/reflect/adapters/codex/codex_adapter.py install
# or skip the bg drain on codex-only machines without claude on PATH:
python plugins/reflect/adapters/codex/codex_adapter.py install --no-bg-drain

Recall · UserPromptSubmit primary, SessionStart baseline, per-session dedupe

SessionStart fires before the user has typed anything — its recall query has to be inferred from cwd, branch, and recent commits. UserPromptSubmit has the actual user prompt to query against, which gives much sharper hits. Both fire; UserPromptSubmit dedupes against learnings already injected this session so the same memory doesn’t re-inject on every prompt.

SESSION TIMELINE t → SessionStart baseline recall cwd · branch UserPromptSubmit ① "fix OAuth bug" UserPromptSubmit ② "add refresh token" UserPromptSubmit ③ "now the rate-limiter" session_start_recall query: cwd + branch inject top-3 baseline → L₁ L₂ L₃ user_prompt_recall query: prompt text dedupe vs {L₁,L₂,L₃} → L₄ (L₁ skipped) user_prompt_recall query: prompt text dedupe vs {L₁..L₄} → L₅ (L₂,L₄ skipped) user_prompt_recall query: prompt text topic pivot → new hits → L₆ L₇ L₈ DEDUPE STATE ~/.reflect/session-injected/<session_id>.json {L₁, L₂, L₃} {L₁, L₂, L₃, L₄} {L₁..L₄, L₅} {L₁..L₅, L₆, L₇, L₈} Why two layers? SessionStart fires BEFORE user types → query is coarse UserPromptSubmit has actual intent → sharp query Dedupe stops the same learning being re-injected per prompt

Dedupe state lives at ~/.reflect/session-injected/<session_id>.json — a per-session set of learning IDs already injected. UserPromptSubmit recall queries the KB, intersects with the dedupe set, and only injects new hits as additionalContext.


Capture · PostToolUse mini-learnings + Stop reflection enqueue

PreCompact handles the high-cost full reflection (claude -p /reflect). Two more hooks cover gaps:

  • PostToolUse captures cheap mini-learnings inline — on tool failure, arms a watcher for the next user prompt; if the prompt looks like a correction ("try X instead"), write a low-confidence learning directly to disk. No LLM run needed.
  • Stop catches short sessions that end before PreCompact ever fires. Enqueues the transcript on agent finish; dedupes against any PreCompact entry for the same session.
POSTTOOLUSE · MINI-LEARNING Tool call fails Bash exit≠0 · Edit error PostToolUse hook fires posttooluse_minilearning.py arms next-prompt watcher writes ~/.reflect/armed.json User: "try X instead" UserPromptSubmit fires arming detected ~/.learnings/documents/ <slug>.md · conf=low low-cost · no /reflect run STOP · REFLECTION ENQUEUE (short sessions) Agent finishes Stop hook fires (context never filled) stop_reflect.py enqueue if not already dedupe by session_id ~/.reflect/pending_reflections .jsonl (shared queue) drained next SessionStart claude -p /reflect (headless) writes learnings + reindex same drain path ★ Mini-learning is cheap (no LLM run); Stop-enqueue is full /reflect (same as PreCompact). Both write to the SAME learnings store.

Status line · making recall + capture activity visible

Both harnesses give visual feedback, but through different mechanisms.

CLAUDE CODE · CUSTOM SHELL STATUS LINE Any reflect hook fires recall · enqueue · drain writes ~/.reflect/last-event.json ~/.reflect/last-event.json {event, ts, detail} small atomic file ~/.claude/statusline.sh reads last-event.json renders reflect fragment claude · main 🧠 recalled 3 · queued 1 8% ctx · 2h limit CODEX CLI · HOOK statusMessage FIELD (PER-CALL ONLY) Any reflect hook fires recall · enqueue · drain hook entry has statusMessage codex reads hooks.json entry "statusMessage":"🧠 recalling…" codex TUI · ephemeral shows DURING hook only no persistent fragment yet codex · gpt-5.5 🧠 recalling… main · ./repo · 4% ★ Claude has a custom shell statusline — we get persistent counters (recalled N · queued M). Codex's status line is a fixed token list; we use the per-hook statusMessage field — shows ephemerally during hook execution only. Full parity blocked on a codex custom-token API.
  • Claude Code — hooks write ~/.reflect/last-event.json; the user’s ~/.claude/statusline.sh reads it and renders a persistent reflect fragment (🧠 3 recalled · 1 queued).
  • Codex CLI — hooks declare a statusMessage field in hooks.json. Codex shows it ephemerally during hook execution (🧠 recalling...). The static [tui] status_line config can’t carry a custom token yet, so persistent codex-side counters wait on a codex API extension.

Worked example · “fix the OAuth redirect bug”

A concrete walkthrough showing every hook firing in order across a morning Claude session and an afternoon Codex session on the same repo.

SCENARIO · CLAUDE CODE SESSION, MORNING 09:14 · SESSIONSTART Session starts in ./auth-service recall(cwd=auth) → L₁ "OAuth state" 09:14 · USERPROMPTSUBMIT "fix the OAuth redirect bug" recall(prompt) → L₂ L₃ (L₁ skipped) 09:18 · POSTTOOLUSE Bash exit=1 (curl 500) posttooluse arms watcher 09:18 · USERPROMPTSUBMIT "use --insecure for local dev" mini-learning written ✓ 10:42 · PRECOMPACT Context 90% full enqueued transcript ✓ 11:05 · STOP Agent finishes · session ends stop_reflect skips (dedupe) 14:30 · CODEX · SESSIONSTART Open codex on same repo drain-bg picks up morning queue 14:30 · BG DRAIN claude -p /reflect (headless) writes L₄ "OAuth state mismatch" 14:31 · CODEX USERPROMPTSUBMIT "add OAuth refresh tokens" recall returns L₂, L₃, L₄ ✓ OUTCOME Codex session benefits from this morning's Claude work L₄ "OAuth state mismatch (Claude AM)" is now in the index. Cross-tool. STATUS LINE · CODEX TUI codex · gpt-5.5 🧠 3 recalled · ↗ queue empty
  1. 09:14 · Claude SessionStart — recall on cwd=auth-service returns L₁ (“OAuth state handling”). Injected as baseline.
  2. 09:14 · UserPromptSubmit — user types “fix the OAuth redirect bug”. Sharp query pulls L₂, L₃. L₁ skipped (already injected). Dedupe set: {L₁, L₂, L₃}.
  3. 09:18 · PostToolUsecurl returns 500. Mini-learning watcher arms.
  4. 09:18 · UserPromptSubmit — user types “use --insecure for local dev”. Watcher sees correction pattern, writes mini-learning directly to disk. No LLM run.
  5. 10:42 · PreCompact — context 90% full. Transcript path enqueued to ~/.reflect/pending_reflections.jsonl.
  6. 11:05 · Stop — agent finishes. stop_reflect.py checks queue, sees PreCompact already enqueued this session_id → skips.
  7. 14:30 · Codex SessionStart — different harness, same repo. reflect-drain-bg.sh starts in background, finds the morning queue entry, spawns claude -p /reflect headless. Writes L₄ (“OAuth state mismatch on redirect”).
  8. 14:31 · Codex UserPromptSubmit — user types “add OAuth refresh tokens”. Recall pulls L₂, L₃, L₄ — including the learning Claude just wrote this morning.

The codex session benefits from the morning’s Claude work without anyone moving files around. The queue and the learnings store are the only handoff.


Files involved

FileRole
plugins/reflect/skills/recall/hooks/session_start_recall.pySessionStart recall (baseline, cwd-based query)
plugins/reflect/skills/recall/hooks/user_prompt_submit_recall.pyUserPromptSubmit recall (intent-sharp, with dedupe)
plugins/reflect/hooks/precompact_reflect.pyPreCompact enqueue (full reflection deferred)
plugins/reflect/hooks/stop_reflect.pyStop enqueue (short-session fallback)
plugins/reflect/hooks/posttooluse_minilearning.pyPostToolUse mini-learning capture
plugins/reflect/hooks/reflect-drain-bg.shSessionStart bg-drainer (shells out to claude -p)
plugins/reflect/.claude-plugin/plugin.jsonClaude plugin autowire (/plugin install)
plugins/reflect/adapters/codex/codex_adapter.pyCodex installer (writes ~/.codex/hooks.json)
~/.reflect/pending_reflections.jsonlShared queue (any harness writes, any drains)
~/.reflect/session-injected/<session_id>.jsonPer-session dedupe state
~/.reflect/last-event.jsonStatus line fragment source
~/.learnings/documents/Markdown learnings + entity sidecars
~/.learnings/graphrag/GraphRAG + vector index

FAQ

Why doesn’t SessionStart recall use the user’s first prompt? SessionStart fires before the user has typed anything. Its query has to be inferred from cwd, branch, and recent commits — coarse but immediate. UserPromptSubmit fills the prompt-aware recall slot.

What stops UserPromptSubmit recall from re-injecting the same learning every prompt? The per-session dedupe set at ~/.reflect/session-injected/<session_id>.json. Each hit becomes a {learning_id: ts} entry; future prompts skip already-injected learnings unless they’d be the top hit anyway.

Why does the drainer always shell out to claude instead of codex? The /reflect skill is a Claude skill. Codex is the trigger (any SessionStart fires the bg drainer), Claude is the worker. On codex-only machines without claude on PATH, pass --no-bg-drain to the codex adapter to skip the drain hook.

Does Stop also fire on long sessions that hit PreCompact? Yes, but stop_reflect.py dedupes against the queue by session_id. PreCompact gets in first; Stop is a fallback for sessions that never compact.

Where does ~/.reflect/ live, and is it portable? Under $HOME/.reflect/ by default; overridable via REFLECT_STATE_DIR. Contents are JSONL/Markdown/YAML — fully grep-able, version-control friendly, and portable across machines via filesystem sync.


Try it — see the standalone visual posters with the same diagrams: