Cache key + memoization
Every node's run is memoized to a per-node cache under .openexpertise/cache/. If the same node, with the same inputs, has been seen before, the cached output is returned and no work is done. This makes oe resume instant and saves real money on LLM-backed nodes.
What's cached
The cache stores NodeOutput (the { state_delta, edge_output?, metrics? } triple) keyed by a content-addressed hash of:
- The node's
id - The node's
kind - The node's full spec (everything in the YAML node definition)
- The
RUNTIME_VERSIONconstant (bumped to invalidate caches on breaking changes) - The
bundle.state_view(every field the node reads) - The
bundle.edge_inputs(every upstreamedge_output) - The
bundle.args(resolved per-node args) - For
for_eachiterations, the$itemvalue too
See packages/core/src/cache/key.ts for the exact construction. It uses object-hash for stable hashing.
How it's used
The scheduler checks the cache before invoking the dispatcher:
const cacheKey = computeCacheKey({ node, bundle, ... })
const hit = cache.get(cacheKey)
if (hit) {
// emit node.started, apply state_delta from cache, emit node.finished
// — no dispatcher call, no LLM call, no subprocess spawn
return
}
// otherwise: dispatcher.run(...) and cache.set(cacheKey, output)For deterministic tool nodes, this is harmless and useful. For LLM-backed nodes, it's a real cost saving — the same prompt + state combination won't call the LLM twice.
When the cache is a hit
The cache hits when every byte of the input bundle hashes the same. Concretely:
- The node spec hasn't changed (no YAML edits).
- The reads-list values in state are byte-identical.
- The args resolve to the same values.
- The upstream edge outputs are identical.
- The
RUNTIME_VERSIONconstant hasn't been bumped.
If you tweak the prompt by one character — that's a different node spec — cache miss. If an upstream tool produces a slightly different output — cache miss for the downstream agent.
oe resume uses this aggressively
oe resume r-abc123Re-runs the experience pretending the prior run never failed at node X. Every successful node before X has a cache entry; resume just replays their state writes and proceeds to X. Only X (and any downstream nodes) actually execute.
This is most useful for long-running flows where you fix a bug in a late-stage node and don't want to re-pay the LLM bill for the early stages.
Disabling the cache
Per-run:
await runExperience({ ..., cache: false })There's no CLI flag yet (V1 ships cache-on; opt-out via API). To force a re-run of everything from the CLI today, the simplest thing is:
rm -rf .openexpertise/cache
oe run .What's NOT cached
cli-agentnodes — the subprocess spawn is treated as non-deterministic. Caching here would require the parent flow to assume the CLI returns the same output for identical args, which it explicitly doesn't (Claude Code searches your codebase live, etc.).- Nodes inside
for_eachiterations — actually these ARE cached, but per-iteration: each$itemgets its own cache entry. - Random-by-design tools — if your tool reads
Math.random()orDate.now()and outputs based on it, the cache will hit on the SECOND run because the input bundle is identical but your tool would have produced a different value. Cachefalseif you need true non-determinism, or make the non-determinism an explicit input (e.g., pass aseedarg).
Cache layout on disk
.openexpertise/cache/
├── <hash-1>.json
├── <hash-2>.json
└── ...Each file is a single NodeOutput serialization. The cache is bounded only by disk space — it has no expiry policy in V1. If your experience grows large, periodically rm -rf .openexpertise/cache to reset.
When caches go stale
The cache invalidates automatically on:
- Node spec changes — different hash key.
- State changes — different
state_viewhash. - Upstream edge changes — different
edge_inputshash. RUNTIME_VERSIONbumps — when OpenExpertise itself ships a breaking runtime change, the constant inpackages/core/src/graph/scheduler.tsis bumped, invalidating everything.
The cache does NOT invalidate on:
- Time. A cached LLM response from a month ago is still served today if the inputs match.
- Model changes. If you change
model: claude-sonnet-4-6 → gpt-4o-2024-11-20in the node spec, that IS a spec change and invalidates. But environment-level changes (differentANTHROPIC_API_KEY, different system prompt at the API level, model upgrades on the provider side) do not invalidate.
Why this design
The cache exists because:
- LLM calls are expensive. A 10-minute review-branch run costs cents per agent node. Across 20 reviews a day that's real money.
- Reproducibility. Replaying a run yields the same intermediate state. Useful for debugging — you can mutate a prompt and only the downstream nodes rerun.
- Resume on failure. When a late-stage node fails, fixing it shouldn't cost the upstream nodes again.
The trade-off is silent staleness. The cache won't know your model improved last week, or your fixture file rotated overnight, or your prompt is "almost the same" but a comma was added. The mitigation is rm -rf .openexpertise/cache when in doubt.
→ Continue with Evolution loop.