FAQ + Troubleshooting
The questions and errors that come up first. If you don't find it here, search the site (/) — every page is indexed.
Concept questions
Is this an agent framework?
No. OpenExpertise is the orchestration layer above agent frameworks. The graph is deterministic YAML; the LLM only fills in the gaps inside nodes (a single agent's structured response, or a single skill's reply). The graph never changes at runtime.
If you want an autonomous agent that explores freely, use Claude Code, Codex, or Gemini directly. If you want to wrap one of those agents in a reproducible workflow, use OE. See vs Claude Code directly for the side-by-side.
How is this different from LangGraph / CrewAI / Mastra?
The full matrix is at Compare. Short version:
- vs LangGraph — they're Python and code-as-graph; we're YAML-first and dispatcher-modular. You define the same DAG, but ours lives in source-control as a single artifact that humans can review.
- vs CrewAI — they're agent-conversation-centric; we're graph-execution-centric. Less "agents talking", more "DAG executes with LLM inside specific nodes".
- vs Mastra — closest spiritual neighbor. Different design: we treat the graph + state + events as the durable substrate; agent-internals are pluggable dispatchers.
- vs Inngest / Temporal — they're durable execution backends without LLM batteries-included; we ship LLM dispatchers + 429 retry + structured output + evolution loop on top of a similar primitive.
Why YAML and not Python / TypeScript / JSX?
Three reasons:
- Git-reviewable. A YAML graph is a flat artifact. Five reviewers can look at the same
experience.yamland see the whole shape. - Tool-friendly. Schema-validated, LLM-authorable (
oe ultrawrites them), diffable, type-narrowed in editors with the published JSON schema. - Single source of truth. The same file feeds the validator, the runtime, the TUI, and the evolution advisor. Nothing is hidden behind a Python decorator.
If you genuinely want a code-first DAG, see the programmatic runExperience API — but most production users prefer YAML once they've tried both.
What's the difference between a tool and a cli-agent?
tool | cli-agent |
|---|---|
| Pure JS function. Deterministic. You write the .mjs. | Subprocess that spawns claude / codex / gemini. Non-deterministic. |
| Synchronous-ish (returns when the function returns). | Async — takes seconds to minutes per invocation. |
| No tokens, no metrics, no LLM dependency. | Tokens spent inside the CLI's session. Not always exposed. |
| Use for: HTTP fetch, file IO, parsing, deterministic logic, classification. | Use for: anything where you want a different vendor's CLI to do agentic work. |
What's the difference between an agent and a skill?
Both are LLM calls. Difference is where the prompt lives:
agent— inline prompt path + inline JSON schema inexperience.yaml. Optimized for one specific structured output you control fully.skill— points at aSKILL.mdpackage (frontmatter + body, à la Anthropic skills). Optimized for reusable, redistributable capabilities. The SKILL.md travels independent of the experience.
If you'd reuse the same prompt + schema across multiple experiences, package it as a skill. Otherwise inline as an agent.
Does it work without an LLM?
Yes for tool and dataset nodes. You can build a deterministic-only workflow (the hello-tool and dataset-aggregate examples don't touch an LLM).
For agent / skill / cli-agent nodes you need a configured provider (ANTHROPIC_API_KEY, or OPENAI_API_KEY + OPENAI_BASE_URL, or the CLI installed locally). The 12 examples ship with mocked-LLM e2e tests that exercise the structure without burning real tokens.
Can I run experiences offline?
Yes — point OPENAI_BASE_URL at a local vLLM / Ollama / LM Studio server. See Self-hosted LLMs. tool / dataset nodes don't need any model.
Will my workflow change between runs?
No. The graph is fixed by your YAML. The LLM only fills in nodes' content (findings, summary, etc.). The shape is stable, the path is stable, the state schema is stable.
If you want to change the graph, that's oe evolve — but it writes a markdown file with a unified diff, and you git apply it consciously. Nothing auto-applies.
Setup questions
What Node version?
20.x, 22.x, or 24.x. We CI on all three. 18 will work but isn't tested.
Why pnpm specifically?
@openexpertise/* ships as a pnpm-workspace monorepo with internal workspace:* links. pnpm install resolves them transparently. Downstream consumers installing from npm can use npm / yarn, but for developing inside the repo, pnpm is required.
If pnpm install fails with ERR_PNPM_UNSUPPORTED_ENGINE, you need pnpm@9+. npm i -g pnpm@9.
Why does better-sqlite3 install say it's compiling native code?
better-sqlite3 is a native module — first install on a new machine rebuilds it for your platform. If you see a build error here, you're missing build tools:
- macOS:
xcode-select --install - Ubuntu:
sudo apt install build-essential python3 - Windows: install Visual Studio Build Tools (the official
node-gypREADME has the recipe)
Subsequent installs are cached.
Where does state live?
<cwd>/.openexpertise/state.sqlite (one DB per workspace). Events live at <cwd>/.openexpertise/runs/<run-id>.jsonl. Per-run cache at <cwd>/.openexpertise/cache/<key>.json.
All three are git-ignorable by default. If you want a different location, see the OE_DATA_DIR env var documented in Architecture.
How do I run only one node from a graph?
You don't directly — but oe resume <run-id> will skip every cached step and only re-execute the ones that are dirty or downstream of dirty. Combine with oe reset-state <field> to mark a field stale and force re-execution.
Common errors
Error: Node "X" has unknown kind "xyz"
Your YAML has kind: xyz, which isn't one of tool / agent / skill / dataset / experience / cli-agent. Check spelling. The 6 valid kinds are documented at Node kinds.
Error: cycle detected in graph: A → B → A
oe validate will tell you which edges form the cycle. Common cause: copy-paste mistake creating a back-edge. Use oe inspect <run-id> after a clean run to see the actual ordering.
Error: schema validation failed for node "X"
The agent node's structured output didn't match its declared schema. Look at the error path — AJV will say exactly which property failed. Most common:
- Missing
requiredproperty → either tighten the prompt or make the field optional in schema. additionalProperties: falserejected an unexpected key → the model added a field; either allow it or refine the prompt.- Type mismatch → e.g., model returned
"5"(string) where you wanted5(number).
The agent dispatcher retries once if shape-invalid. If it fails twice, the node errors and on_error policy kicks in.
Error: provider "X" not configured
Your YAML references a provider (anthropic, openai, claude-code, codex, gemini) that isn't set up in environment or in runtime.providers. For LLM SDKs:
export ANTHROPIC_API_KEY=sk-ant-...
# or
export OPENAI_API_KEY=sk-...
export OPENAI_BASE_URL=https://api.openai.com/v1 # default OKFor cli-agent providers, the CLI must be installed: see cli-agent usage.
Error: Tool impl ./tools/X.mjs not found
Path is relative to the experience directory (the dir containing experience.yaml), not the cwd. If you ran oe run examples/review-branch from the repo root, paths like ./tools/foo.mjs resolve to examples/review-branch/tools/foo.mjs.
Error: Tool returned non-object: undefined
Your tool's default export didn't return a { state_delta } object. Even if you do nothing, return { state_delta: {} }. See Tool stubs.
429 Too Many Requests
The LLM provider is rate-limiting you. The Anthropic and OpenAI clients retry with exponential backoff (250ms, 500ms, 1s, 2s, ...) up to 5 attempts. If you keep hitting 429:
- Drop
--concurrency(or the YAML'sruntime.concurrency) - Drop
for_each.concurrencyon the busy node - Get a higher-tier API key
See Concurrency + 429 retry for the details.
TUI shows ? for a node that did run
The TUI tracks nodes by ID. If you've edited the YAML mid-run with oe resume, old run files may have IDs that don't exist anymore. Easiest fix: start a fresh run with oe run.
oe evolve produced a diff that won't git apply
The evolution advisor's diff isn't pre-validated. If the diff line numbers drifted between when the run happened and when you applied (because you edited the source in between), git apply will reject it. Options:
- Edit by hand using the proposal as guidance (it tells you the operation + rationale)
- Re-run + re-evolve from a clean state to get a fresh diff
- Use
git apply --rejectto apply hunks that still match and merge the rejected.rejmanually
This is documented as a known V1 limitation at Evolution loop.
Performance & cost
How much does a typical run cost?
Depends entirely on the graph. As reference points from our test suite:
| Example | Tokens (in/out) | API cost (USD, Claude 3.5 Sonnet) |
|---|---|---|
hello-tool | 0 / 0 | $0.00 |
agent-echo | ~600 / ~80 | $0.003 |
review-branch (3 dimensions + verifier + score) | ~12K / ~2K | $0.07 |
tri-cli-orchestration (Claude + Codex + Gemini chain) | Varies by tier | $0.05 – $0.15 |
deep-research (multi-source synthesis) | ~40K / ~6K | $0.20 |
Use oe inspect <run-id> to see exact tokens per node. Use node.tokens events to integrate with your own cost tracker. See Observability.
Why is my run slow?
Three common causes:
- No concurrency — set
runtime.concurrency: 4(or pass--concurrency 4) so independent nodes can run in parallel. - Sequential
for_each— default concurrency infor_eachis 1. Setconcurrency: Nto fan out properly. - Long LLM responses — if you're asking for a 5-page-long summary, that's the bottleneck. Tighten the schema's max sizes or split into multiple agents.
oe inspect <run-id> shows per-node duration. Look for the long tail.
Production questions
Can I run this as a service?
Yes. Use the programmatic API (runExperience) inside your service, or oe run as a subprocess. See Deployment for cron / queue / containerized patterns.
How do I monitor production runs?
Subscribe to the EventBus and pipe events to Prometheus / Datadog / your APM of choice. The shape is stable — node.tokens for spend, node.failed for alerts, node.activity for live tracing. See Observability for the recipe.
Is there a hosted version?
No. OpenExpertise is open-source MIT and runs entirely on your infrastructure. Your state, your events, your tokens — never touches our servers.
What about secrets management?
OE itself never reads secrets — it just reads env vars (ANTHROPIC_API_KEY etc.) at runtime. Manage them however you'd manage any env var: 1Password CLI, AWS Secrets Manager, Doppler, etc.
tool nodes that need their own credentials (a tool calling GitHub API, for example) read those from env too. See Deployment for patterns.
→ Still stuck? Open an issue at github.com/xingchengxu/OpenExpertise/issues — please include oe doctor output and the relevant oe inspect excerpt.