Code-as-Law
YAML schema validates the graph structure before runtime. LLMs only fill the gaps inside nodes โ they can't rewrite the flow at runtime. No drift, no surprises, the same DAG every run.
Codify expert workflows as version-controlled YAML graphs. Run them with deterministic flow + LLM-powered nodes. Let the LLM evolve the graph after each run.
OpenExpertise is the orchestration layer above Claude Code, Codex, and Gemini.
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Claude Code / Codex / Gemini โ โ OpenExpertise โ
โ "AI bash" โ โ "AI Makefile" โ
โ โ vs โ โ
โ - improvised each run โ โ - same DAG every run โ
โ - opaque trajectory โ โ - JSONL event log + SQLite state โ
โ - one-shot, no memory โ โ - evolves itself across runs โ
โ - general-purpose โ โ - codifies a specific SOP โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
autonomous worker workflow conductor
(can call the workers)It is not an autonomous agent. It is the orchestration layer that lets you wire deterministic code, LLM agents, and CLI agents into reproducible, persistent, self-improving pipelines.
If your team has a SOP that someone has to follow every Monday morning โ code review, incident triage, release gates, compliance check, customer onboarding โ and you want it to run the same way every time, leave a trail, and get better at it โ this is for you.
export ANTHROPIC_API_KEY=sk-...
oe run examples/review-branch --tuiThree reviewers (bugs / perf / tests) fan out over a Python diff. They find a missing null-check, a missing test, and an unclosed cursor โ but they miss the SQL injection.
โ run-2026-05-26-a1b2c3 finished
findings: 3 issues
risk_score: 0.30Now ask the evolution advisor what's missing:
oe evolve run-2026-05-26-a1b2c3
# โ wrote .openexpertise/evolution/run-2026-05-26-a1b2c3.md
# proposal: "Add `security` dimension โ default reviewers focus on
# logic/tests; injection bugs need a dedicated reviewer."Apply the one-line YAML patch from the proposal and re-run:
โ run-2026-05-26-d4e5f6 finished
findings: 4 issues (+ SQL injection in /users/<id>)
risk_score: 0.85The experience improved itself. Author โ run โ evolve, all driven by the same LLM provider.
โ Full walkthrough: examples/review-branch.
No other workflow framework does this today.
graph:
nodes:
- {
id: summarize,
kind: cli-agent,
provider: claude-code,
prompt: 'Summarize this topic in one sentence: {{topic}}',
writes: [summary],
}
- {
id: critique,
kind: cli-agent,
provider: codex,
prompt: 'What does this summary miss? {{summary}}',
reads: [summary],
writes: [critique],
}
- {
id: verdict,
kind: cli-agent,
provider: gemini,
prompt: 'Verdict on production-readiness given: {{summary}} + {{critique}}',
reads: [summary, critique],
writes: [verdict],
}
edges:
- { from: summarize, to: critique }
- { from: critique, to: verdict }One DAG, three vendors, shared SQLite state, replayable event log. 37s real wall time, three CLIs, one trace.
โ See it in action: examples/tri-cli-orchestration.
| Example | What it shows | Featuring |
|---|---|---|
hello-tool | Smallest possible flow | tool |
agent-echo | Single LLM agent with structured output | agent |
dataset-aggregate | Load CSV โ aggregate | dataset + tool |
review-branch โ
| The hero demo โ multi-dim review + verifier + score + evolution | tool + agent ร3 |
oncall-runbook | Fan out an investigation across 3 dimensions | for_each |
issue-triage | Classify โ search dupes โ conditional dedup โ route | when: edges |
release-gates | License + changelog + coverage + Claude-Code security scan โ gate | tool + cli-agent + agent |
cli-orchestration | Claude Code summarizes; Codex critiques | cli-agent ร2 |
tri-cli-orchestration โ
| Claude โ Codex โ Gemini in one DAG | cli-agent ร3 |
deep-research | Multi-source research with cross-referencing | agent fan-in |
systematic-debugging | Hypothesize โ localize โ fix โ verify loop | tool + agent |
brainstorming | Diverge โ cluster โ critique โ synthesize top 3 | cli-agent fan-out + agent |
All 12 examples ship with mocked-LLM e2e tests so the structure is verifiable without API keys.
automate a recurring, multi-step process that mixes deterministic logic + LLM judgment? โ โโโโโโโโโดโโโโโโโโโ YES NO โ โ โโโโโโโโดโโโโโโโ Use Claude Code โ โ or Codex directly. Need it durable, One-shot reproducible, exploration? evolvable? โ โ โโ Use Claude Code. โโ โถ Use OpenExpertise.
If you want a chat-based assistant or one-off task automation, use the underlying CLI directly (Claude Code, Codex, Gemini). OpenExpertise sits above those tools, not beside them.
โ Compare in detail: vs the alternatives.
Build expert workflows once. Run them forever. Watch them get better at it.
Start building โ ย ยทย Copy a recipe โ ย ยทย Got a question?