`LLMClient`

The provider-neutral abstraction that LLM-backed dispatchers (agent, skill) use to send completions. Real implementations live in @openexpertise/node-kinds-agent and @openexpertise/llm-openai; tests inject fake clients with canned responses.

Import

import type { LLMClient } from '@openexpertise/core'

Signature

export interface LLMMessage {
  role: 'user' | 'assistant'
  content: string
}

export interface LLMTool {
  name: string
  description: string
  input_schema: Record<string, unknown>
}

export interface LLMCompleteOpts {
  model: string
  system?: string
  messages: LLMMessage[]
  tools?: LLMTool[]
  max_tokens?: number
}

export interface LLMToolCall {
  name: string
  input: unknown
}

export interface LLMUsage {
  input_tokens: number
  output_tokens: number
}

export interface LLMCompleteResult {
  text: string
  tool_calls?: LLMToolCall[]
  usage?: LLMUsage
  stop_reason?: string
}

export interface LLMClient {
  complete(opts: LLMCompleteOpts): Promise<LLMCompleteResult>
}

`complete` parameters

Name	Type	Required	Description
`model`	`string`	✓	Model identifier string, passed verbatim to the provider (e.g. `"claude-sonnet-4-6"`, `"gpt-4o"`).
`messages`	`LLMMessage[]`	✓	Conversation turns. Each turn has a `role` (`'user'` or `'assistant'`) and a `content` string.
`system`	`string`	—	System prompt sent before the message list.
`tools`	`LLMTool[]`	—	Tool definitions to expose. Each tool has a `name`, `description`, and a JSON Schema `input_schema`.
`max_tokens`	`number`	—	Maximum tokens in the completion. Defaults to `4096` in most implementations.

`complete` return type

Field	Type	Description
`text`	`string`	Concatenated text content of all `text`-type blocks in the response. May be empty when the model only returns tool calls.
`tool_calls`	`LLMToolCall[]`	Tool invocations requested by the model. Present only when the model returned at least one tool call.
`usage`	`LLMUsage`	Token counts for the request. Present when the provider reports usage.
`stop_reason`	`string`	Provider-specific stop reason string (e.g. `"end_turn"`, `"tool_use"`, `"stop"`).

Provided implementations

`AnthropicLLMClient` (`@openexpertise/node-kinds-agent`)

import { AnthropicLLMClient } from '@openexpertise/node-kinds-agent'

const client = new AnthropicLLMClient({
  apiKey: process.env.ANTHROPIC_API_KEY, // falls back to env var if omitted
  retry: { max_attempts: 4, base_ms: 1000 }, // exponential back-off on 429
})

Reads ANTHROPIC_API_KEY from the environment when apiKey is not supplied. Retries up to max_attempts times with exponential back-off on HTTP 429 / RateLimitError. Uses the @anthropic-ai/sdk under the hood.

`OpenAILLMClient` (`@openexpertise/llm-openai`)

import { OpenAILLMClient } from '@openexpertise/llm-openai'

const client = new OpenAILLMClient({
  apiKey: process.env.OPENAI_API_KEY,
  retry: { max_attempts: 4, base_ms: 1000 },
})

Compatible with any OpenAI-format endpoint (OpenAI, Azure OpenAI, vLLM, Ollama). Strips <think>…</think> prefixes from tool-call argument strings before JSON parsing — a quirk of some reasoning-mode servers. See Self-hosted LLMs for vLLM / Ollama setup.

Writing your own implementation

Any object that satisfies the LLMClient interface is acceptable:

import type { LLMClient, LLMCompleteOpts, LLMCompleteResult } from '@openexpertise/core'

class MyClient implements LLMClient {
  async complete(opts: LLMCompleteOpts): Promise<LLMCompleteResult> {
    // call your provider's API
    return { text: 'hello from my provider' }
  }
}

Inject it wherever a dispatcher that needs an LLMClient is constructed:

import { AgentDispatcher } from '@openexpertise/node-kinds-agent'
import { SkillDispatcher } from '@openexpertise/node-kinds-skill'

dispatchers.register(new AgentDispatcher({ client: new MyClient() }))
dispatchers.register(new SkillDispatcher({ client: new MyClient() }))

Behavior notes

Tool routing. AnthropicLLMClient and OpenAILLMClient both force tool use when exactly one tool is provided (Anthropic: tool_choice: { type: 'tool' }; OpenAI: tool_choice: { type: 'function', function: { name } }). This is how EvolutionAdvisor and UltraExpertise guarantee structured output.

Retry. Both built-in clients retry on 429 with exponential back-off: delay = base_ms * 2^(attempt-1). The attempt counter starts at 1. At max_attempts the error is re-thrown.

Error transparency. Errors from the underlying SDK (network failure, 500, auth failure) are not wrapped — they propagate as-is so that the node's on_error policy can handle them.

Source

packages/core/src/llm/client.ts

LLMClient ​

Import ​

Signature ​

complete parameters ​

complete return type ​

Provided implementations ​

AnthropicLLMClient (@openexpertise/node-kinds-agent) ​

OpenAILLMClient (@openexpertise/llm-openai) ​

Writing your own implementation ​

Behavior notes ​

Source ​