Testing Your Actions

Ageniti ships a small, dependency-free testing toolkit at @ageniti/core/test-utils. It works with any runner that has plain assertions — node:test, Vitest, or Jest — because the helpers throw ordinary Errors rather than depending on a specific framework.

The core idea: you test an action once, against the shared runtime, and that behavior holds across every surface (CLI, HTTP, MCP, tool calls, React), because every surface runs through the same runtime.

Quick start

import test from "node:test";
import { createTestRuntime, expectOk, expectError } from "@ageniti/core/test-utils";
import { createTask } from "./actions/create-task.js";
 
test("creates a task", async () => {
  const t = createTestRuntime([createTask], {
    services: { tasks: fakeTaskService() },
  });
 
  const data = expectOk(await t.invoke("create_task", { title: "Ship it" }));
  assert.equal(data.title, "Ship it");
});
 
test("rejects an empty title", async () => {
  const t = createTestRuntime([createTask]);
  expectError(await t.invoke("create_task", { title: "" }), "VALIDATION_ERROR");
});

createTestRuntime(actions, options?)

Spins up a runtime preconfigured for tests:

  • all actions are auto-registered
  • default surface is json — no confirmation gate, no UI assumptions
  • confirmation is bypassed by default (so destructive actions run without a confirm flag in tests)

Options:

| Option | Purpose | | --- | --- | | services | Inject dependency stubs available as ctx.services inside run. | | allow | Simulate permission outcomes. { allow: false } denies everything; pass a function or string to control the permissionChecker. | | middleware | Provide middleware to exercise cross-cutting logic. | | hooks | Provide lifecycle hooks. | | redact | Custom redaction keys. | | idempotencyCache | Provide a cache to test idempotent replays. |

The returned object exposes:

  • runtime — the underlying runtime, if you need direct access
  • invoke(name, input?, options?) — invoke an action, returns the result envelope
  • stream(name, input?, options?) — invoke and get the live event stream

Assertion helpers

import { expectOk, expectError, expectLog, collectStream } from "@ageniti/core/test-utils";
  • expectOk(envelope) — asserts success and returns envelope.data.
  • expectError(envelope, code?) — asserts failure; if code is given, also asserts the error code (e.g. "VALIDATION_ERROR", "PERMISSION_DENIED").
  • expectLog(envelope, matcher) — asserts a log entry exists. matcher can be a substring, a RegExp, or a predicate function.
  • collectStream(stream) — drains an async event stream into an array so you can assert the full sequence of log / progress / artifact / result events.

Testing streaming behavior

import { createTestRuntime, collectStream } from "@ageniti/core/test-utils";
 
test("emits progress then a result", async () => {
  const t = createTestRuntime([longRunningAction]);
  const events = await collectStream(t.stream("reindex", { full: true }));
 
  const types = events.map((e) => e.type);
  assert.ok(types.includes("progress"));
  assert.equal(types.at(-1), "result");
  assert.equal(events.at(-1).envelope.ok, true);
});

Testing permissions

test("denies without the right permission", async () => {
  const t = createTestRuntime([deleteTask], { allow: false });
  expectError(await t.invoke("delete_task", { taskId: "t_1" }), "PERMISSION_DENIED");
});

Stubbing dependencies

Use stubAction(name, options) to build a controllable fake action — handy when testing middleware or composition without wiring real implementations:

import { stubAction, createTestRuntime, expectOk } from "@ageniti/core/test-utils";
 
const stub = stubAction("charge_card", { returns: { receiptId: "r_1" } });
const t = createTestRuntime([stub]);
expectOk(await t.invoke("charge_card", {}));

Why this is enough

Because every surface (CLI, HTTP, MCP, OpenAI / AI SDK tool calls, React) is a thin adapter over the same runtime, a passing action test means the capability is correct everywhere it is exposed. You do not need to write separate tests per surface.