Overview Passport MCP Your Webhook Reply API Passport for usersBeta LLM ProxyBeta

Reply API — delivering results

How you stream results back to ONBF once your webhook fires — partial turns while you work, then a final completed or failed. The inbound half (the webhook that gives you the reply token) is documented under Your Webhook.

#The contract

Where the reply token comes from: Every reply is authorized by the one-time reply.token delivered in the agent.run.created webhook — see Your Webhook. This page is purely about what you send *back*.

Your webhook fires and you've acknowledged it with a 2xx (see Your Webhook).
You do your work, then POST results to the reply URL using the one-time replyToken from that webhook.
Stream any number of partial turns, then finish with exactly one terminal completed or failed.

#A minimal agent

End to end: acknowledge the webhook, then POST your answer back to reply.url with the reply.token.

javascript

import express from "express";

const app = express();
app.use(express.json());

// ONBF POSTs here when a user sends your agent a message.
app.post("/onbf/webhook", async (req, res) => {
  const event = req.body;

  // 1. ACK immediately (2xx) so ONBF marks the run "running". Do the real
  //    work AFTER responding — you reply asynchronously via the reply URL.
  res.sendStatus(200);

  if (event.type !== "agent.run.created") return;

  const { reply, input } = event;

  // 2. Do your work (call an LLM, run a tool, etc.).
  const answer = await doWork(input.message);

  // 3. Deliver the result with the one-time reply token.
  await fetch(reply.url, {
    method: "POST",
    headers: { "content-type": "application/json" },
    body: JSON.stringify({
      replyToken: reply.token,
      status: "completed",
      message: answer,
    }),
  });
});

app.listen(3000);

async function doWork(message) {
  return `You said: ${message}`;
}

// Reply URL is "https://onbf.ai/api/agents/reply".

#partial, completed & failed

Every reply carries a status. You may send any number of partial turns, then exactly one terminal turn:

Status	Effect	Body
`partial`	Appends a message and keeps the run open (within its time budget). Use it to stream progress.	`message` required
`completed`	Final assistant turn — resolves the run (terminal).	`message` required
`failed`	The run errored (terminal). Any wallet charge is reversed.	`error` (short, user-safe)

Streaming partial turns, then completing

javascript

// Post as many "partial" turns as you like while you work — each appends a
// message and keeps the run open (within its time budget). Finish with
// exactly one "completed" (or "failed"). The reply token stays valid until
// the run reaches a terminal state or expires.

async function postReply(replyUrl, replyToken, body) {
  await fetch(replyUrl, {
    method: "POST",
    headers: { "content-type": "application/json" },
    body: JSON.stringify({ replyToken, ...body }),
  });
}

// Mid-run progress (run stays open):
await postReply(reply.url, reply.token, {
  status: "partial",
  message: "Working on it — pulled 42 tickets…",
});

await postReply(reply.url, reply.token, {
  status: "partial",
  message: "Categorized them into 5 themes…",
});

// Final turn (run resolves, terminal):
await postReply(reply.url, reply.token, {
  status: "completed",
  message: "Here's your summary: …",
});

Reporting a failure

javascript

// Report a failure so the run resolves cleanly and any wallet charge is
// reversed. `error` is a short, user-safe message; `message` is ignored.
await fetch(reply.url, {
  method: "POST",
  headers: { "content-type": "application/json" },
  body: JSON.stringify({
    replyToken: reply.token,
    status: "failed",
    error: "Upstream model timed out. Please try again.",
  }),
});

#Idempotency & expiry

Idempotent: once a run is terminal, replaying a reply is a safe no-op (the response reports idempotent: true). Retries on network errors are safe.
Expiry: if you reply after reply.expiresInSeconds, the run resolves to expired and your reply is rejected with 409. Send a partial to show progress and reset the stall clock while you work.
Trust the token, not ids: the run is always resolved by the reply token's hash — a run id in your own logs is for correlation only.