Agent Definition and Usage

Agents encapsulate models, prompting, tools, and other configuration. They can be defined as globals, or at runtime.

They use threads to contain a series of messages used along the way, whether those messages are from a user, another Agent / LLM, or elsewhere. A thread can have multiple Agents responding, or be used by a single Agent.

Agentic workflows are built up by combining contextual prompting (threads, messages, tool responses, RAG, etc.) and dynamic routing via LLM tool calls, structured LLM outputs, or a myriad of other techniques via custom code.

Basic Agent definition

import { components } from "./_generated/api";
import { Agent } from "@convex-dev/agent";
import { openai } from "@ai-sdk/openai";

const agent = new Agent(components.agent, {
  name: "Basic Agent",
  languageModel: openai.chat("gpt-4o-mini"),
});

See below for more configuration options.

Everything except the name can be overridden at the call site when calling the LLM, and many features available on the agent can be used without an Agent, if this way of organizing the work is not needed for your use case.

Dynamic Agent definition

You can define an Agent at runtime, which is useful if you want to create an Agent for a specific context. This allows the LLM to call tools without requiring the LLM to always pass through full context to each tool call. It also allows dynamically choosing a model or other options for the Agent.

import { Agent } from "@convex-dev/agent";
import { type LanguageModel } from "ai";
import type { ActionCtx } from "./_generated/server";
import type { Id } from "./_generated/dataModel";
import { components } from "./_generated/api";

function createAuthorAgent(
  ctx: ActionCtx,
  bookId: Id<"books">,
  model: LanguageModel,
) {
  return new Agent(components.agent, {
    name: "Author",
    languageModel: model,
    tools: {
      // See https://docs.convex.dev/agents/tools
      getChapter: getChapterTool(ctx, bookId),
      researchCharacter: researchCharacterTool(ctx, bookId),
      writeChapter: writeChapterTool(ctx, bookId),
    },
    maxSteps: 10, // Alternative to stopWhen: stepCountIs(10)
  });
}

Generating text with an Agent

To generate a message, you provide a prompt (as a string or a list of messages) to be used as context to generate one or more messages via an LLM, using calls like agent.streamText or agent.generateObject.

The arguments to generateText and others are the same as the AI SDK, except you don't have to provide a model. By default it will use the agent's language model. There are also extra arguments that are specific to the Agent component, such as the promptMessageId which we'll see below.

See the full list of AI SDK arguments here

The message history will be provided by default as context from the given thread. See LLM Context for details on how to configuring the context provided.

Note: authorizeThreadAccess referenced below is a function you would write to authenticate and authorize the user to access the thread. You can see an example implementation in threads.ts.

See chat/basic.ts or chat/streaming.ts for live code examples.

Streaming text

Streaming text follows the same pattern as the approach below, but with a few differences, depending on the type of streaming you're doing. See streaming for more details.

Basic approach (synchronous)

export const generateReplyToPrompt = action({
  args: { prompt: v.string(), threadId: v.string() },
  handler: async (ctx, { prompt, threadId }) => {
    // await authorizeThreadAccess(ctx, threadId);
    const result = await agent.generateText(ctx, { threadId }, { prompt });
    return result.text;
  },
});

Note: best practice is to not rely on returning data from the action. Instead, query for the thread messages via the useThreadMessages hook and receive the new message automatically. See below.

Saving the prompt then generating response(s) asynchronously

While the above approach is simple, generating responses asynchronously provide a few benefits:

You can set up optimistic UI updates on mutations that are transactional, so the message will be shown optimistically on the client until the message is saved and present in your message query.
You can save the message in the same mutation (transaction) as other writes to the database. This message can the be used and re-used in an action with retries, without duplicating the prompt message in the history. If the promptMessageId is used for multiple generations, any previous responses will automatically be included as context, so the LLM can continue where it left off. See workflows for more details.
Thanks to the idempotent guarantees of mutations, the client can safely retry mutations for days until they run exactly once. Actions can transiently fail.

Any clients listing the messages will automatically get the new messages as they are created asynchronously.

To generate responses asynchronously, you need to first save the message, then pass the messageId as promptMessageId to generate / stream text.

import { components, internal } from "./_generated/api";
import { saveMessage } from "@convex-dev/agent";
import { internalAction, mutation } from "./_generated/server";
import { v } from "convex/values";

// Step 1: Save a user message, and kick off an async response.
export const sendMessage = mutation({
  args: { threadId: v.id("threads"), prompt: v.string() },
  handler: async (ctx, { threadId, prompt }) => {
    const { messageId } = await saveMessage(ctx, components.agent, {
      threadId,
      prompt,
    });
    await ctx.scheduler.runAfter(0, internal.example.generateResponseAsync, {
      threadId,
      promptMessageId: messageId,
    });
  },
});

// Step 2: Generate a response to a user message.
export const generateResponseAsync = internalAction({
  args: { threadId: v.string(), promptMessageId: v.string() },
  handler: async (ctx, { threadId, promptMessageId }) => {
    await agent.generateText(ctx, { threadId }, { promptMessageId });
  },
});

Note that the action doesn't need to return anything. All messages are saved by default, so any client subscribed to the thread messages will receive the new message as it is generated asynchronously.

The Step 2 code is common enough that there's a utility to save you some typing. It takes in some parameters to control streaming, etc. For more details, see the code.

// Equivalent to Step 2 above.
export const generateResponseAsync = agent.asTextAction();

Generating an object

Similar to the AI SDK, you can generate or stream an object. The same arguments apply, except you don't have to provide a model. It will use the agent's default language model.

import { z } from "zod/v3";

const result = await thread.generateObject({
  prompt: "Generate a plan based on the conversation so far",
  schema: z.object({...}),
});

Unfortunately, object generation doesn't support using tools. One, however, is to structure your object as arguments to a tool call that returns the object. You can use a custom stopWhen to stop the generation when the tool call produces the result and use toolChoice: "required" to prevent the LLM from returning a text response.

Customizing the agent

The agent by default only needs a chat model to be configured. However, for vector search, you'll need a textEmbeddingModel model. A name is helpful to attribute each message to a specific agent. Other options are defaults that can be over-ridden at each LLM call-site.

import { tool, stepCountIs } from "ai";
import { openai } from "@ai-sdk/openai";
import { z } from "zod/v3";
import { Agent, createTool, type Config } from "@convex-dev/agent";
import { components } from "./_generated/api";

const sharedDefaults = {
  // The language model to use for the agent.
  languageModel: openai.chat("gpt-4o-mini"),
  // Embedding model to power vector search of message history (RAG).
  textEmbeddingModel: openai.embedding("text-embedding-3-small"),
  // Used for fetching context messages. See https://docs.convex.dev/agents/context
  contextOptions,
  // Used for storing messages. See https://docs.convex.dev/agents/messages
  storageOptions,
  // Used for tracking token usage. See https://docs.convex.dev/agents/usage-tracking
  usageHandler: async (ctx, args) => {
    const { usage, model, provider, agentName, threadId, userId } = args;
    // ... log, save usage to your database, etc.
  },
  // Used for filtering, modifying, or enriching the context messages. See https://docs.convex.dev/agents/context
  contextHandler: async (ctx, args) => {
    return [...customMessages, args.allMessages];
  },
  // Useful if you want to log or record every request and response.
  rawResponseHandler: async (ctx, args) => {
    const { request, response, agentName, threadId, userId } = args;
    // ... log, save request/response to your database, etc.
  },
  // Used for limiting the number of retries when a tool call fails. Default: 3.
  callSettings: { maxRetries: 3, temperature: 1.0 },
} satisfies Config;


const supportAgent = new Agent(components.agent, {
  // The default system prompt if not over-ridden.
  instructions: "You are a helpful assistant.",
  tools: {
    // Convex tool. See https://docs.convex.dev/agents/tools
    myConvexTool: createTool({
      description: "My Convex tool",
      args: z.object({...}),
      // Note: annotate the return type of the handler to avoid type cycles.
      handler: async (ctx, args): Promise<string> => {
        return "Hello, world!";
      },
    }),
    // Standard AI SDK tool
    myTool: tool({ description, parameters, execute: () => {}}),
  },
  // Used for limiting the number of steps when tool calls are involved.
  // NOTE: if you want tool calls to happen automatically with a single call,
  // you need to set this to something greater than 1 (the default).
  stopWhen: stepCountIs(5),
  ...sharedDefaults,
});

Basic Agent definition​

Dynamic Agent definition​

Generating text with an Agent​

Streaming text​

Basic approach (synchronous)​

Saving the prompt then generating response(s) asynchronously​

Generating an object​

Customizing the agent​