Does Truto automatically retry Vapi API calls when rate limits are hit?

No. Truto does not retry, throttle, or apply backoff on rate limit errors. It directly passes the HTTP 429 status code from Vapi to the caller and standardizes the upstream rate limit information into standard HTTP headers (ratelimit-limit, ratelimit-remaining, ratelimit-reset). The calling agent framework must handle the retry logic.

How does Truto handle Vapi recording downloads that return 302 redirects?

Vapi's recording endpoints typically return a 302 redirect to a short-lived presigned URL rather than structured JSON. Truto normalizes this behavior, allowing the AI agent to receive the URL cleanly without failing due to unexpected transport-layer HTTP redirects.

Can I filter the tools fetched from Truto based on read or write operations?

Yes. The Truto /tools endpoint accepts query parameters like `methods[0]=read` to filter the returned schemas, allowing you to give an AI agent strictly read-only access to specific Vapi resources.

Is this tool integration limited to LangChain or MCP?

No. The Truto /tools endpoint generates standard JSON schemas that map natively to any agent framework that supports OpenAI-compatible function calling, including CrewAI, LangGraph, and the Vercel AI SDK.

Connect Vapi to AI Agents: Automate Campaigns and Phone Numbers

You want to connect Vapi to an AI agent so your system can autonomously spin up outbound campaigns, provision phone numbers, fetch call recordings, and analyze transcripts based on real-time triggers. Here is exactly how to do it using Truto's /tools endpoint and developer SDKs, bypassing the need to hand-code integrations for Voice AI infrastructure.

Giving a Large Language Model (LLM) read and write access to a complex voice orchestration platform is an engineering headache. You either spend weeks building, hosting, and maintaining custom API wrappers, or you use a managed infrastructure layer that handles the boilerplate tool schemas for you. If your team uses ChatGPT, check out our guide on connecting Vapi to ChatGPT, or if you prefer Anthropic's ecosystem, read our guide on connecting Vapi to Claude. For developers building custom autonomous workflows, you need a programmatic way to fetch these schemas and bind them to your agent framework.

This guide breaks down exactly how to fetch AI-ready tools for Vapi, bind them natively to an LLM using frameworks like LangChain, LangGraph, CrewAI, or the Vercel AI SDK, and execute complex revenue and support operations. For a deeper look at the core architecture behind this approach, refer to our research on architecting AI agents and the SaaS integration bottleneck.

The Engineering Reality of Custom Vapi Connectors

Building AI agents is trivial. Connecting them to external SaaS APIs safely is difficult. Giving an LLM access to external voice infrastructure sounds straightforward in a local prototype. You write a Node.js function that makes a fetch request, wrap it in a @tool decorator, and move on. In production, this approach collapses entirely.

If you decide to build a custom integration for Vapi, you own the entire API lifecycle. Vapi's architecture introduces several highly specific integration challenges that break standard LLM assumptions.

The 302 Redirect Recording Quirk

When an AI agent needs to analyze a past interaction, it logically queries a recording endpoint. For most REST APIs, developers expect a JSON payload containing a download link or base64 encoded string.

Vapi's recording endpoints (such as those for mono, stereo, and video recordings) do not return structured JSON bodies. Instead, the endpoint issues an HTTP 302 redirect pointing to a short-lived presigned URL where the binary file resides. Hand-coded agents and naive API wrappers consistently fail here. The LLM expects a JSON response, parses the 302 redirect as an empty object, and hallucinates a failure. Your custom tool layer must explicitly intercept the redirect, capture the Location header, and return that URL as a structured schema to the agent. Truto's proxy architecture handles these transport layer anomalies out of the box, converting complex HTTP behaviors into predictable JSON schemas the LLM can safely ingest.

Polymorphic Chat Response Events

Vapi supports OpenAI-compatible Responses APIs for chat interactions. Depending on how you configure the request, the response shape drastically changes. A non-streaming request returns a standard ResponseObject. A streaming request returns a stream of highly specific event types (ResponseTextDeltaEvent, ResponseTextDoneEvent, ResponseCompletedEvent).

LLMs struggle with polymorphic return types if the tool description does not explicitly state how to parse the result. If you write your own tools, you must maintain massive, union-heavy JSON schemas for every variant of the Vapi API. Truto dynamically generates these schemas based on the underlying resource definition, injecting the exact query and response parameters required into the tool definition so the LLM knows exactly what to expect.

Handling Concurrency and Strict Rate Limits

Vapi rate limits are aggressive, which makes sense given the computational overhead of real-time voice synthesis. However, a critical architectural truth must be addressed: Truto does not retry, throttle, or apply backoff on rate limit errors.

When Vapi returns an HTTP 429 Too Many Requests error, Truto passes that error directly to the caller. Truto normalizes the upstream rate limit information into standardized HTTP headers per the IETF specification: ratelimit-limit, ratelimit-remaining, and ratelimit-reset. It is entirely the responsibility of the calling agent framework to catch the 429, read the reset header, and explicitly wait. If your hand-coded LLM loop lacks robust error handling, the agent will enter an infinite loop of failed retries, burning your OpenAI credits in seconds.

Architecture: How Truto Exposes Vapi as Tools

Truto maps every Vapi API endpoint into a REST-based CRUD abstraction. Every integration on Truto consists of Resources (e.g., calls, campaigns, assistants) and Methods (e.g., List, Create, Get).

These Methods serve as Proxy APIs. Truto handles the authentication lifecycle, pagination normalization, and query parameter processing, returning data in a predefined format. For building agents, Proxy APIs are the ideal level of abstraction because they allow the LLM to access the raw data from the underlying product without forcing it through an opinionated unified schema.

Truto provides all the resources defined on the Vapi integration as executable tools for your LLM framework. By calling the GET /integrated-account/<id>/tools endpoint, Truto returns all available Vapi Proxy APIs with rich descriptions and JSON schemas optimized for LLM function calling.

flowchart TD
    A["Agent Loop<br>(LangChain, CrewAI)"] -->|"Request schemas"| B["Truto /tools API"]
    B -->|"Returns JSON Schemas"| A
    A -->|"LLM generates args"| C["Truto Proxy API"]
    C -->|"Injects Auth / Normalizes"| D["Vapi Upstream API"]
    D -->|"HTTP 429 (Rate Limit)"| C
    C -->|"Passes 429 + Headers"| A
    A -->|"Agent executes backoff"| A

Fetching and Binding Vapi Tools

To connect Vapi to your AI Agents, you do not need to deploy a standalone server. You simply use Truto's SDK to fetch the tools dynamically and bind them to your language model.

Here is an example using the TrutoToolManager from the @trutohq/truto-langchainjs-toolset package. This approach natively supports LangChain, but the raw schemas can be easily mapped to Vercel AI SDK or any other agent orchestrator.

import { ChatOpenAI } from "@langchain/openai";
import { TrutoToolManager } from "@trutohq/truto-langchainjs-toolset";
 
async function initializeVapiAgent(integratedAccountId: string) {
  // 1. Initialize the LLM
  const llm = new ChatOpenAI({
    modelName: "gpt-4o",
    temperature: 0,
  });
 
  // 2. Fetch tools from Truto for the connected Vapi account
  const toolManager = new TrutoToolManager({
    apiKey: process.env.TRUTO_API_KEY,
    integratedAccountId: integratedAccountId,
  });
 
  // Filter for write/read methods explicitly if needed
  const tools = await toolManager.getTools();
 
  // 3. Bind tools to the model natively
  const agentWithTools = llm.bindTools(tools);
 
  return agentWithTools;
}

By fetching tools at runtime, your agent automatically inherits any updates, new endpoints, or custom descriptions you add via the Truto dashboard. You never have to manually update a schema file when Vapi ships a new API version.

High-Leverage Vapi Tools for AI Agents

When building autonomous voice workflows, you do not need every single CRUD endpoint. You need high-leverage operations that orchestrate campaigns and manage call states. Here are the core hero tools you should prioritize exposing to your agent.

list_all_vapi_assistants

This tool retrieves all available voice assistants configured in the Vapi workspace. It returns core metadata like id, createdAt, and updatedAt. The agent relies on this to map human-readable assistant names to the specific IDs required for launching campaigns or making outbound calls.

"Fetch all of our available voice assistants. Find the ID for the assistant named 'Inbound Support Bot' and tell me when it was last updated."

create_a_vapi_campaign

Deploying outbound voice campaigns programmatically is Vapi's superpower. This tool creates a new campaign and returns its id, status, duration, and cost. It requires the agent to format specific configuration payloads, linking predefined assistants and phone number lists to the campaign logic.

"Create a new outbound campaign using the 'Lead Reactivation' assistant. Set the daily budget to $50 and associate it with our primary Twilio phone number block."

list_all_vapi_phone_numbers

Before an agent can spin up a campaign or initiate a call, it must know which telephony assets are available. This tool lists all Vapi phone numbers across supported telephony providers, returning the id, provider, name, and number. It supports timestamp filtering, allowing agents to audit recently provisioned assets.

"Audit our current telephony inventory. List all phone numbers we have provisioned via Twilio that were created in the last 30 days."

create_a_vapi_call

This is the core execution tool. It initiates a new Vapi call and returns the object including id, type, assistantId, and phoneNumberId. The agent must orchestrate several steps prior to calling this tool, ensuring it has valid IDs from the assistant and phone number listing endpoints.

"Initiate an outbound call to +15550198372. Use the 'Billing Recovery' assistant and route it through our standard outbound provider."

get_single_vapi_call_by_id

Because voice calls are heavily asynchronous, an agent cannot simply fire a call and immediately know the outcome. It must use this tool to fetch the real-time state of the call. This returns the status, duration, and associated metadata. It is highly useful in loop mechanisms where the agent waits for a call to terminate before proceeding to the next workflow step.

"Check the status of call ID call_12345. If the call is completed, summarize the final duration and cost. If it is still in-progress, wait."

list_all_vapi_stereo_recordings

This tool retrieves the dual-channel recording for a specific call. As noted in the engineering reality section, this endpoint returns a redirect to a short-lived presigned URL. The agent retrieves this URL to pass the raw audio file to secondary transcription or evaluation tools.

"Get the stereo recording for the call we just completed. Once you have the presigned URL, pass it to our transcription model to extract the customer's primary complaint."

For a complete list of endpoints, including analytics, evaluators, and reporting insights, view the Vapi integration page.

Workflows in Action

Exposing tools to an LLM is only the baseline. The real value is in combining these tools into autonomous, multi-step workflows tailored to specific business operations.

1. Automated Lead Reactivation Campaign Deployment

Persona: RevOps Engineer / Growth Manager

"We have a new list of churned customers. Provision a new Vapi squad using our Twilio provider, check our available phone numbers, and launch an outbound 'Winback' campaign utilizing those assets."

The agent calls list_all_vapi_phone_numbers filtering for the specific telephony provider to ensure available capacity.
The agent executes create_a_vapi_squad to group multiple specialized assistants together for the winback flow.
Finally, it uses create_a_vapi_campaign, passing the squad ID and phone numbers as parameters to officially launch the dialing queue.

The RevOps engineer receives a confirmation detailing the campaign ID, the total number of assigned numbers, and the initial status of the campaign, completely bypassing the Vapi UI.

2. Post-Call Quality Assurance and Escalation

Persona: Customer Success Operations

"Find the most recent outbound call that lasted longer than 10 minutes. Fetch its stereo recording URL, and check the observability scorecard to see if the agent adhered to the compliance script."

The agent uses list_all_vapi_calls, applying filters for type (outbound) and iterating over results to find a duration exceeding 600 seconds.
Once identified, it calls list_all_vapi_stereo_recordings with the target call ID to secure the presigned URL.
The agent then calls list_all_vapi_observability_scorecards, filtering by the call ID, to retrieve the automated QA metrics generated by Vapi.

The Customer Success team gets a structured report containing the direct recording link and the compliance score, immediately highlighting whether the call requires human intervention.

Building Multi-Step Workflows

When writing agent loops that interact with production infrastructure, you must handle state, pagination, and strict upstream rate limits. Because Truto directly surfaces Vapi's 429 status codes and standardized HTTP headers, your application code must inspect the response and execute backoffs.

Below is an architectural representation of a custom execution loop that explicitly handles Truto's standardized ratelimit-reset header. This is framework agnostic, easily adaptable to standard execution pipelines.

sequenceDiagram
    participant App as AI Agent Framework
    participant Truto as Truto Proxy Layer
    participant Vapi as Upstream API
    
    App->>Truto: Execute Tool (Create Campaign)
    Truto->>Vapi: POST /campaigns
    Vapi-->>Truto: HTTP 429 Too Many Requests
    Truto-->>App: Throw 429 + ratelimit-reset: 1718293048
    Note over App: Catch 429.<br>Calculate wait time.<br>Pause execution thread.
    App->>App: Wait (Sleep)
    App->>Truto: Execute Tool (Retry)
    Truto->>Vapi: POST /campaigns
    Vapi-->>Truto: HTTP 200 OK (Campaign Data)
    Truto-->>App: Return JSON Schema

Here is what that looks like conceptually in TypeScript, wrapping your agent invocation to ensure rate limits do not crash the runtime:

async function executeWithRateLimitHandling(agent, userPrompt: string) {
    let attempts = 0;
    const maxAttempts = 3;
 
    while (attempts < maxAttempts) {
        try {
            // Invoke the agent (LangChain, Vercel AI, etc.)
            const response = await agent.invoke({
                input: userPrompt
            });
            return response;
 
        } catch (error: any) {
            // Check if the error is an HTTP 429 passed through Truto
            if (error.response && error.response.status === 429) {
                // Extract the standardized Truto rate limit header
                const resetHeader = error.response.headers['ratelimit-reset'];
                
                if (resetHeader) {
                    // Calculate time to wait (header is typically UNIX timestamp or seconds)
                    const resetTime = parseInt(resetHeader, 10) * 1000; 
                    const now = Date.now();
                    const waitTime = Math.max(0, resetTime - now) + 1000; // Add 1s buffer
                    
                    console.warn(`Rate limit hit. Sleeping for ${waitTime}ms...`);
                    await new Promise(resolve => setTimeout(resolve, waitTime));
                    
                    attempts++;
                    continue;
                }
            }
            // Rethrow non-429 errors or exhausted retries
            throw error;
        }
    }
    throw new Error("Agent execution failed: Rate limit retry exhausted.");
}

By pushing the retry logic to the caller, Truto ensures that your agent framework retains total control over execution timeouts, budget limits, and context window management. It prevents runaway background queues that drain API credits.

Moving Forward

Integrating Vapi into AI agent workflows unlocks massive scale for voice operations, inbound support, and outbound revenue generation. However, exposing complex telephony APIs directly to LLMs through hand-coded integrations introduces severe stability risks, schema maintenance debt, and failure states caused by transport quirks like 302 redirects.

Using Truto's /tools endpoint to dynamically fetch and bind Vapi methods allows your engineering team to focus entirely on agent orchestration and prompt logic. Truto handles the authentication, normalizes the pagination, maps the endpoints into LLM-friendly schemas, and exposes the exact rate limit headers you need to build resilient systems. If you are looking for more structured ways to expose your internal tools to agents, check out our guide on building MCP servers.