How do I give Claude access to Vapi call recordings?

You can generate a managed Model Context Protocol (MCP) server URL using Truto. This exposes Vapi endpoints as tools, allowing Claude to securely fetch call metadata, analytics, and presigned URLs for audio recordings.

Does Truto automatically handle Vapi API rate limits?

No. Truto passes 429 Too Many Requests errors directly to the caller. However, it normalizes upstream rate limit information into standard IETF headers (ratelimit-limit, ratelimit-remaining, ratelimit-reset) so your agent can implement accurate retry and backoff logic.

Can I restrict which Vapi endpoints Claude can access?

Yes. When creating the MCP server in Truto, you can use method filtering (e.g., allowing only 'read' operations) and tag filtering to strictly limit the LLM's access to specific Vapi resources like calls and analytics.

How do I download actual Vapi audio recordings via MCP?

Vapi's recording endpoints return a 302 redirect to a short-lived presigned URL. The Truto MCP tool executes the request and returns this URL, which your agent or local script can use to download the binary audio file.

Connect Vapi to Claude: Analyze Recordings, Evals, and Analytics

If your team needs to connect Vapi to Claude to automate call analytics, transcribe audio pipelines, or run quality assurance evaluations on AI voice agents, you need a Model Context Protocol (MCP) server. This server translates Claude's natural language tool calls into structured requests against Vapi's REST endpoints. You can spend weeks building and hosting this infrastructure yourself, or use a managed integration layer like Truto to dynamically generate a secure MCP endpoint. If your team uses ChatGPT, check out our guide on connecting Vapi to ChatGPT or explore our broader architectural overview on connecting Vapi to AI Agents.

Vapi provides a powerful infrastructure layer for building voice AI, but interacting with its API programmatically introduces significant engineering friction. Managing large schema dictionaries for assistants, handling complex nested payloads for structured outputs, and parsing multi-step observability flows requires extensive boilerplate. Giving a Large Language Model (LLM) raw access to this ecosystem is dangerous without strict schema validation and access controls.

This guide breaks down exactly how to use Truto to generate a managed MCP server for Vapi, connect it securely to Claude, and execute complex analytics and evaluation workflows using natural language.

The Engineering Reality of the Vapi API

A custom MCP server is essentially a self-hosted API gateway that maps natural language intents to strict JSON schemas. While Anthropic's MCP specification provides a predictable discovery mechanism for LLMs, the reality of implementing it against Vapi's specific API design presents several distinct challenges.

If you decide to build a custom MCP server for Vapi, you take on the entire integration lifecycle. Here is exactly what makes the Vapi API uniquely difficult to wrap for AI agents.

302 Redirects for Media Payloads

Most modern REST APIs return structured JSON data. However, when an agent needs to retrieve actual audio data - such as pulling a customer-side recording for sentiment analysis - the LLM cannot natively process raw binary WAV or MP3 files. When you call Vapi's recording endpoints (like stereo or mono recordings), the API does not return a JSON payload with a download link. Instead, it issues a 302 Found redirect to a short-lived presigned cloud storage URL. If your custom MCP server doesn't catch and intercept this redirect properly, the LLM receives an opaque network error. Truto normalizes this behavior, cleanly exposing the target presigned URL back to the LLM so it can hand the URL off to downstream transcription or storage tools.

Discriminator-Heavy Data Models

Vapi relies extensively on discriminated unions for its configurations. A tool configuration in Vapi might be an apiRequest, a function, a transferCall, or an sms type. Each variant requires a completely different nested schema. If you expose a generic JSON object to Claude, the model will frequently mix up the required fields for different discriminators, attempting to pass a webhook URL to a function tool. Truto derives precise JSON Schema definitions from the integration's documentation records, forcing the LLM to adhere to the exact required fields for the specific discriminator it selects.

Rate Limit Transparency

Vapi enforces strict concurrency and rate limits based on your account tier. When scraping historical call logs or running bulk evaluations, your agent will inevitably hit these ceilings. A common mistake developers make is expecting their integration proxy to silently queue and retry these requests. Truto does not retry, throttle, or apply backoff on rate limit errors.

When Vapi returns an HTTP 429 Too Many Requests, Truto passes that error directly to the caller. However, Truto normalizes the upstream rate limit information into standardized headers (ratelimit-limit, ratelimit-remaining, ratelimit-reset) per the IETF specification. This explicit pass-through is an architectural feature, not a bug - it ensures your orchestrator or LangGraph agent is fully aware of the upstream backpressure and can execute an intelligent exponential backoff strategy rather than failing silently inside a black-box proxy.

How to Generate a Vapi MCP Server with Truto

Truto dynamically generates MCP servers based on the resources and documentation available for an integration. These servers are stateless, secure, and hosted at the edge. They map Vapi's endpoints directly to JSON-RPC 2.0 tools.

You can generate your Vapi MCP server using either the Truto Dashboard or the API.

Method 1: Via the Truto UI

For administrators and non-technical operators, generating a server takes seconds via the dashboard.

Log into your Truto environment and navigate to the integrated account page for your connected Vapi instance.
Click the MCP Servers tab.
Click Create MCP Server.
Configure your desired access parameters. You can name the server (e.g., "Vapi Analytics Agent"), filter allowed methods to only include read operations, or restrict access to specific resource tags.
Copy the generated MCP server URL (e.g., https://api.truto.one/mcp/abc123def456...).

Method 2: Via the Truto API

For platform engineers building multi-tenant AI products, you can provision MCP servers programmatically. This endpoint issues a secure, hashed token stored in a distributed KV store for low-latency authentication.

Make a POST request to /integrated-account/:id/mcp:

curl -X POST https://api.truto.one/api/integrated-account/<vapi_account_id>/mcp \
  -H "Authorization: Bearer <YOUR_TRUTO_API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Vapi Call Evaluator",
    "config": {
      "methods": ["read", "list"],
      "tags": ["analytics", "observability"]
    },
    "expires_at": "2026-12-31T23:59:59Z"
  }'

The API returns the provisioned server details, including the single URL you need to configure Claude.

{
  "id": "mcp_abc123",
  "name": "Vapi Call Evaluator",
  "config": {
    "methods": ["read", "list"],
    "tags": ["analytics", "observability"]
  },
  "expires_at": "2026-12-31T23:59:59.000Z",
  "url": "https://api.truto.one/mcp/deadbeef1234567890abcdef"
}

How to Connect the MCP Server to Claude

With your secure Truto MCP URL in hand, you can immediately attach it to Claude. Because Truto's servers are fully self-contained and encode the integrated account context within the tokenized URL, there is zero setup required on the client side beyond providing the link.

Method 1: Via the Claude UI

If you are using Claude's enterprise workspaces or custom agents via the web interface:

In Claude, navigate to Settings - Integrations (or Custom Connectors).
Click Add MCP Server.
Provide a friendly name for the integration (e.g., "Vapi Production Data").
Paste the Truto MCP URL into the Server URL field.
Click Add.

Claude will immediately ping the endpoint, execute the handshake, and ingest the dynamically generated JSON schemas for the Vapi API.

Method 2: Via Configuration File (Claude Desktop)

If you are running Claude Desktop locally or orchestrating a local agent environment, you can connect the server using your claude_desktop_config.json file. Truto's endpoints operate over Server-Sent Events (SSE).

Open your configuration file (typically located at ~/Library/Application Support/Claude/claude_desktop_config.json on macOS) and append the following block:

{
  "mcpServers": {
    "vapi-analytics": {
      "command": "npx",
      "args": [
        "-y",
        "@modelcontextprotocol/server-sse",
        "https://api.truto.one/mcp/deadbeef1234567890abcdef"
      ]
    }
  }
}

Restart Claude Desktop. The application will initialize the server and make the Vapi tools instantly available in your chat interface.

High-Leverage Vapi MCP Tools for Claude

Truto automatically generates descriptive, snake_case tools from the underlying Vapi integration documentation. The LLM understands exactly what arguments are required and how to structure the payload. Here are six high-leverage hero tools for analyzing calls and running evaluations.

1. list_all_vapi_calls

This is the core retrieval tool for pulling historical call metadata. It returns essential data like assistantId, type, and duration metrics. Agents use this to fetch a batch of recent interactions for post-call analysis.

"Claude, list all Vapi calls made in the last 24 hours for assistant ID 'ast_98765'. I need to see if we had an abnormal volume of disconnected calls."

2. list_all_vapi_stereo_recordings

This tool handles the complex task of retrieving audio files. Instead of crashing on binary data, it captures the Vapi 302 redirect and returns a short-lived presigned URL. The agent can provide this URL to you or pass it to an audio analysis tool.

"Retrieve the stereo recording URL for call ID 'call_123abc'. I need to download the raw audio to analyze the latency between the user speaking and the assistant responding."

3. create_a_vapi_structured_output_run

When you need to extract highly specific schema-driven insights from a call after the fact, this tool allows Claude to trigger a structured output job. The LLM constructs the complex JSON payload mapping out the target extraction schema.

"Trigger a structured output run for call 'call_456def'. I need you to extract the customer's budget, timeline, and primary objection into a JSON object based on the call transcript."

4. list_all_vapi_evals

Agents use this to audit the automated Quality Assurance configurations within Vapi. It returns the active criteria being used to grade assistant performance.

"List all configured Vapi evals in the system. I want to see how we are currently scoring calls for compliance and politeness."

5. get_single_vapi_eval_run_by_id

Once an eval run is completed, this tool pulls the specific grading results. Claude can use this to summarize why a particular call failed a compliance check.

"Get the evaluation run results for eval run ID 'evr_999'. Summarize the areas where the assistant failed to follow the script."

6. create_a_vapi_analytics

This tool allows the LLM to submit aggregate queries against Vapi's analytics engine. It accepts an array of query objects specifying the target table and operations, returning high-level metrics like total cost or average duration.

"Create an analytics query against the calls table. I need to aggregate the total cost and average duration for all calls made by assistant 'ast_111' over the past 7 days."

Workflows in Action

Providing an LLM with individual tools is useful, but the true power of MCP lies in multi-step orchestration. Claude can chain these operations together to replace hours of manual dashboard navigation.

Scenario 1: Post-Call Quality Assurance & Extraction

Persona: QA Engineer / Support Ops Goal: Identify a specific flagged call, retrieve its recording for manual review, and run a structured extraction to update an external CRM.

"Claude, find the most recent call for assistant ID 'ast_555'. Retrieve its stereo recording URL so I can download it. Then, trigger a structured output run to extract the user's callback number and specific technical issue."

Execution Flow:

sequenceDiagram
    participant User
    participant Claude as Claude Desktop
    participant Truto as Truto MCP Server
    participant Vapi as Vapi API

    User->>Claude: Find recent call, get recording, run extraction
    Claude->>Truto: Call list_all_vapi_calls (assistantId: ast_555, limit: 1)
    Truto->>Vapi: GET /call?assistantId=ast_555&limit=1
    Vapi-->>Truto: Return Call ID (call_789)
    Truto-->>Claude: JSON array of calls
    Claude->>Truto: Call list_all_vapi_stereo_recordings (id: call_789)
    Truto->>Vapi: GET /call/call_789/recording/stereo
    Vapi-->>Truto: 302 Redirect URL (Presigned S3 link)
    Truto-->>Claude: Presigned URL string
    Claude->>Truto: Call create_a_vapi_structured_output_run
    Truto->>Vapi: POST /structured-output/run (with Call ID and schema payload)
    Vapi-->>Truto: Run initiated object
    Truto-->>Claude: Extraction confirmation
    Claude-->>User: "Here is the recording link. The structured extraction job has been queued."

Scenario 2: Investigating Eval Failures and Cost Analytics

Persona: Voice AI Product Manager Goal: Audit system performance to determine why eval scores dropped and how much it cost.

"Claude, check the latest eval runs. If any have a failing score, get the specific run details to see why. Finally, create an analytics query to calculate the total cost of all calls for the last week."

Execution Flow:

sequenceDiagram
    participant User
    participant Claude as Claude Desktop
    participant Truto as Truto MCP Server
    participant Vapi as Vapi API

    User->>Claude: Check failed evals and get total cost
    Claude->>Truto: Call list_all_vapi_eval_runs
    Truto->>Vapi: GET /eval-run
    Vapi-->>Truto: List of eval runs
    Truto-->>Claude: JSON array of eval runs
    Claude->>Truto: Call get_single_vapi_eval_run_by_id (id: failing_run_123)
    Truto->>Vapi: GET /eval-run/failing_run_123
    Vapi-->>Truto: Detailed run criteria and failure reasons
    Truto-->>Claude: Eval run details
    Claude->>Truto: Call create_a_vapi_analytics
    Truto->>Vapi: POST /analytics (with aggregation query payload)
    Vapi-->>Truto: Aggregated cost metrics
    Truto-->>Claude: Total cost data
    Claude-->>User: "Found 2 failing evals due to script deviation. Total call cost last week was $342.15."

Security and Access Control

Giving an AI agent read/write access to production telephony and analytics data introduces significant risk. If an agent hallucinates a destructive action, it could delete assistant configurations or purge historical logs. Truto provides four distinct layers of security to lock down your Vapi MCP servers:

Method Filtering: Restrict servers at the protocol level. Configure methods: ["read"] to ensure the server only derives get and list operations from the Vapi integration. Write endpoints (create, update, delete) will simply not exist in the LLM's tool list.
Tag Filtering: Limit horizontal access by resource domain. By assigning config.tags: ["observability"], the server will only generate tools related to evals, analytics, and scorecards, blocking access to core assistant configurations or billing.
API Token Authentication: By setting require_api_token_auth: true, possession of the MCP URL is no longer sufficient. The client must also inject a valid Truto API token via a Bearer header, ensuring only authenticated internal systems can execute tools.
Time-to-Live (TTL): Use the expires_at parameter to grant temporary access. When the timestamp is reached, the distributed KV cache automatically evicts the token, and background alarms instantly destroy the server metadata. This is ideal for short-lived audit or contractor access.

Architecting for Production

Integrating voice AI data into an LLM workflow requires moving beyond manual script execution and building resilient, automated pipelines. Writing custom API wrappers to handle Vapi's discriminator schemas, binary recording redirects, and strict rate limits drains engineering velocity.

By utilizing Truto's dynamic MCP generation, you shift the burden of maintaining integration boilerplate to an edge-optimized infrastructure layer. Your agents get immediate, strictly-typed access to Vapi's analytics, evals, and call logs, allowing your engineering team to focus on building better AI interactions rather than debugging JSON mapping errors.

Connect Vapi to Claude: Analyze Recordings, Evals, and Analytics

The Engineering Reality of the Vapi API

302 Redirects for Media Payloads

Discriminator-Heavy Data Models

Rate Limit Transparency

How to Generate a Vapi MCP Server with Truto

Method 1: Via the Truto UI

Method 2: Via the Truto API

How to Connect the MCP Server to Claude

Method 1: Via the Claude UI

Method 2: Via Configuration File (Claude Desktop)

High-Leverage Vapi MCP Tools for Claude

1. list_all_vapi_calls

2. list_all_vapi_stereo_recordings

3. create_a_vapi_structured_output_run

4. list_all_vapi_evals

5. get_single_vapi_eval_run_by_id

6. create_a_vapi_analytics

Workflows in Action

Scenario 1: Post-Call Quality Assurance & Extraction

Scenario 2: Investigating Eval Failures and Cost Analytics

Security and Access Control

Architecting for Production

FAQ

More from our Blog

Managed MCP for Claude: Full SaaS API Access Without the Security Headaches

What is MCP and MCP servers and How do they work: A complete in-depth guide on MCPs

The Hands-On Guide to Building MCP Servers for AI Agents (2026 Architecture)

How to Handle Third-Party API Rate Limits When AI Agents Scrape Data