---
title: "Connect AssemblyAI to Claude: Process Speech and Generate Subtitles"
slug: connect-assemblyai-to-claude-process-speech-and-generate-subtitles
date: 2026-06-16
author: Uday Gajavalli
categories: ["AI & Agents"]
excerpt: "Learn how to connect AssemblyAI to Claude using a managed MCP server. Automate asynchronous transcription, subtitle generation, and speech analysis."
tldr: "Step-by-step guide to integrating AssemblyAI with Claude via Truto's managed MCP server. Covers handling async polling, generating subtitles, and securely connecting Claude to speech-to-text workflows without custom API code."
canonical: https://truto.one/blog/connect-assemblyai-to-claude-process-speech-and-generate-subtitles/
---

# Connect AssemblyAI to Claude: Process Speech and Generate Subtitles


If you need to connect AssemblyAI to Claude to process raw audio, generate subtitles, or run speech intelligence tasks, you need a Model Context Protocol (MCP) server. This server acts as the translation layer between Claude's tool calling capabilities and AssemblyAI's REST APIs. You can either [build, host, and maintain this infrastructure yourself](https://truto.one/the-hands-on-guide-to-building-mcp-servers-for-ai-agents-2026/), or use a managed integration platform like Truto to dynamically generate a secure, authenticated MCP server URL. If your team uses ChatGPT, check out our guide on [connecting AssemblyAI to ChatGPT](https://truto.one/connect-assemblyai-to-chatgpt-transcribe-and-analyze-audio-content/) or explore our broader architectural overview on [connecting AssemblyAI to AI Agents](https://truto.one/connect-assemblyai-to-ai-agents-search-and-understand-voice-data/).

Giving a Large Language Model (LLM) the ability to orchestrate complex speech-to-text pipelines is an engineering challenge. You are not just dealing with standard JSON payloads - you have to handle binary file uploads, long-running asynchronous polling loops, and strict content schema requirements. Every time AssemblyAI updates a model version or introduces a new intelligence endpoint, you have to update your server code, redeploy, and test the integration.

This guide breaks down exactly how to use Truto to generate a secure, [managed MCP server](https://truto.one/managed-mcp-for-claude-full-saas-api-access-without-security-headaches/) for AssemblyAI, connect it natively to Claude, and execute sophisticated speech processing workflows using natural language.

## The Engineering Reality of the AssemblyAI API

A custom MCP server is a self-hosted integration layer. While the open MCP standard provides a predictable way for models to discover tools over JSON-RPC, the reality of implementing it against AssemblyAI's specific infrastructure requires handling several unique architectural constraints.

If you decide to build a custom MCP server for AssemblyAI, you own the entire API lifecycle. Here are the specific challenges you will face:

**Asynchronous Processing and LLM Timeouts**
Claude expects synchronous tool execution. When it calls a tool, it expects an immediate response. However, AssemblyAI's transcription architecture is inherently asynchronous. Submitting an audio file returns an immediate `200 OK` with a `status: "queued"` and a job ID. The actual processing might take seconds or minutes. If you expose raw endpoints to Claude, the model will assume the transcription failed when it doesn't get the text back immediately. You must expose separate tools for initiation and polling, and strictly prompt the model to wait and retrieve the final payload.

**Multi-Step Binary Uploads**
AssemblyAI allows transcription via public URLs, but if you have local or protected files, you must first use the upload endpoint to post raw binary data. Sending raw binary through an LLM tool call is not natively supported. A managed MCP server abstracts this, allowing the LLM to trigger a file pipeline where the server handles the multipart or binary transfer stream, returning a secure media URL to the LLM for the subsequent transcription request.

**Rate Limits and Concurrency Ceilings**
AssemblyAI enforces strict concurrency limits on transcription jobs. If an over-eager AI agent decides to batch-process 50 historical audio files at once, AssemblyAI will return an `HTTP 429 Too Many Requests`. Truto does not retry, throttle, or absorb these rate limits. Instead, Truto passes the `429` error directly back to Claude while normalizing the upstream rate limit information into standardized IETF headers (`ratelimit-limit`, `ratelimit-remaining`, `ratelimit-reset`). The caller - your agent or Claude Desktop - is responsible for reading these headers and implementing its own retry and backoff logic.

Instead of building this infrastructure from scratch, you can use Truto. Truto derives tool definitions directly from AssemblyAI's documentation records, exposing normalized endpoints as ready-to-use MCP tools.

## How to Generate an AssemblyAI MCP Server with Truto

Truto's MCP architecture does not rely on hand-coded tool definitions. Instead, it [dynamically generates tools](https://truto.one/auto-generated-mcp-tools-for-ai-agents-a-2026-architecture-guide/) based on the API endpoints defined in the integration's configuration and documentation schema. If an endpoint is documented, it becomes an AI tool.

Each MCP server is scoped to a single integrated account (a connected AssemblyAI instance for a specific tenant). The resulting server URL contains a cryptographically hashed token that encodes the account, environment, and filtering rules. You can create this server through the Truto UI or via the API.

### Method 1: Via the Truto UI

For internal workflows or one-off agent deployments, generating the server via the UI is the fastest path.

1. Navigate to the integrated account page for your AssemblyAI connection.
2. Click the **MCP Servers** tab.
3. Click **Create MCP Server**.
4. Select your desired configuration (name, allowed methods like `read` or `write`, tags, and expiration).
5. Click Save and **copy the generated MCP server URL**.

### Method 2: Via the Truto API

For production applications provisioning AI agents for end-users, you will generate these servers programmatically. Make a `POST` request to the integrated account endpoint with your configuration payload.

```bash
curl -X POST https://api.truto.one/integrated-account/{integrated_account_id}/mcp \
  -H "Authorization: Bearer YOUR_TRUTO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "AssemblyAI Transcript Processor",
    "config": {
      "methods": ["read", "write", "custom"],
      "tags": ["transcription", "intelligence"]
    },
    "expires_at": null
  }'
```

The API returns a fully qualified JSON-RPC 2.0 endpoint:

```json
{
  "id": "mcp_token_abc123",
  "name": "AssemblyAI Transcript Processor",
  "url": "https://api.truto.one/mcp/a1b2c3d4e5f6..."
}
```

This URL is completely self-contained. It handles authentication to AssemblyAI, dynamic schema generation, and pagination normalization.

## Connecting the MCP Server to Claude

Once you have the Truto MCP server URL, you must register it with your Claude client. Anthropic supports remote server-sent events (SSE) connections for custom endpoints.

### Method A: Via the Claude UI

If you are using Claude Desktop or the Claude Enterprise web interface:

1. Open **Settings**. 
2. Navigate to **Integrations** (or **Connectors** depending on your tier).
3. Click **Add MCP Server** or **Add custom connector**.
4. Paste the Truto MCP server URL and provide a name (e.g., "AssemblyAI").
5. Click **Add**.

Claude will immediately send an `initialize` request to the Truto server, validate the protocol version, and populate the model's context window with the available AssemblyAI tools.

### Method B: Via Manual Config File

For developers running custom environments or local Claude Desktop instances, you can wire the server directly into the configuration JSON. Since Truto provides an HTTP endpoint, you use the standard SSE transport wrapper.

Locate your `claude_desktop_config.json` file:
- Mac: `~/Library/Application Support/Claude/claude_desktop_config.json`
- Windows: `%APPDATA%\Claude\claude_desktop_config.json`

Update the configuration to include the Truto endpoint:

```json
{
  "mcpServers": {
    "assemblyai-truto": {
      "command": "npx",
      "args": [
        "-y",
        "@modelcontextprotocol/server-sse",
        "--url",
        "https://api.truto.one/mcp/a1b2c3d4e5f6..."
      ]
    }
  }
}
```

Restart Claude. The integration is now live.

## Hero Tools for AssemblyAI

Truto automatically translates AssemblyAI's endpoints into descriptive, snake_case tools with JSON schemas attached. Here are the highest-leverage tools your agents will use to orchestrate speech workflows.

### create_a_assembly_ai_upload

Use this tool to upload a local media file as raw binary data to AssemblyAI's secure servers. This is the required first step if your audio file is not already hosted on a publicly accessible URL.

**Usage note:** The tool returns an uploaded file object containing an `upload_url`. This URL is temporary and strictly scoped for use in the transcription request.

> "I have a local file named 'interview_raw.mp3'. Upload this binary payload to AssemblyAI and capture the upload URL for processing."

### create_a_assembly_ai_transcript

This tool initiates the asynchronous transcription process. You provide an `audio_url` (either public or generated via the upload tool) and optional parameters for intelligence models like speaker diarization, profanity filtering, or sentiment analysis.

**Usage note:** This tool does not return the text. It returns a transcript object with an `id` and a `status` (typically "queued" or "processing"). The LLM must capture this ID for the polling step.

> "Start a transcription job for the audio at this URL: https://example.com/audio.mp3. Enable speaker diarization so we know who is talking. Give me the transcript ID when the job is queued."

### get_single_assembly_ai_transcript_by_id

This is the polling tool. The agent uses this to check the status of a specific transcription job. Once the `status` string changes to "completed", the response payload will include the full `text`, speaker labels, and any requested intelligence data.

**Usage note:** If you hit rate limits while polling, Truto passes the `429` back with `ratelimit-reset` headers. Prompt your agent to wait the specified seconds before trying again.

> "Check the status of transcript ID 550e8400. If it is still processing, wait 10 seconds and check again. Once completed, summarize the main topics discussed."

### list_all_assembly_ai_sentences

Instead of a massive block of raw text, this tool returns the transcript segmented into grammatically correct sentences. This is highly optimized for LLMs that need to extract quotes or build structured summaries.

**Usage note:** You must provide a valid `transcript_id` of a completed job.

> "Get the sentence breakdown for transcript ID 550e8400. Find the specific sentence where the speaker mentions 'quarterly revenue' and tell me the exact timestamp."

### get_single_assembly_ai_subtitle_by_id

This tool instantly converts a completed transcription into formatted subtitle files. You pass the transcript ID and the `subtitle_format` (either `srt` or `vtt`).

**Usage note:** The returned payload is the raw text of the subtitle file, including all sequence numbers and timestamp boundaries. Claude can easily write this to a local file or code block.

> "Generate VTT subtitles for transcript ID 550e8400. Output the VTT format directly into a code block so I can copy it into my video editor."

### list_all_assembly_ai_word_searches

This tool performs server-side keyword searching across the entire transcript. You provide the `transcript_id` and a comma-separated list of `words`.

**Usage note:** This is vastly more efficient than asking the LLM to read a 40,000-word transcript just to find specific terms. Let AssemblyAI's search endpoint do the heavy lifting.

> "Search transcript ID 550e8400 for occurrences of 'lawsuit', 'litigation', and 'settlement'. Return the exact matches and their timestamps."

To view the complete inventory of AssemblyAI endpoints - including speech understanding, streaming tokens, and redaction features - visit the [AssemblyAI integration page](https://truto.one/integrations/detail/assemblyai).

## Workflows in Action

By connecting these tools through an MCP server, Claude transforms from a text-only assistant into a multimodal speech processing engine. Here are real-world examples of how personas use this setup.

### Scenario 1: The Podcaster Subtitle Pipeline

A content creator needs to turn an hour-long podcast interview into accurate, timestamped VTT subtitles for YouTube.

> "I have an audio file hosted at https://audio.com/ep42.mp3. I need you to transcribe it, wait for the job to finish, and then generate the VTT subtitles for the entire episode."

**How Claude executes this:**
1. Calls `create_a_assembly_ai_transcript` with the provided `audio_url`.
2. Receives a queued job response and extracts the `id`.
3. Enters a loop, calling `get_single_assembly_ai_transcript_by_id` periodically.
4. Once the status reads `completed`, it calls `get_single_assembly_ai_subtitle_by_id` with `subtitle_format: "vtt"`.
5. Returns the fully formatted VTT content in a code block for the user.

### Scenario 2: Compliance Officer Keyword Auditing

A financial compliance officer needs to audit recorded sales calls for prohibited guarantees or high-risk language.

> "Take the call recording hosted at https://internal.company.com/call-104.wav. Transcribe it with speaker labels. Once done, search the audio for the phrases 'guaranteed return', 'risk free', and 'off the record'. Flag any occurrences."

**How Claude executes this:**
1. Calls `create_a_assembly_ai_transcript` passing the URL and enabling speaker diarization.
2. Polls `get_single_assembly_ai_transcript_by_id` until completion.
3. Calls `list_all_assembly_ai_word_searches` passing the transcript ID and the requested keyword phrases.
4. If matches are found, it queries `list_all_assembly_ai_sentences` to pull the exact context and speaker label for the flagged timestamps.
5. Presents a formatted compliance risk report to the user.

## Security and Access Control

When connecting an enterprise workspace to an AI model, security cannot be an afterthought. Exposing administrative API endpoints to an autonomous agent is dangerous without proper guardrails. Truto MCP servers include built-in authorization primitives at the token level:

*   **Method Filtering:** Limit the server to specific operation types. By passing `methods: ["read"]` during server creation, you ensure Claude can only poll and fetch transcripts, preventing it from deleting historical data.
*   **Tag Filtering:** Group specific resources using integration-level tags. You can restrict the server to only expose tools tagged with `intelligence`, hiding billing or account-level endpoints.
*   **Mandatory API Auth (`require_api_token_auth`):** For shared environments, possession of the MCP URL isn't enough. Enabling this flag forces the client to also pass a valid Truto session or API token, authenticating both the server and the individual user executing the prompt.
*   **Automatic Expiration (`expires_at`):** Generate short-lived MCP servers for contractors or temporary AI workflows. Setting an ISO datetime ensures the token is purged from distributed edge storage the moment it expires, neutralizing the URL permanently.

## Automate Speech Data the Right Way

Connecting AssemblyAI to Claude gives your LLMs the power to hear. But managing asynchronous polling logic, mapping complex JSON schemas, and dealing with 429 rate limit backoff is a massive drain on engineering resources. A managed MCP layer solves this by abstracting the boilerplate and serving dynamic, documentation-driven tools directly to your models.

By leveraging Truto, you can build autonomous workflows that upload media, generate transcriptions, create subtitles, and search speech data without writing a single line of integration code.

> Stop wrestling with asynchronous API polling and custom connectors. Partner with Truto to instantly generate secure, production-ready MCP servers for AssemblyAI and 100+ other SaaS platforms.
>
> [Talk to us](https://cal.com/truto/partner-with-truto)
