Conversational Intelligence

AssemblyAI
API integration

Ship Conversational Intelligence features without building the integration. Full AssemblyAI API access via Proxy and 10+ MCP-ready tools for AI agents — extend models and mappings to fit your product.

Talk to us

Use Cases

Why integrate with AssemblyAI

Common scenarios for SaaS companies building AssemblyAI integrations for their customers.

01

Add AI-powered call intelligence to sales platforms

Sales enablement and CRM SaaS companies can let their users connect AssemblyAI to automatically transcribe and analyze customer calls. Truto handles the auth and async transcription flow so teams ship Gong-style features in days, not quarters.

02

Embed compliant transcription in healthcare and legal SaaS

Telehealth, EHR, and LegalTech platforms can offer their customers PII-redacted transcripts and audio for HIPAA-sensitive workflows. End users bring their own AssemblyAI account, and the SaaS app delivers automated SOAP notes or deposition transcripts without owning the compliance burden.

03

Power interactive video and podcast experiences for media platforms

Video editing suites, podcast hosts, and EdTech platforms can give creators auto-generated subtitles, paragraph-level transcripts, and word-level search inside their player. Truto exposes the segmentation and subtitle endpoints behind one consistent interface.

04

Ship live agent assist for contact center SaaS

CCaaS and UCaaS platforms can offer real-time transcription and AI co-pilot features by issuing short-lived streaming tokens to the agent's browser. The SaaS owns the UX while AssemblyAI handles the live STT.

05

Layer generative AI workflows on top of recorded conversations

Any SaaS sitting on a library of recorded audio can let customers run custom LLM prompts against their transcripts — extracting action items, scoring calls, or generating summaries — using AssemblyAI's chat completion endpoint instead of routing through a separate LLM provider.

What You Can Build

Ship these features with Truto + AssemblyAI

Concrete product features your team can ship faster by leveraging Truto’s AssemblyAI integration instead of building from scratch.

01

End-to-end async transcription pipeline

Upload media files to AssemblyAI, kick off a transcript job, and fetch the completed result through a single integration layer.

02

Interactive transcript viewer with timestamps

Render readable transcripts segmented by sentences and paragraphs with millisecond timestamps for clickable, jump-to-moment playback.

03

Auto-generated SRT/VTT subtitles for video

Pull standard subtitle formats directly from a transcript ID so customers can publish captioned video without a separate captioning tool.

04

In-video word search and navigation

Let end users search for keywords inside long recordings and jump to the exact timestamp where each match occurs.

05

AI call scorecards and summaries

Send custom prompts against a transcript to extract objections, grade reps, or generate structured summaries — all without a separate LLM integration.

06

PII-redacted audio storage for compliance

Offer customers a redacted audio file with sensitive information beeped out, ready to attach to a patient record, case file, or call log.

SuperAI

AssemblyAI AI agent tools

Comprehensive AI agent toolset with fine-grained control. Integrates with MCP clients like Cursor and Claude, or frameworks like LangChain.

create_a_assembly_ai_upload

Upload a media file to AssemblyAI's servers as raw binary data. Returns the uploaded file object upon success, which can be referenced in subsequent transcription requests.

list_all_assembly_ai_transcripts

List AssemblyAI transcripts sorted from newest to oldest. Returns: id, status, audio_url. Transcripts are available for the last 90 days of usage.

create_a_assembly_ai_transcript

Create an AssemblyAI transcript from a media file accessible via URL. Returns the transcript object including id and status; processing is asynchronous—poll via get until status is "completed". Required: audio_url.

get_single_assembly_ai_transcript_by_id

Get an AssemblyAI transcript by id. Returns the transcript resource including id, status, text, and audio_url; the transcript is ready when status is "completed". Required: id.

delete_a_assembly_ai_transcript_by_id

Delete an AssemblyAI transcript by id, removing all associated data and marking it as deleted. Returns the deleted transcript object. Required: id.

list_all_assembly_ai_sentences

Get the sentences for an AssemblyAI transcript, semantically segmented for reader-friendly output. Returns an array of sentence objects. Required: transcript_id.

list_all_assembly_ai_paragraphs

Get the paragraphs of an AssemblyAI transcript, semantically segmented for reader-friendly output. Returns an array of paragraph objects. Required: transcript_id.

list_all_assembly_ai_word_searches

Search an AssemblyAI transcript for keywords or phrases and return all matching occurrences. Returns: matches. Required: transcript_id, words. Each search term can be an individual word, number, or phrase of up to five words.

list_all_assembly_ai_redacted_audios

Get the redacted audio in AssemblyAI for a specific transcript using transcript_id. Returns status indicating redacted audio readiness and redacted_audio_url containing the downloadable file link. Redacted audio is available for 24 hours only.

get_single_assembly_ai_subtitle_by_id

Get subtitles for an AssemblyAI transcript in the specified format. Returns the subtitle file content for the given transcript. Required: id, subtitle_format.

create_a_assembly_ai_chat_completion

Create a chat completion in AssemblyAI. Generates model responses based on provided messages or prompt. Returns choices array with message content, finish_reason, and usage details for token counts.

create_a_assembly_ai_speech_understanding

Create a speech understanding task in AssemblyAI for a given transcript_id. Performs translation, speaker_identification, or custom_formatting on the transcript. Returns response objects with task status, mappings, formatted_text, and translated_texts.

list_all_assembly_ai_streaming_token

Generate a temporary streaming_token in AssemblyAI. Requires expires_in_seconds parameter. Returns token and expires_in_seconds fields, which indicate the generated temporary authentication token and its redemption window.

list_all_assembly_ai_voice_agent_token

Generate a temporary Voice Agent token in AssemblyAI. Requires expires_in_seconds. Returns token and expires_in_seconds fields used to authenticate a single session.

Why Truto

Why use Truto’s MCP server for AssemblyAI

Other MCP servers give you a static tool list for one app. Truto gives you a managed, multi-tenant MCP infrastructure across 650+ integrations.

01

Auto-generated, always up to date

Tools are dynamically generated from curated documentation — not hand-coded. As integrations evolve, tools stay current without manual maintenance.

02

Fine-grained access control

Scope each MCP server to read-only, write-only, specific methods, or tagged tool groups. Expose only what your AI agent needs — nothing more.

03

Multi-tenant by design

Each MCP server is scoped to a single connected account with its own credentials. The URL itself is the auth token — no shared secrets, no credential leaking across tenants.

04

Works with every MCP client

Standard JSON-RPC 2.0 protocol. Paste the URL into Claude, ChatGPT, Cursor, or any MCP-compatible agent framework — tools are discovered automatically.

05

Built-in auth, rate limits, and error handling

Tool calls execute through Truto’s proxy layer with automatic OAuth refresh, rate-limit handling, and normalized error responses. No raw API plumbing in your agent.

06

Expiring and auditable servers

Create time-limited MCP servers for contractors or automated workflows. Optional dual-auth requires both the URL and a Truto API token for high-security environments.

How It Works

From zero to integrated

Go live with AssemblyAI in under an hour. No boilerplate, no maintenance burden.

01

Link your customer’s AssemblyAI account

Use Truto’s frontend SDK to connect your customer’s AssemblyAI account. We handle all OAuth and API key flows — you don’t need to create the OAuth app.

02

We handle authentication

Don’t spend time refreshing access tokens or figuring out secure storage. We handle it and inject credentials into every API request.

03

Call our API, we call AssemblyAI

Truto’s Proxy API is a 1-to-1 mapping of the AssemblyAI API. You call us, we call AssemblyAI, and pass the response back in the same cycle.

04

Unified response format

Every response follows a single format across all integrations. We translate AssemblyAI’s pagination into unified cursor-based pagination. Data is always in the result attribute.

FAQs

Common questions about AssemblyAI on Truto

Authentication, rate limits, data freshness, and everything else you need to know before you integrate.

How does end-user authentication to AssemblyAI work?

AssemblyAI uses API key authentication. Through Truto's connected account flow, your end users provide their AssemblyAI API key once, and Truto securely stores and injects it on every API call your product makes on their behalf.

Is transcription synchronous or asynchronous?

AssemblyAI's transcription API is asynchronous. You call create_a_assembly_ai_transcript to start a job, then either poll get_single_assembly_ai_transcript_by_id until the status is 'completed' or rely on AssemblyAI's webhook to notify your service.

How do I upload media files for transcription?

Use create_a_assembly_ai_upload to send the raw audio or video binary to AssemblyAI's storage. The response returns an upload URL that you then pass to create_a_assembly_ai_transcript to begin processing.

Can I support real-time streaming transcription from the browser?

Yes. Use list_all_assembly_ai_streaming_token or list_all_assembly_ai_voice_agent_token to mint short-lived tokens server-side, then hand them to your frontend so the client can stream audio directly to AssemblyAI without exposing the user's long-lived API key.

How do I get readable transcripts instead of one large block of text?

Once a transcript is completed, call list_all_assembly_ai_sentences or list_all_assembly_ai_paragraphs to retrieve semantically segmented chunks with timestamps. For video captions, use get_single_assembly_ai_subtitle_by_id to fetch SRT or VTT output.

How long are redacted audio files available?

Redacted audio files generated via list_all_assembly_ai_redacted_audios are available for 24 hours from AssemblyAI. If your product needs long-term retention, download the file within that window and persist it in your own storage.

From the Blog

AssemblyAI integration guides

Deep dives, architecture guides, and practical tutorials for building AssemblyAI integrations.

AssemblyAIAPI integration