---
title: Sarvam API Integration on Truto
slug: sarvam
category: Default
canonical: "https://truto.one/integrations/detail/sarvam/"
---

# Sarvam API Integration on Truto



**Category:** Default  
**Status:** Generally available

## MCP-ready AI tools

Truto exposes 28 tools for Sarvam that AI agents can call directly.

- **create_a_sarvam_chat_completion** — Create a Sarvam AI chat completion by sending a model name and a list of messages. Returns: id, model, choices (each containing a message with role and content, plus finish_reason), and usage. Required: model, messages.
- **create_a_sarvam_text_translation** — Translate text from one Indic language to another using Sarvam AI's translation service. Returns: translated_text. Required: input, source_language_code, target_language_code.
- **create_a_sarvam_text_transliteration** — Transliterate text from one script to another using the Sarvam AI transliterate API. Returns the converted text in the target script. Returns: transliterated_text. Required: input, source_language_code, target_language_code.
- **create_a_sarvam_text_language_identification** — Identify the language of a text input using Sarvam AI's language identification (LID) endpoint. Returns: language_code. Required: input.
- **create_a_sarvam_speech_to_text** — Transcribe audio using sarvam's Saaras v3 speech recognition model. Submits an audio file via multipart form-data and returns the transcribed output. Supports output modes: transcribe (original language, default), translate (to English), verbatim (word-for-word), translit (romanization), and codemix (mixed script). Required: file, model. Returns: transcript.
- **create_a_sarvam_speech_to_text_translate** — Translate speech from an uploaded audio file into English text using Sarvam AI's Saaras model. Returns: transcript. Required: file.
- **create_a_sarvam_text_to_speech** — Convert text to speech using Sarvam AI's TTS engine, synthesizing audio in the specified target language. Returns an opaque synthesized audio payload. Required: text, target_language_code.
- **create_a_sarvam_text_to_speech_stream** — Stream text-to-speech audio conversion using Sarvam AI's TTS stream endpoint. Converts input text into spoken audio and returns a streamed binary audio response. Required: text, target_language_code.
- **list_all_sarvam_pronunciation_dictionaries** — List all pronunciation dictionaries in Sarvam. Returns: id.
- **get_single_sarvam_pronunciation_dictionary_by_id** — Get a single Sarvam pronunciation dictionary by id. Returns: id. Required: id.
- **create_a_sarvam_pronunciation_dictionary** — Create a new pronunciation dictionary in Sarvam. Returns: id.
- **update_a_sarvam_pronunciation_dictionary_by_id** — Update an existing Sarvam pronunciation dictionary by id. Returns: id. Required: id.
- **delete_a_sarvam_pronunciation_dictionary_by_id** — Delete a Sarvam pronunciation dictionary by id. Returns an empty 204 response on success. Required: id.
- **create_a_sarvam_speech_to_text_job** — Initiate an async speech-to-text transcription job in sarvam using the Saaras model. Returns the job_id to poll for results via the status endpoint. Required: model.
- **get_single_sarvam_speech_to_text_job_by_id** — Get the current status of a sarvam speech-to-text transcription job by id. Returns: job_id, status, and transcript (populated when the job completes). Required: id.
- **create_a_sarvam_speech_to_text_job_upload** — Upload audio files for asynchronous batch speech-to-text processing in Sarvam via multipart form. Returns: request_id. Required: file.
- **create_a_sarvam_speech_to_text_job_start** — Start a pending speech-to-text batch job in Sarvam AI by its job ID, initiating asynchronous audio processing. Returns: request_id, status. Required: job_id.
- **create_a_sarvam_speech_to_text_job_download** — Download output files for a completed Sarvam speech-to-text batch job. Submits a request to retrieve the transcribed output files associated with a previously submitted job. Specific request body fields and response structure are defined in the Sarvam job download API; consult the upstream docs for the full field-level breakdown.
- **create_a_sarvam_speech_to_text_translate_job** — Initiate a Sarvam speech-to-text translation batch job for asynchronous audio processing. Returns: job_id, status. Use the returned job_id to poll for results via the status endpoint. Implement a minimum 5ms delay between consecutive status polling requests.
- **get_single_sarvam_speech_to_text_translate_job_by_id** — Get the status of a Sarvam speech-to-text translation batch job by id. Returns: job_id, status. Required: id. Implement a minimum 5ms delay between consecutive status polling requests to avoid hitting rate limits.
- **create_a_sarvam_speech_to_text_translate_job_upload** — Upload audio files to create an asynchronous batch speech-to-text translation job in sarvam. Returns job_id for polling the job's processing status. Required: files.
- **create_a_sarvam_speech_to_text_translate_job_start** — Start a sarvam speech-to-text translate batch job by its job ID, triggering audio processing for the specified job. Required: job_id. Returns an empty 204 response on success.
- **create_a_sarvam_speech_to_text_translate_job_download** — Create a download request to retrieve output files from a completed Sarvam speech-to-text translation batch job. Returns file download data whose structure varies by job type and output format; consult the Sarvam batch job documentation for field-level details.
- **create_a_sarvam_document_intelligence_job** — Initialize a sarvam Document Intelligence job to begin async document digitization processing. Returns: request_id, status.
- **get_single_sarvam_document_intelligence_job_by_id** — Get the status of a sarvam document digitization job by id. Returns: request_id, status. Required: id.
- **create_a_sarvam_document_intelligence_job_upload** — Create a file upload request for a sarvam document digitization job to receive pre-signed upload links for document processing. The per-endpoint source was not available for this method; consult https://docs.sarvam.ai/api-reference-docs/document-intelligence/get-upload-links for the full request body and response field details.
- **create_a_sarvam_document_intelligence_job_start** — Start a Sarvam document digitization job, triggering asynchronous processing for the specified job. Returns an empty 204 response on success. Required: job_id.
- **create_a_sarvam_document_intelligence_job_download** — Create download links for the processed output files of a sarvam document intelligence job. Posts a request to generate presigned download URLs for the specified job's output files. Returns id and attributes containing the download link data; the exact field structure depends on the job's output configuration in Sarvam. Required: job_id.

## How it works

1. **Link your customer's Sarvam account.** Use Truto's frontend SDK; we handle every OAuth and API key flow so you don't need to create the OAuth app.
2. **Authentication is automatic.** Truto refreshes tokens, stores credentials securely, and injects them into every API request.
3. **Call Truto's API to reach Sarvam.** The Proxy API is a 1-to-1 mapping of the Sarvam API.
4. **Get a unified response format.** Every response uses a single shape, with cursor-based pagination and data in the `result` field.

## Use cases

- **Embed vernacular voice AI into your SaaS** — Offer your customers native Indic speech-to-text, text-to-speech, and chat completions so they can build voice bots that actually understand Hindi, Tamil, Bengali, and code-mixed 'Hinglish' conversations without stitching together multiple providers.
- **Power batch call transcription for contact center platforms** — Let CX and helpdesk tools ingest hours of regional-language call recordings, transcribe and translate them to English via Sarvam's async job APIs, and surface searchable transcripts inside their existing supervisor dashboards.
- **Automate regional document digitization in fintech workflows** — Lending, KYC, and insurance SaaS can offer one-click extraction of handwritten Indic-language documents using Sarvam's Document Intelligence jobs, eliminating manual data entry for Tier 2/3 customer onboarding.
- **Localize marketing and support content at scale** — CRMs and marketing automation tools can auto-translate, transliterate, and voice-synthesize English campaigns into 10+ Indian languages, helping their customers reach the 800M+ vernacular-first user base.
- **Route inbound messages by detected language** — Helpdesk and conversational platforms can use Sarvam's language identification to automatically tag tickets and route them to the right regional agent or AI pipeline, replacing manual triage.

## What you can build

- **Code-mixed voice bot pipeline** — Chain Sarvam's speech-to-text (with codemix mode), chat completion, and streaming text-to-speech to ship a Hinglish-fluent voice agent over WhatsApp or IVR.
- **Async call recording transcription with translation** — Trigger speech-to-text translate jobs on uploaded call recordings, poll for completion, and download English transcripts of vernacular conversations directly into your ticketing UI.
- **Handwritten document OCR for KYC** — Use the document intelligence job workflow to upload scanned Indic-language documents, start processing, and pull back structured JSON to pre-fill onboarding forms.
- **Brand-aware text-to-speech with pronunciation dictionaries** — Let your customers manage custom pronunciation dictionaries via Truto so generated audio correctly says their product names, acronyms, and regional terms.
- **Auto-language detection and translation for inbound tickets** — Run text language identification on every incoming message, then call text translation to normalize it to English for your AI workflows or human agents.
- **Multi-language campaign generator** — Translate one English marketing message into 10+ Indian languages and optionally generate matching TTS audio clips for WhatsApp broadcasts in a single workflow.

## FAQs

### How does authentication to Sarvam work through Truto?

Truto handles Sarvam's API key authentication on behalf of your end users. Your users connect their Sarvam account once via Truto's connected account flow, and Truto securely stores and injects credentials on every API call—you never touch the keys.

### Does Truto support both sync and async Sarvam endpoints?

Yes. You can call the synchronous endpoints (chat completion, text translation, transliteration, language identification, real-time speech-to-text, and text-to-speech) as well as the async job workflows for speech-to-text, speech-to-text translate, and document intelligence (create job → upload → start → poll status → download results).

### Can my users manage custom pronunciation dictionaries?

Yes. Truto exposes the full CRUD set for pronunciation dictionaries—list, get by ID, create, update, and delete—so your product can let end users maintain brand-specific pronunciations that apply to their Sarvam TTS calls.

### How do I handle long audio files or large documents?

Use Sarvam's job-based endpoints exposed through Truto. You create a job, upload the file, start processing, then poll the get-job endpoint until it completes, and finally call the download endpoint to retrieve transcripts or extracted data.

### Does Truto support Sarvam's streaming text-to-speech?

Yes, the streaming text-to-speech endpoint is available alongside the standard TTS endpoint, so you can build low-latency voice agents that stream audio chunks back to your application.

### What about rate limits and quotas?

Rate limits are governed by the Sarvam plan tied to each end user's API key. Truto passes through Sarvam's rate-limit responses so you can handle backoff in your application, and Truto itself does not impose additional per-call limits beyond your Truto plan.

## Related reading

- [Connect Sarvam to Claude: Indic Translation and Audio Transcription](https://truto.one/blog/connect-sarvam-to-claude-indic-translation-and-audio-transcription/) — Learn how to connect Sarvam to Claude using a managed MCP server. Give your AI agents the tools to run Indic text translation, async audio transcription, and document intelligence.
- [Connect Sarvam to ChatGPT: Multilingual Speech and Text Processing](https://truto.one/blog/connect-sarvam-to-chatgpt-multilingual-speech-and-text-processing/) — Learn how to dynamically generate a managed MCP server for Sarvam, connect it to ChatGPT, and automate multilingual text, speech, and async job workflows.
- [Connect Sarvam to AI Agents: Build Indic Voice and Document Workflows](https://truto.one/blog/connect-sarvam-to-ai-agents-build-indic-voice-and-document-workflows/) — Learn how to connect Sarvam to ai agents using Truto. Step-by-step guide to tool calling, API quirks, and autonomous workflows.
