---
title: Z.ai API Integration on Truto
slug: zai
category: Artificial Intelligence
canonical: "https://truto.one/integrations/detail/zai/"
---

# Z.ai API Integration on Truto


**Category:** Artificial Intelligence  
**Status:** Generally available

## MCP-ready AI tools

Truto exposes 14 tools for Z.ai that AI agents can call directly.

- **create_a_z_ai_chat_completion** — Create a chat completion in zai that generates AI replies for a given conversation history. Supports multimodal inputs (text, image, video, file), function calling via tools, and both streaming and non-streaming output modes. Returns the completion object including id, model, choices, and usage. Required: model, messages.
- **create_a_z_ai_videos_generation** — Create a video generation task in Z.AI using CogVideoX or Vidu models from a text prompt, image URL, or pair of first/last frame images. Supports text-to-video, image-to-video, and first/last-frame-to-video workflows. Returns the created generation task. Returns: model, id, request_id, task_status. Required: model.
- **get_single_z_ai_paas_async_result_by_id** — Get the result of an asynchronous request in zai by task id. Returns the async result object with task-type-specific fields — either an AsyncVideoGenerationResponse or AsyncImageGenerationResponse depending on the originating task; consult the zai upstream docs for field-level details. Required: id.
- **create_a_z_ai_images_generation** — Generate a high-quality image from a text prompt using GLM-Image series models in zai. Returns: url, id, created_at. Required: model, prompt.
- **create_a_z_ai_audio_transcription** — Transcribe an audio file into text in zai using the GLM-ASR-2512 model, supporting multiple languages and optional real-time streaming output. Returns: id, created, request_id, model, text. Required: model.
- **create_a_z_ai_paas_tokenizer** — Tokenize text input using a specified model in zai to calculate token counts, suitable for text length evaluation, model input estimation, dialogue context truncation, and cost calculation. Returns: created, id, request_id, usage. Required: model, messages.
- **create_a_z_ai_paas_layout_parsing** — Submit a layout parsing request in zai using the GLM-OCR model to extract text content and layout information from a document image or PDF. Returns: id, created, model, md_results, layout_details, layout_visualization, data_info, usage, request_id, attributes. Required: model, file.
- **create_a_z_ai_paas_web_search** — Perform a web search using the zai LLM-optimized Web Search API, which enhances intent recognition to return results better suited for large language model processing, including webpage titles, URLs, summaries, site names, and favicons. Returns: id, created, search_result.
- **create_a_z_ai_paas_reader** — Create a web reader request in zai to read and parse the content of a specified URL. Returns: id, created, request_id, model, reader_result, url, content. Required: url.
- **create_a_z_ai_paas_file** — Upload an auxiliary file (such as a glossary or terminology list) to zai to enhance translation accuracy and consistency. Returns the uploaded file record including id, object, bytes, filename, purpose, and created_at. Required: purpose, file. File size limit is 100 MB; accepted formats are pdf, doc, xlsx, ppt, txt, jpg, and png.
- **create_a_z_ai_agent** — Create a zai agent task for one of three agent types: General Translation (multilingual text translation with auto-detection and glossary support), Popular Special Effects Videos (AI video generation from an image and prompt via a template), or GLM Slide/Poster (slide or poster generation from natural language instructions). Returns: id, agent_id, choices, usage, status, async_id.
- **create_a_z_ai_agents_async_result** — Query the result of an asynchronous zai agent request. Returns: status, agent_id, async_id, choices, id, usage. Required: async_id.
- **create_a_z_ai_agents_conversation** — Query conversation history for a zai slides_glm_agent (the only supported agent type). Returns id, agent_id, choices (with messages containing role and content, and finish_reason), and usage token statistics for synchronous responses, or status and async_id for asynchronous responses.
- **create_a_z_ai_async_images_generation** — Create an async image generation job in zai using the GLM-Image model. Returns: model, id, request_id, task_status. Required: model, prompt.

## How it works

1. **Link your customer's Z.ai account.** Use Truto's frontend SDK; we handle every OAuth and API key flow so you don't need to create the OAuth app.
2. **Authentication is automatic.** Truto refreshes tokens, stores credentials securely, and injects them into every API request.
3. **Call Truto's API to reach Z.ai.** The Proxy API is a 1-to-1 mapping of the Z.ai API.
4. **Get a unified response format.** Every response uses a single shape, with cursor-based pagination and data in the `result` field.

## Use cases

- **Embed multimodal AI generation into your product** — Offer your users a single connection to Z.ai that powers text, image, video, and audio generation natively inside your app. Instead of integrating four separate AI vendors, you ship one Z.ai integration via Truto and expose any modality your roadmap demands.
- **Power document intelligence workflows** — SaaS platforms handling contracts, invoices, or regulatory PDFs can use Z.ai's layout parsing to extract structured markdown and bounding boxes, then chain into chat completions for extraction and classification. Truto handles the auth and request plumbing so your team focuses on the parsing pipeline.
- **Ship brand-compliant AI translation** — Localization and content platforms can let customers upload glossary files to Z.ai and trigger translation agents that respect proprietary terminology. This becomes an upsell tier above generic LLM translation, with Truto managing the file upload and agent invocation.
- **Add web-grounded AI features without scrapers** — Sales enablement, research, and content tools can ground AI outputs in real-time data using Z.ai's web search and URL reader endpoints. Your users get fresh, cited responses instead of stale model knowledge — no scraping infrastructure required.
- **Offer conversational asset creation** — Marketing and presentation tools can plug into Z.ai's agent endpoints to generate slides, posters, and special-effects videos, then let end users iterate conversationally. Truto's connection layer lets your customers bring their own Z.ai account and consumption budget.

## What you can build

- **Async video generation with polling** — Trigger video generation jobs via Z.ai and poll the async result endpoint to surface a 'rendering' state in your UI until the final asset is ready.
- **PDF-to-structured-markdown pipeline** — Run uploaded documents through Z.ai's layout parsing to extract markdown, tables, and layout coordinates, then feed the output into chat completions for downstream extraction.
- **Glossary-anchored translation agent** — Upload customer terminology files via the Z.ai file endpoint and invoke the translation agent so brand names and technical terms render consistently across languages.
- **First-and-last-frame video interpolation** — Let users supply two reference frames and a prompt to generate smooth branded transitions or product showcase videos without manual editing.
- **Web-augmented chat answers** — Combine Z.ai's web search and URL reader endpoints with chat completions to deliver grounded, real-time answers inside your assistant or copilot feature.
- **Audio-to-action-items workflow** — Transcribe meeting or call audio through Z.ai's transcription endpoint and pipe the transcript into chat completions to extract action items, summaries, and CRM updates.

## FAQs

### How does authentication work for end users connecting their Z.ai account?

Z.ai uses API key authentication. Through Truto, your end users provide their Z.ai API key during the connection flow, and Truto securely stores and injects it on every request — your application never has to handle or persist the raw credential.

### Which Z.ai operations are available through Truto today?

Truto exposes the full set of Z.ai endpoints needed for production workflows: chat completions, image generation (sync and async), video generation, audio transcription, layout parsing, tokenizer, web search, URL reader, file upload, agent invocation, agent conversations, and async result polling.

### How do long-running jobs like video and image generation work?

Z.ai's video and async image endpoints return a task ID immediately. Your app polls the async result endpoint via Truto until the job completes. Truto passes through the task IDs and response payloads without modification, so you can implement polling or backoff logic on your side.

### Are there unified APIs available for Z.ai?

Not yet. Z.ai is currently available as a direct passthrough integration, meaning you call Z.ai's native endpoints through Truto's unified connection and auth layer. If you need a unified AI API across multiple providers, contact Truto — these are built on request.

### Can we upload files like glossaries or documents through the integration?

Yes. The Z.ai file endpoint accepts files (including formats like .xlsx and .txt up to Z.ai's documented size limits) and Truto proxies the upload. The returned file reference can then be bound to translation or other agents in subsequent calls.

### How are rate limits and errors handled?

Truto passes Z.ai's native rate limit headers and error responses back to your application unchanged, so you can implement retry and backoff logic based on Z.ai's published limits. Per-end-user quotas are governed by each user's own Z.ai account and plan.