Skip to content

Announcing Truto Docs MCP: Stop AI Hallucinations in API Integrations

Stop AI coding assistants from hallucinating API endpoints. The Truto Docs MCP server feeds accurate, structured API documentation directly into Cursor, Claude, and ChatGPT.

Uday Gajavalli Uday Gajavalli · · 14 min read
Announcing Truto Docs MCP: Stop AI Hallucinations in API Integrations

Engineering leaders and senior developers are rapidly adopting AI coding assistants like Cursor and Claude. The speed gains are undeniable. But when it comes to building third-party API integrations, the experience often hits a brick wall. You ask your AI to build a HubSpot or Salesforce integration. It confidently hallucinates a /v4/contacts endpoint, invents an undocumented pagination parameter, and leaves you debugging an invalid_grant error for 45 minutes.

If your AI coding assistant keeps inventing API endpoints, fabricating query parameters, or guessing at OAuth flows, the fix is structured context, not a better prompt.

Today, we are releasing the Truto Docs MCP Server—a public, unauthenticated endpoint that injects our complete, up-to-date documentation and API reference directly into your AI agent's context window. Point your assistant at https://docs-mcp.truto.one/mcp, and it stops guessing.

This post covers what the Docs MCP server does, the architecture behind it, how it pairs with the Truto CLI and Truto Skills, and how to wire it up in your editor in under two minutes.

The Hallucination Problem in API Integrations

AI coding assistants are prediction engines, not knowledge bases. When they hit a gap in their training data, they fill it with something plausible. For integration code, that means phantom endpoints, made-up authentication headers, and pagination cursors that look right but aren't.

AI integration hallucination usually looks like:

  • Calling a route that sounds right but does not exist.
  • Mixing up unified API and proxy API paths.
  • Sending OAuth credentials from the browser because the model missed the server-side boundary.
  • Decoding or modifying an opaque pagination cursor.
  • Inventing fields for a CRM, HRIS, ATS, ticketing, or accounting API because the vendor docs were missing from context.
  • Retrying every error blindly, including 400s and 403s that should never be retried.

Developers are not imagining this. Stack Overflow's 2025 Developer Survey found that 84% of respondents use or plan to use AI tools, but 46% actively distrust AI tool accuracy. The top frustration, cited by 66% of developers, is dealing with AI answers that are close enough to look credible but still wrong. Industry research from Nurix AI shows that these integration failures are the top reason AI agents fail to scale in production environments, while SparkCo reveals that over 40% of AI models exhibit some degree of hallucination in domain-specific tasks. Furthermore, AI-coauthored pull requests show roughly 1.7x more issues than human-only PRs when the model lacks domain context.

That pattern is brutal for third-party integrations. Vendor API docs are inconsistent. Some endpoints use cursor pagination, some use offsets, some use weird link headers, some return has_more, and some lie in the docs. Even if the model has seen an older version of the docs during training, that does not mean it knows the current contract. Gartner has even warned that more than 40% of agentic AI projects may be canceled by the end of 2027 because of inadequate risk controls—exactly the kind of drag bad integration context creates.

The naive fix is to download the vendor's entire OpenAPI spec and dump it into the prompt. That doesn't work either. A fully resolved OpenAPI spec for a modern SaaS platform can easily exceed 10MB of JSON. Stuff it all into context and you blow past the model's working window. The well-known "Lost in the Middle" MIT research found that language models perform best when relevant information appears at the start or end of context, with performance dropping sharply when the answer is buried in the middle of a massive input.

There's a third failure mode that catches teams off guard: stale knowledge. Your AI assistant's training cutoff was months ago. Truto ships new integrations and new unified models continuously. Anything the model "knows" about Truto is out of date the moment we deploy.

So the fix is not bigger prompts. The fix is just-in-time retrieval—the ability for the assistant to fetch the exact paragraph or endpoint schema it needs, when it needs it, from a live source.

Enter the Model Context Protocol (MCP)

The Model Context Protocol (MCP) is the new industry standard for connecting AI to external data. Introduced by Anthropic in November 2024, MCP provides a universal, open standard for connecting AI systems with data sources, replacing fragmented integrations with a standardized JSON-RPC architecture.

Instead of pushing gigabytes of context into a prompt upfront, MCP allows the AI to pull context dynamically. The AI uses predefined "tools" to query an external server, retrieve specific documentation chunks, and read exact API schemas. This shifts the paradigm from hardcoded Retrieval-Augmented Generation (RAG) pipelines to autonomous, agent-driven discovery.

Adoption is no longer a question. The protocol was adopted by major AI providers, including OpenAI and Google DeepMind, and Anthropic recently donated MCP to the Linux Foundation's new Agentic AI Foundation, ensuring these foundational technologies remain neutral, open, and community-driven. If you ship a public MCP server today, every major AI client can speak to it tomorrow.

Introducing the Truto Docs MCP Server

The Truto Docs MCP server is a static knowledge base designed specifically for LLM consumption. It is a public API documentation MCP server for Truto's docs and API reference. It exposes Truto's full documentation and API reference as auto-generated MCP tools. No API token, no OAuth dance, no rate limit on the docs themselves.

Drop the URL into any MCP-compatible client and your assistant can answer Truto-specific questions with citations to real docs.

Info

Endpoint URL: https://docs-mcp.truto.one/mcp Protocol: HTTP-Streamable (SSE)

It exposes four core tools and two resources to your AI assistant:

Tool What it does When your assistant should use it
search_docs Semantic vector search across all Truto guides and API references. Returns ranked hits with slug, title, section, score, and snippet. When the agent knows the topic but not the exact page.
get_doc_page Retrieves the full, raw Markdown content of a specific documentation page based on its slug, with smart fuzzy fallback. When the agent needs exact prose, examples, or setup steps.
list_api_endpoints Returns a directory of all available API endpoints, optionally filtered by group (e.g., unified-crm-api). When the agent needs to discover available methods.
get_api_endpoint Fetches the complete, granular details of a specific API endpoint, including path parameters, query schemas, request body fields, and supported integrations. Before the agent writes code that calls Truto.

The two resources (docs://pages and docs://api-reference) give an LLM a cheap way to enumerate the entire documentation tree before it decides what to fetch. The agent can browse like a developer reading a sidebar, then drill into exactly the page or endpoint it needs.

What the agent sees when it calls get_api_endpoint

A single call returns everything the model needs to write a correct integration call: endpoint summary, HTTP method, path parameters, query parameters with descriptions and enum values, request body schema with oneOf/anyOf variants resolved, response shape, the full list of integrations that support that endpoint, and deprecation status. No more inventing parameters. No more guessing whether pipedrive is supported on a given unified CRM endpoint.

Separate from Live Execution

This is intentionally a separate system from Truto's per-account integration MCP servers. Those servers expose live tool calls against a customer's connected Salesforce, HubSpot, or Jira account. The Docs MCP server is read-only documentation. Different jobs, different security posture, different scope.

Think of the Docs MCP server as the "man pages" for Truto. The per-account MCP servers are the actual command execution. You want both: the agent reads the manual, then runs the right command.

The architecture is boring in the best way. Building an MCP server that responds fast enough for real-time coding assistants requires moving away from traditional, heavy backend architectures. We built the Truto Docs MCP server using a globally distributed edge runtime paired with a high-performance vector index.

flowchart LR
    A[Markdown docs<br>+ OpenAPI spec] --> B[Chunk by H1-H3<br>headings]
    B --> C[Embed chunks<br>768-dim model]
    C --> D[Vector index<br>cosine similarity]
    A --> E[mcp-index.json<br>full content map]
    F[MCP Client<br>Cursor / Claude] -->|POST /mcp| G[Edge runtime]
    G -->|embed query| C
    G -->|vector search| D
    G -->|page lookup| E
    G -->|JSON-RPC response| F

The Build-Time Pipeline

Naive chunking—splitting text arbitrarily by character count—destroys the semantic meaning of API documentation. Our build pipeline takes a structural approach:

  1. Heading-Based Chunking: The pipeline parses our Markdown documentation and splits the content strictly by H1-H3 heading boundaries. Tiny tail sections (under 100 characters) are merged into their predecessor to avoid polluting the index with useless fragments.
  2. Schema Flattening: The pipeline fetches our fully resolved OpenAPI spec. It flattens deeply nested API methods into single, highly descriptive text chunks containing the HTTP method, path, summary, query parameters, body fields, and supported integrations.
  3. Vector Embedding: Every chunk is embedded using a 768-dimensional embedding model and upserted into a vector index alongside its metadata (slug, title, section, and a text snippet). Build is incremental: a content-hash cache means only changed chunks get re-embedded on each docs deploy.

The Runtime Architecture

When your AI assistant sends a request to the MCP server, the request hits a stateless edge runtime. The transport layer uses the MCP HTTP-Streamable protocol. Each /mcp request creates a fresh MCP server instance. There is no authentication required, no session tracking, and no rate limiting on the MCP endpoint itself. The edge runtime handles the query embedding on the fly and returns the vector search results in milliseconds.

The key design choice is that semantic search and exact lookup are separate. Search is good for discovery. Exact lookup is good for correctness. If an agent asks, "How do I create a CRM contact?", search can find the relevant endpoint. Before code is written, get_api_endpoint can fetch the exact path, method, query parameters, request body, and response schema.

LLM Discovery Files and Markdown Twins

Beyond the MCP protocol, we also generate static discovery files following the emerging llms.txt convention.

  • /docs/llms.txt: A site-level index file listing every page with a one-line description and link.
  • /docs/llms-full.txt: A concatenation of every docs page as Markdown. We deliberately exclude API method pages from this full dump. A typical method page expands to over 200 lines of parameter tables. Including all of them would push the file past the context windows of smaller frontier models.

Every HTML page on our docs site also has a hidden .md twin. If you navigate to /docs/guides/foo.md, you get a clean Markdown version of the page with a > Source: {canonical URL} line injected at the top. This ensures that if you manually copy-paste documentation into ChatGPT, the AI retains the source attribution.

Handling Edge Cases: Fuzzy Matching and Rate Limits

Building developer tools requires acknowledging how systems actually fail in the real world.

Fuzzy Slug Matching

LLMs are notoriously bad at remembering exact URL slugs. They will ask for unified-apis or crm contacts instead of guides/unified-apis/what-are-unified-apis.

Our edge runtime implements smart fallback behavior. If an AI calls get_doc_page with an imprecise slug, the server searches all available slugs for partial substring matches. If it finds a single partial match, it returns that page. If it finds multiple candidates, it returns the list of options, allowing the AI to correct itself. That sounds small until you watch an agent loop for five minutes because it guessed a slug with one missing path segment.

The Reality of Rate Limits

When building integrations, rate limits are the most common failure point. It is critical to understand how Truto handles them so your AI writes the correct application logic.

Truto does not magically absorb, automatically retry, or apply backoff when an upstream vendor API returns HTTP 429.

When a vendor API (like Salesforce or Zendesk) returns an HTTP 429 Too Many Requests error, Truto passes that error directly back to the caller. We normalize the upstream rate limit information into standard IETF headers:

  • ratelimit-limit
  • ratelimit-remaining
  • ratelimit-reset

This is intentional. Silent retries inside an integration platform hide latency, blow your own rate budget, and make debugging miserable. By exposing this exact behavior via the Docs MCP server, your AI agent knows that it cannot rely on Truto to retry failed requests. The AI will correctly implement exponential backoff and retry logic in your application code, reading the normalized headers to determine exactly how long to wait.

A minimal retry wrapper generated by an informed agent should treat 429 differently from validation and auth errors:

const sleep = (ms: number) => new Promise(resolve => setTimeout(resolve, ms));
 
export async function trutoFetchWithBackoff(
  url: string,
  init: RequestInit,
  maxAttempts = 4
) {
  for (let attempt = 1; attempt <= maxAttempts; attempt++) {
    const res = await fetch(url, init);
 
    if (res.status === 429 || res.status === 502 || res.status === 504 || res.status >= 500) {
      if (attempt === maxAttempts) return res;
 
      const retryAfter = res.headers.get('retry-after');
      const rateLimitReset = res.headers.get('ratelimit-reset');
      const waitSeconds = Number(retryAfter || rateLimitReset || 0);
 
      const jitter = Math.floor(Math.random() * 250);
      const fallback = Math.min(30_000, 500 * 2 ** (attempt - 1));
      const delayMs = waitSeconds > 0 ? waitSeconds * 1000 : fallback + jitter;
 
      await sleep(delayMs);
      continue;
    }
 
    // Do not retry malformed requests, missing scopes, expired tokens, or validation errors.
    return res;
  }
 
  throw new Error('unreachable');
}

That is exactly the kind of pattern you want your assistant to generate after consulting docs—not a blind while true loop that turns a provider rate limit into an outage.

Pairing Docs MCP with Truto Skills and the CLI

The best setup uses three layers of context, each with a different job. The Docs MCP server gets you accurate retrieval, but it is just one piece of the puzzle. To create a completely hallucination-free development environment, you should combine it with the rest of our AI tooling.

Layer Job Example
Truto Docs MCP Fetch current docs and endpoint schemas "What fields are required to create a CRM contact?"
Truto Skills Teach your coding agent Truto conventions "Use server-side API tokens, walk next_cursor, handle Link SDK setup."
Truto CLI Validate against a real environment from the terminal "List accounts, inspect tools, export contacts, debug logs."

1. Truto Skills

Truto Skills is a SKILL.md-compatible repository that injects Truto's conventions, JSONata mapping patterns, CLI usage, Link SDK behavior, and API idioms directly into your agent's context as a skill. Skills tell the model how to think about Truto. The setup flow is one command:

npx skills add trutohq/truto-skills

2. Truto CLI

The Truto CLI closes the loop by letting the agent verify what it just wrote. Instead of writing code and hoping it works, the AI can shell out to the CLI to run commands against a real integrated account, inspect the actual JSON payload, and write deterministic code based on real data.

curl -fsSL https://cli.truto.one/install.sh | bash
truto login --token "$TRUTO_API_TOKEN"
truto export crm/contacts -a <account-id> -o ndjson

The Ideal Agent Loop

A realistic agent loop looks like this:

sequenceDiagram
    participant Dev as Developer
    participant Agent as AI Agent<br>(Cursor / Claude)
    participant Skill as Truto Skill<br>(SKILL.md)
    participant Docs as Docs MCP<br>(docs-mcp.truto.one)
    participant CLI as Truto CLI

    Dev->>Agent: "Sync HubSpot contacts to my DB"
    Agent->>Skill: Load Truto conventions
    Agent->>Docs: search_docs("unified CRM contacts list")
    Docs-->>Agent: Endpoint hit + snippet
    Agent->>Docs: get_api_endpoint("unified-crm-api", "contacts", "list")
    Docs-->>Agent: Full schema + supported integrations
    Agent->>Agent: Generate code with correct fields
    Agent->>CLI: truto export crm/contacts -a <id> -o ndjson
    CLI-->>Agent: Real records, schema confirmed
    Agent-->>Dev: Working code + sample output

No invented endpoints. No phantom query parameters. The agent reasons against live truth at every step.

Getting Started in Cursor, Claude Desktop, and ChatGPT

Install time is under two minutes. The Docs MCP server is public, so there is no API token step for documentation access.

Cursor MCP Integration

Open ~/.cursor/mcp.json (or your workspace .cursor/mcp.json) and add:

{
  "mcpServers": {
    "truto_docs": {
      "url": "https://docs-mcp.truto.one/mcp"
    }
  }
}

Restart Cursor. The four tools (search_docs, get_doc_page, list_api_endpoints, get_api_endpoint) will appear in the MCP panel. You can also configure this via the UI by navigating to Features > MCP > + Add New MCP Server, setting type to sse, and pasting the URL.

Claude Desktop MCP Setup

Claude Desktop currently connects to remote HTTP-Streamable MCP servers through an npx bridge. Add this to claude_desktop_config.json from Settings > Developer > Edit Config:

{
  "mcpServers": {
    "truto_docs": {
      "command": "npx",
      "args":["-y", "mcp-remote", "https://docs-mcp.truto.one/mcp"]
    }
  }
}

Fully quit and reopen Claude Desktop after saving.

Claude Code MCP Setup

For Claude Code, use the HTTP transport directly in your terminal:

claude mcp add --transport http truto_docs https://docs-mcp.truto.one/mcp

Add --scope user if you want it available across projects.

ChatGPT (Developer Mode)

In Settings -> Apps -> Advanced settings, enable Developer mode, then add a custom MCP server with the URL https://docs-mcp.truto.one/mcp. ChatGPT will list the available tools after connecting.

Prompts That Work Well

Once installed, be explicit. Good agents respond well to narrow instructions:

"Use the Truto Docs MCP server before writing code. Find the exact endpoint and schema for listing CRM contacts, then implement a server-side helper that walks pagination with next_cursor."

"Use Truto Skills and the Truto Docs MCP server. Add a POST /api/truto/link-token route, keep TRUTO_API_TOKEN server-side, and include error handling for expired or invalid tokens."

"Before changing code, use the Truto Docs MCP server to confirm how Truto returns provider rate limit errors. Then update our retry wrapper with bounded exponential backoff."

What This Will Not Fix

A Docs MCP server is not a magic correctness shield.

It makes your assistant correct about Truto. It does not make it correct about HubSpot's quirks, Salesforce's custom field semantics, or Jira's permission model. It will not know your internal tenant model. It will not decide whether a write operation needs human approval. It will not stop a developer from accepting a bad diff without review.

It changes the failure mode. Instead of the model guessing, it can ask the docs. Instead of stuffing a huge API reference into context, it can retrieve the exact page or endpoint schema. Instead of relying on old training data, it can use the current contract.

That is a meaningful improvement for API integration work, but you still need tests, logs, retry limits, idempotency keys for writes, and code review. For live data execution, you still need the per-integration managed MCP servers backed by real connected accounts, plus pragmatic testing against sandbox data.

The Practical Next Step

Documentation MCP servers are becoming table stakes for any developer-facing platform. Stale training data plus bad retrieval is the single largest source of AI-generated integration bugs we see.

Install the Docs MCP server in your coding assistant today, install Truto Skills, authenticate the CLI, and try one integration task that normally takes you a few rounds of doc checking. Ask the agent to prove every Truto endpoint by calling get_api_endpoint before it writes code.

Tip

The cleanest agent setup is: Skills for conventions, Docs MCP for current docs, CLI for verification, and scoped MCP servers for live customer data. Do not collapse all four jobs into one giant prompt.

FAQ

What is the Truto Docs MCP server?
A public, unauthenticated Model Context Protocol endpoint that allows AI assistants like Cursor, Claude, and ChatGPT to semantically search and read Truto's documentation and API reference on demand.
How does the Docs MCP server prevent AI hallucinations when writing integration code?
Instead of relying on stale training data or overflowing the context window with massive OpenAPI specs, the assistant calls tools like search_docs and get_api_endpoint to fetch exact, current parameter schemas and pagination semantics. The model writes code grounded in live documentation rather than guessing.
How is the Docs MCP server different from Truto's per-integration MCP servers?
The Docs MCP server is read-only and serves Truto's documentation and API reference. Per-integration MCP servers are scoped to a connected account (Salesforce, HubSpot, Jira, etc.) and execute live API actions. You typically use both: docs MCP for understanding the platform, per-account MCP for executing real work.
Does Truto automatically retry upstream 429 rate limit errors?
No. When an upstream provider returns 429, Truto passes the 429 directly to the caller and normalizes rate limit information into standard IETF headers (ratelimit-limit, ratelimit-remaining, ratelimit-reset). Your application or agent workflow must implement its own retry, bounded backoff, and idempotency logic.

More from our Blog

Introducing Truto Agent Toolsets
AI & Agents/Product Updates

Introducing Truto Agent Toolsets

Newest offering of Truto SuperAI. It helps teams using Truto convert the existing integrations endpoints into tools usable by LLM agents.

Nachi Raman Nachi Raman · · 2 min read
Introducing the Truto CLI
Product Updates/Engineering

Introducing the Truto CLI

Manage your entire Truto integration platform from the terminal. Install in one command, query unified APIs, export data, batch operations, and diff records.

Roopendra Talekar Roopendra Talekar · · 7 min read