How to Publish an API Technical Appendix That Closes Enterprise Deals: Rate Limits & Examples
Learn how to publish an API technical appendix with rate limits, retry semantics, runnable examples, and MCP tool generation to win enterprise AI agent procurement reviews.
When enterprise procurement teams evaluate your B2B SaaS product, the decision to buy rarely comes down to your marketing site's feature matrix. The actual buyer is a lead architect or staff engineer who opens your documentation, searches for an API technical appendix, and looks for the exact ways your system will fail under load. If you sell a B2B SaaS product into the enterprise, your API technical appendix is the document that closes—or kills—your procurement review.
An API technical appendix is a structured reference that documents the unglamorous but contractually critical parts of your API: rate limits, retry semantics, pagination rules, authentication edge cases, and runnable code examples. It is not the auto-generated Swagger dump. It is the document a staff engineer reads at 11 PM to decide whether your platform can survive their production traffic. They are evaluating your API developer experience (DX) as a proxy for your platform's overall engineering quality.
If your documentation is just an auto-generated Swagger dump with zero context on edge cases, you fail the technical evaluation. Enterprise buyers need to know what happens when they hit an HTTP 429 Too Many Requests error. They need to know how you handle pagination cursors, OAuth token expiry, and exponential backoff. Most importantly, they need runnable code snippets that allow them to validate your API in under five minutes.
This guide provides a structured framework for senior product managers, developer advocates, and technical writers to build an enterprise-grade API technical appendix. We will cover what to include, how to explicitly document rate limits without hand-waving, how to provide runnable examples that optimize Time to First Call (TTFC), and how teams shipping across dozens of third-party integrations keep this consistent using a unified API architecture without writing 50 different appendices.
Why Your API Needs a Technical Appendix (Not Just a Swagger Dump)
Short answer: Because the developer evaluating your API will not read your reference page top to bottom. They will skip directly to the sections where APIs usually break—rate limits, retries, pagination, and webhook signatures—and decide in five minutes whether your platform is worth integrating against.
An OpenAPI (Swagger) specification is a contract for machines, not a guide for humans. While it defines paths, methods, and schemas, it completely ignores the operational realities of software engineering—the network failures, the undocumented edge cases, and the concurrency limits.
API-first companies treat their documentation as a core marketing asset. A well-structured API technical appendix signals enterprise readiness to technical evaluators. It proves that your engineering team has anticipated the friction points of integration and solved them upstream.
The Time to First Call (TTFC) research from Postman is direct: developers were 1.7 times faster making their first call when using a collection provided by the API publisher, and across the sample developers made a successful call 1.7 to 56 times faster when using a forked collection. Those improvements cannot be attributed to the mere presence of documentation; in fact, all of these APIs have API documentation, and some of them have excellent documentation. The differentiator is whether the snippet runs.
When a staff engineer at a Fortune 500 company reviews your API, they are looking for answers to specific, operational questions:
- If I send 10,000 requests in a minute, do you drop the requests, queue them, or return a 429?
- If you return a 429, do you include a
Retry-Afterheader, or do I have to guess the backoff window? - How do I paginate through 50,000 records without timing out?
- Are your webhook deliveries guaranteed at-least-once, and how do I verify their cryptographic signatures?
If your documentation does not answer these questions explicitly, the evaluating engineer assumes your API is fragile. They will recommend a competitor who provides the operational clarity they need. To pass these rigorous performance benchmark whitepaper reviews, your technical appendix must move beyond happy-path tutorials and address the mechanics of failure and recovery.
A well-built appendix is also a marketing surface. It indexes for long-tail technical queries ("how to handle 429 from X API," "pagination cursor format X") and signals platform depth to the architects doing the buying. Cheap to produce. Hard to fake.
The Anatomy of a High-Converting API Technical Appendix
A high-converting API technical appendix is organized entirely around developer friction. It should be separated from your standard endpoint reference and focus exclusively on systemic behaviors.
A technical appendix that actually helps procurement reviewers—and the engineers behind them—has six core sections. Treat this as the non-negotiable checklist:
- Authentication and authorization edge cases. Do not just say "Pass a Bearer token." Explain the lifecycle. OAuth scopes per endpoint, token TTL, refresh behavior, what your platform does when refresh fails, and how multi-tenant credentials are isolated. If you support multiple auth types (OAuth 2.0 Authorization Code, Client Credentials, API Keys), provide explicit examples of each flow.
- Pagination contracts. Define your pagination strategy clearly. Cursor vs. offset, page size limits, behavior on result-set drift, and whether the cursor is opaque or schema-coupled. Provide a code example of a
whileloop that correctly extracts thenext_cursorfrom a response and appends it to the subsequent request. - Rate limits and retry semantics. This is the most critical section for enterprise buyers. You must define your exact limit values, time windows, per-endpoint exceptions, headers returned, and what the caller is expected to do on 429.
- Error taxonomy and idempotency. A flat list of every HTTP status code and custom error code your API can return, mapped to a recommended caller action (retry, surface to user, escalate, dead-letter). Differentiate between a 401 (Unauthorized) and a 403 (Forbidden). Explain how to use
Idempotency-Keyheaders for POST requests so developers can safely retry network failures without creating duplicate records. - Webhook delivery guarantees. Signing scheme, signature verification snippet in at least two languages, replay protection, retry schedule, and what the body looks like on retry.
- Runnable code examples. Copy-paste curl, plus one snippet per officially supported SDK, with placeholders for credentials and a clearly marked sandbox endpoint.
flowchart LR
A[Developer lands<br>on docs] --> B[Reads appendix:<br>auth + rate limits]
B --> C{Snippet runs<br>on first try?}
C -->|Yes| D[Successful first call<br>TTFC < 5 min]
C -->|No| E[Bounce / Slack<br>support / churn]
D --> F[Production integration]
F --> G[Procurement review:<br>'show me your appendix']
G --> H[Deal closes]The order matters. Authentication comes first because it is the most common point of failure in a TTFC measurement. Pagination and rate limits come next because they are the two areas where vendors most consistently lie by omission. Webhooks come last because most developers will not touch them until production—but the absence of a verification snippet here is what a security reviewer will flag.
If you only ship one new section this quarter, ship the rate limit section. It is the single highest-leverage piece of content for passing an enterprise security review and the most commonly missing from auto-generated docs.
Documenting API Rate Limits: Beyond "HTTP 429"
Short answer: Telling developers "we return 429 when you exceed our rate limit" is not rate limit documentation. It is a footnote. Real rate limit documentation states the limit values, the time window, the per-endpoint exceptions, the headers you return, and the exact backoff strategy you expect the caller to use.
Returning an HTTP 429 Too Many Requests status code is the bare minimum. How you document and expose the metadata around that 429 determines whether a developer can build a resilient integration or whether they will accidentally DDoS your servers with infinite retry loops. According to API management platform Zuplo, the developer experience (DX) of rate limits requires more than just blocking traffic; APIs must document retry headers and provide detailed problem payloads. Similarly, gateway provider Tyk.io stresses that API documentation must explicitly state rate limit values, time windows, and endpoint-specific limits to prevent user frustration.
The IETF has been standardizing this for years. The draft defines RateLimit-Policy (a quota policy defined by the server that client HTTP requests will consume) and RateLimit (the currently remaining quota available for a specific policy). Earlier iterations defined RateLimit-Limit (the requests quota in the time window), RateLimit-Remaining (the remaining requests quota in the current window), and RateLimit-Reset (the time remaining in the current window, specified in seconds). Pick a convention and document it. The worst outcome is a developer guessing at your header names.
A good rate limit section answers these questions explicitly:
| Question | Why it matters |
|---|---|
| What is the limit, in requests and time window? | Lets developers size their workers |
| Is the limit per-API-key, per-tenant, or global? | Determines blast radius of a bad job |
| Which endpoints have stricter limits? | Write-heavy endpoints often differ from reads |
| What headers do you return on 200? On 429? | Lets clients proactively throttle |
| What is the expected backoff strategy? | Prevents thundering herds on reset |
| Do you ever return 429 without a Retry-After? | If yes, document the fallback |
Rate limits are also evolving rapidly. Atlassian recently overhauled their API rate limits specifically to handle the massive, unpredictable load generated by AI agents and real-time orchestration workflows. As their ecosystem grows with partners building AI-powered experiences, real-time orchestrations, and integrations that leverage their APIs at unprecedented scale, they are evolving their approach to rate limiting. They moved from naive request counts to a points-based model. Jira Cloud uses a points-based model to measure API usage; instead of simply counting requests, each API call consumes points based on the work it performs, such as the amount of data returned or the complexity of the operation.
The lesson is not that everyone should adopt points—it is that your appendix needs to explain whatever model you actually use, with concrete examples. Atlassian's own 429 example is a good template:
HTTP/1.1 429 Too Many Requests
Retry-After: 1847
X-RateLimit-Limit: 100000
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 2025-10-08T15:00:00Z
RateLimit-Reason: jira-quota-global-basedNotice what they ship in the docs: when any limit is exceeded, Jira returns an HTTP 429 Too Many Requests response, and the app should handle this gracefully by respecting the Retry-After header and implementing appropriate backoff strategies. That sentence belongs in your appendix verbatim, customized to your platform. For a deeper architectural walk-through, see our guide on handling API rate limits and retries across multiple third-party APIs.
A non-negotiable: be honest about what your platform does and does not do. If you proxy upstream APIs (CRMs, HRIS, ticketing), state explicitly whether you retry on the caller's behalf or pass the 429 through. Both are valid designs—hidden behavior is what burns enterprise customers.
The 429 Retry Trap: If you do not explicitly document your Retry-After header behavior, developers will implement aggressive, uncoordinated retries. This leads to the "thundering herd" problem, where a sudden release of a rate limit causes all queued requests to hit your API simultaneously, immediately exhausting the limit again.
Providing Runnable API Examples to Optimize TTFC
Time to First Call (TTFC) is the elapsed time from a developer signing up for your service to executing their first successful, authenticated API request that returns a non-error response.
Static code blocks are not runnable examples. A runnable example is a snippet that a developer can paste into their terminal or editor, swap in a sandbox credential, and execute in under 60 seconds with a 200 response.
The payoff for getting this right is enormous. In a Postman case study, Swapnil Sapar, a Principal Engineer at PayPal, mentioned how Postman helped PayPal reduce their time to first call from 60 minutes to one minute. Autodesk reported similar results: the TTFC was reduced from less than an hour to 3 minutes, and partners are scaling reliable APIs with teams of the same size. This is the kind of metric that converts an evaluation into a signed contract.
Your technical appendix must include copy-pasteable, runnable code. A generic cURL request with <INSERT_TOKEN_HERE> is no longer sufficient. You need to provide developer recipes snippets in Python, Node.js, and Go that handle the full lifecycle. What makes an example actually runnable:
- One credential, one resource, one verb. The first example must be a single GET that returns real-looking data. Save POSTs and complex flows for later in the appendix.
- Inline auth, not a reference link. Show the
Authorizationheader in the snippet itself. Do not make developers click through to find out where the token goes. - A sandbox that does not require manual approval. If a developer has to email sales for a key, your TTFC is measured in days, not minutes.
- Copy buttons that copy the entire command. Including the trailing newline. This sounds petty until you watch a developer's eyes glaze over after their fourth syntax error.
- At least three languages. curl, Node.js, and Python is the floor. Add Go or Ruby if your buyer base demands it.
# Minimal runnable example - the first thing in your appendix
curl -X GET 'https://api.example.com/v1/contacts?limit=10' \
-H 'Authorization: Bearer YOUR_SANDBOX_TOKEN' \
-H 'Content-Type: application/json'The deeper trap: be careful of artificially hacking a TTFC, perhaps by hiding away the tricky parts or ignoring the gotchas, as you may be shifting the friction to the implementation stage; productivity shortcuts like SDKs, libraries, and code generation can help ease this transition. The appendix should not lie about complexity. If pagination requires a cursor, show the cursor. If the API has a 30-second timeout on certain endpoints, say so in the example comments.
Here is an example of what a runnable Node.js snippet should look like in your appendix to handle the full lifecycle, including exponential backoff:
import fetch from 'node-fetch';
async function fetchWithBackoff(url, options, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
const response = await fetch(url, options);
if (response.status === 429) {
// Extract the normalized IETF standard header
const retryAfter = response.headers.get('ratelimit-reset');
const waitTime = retryAfter ? parseInt(retryAfter) * 1000 : Math.pow(2, i) * 1000;
console.warn(`Rate limited. Retrying in ${waitTime}ms...`);
await new Promise(resolve => setTimeout(resolve, waitTime));
continue;
}
if (!response.ok) {
throw new Error(`HTTP Error: ${response.status}`);
}
return response.json();
}
throw new Error('Max retries exceeded');
}
// Usage example
const apiKey = process.env.API_KEY;
const url = 'https://api.example.com/v1/resources';
fetchWithBackoff(url, {
headers: { 'Authorization': `Bearer ${apiKey}` }
}).then(data => console.log(data));This snippet is valuable because it acknowledges reality. It shows the developer exactly how to consume your API safely in a production environment. When you publish end-to-end developer tutorials or build a runnable, step-by-step developer tutorial featuring code like this, you drastically reduce integration friction.
Standardizing Documentation Across 50+ SaaS Integrations
This is where the math breaks for most B2B SaaS teams. Writing one excellent appendix is hard. Writing 50 of them—one per third-party integration your product supports—is a full-time job for a documentation team you do not have.
If your SaaS product offers native integrations with Salesforce, NetSuite, Workday, and Zendesk, your technical appendix cannot simply link out to those vendors' documentation. Their docs are often outdated, their rate limits change, and their error payloads are wildly inconsistent. The usual outcomes:
- Half the integrations have decent docs, half have stubs.
- Rate limit sections are inconsistent because each upstream vendor returns different headers (
X-RateLimit-Remaining,X-Rate-Limit-Remaining,RateLimit-Remaining, sometimes nothing). - Webhook signature docs drift as upstream vendors silently change their schemes.
- Engineers stop updating the appendices because the maintenance cost dwarfs the perceived value.
A unified API layer changes the economics. When the platform normalizes upstream behavior into a single contract, you write the appendix once and it stays correct across providers. The two pieces that benefit most are rate limit headers and webhook signatures.
At Truto, the platform normalizes upstream rate limit information into IETF-style headers—ratelimit-limit, ratelimit-remaining, and ratelimit-reset—regardless of which scheme the upstream vendor uses. Truto uses JSONata expressions in its integration configurations to detect rate limits from upstream providers (whether they return a standard 429 or a proprietary error payload). It extracts the reset times and normalizes them. Those same standardized headers appear on both successful 2xx responses and 429 responses, so a caller can monitor remaining quota proactively rather than reactively.
Second—and this is a critical architectural decision—Truto does not automatically retry, throttle, or apply backoff on rate limit errors. When an upstream API returns a rate limit error, the platform passes that HTTP 429 directly to the caller without hidden retries or backoff. We do this because senior engineers demand control. If an integration platform silently absorbs 429s and holds connections open while applying exponential backoff, it causes distributed deadlocks and memory leaks in the client's architecture. By passing the 429 and the normalized headers back to the caller, developers can implement their own queueing and backoff logic exactly as documented.
That property is what makes a multi-integration appendix tractable. The rate limit section becomes provider-agnostic:
# Standard response from any underlying integration
HTTP/1.1 200 OK
ratelimit-limit: 1000
ratelimit-remaining: 847
ratelimit-reset: 42
# On quota exhaustion - passed through, headers preserved
HTTP/1.1 429 Too Many Requests
ratelimit-limit: 1000
ratelimit-remaining: 0
ratelimit-reset: 60
retry-after: 60One snippet. Works across HubSpot, Salesforce, Pipedrive, and dozens of others. The same flattening applies to pagination cursors and webhook signatures—all things you would otherwise have to document N times. For a deeper architectural treatment, see our breakdown of how unified APIs normalize pagination and error handling across 50+ APIs.
A unified API does not eliminate the need for an appendix. It eliminates the need for 50 appendices. You still owe your buyers a clear document on auth flows, retry semantics, and webhook signatures—just one document instead of fifty.
The Rise of AI Agents as API Consumers
Your technical appendix is no longer just being read by human engineers. It is being ingested by Large Language Models (LLMs) and AI agents that write integration code on the fly.
If your docs are locked behind single-page React applications that render dynamically, LLMs cannot read them. To solve this, Truto built a dedicated Docs MCP (Model Context Protocol) server. Every single documentation page and API reference page in Truto automatically generates a .md Markdown twin.
These Markdown twins strip out the UI components and present pure, structured context: full parameter tables, auto-generated code examples, and supported integration lists. When an AI agent needs to know how to handle a Truto rate limit, it ingests the Markdown twin, reads the standardized header definitions, and writes the correct backoff logic automatically.
How Truto Auto-Generates MCP Tools from Integration Configuration
When teams evaluate MCP server platforms for AI agent integrations, one of the first questions is: how do tools get created? In most platforms, every tool is hand-coded. Truto takes a different approach - tools are generated dynamically from two data sources that already exist: the integration's resource configuration and its documentation records.
Here is a concrete example. Suppose a HubSpot integration has this resource definition in its configuration:
{
"resources": {
"contacts": {
"list": {
"method": "get",
"path": "/crm/v3/objects/contacts",
"response_path": "results"
},
"get": {
"method": "get",
"path": "/crm/v3/objects/contacts/{{id}}"
},
"create": {
"method": "post",
"path": "/crm/v3/objects/contacts"
}
}
},
"tool_tags": {
"contacts": ["crm", "sales"]
}
}And a documentation record exists for the list method:
{
"type": "description",
"resource": "contacts",
"method": "list",
"content": "List all contacts in the HubSpot CRM. Returns contact records with properties like email, first name, last name, and company."
}With a corresponding query schema documentation record:
{
"type": "query_schema",
"resource": "contacts",
"method": "list",
"content": "properties:\n type: object\n properties:\n email:\n type: string\n description: Filter contacts by email address"
}Truto's tool generation pipeline reads both records, injects standard pagination fields (limit, next_cursor), and produces this MCP tool definition:
{
"name": "list_all_hub_spot_contacts",
"description": "List all contacts in the HubSpot CRM. Returns contact records with properties like email, first name, last name, and company.",
"inputSchema": {
"type": "object",
"properties": {
"email": {
"type": "string",
"description": "Filter contacts by email address"
},
"limit": {
"type": "string",
"description": "The number of records to fetch"
},
"next_cursor": {
"type": "string",
"description": "The cursor to fetch the next set of records. Always send back exactly the cursor value you received (nextCursor) without decoding, modifying, or parsing it."
}
}
}
}The key design decision: a resource method only becomes a tool if it has a description-type documentation record. No documentation, no tool. This acts as a quality gate - only well-described, curated endpoints get exposed to LLMs. The tool name, the parameter schema, and the pagination fields are all derived automatically. No per-integration code is written.
This is what makes Truto's approach to MCP server generation different from platforms where each tool is a hand-coded function. The same generic pipeline that generates HubSpot tools also generates Salesforce, Jira, Zendesk, and every other integration's tools - all from configuration data.
Agent Code: Calling Truto MCP Tools and Handling 429 Passthrough
Here is what a minimal agent loop looks like when calling Truto-generated MCP tools. The snippet connects to a Truto MCP server, invokes a tool, and handles the 429 passthrough with the normalized IETF headers described earlier in this article.
import { Client } from '@modelcontextprotocol/sdk/client/index.js';
import { StreamableHTTPClientTransport } from '@modelcontextprotocol/sdk/client/streamableHttp.js';
const transport = new StreamableHTTPClientTransport(
new URL('https://api.truto.one/mcp/YOUR_MCP_TOKEN')
);
const client = new Client({ name: 'my-agent', version: '1.0.0' });
await client.connect(transport);
// List available tools (generated from integration config + docs)
const { tools } = await client.listTools();
console.log('Available tools:', tools.map(t => t.name));
// => ['list_all_hub_spot_contacts', 'get_single_hub_spot_contact_by_id', ...]
// Call a tool with retry logic for 429 passthrough
async function callToolWithRetry(toolName, args, maxRetries = 3) {
for (let attempt = 0; attempt < maxRetries; attempt++) {
const result = await client.callTool({ name: toolName, arguments: args });
const parsed = JSON.parse(result.content[0].text);
// Truto passes upstream 429s through - check the result
if (parsed.error && parsed.status === 429) {
// Normalized IETF headers are in the response
const retryAfter = parsed.headers?.['ratelimit-reset'] || Math.pow(2, attempt);
console.warn(`Rate limited by upstream. Retrying in ${retryAfter}s...`);
await new Promise(r => setTimeout(r, retryAfter * 1000));
continue;
}
return parsed;
}
throw new Error(`Tool ${toolName} failed after ${maxRetries} retries`);
}
const contacts = await callToolWithRetry('list_all_hub_spot_contacts', {
limit: '50'
});
console.log(`Fetched ${contacts.result.length} contacts`);Notice what this snippet does not contain: there is no HubSpot-specific logic, no per-provider retry configuration, and no manual OAuth token management. The MCP server URL is the only credential. Truto handles authentication, token refresh, and response parsing behind that URL. The agent only needs to handle the 429 passthrough, and because the headers are normalized to IETF format, the retry logic works identically whether the upstream is HubSpot, Salesforce, or Zendesk.
Arcade.dev's Approach: User-Delegated OAuth via URL Elicitation
Arcade.dev solves a different problem than Truto, and understanding the distinction matters when choosing MCP infrastructure for your AI agents.
Truto's MCP servers use pre-established credentials. When your team connects an integration (say, a customer's Salesforce instance), the OAuth flow happens once during account setup. Every subsequent MCP tool call reuses those stored credentials, with Truto refreshing tokens proactively before they expire. This is the right model for backend agents, autonomous workflows, and server-to-server integrations where a human is not in the loop.
Arcade focuses on a different scenario: agents that act on behalf of individual end users who need to grant consent in real time. Arcade's URL elicitation capability, co-developed alongside Anthropic, enables an MCP server to provide the user with a secure login page in their browser. Instead of passing credentials through an untrusted client, the agent triggers a secure browser flow where the user authenticates directly. Tokens never touch the model or client, and security boundaries stay intact.
In Arcade's model, each tool declares its required auth scopes via a decorator. You declare requires_auth=GitHub(scopes=["repo"]) and Arcade handles OAuth, token refresh, and per-call scoping. The client and the LLM never see the token. When the tool is invoked and the user has not yet authorized, the user is presented with a URL to complete the OAuth challenge in their browser. On success, the token is injected into context for that call. Subsequent calls reuse and refresh the token automatically.
Here is what the Arcade tool definition pattern looks like in Python:
from arcade_mcp_server import MCPApp, Context
from arcade_mcp_server.auth import Slack
app = MCPApp(name="my-agent-tools", version="1.0.0")
@app.tool(requires_auth=Slack(scopes=["chat:write"]))
async def send_slack_message(
context: Context,
channel: str,
message: str
) -> dict:
"""Send a message to a Slack channel on behalf of the user."""
token = context.get_auth_token_or_empty()
# token is injected by Arcade after user completes OAuth
# ... call Slack API with tokenIf the user has not yet authorized Slack, the Arcade runtime returns an elicitation response containing a URL. The MCP client (Claude, Cursor, or your custom agent) presents this URL to the user. The user clicks, authenticates in their browser, and the tool call can be retried with valid credentials.
Arcade also structures its error handling around tool-level categories. Error adapters automatically translate common exceptions (from httpx, requests, SDKs, etc.) into appropriate Arcade errors. This means zero boilerplate error handling code for you. Rate limit errors from upstream services surface as UpstreamRateLimitError, which includes a retry_after_ms field the orchestrator can use to schedule retries.
The trade-off is real: Arcade gives you per-user, just-in-time OAuth - but the agent must handle interactive consent flows, and the tool catalog for each integration is hand-authored rather than auto-generated from configuration. Truto gives you auto-generated tools with normalized rate limit passthrough - but the credentials are established at account setup time, not at runtime per-user.
Combined Architecture: Arcade for OAuth + Truto for Normalization
For teams building production AI agents, Arcade and Truto are not mutually exclusive. They occupy different layers of the stack, and combining them can give you the best of both worlds: Arcade handles user-delegated authentication and consent management, while Truto provides data normalization, auto-generated tool schemas, and transparent rate limit handling across dozens of providers.
flowchart TD
User[End User] -->|Interacts with| Agent[AI Agent / Orchestrator]
Agent -->|User-facing tools<br>requiring live OAuth| Arcade[Arcade MCP Runtime]
Agent -->|Backend tools<br>normalized data + rate limits| Truto[Truto MCP Server]
Arcade -->|URL Elicitation<br>per-user OAuth| OAuth[User's Browser<br>OAuth Consent]
OAuth -->|Token granted| Arcade
Arcade -->|Authenticated calls| UserApps[User's Gmail,<br>Calendar, Slack]
Truto -->|Pre-established credentials<br>auto-refreshed tokens| SaaS[CRM, Ticketing,<br>HRIS, Accounting]
SaaS -->|Raw responses + rate limits| Truto
Truto -->|Normalized IETF headers<br>+ unified schema| Agent
Arcade -->|Tool results| AgentHere is when each layer applies:
| Scenario | Use Arcade | Use Truto |
|---|---|---|
| Agent sends email as the logged-in user | Yes - needs live user OAuth consent | No |
| Agent reads CRM contacts for a customer tenant | No | Yes - pre-established credentials, normalized pagination |
| Agent posts to Slack on behalf of a specific user | Yes - per-user scoped token | No |
| Agent syncs ticketing data across Jira, Zendesk, ServiceNow | No | Yes - unified schema, rate limit passthrough |
| Agent creates a calendar event for the user | Yes - needs user's calendar OAuth | No |
| Agent queries accounting data for month-end reports | No | Yes - auto-generated tools, cursor-based pagination |
The orchestration layer (LangChain, LangGraph, OpenAI Agents SDK, or a custom loop) decides which MCP server to call based on whether the action requires user-delegated auth or backend data access. The key insight is that these are not competing tools - they are complementary infrastructure for different parts of the same agent workflow.
For teams evaluating Truto vs Arcade.dev for MCP server AI agent integrations, the honest answer is that the right choice depends on your agent's primary interaction pattern. If your agents mostly operate autonomously on backend SaaS data across many providers, Truto's auto-generated tools and normalized rate limits will save you from writing 50 integration-specific appendices. If your agents primarily act on behalf of individual users who need to grant consent interactively, Arcade's URL elicitation and per-user token management is purpose-built for that. Many production deployments will use both.
What to Ship This Quarter (Next Steps)
Stop treating your API documentation as an afterthought. It is a critical sales asset that determines whether enterprise procurement teams approve or reject your software. Don't try to write the full appendix in one sprint. Ship the highest-leverage sections first and let usage data tell you what to expand:
- Week 1: Audit your current docs. Count how many pages explicitly state rate limit values, time windows, and the exact 429 response shape. If the answer is fewer than half, that is your starting point.
- Week 2: Write the rate limit section. Use the IETF header convention. Include both 200 and 429 example responses, with
Retry-Afterand concrete numbers. State your retry policy honestly—whether your platform retries on the caller's behalf or passes errors through. - Week 3: Ship three runnable examples per major resource: curl, one server-side SDK, one async pattern. Measure TTFC for new signups before and after.
- Week 4: Add the webhook signature verification snippet. This is the section enterprise security reviewers ask about most consistently and the one most commonly missing.
The appendix is not glamorous content. It is the surface a senior engineer touches when they are deciding whether to recommend your platform to procurement. Treat it as the technical artifact closest to revenue.
If you're shipping integrations across many SaaS APIs and the cost of writing 50 appendices is what's blocking you, that's exactly the problem we built Truto to solve—one normalized contract for rate limits, pagination, webhooks, and auth across hundreds of providers.
FAQ
- What is an API technical appendix?
- An API technical appendix is a structured reference document covering operational edge cases: rate limits, retry semantics, pagination contracts, authentication flows, webhook signatures, and runnable code examples. It complements auto-generated Swagger reference docs.
- How should I document HTTP 429 Too Many Requests responses?
- State exact limit values and time windows, list headers returned (ideally following the IETF RateLimit-Limit, RateLimit-Remaining, RateLimit-Reset convention), show a sample 429 response with Retry-After, and specify the expected backoff strategy. Be explicit about whether your platform retries or passes 429s through.
- What is Time to First Call (TTFC) and why does it matter?
- TTFC measures the elapsed time from a developer signing up to their first successful authenticated API call. Providing runnable code snippets can drop TTFC from 60 minutes to under 5 minutes, which directly correlates with higher activation and enterprise conversion.
- How do I document rate limits across multiple third-party SaaS integrations?
- Use a unified API layer that normalizes upstream rate limit information into a single header convention (like the IETF standard) regardless of the upstream vendor's scheme. This allows you to write one provider-agnostic rate limit appendix instead of 50 separate ones.