How to Generate MCP Servers for Your SaaS Users (2026 Architecture Guide)
Learn how to generate secure, per-user MCP servers for your SaaS platform, complete with dynamic tool curation, scoped OAuth, TTL expiry, and pass-through rate limits.
If your B2B SaaS platform cannot securely expose its API to an AI agent, you are effectively locking your product out of the next generation of enterprise workflows. Enterprise buyers no longer treat AI agent compatibility as a future-state item on the roadmap. They expect it in the RFP. They no longer just want a REST API or a Zapier connector. They want to connect their accounts directly to AI assistants like Claude, ChatGPT, custom LangChain setups, or autonomous agentic frameworks.
The answer to this demand is the Model Context Protocol (MCP). But building and hosting a multi-tenant MCP server from scratch requires managing complex JSON-RPC state, handling dynamic tool generation, and maintaining OAuth token lifecycles at scale.
This architecture guide explains what it actually means to generate MCP servers for your SaaS users dynamically, the architectural pitfalls of relying on static OpenAPI generators, and how Truto's infrastructure provides a secure, production-ready path to making your platform AI-ready.
What Does It Mean to Generate an MCP Server for Your SaaS?
Generating an MCP server for your SaaS means providing your end-users with a secure, authenticated JSON-RPC 2.0 endpoint—tied specifically to that user's connected account—that automatically translates your product's API into a standardized set of tools that any AI agent can discover and execute.
A usable, production-ready end-user MCP server must answer five questions at runtime:
- Whose account is this? The endpoint must bind to exactly one tenant or connected account. Cross-tenant leakage is unacceptable.
- Which tools are available? The MCP client should discover a curated, current snapshot of operations the user can perform.
- What input is valid? Each tool needs a strict JSON Schema input contract.
- How is auth handled? OAuth and API credentials must be managed server-side and never leak into the AI client.
- When should access end? Short-lived URLs and strict TTLs are safer than forever tokens.
In practice, when a user wants to connect your SaaS platform to an AI agent, the ideal workflow looks like this:
- Authentication: The user logs into your application, connects their third-party account (or your own API), and clicks "Connect to AI Agent."
- Generation: Your system generates a unique, cryptographically secure URL scoped specifically to that user's tenant and permissions (e.g.,
https://your-platform.example/mcp/<token>). - Integration: The user pastes this single URL into an MCP-compatible client (like Claude Desktop, ChatGPT Developer Mode, or a custom LangGraph orchestrator).
- Discovery: The AI agent connects to the URL via standard transport, reads the available tools via
tools/list, and begins interacting with your API on the user's behalf viatools/call.
sequenceDiagram
participant U as End User
participant S as Your SaaS
participant T as Truto Integration Layer
participant V as Target API
participant C as MCP Client (Claude/ChatGPT)
U->>S: Connect account + pick scopes
S->>T: Create integrated account + MCP token
T-->>S: Returns MCP server URL
S-->>U: Display URL / install button
U->>C: Paste URL into MCP client
C->>T: tools/list (JSON-RPC)
T-->>C: Filtered tool definitions
C->>T: tools/call
T->>V: Authenticated API call<br>(refreshes token if needed)
V-->>T: Response
T-->>C: MCP-wrapped resultThe Problem with Generating MCP Servers from OpenAPI Specs
As the demand for MCP compatibility spikes, a common initial approach is to use static generators to convert an existing OpenAPI specification directly into an MCP server.
While scaffolding tools from OpenAPI specs looks great in a local development environment, it is entirely insufficient for production SaaS environments. Static generators are having a moment, but they collapse when handed to a paying customer.
Generating an MCP server from a static spec fails to address the execution context required for reliable AI agent interactions. Here is why the static generation approach breaks down in production:
Summary of Production Requirements vs Spec-Only Limitations
| Production requirement | Why spec-only MCP breaks |
|---|---|
| OAuth refresh | The agent should not own refresh tokens, expired access tokens, or reauth state. |
| Multi-tenant isolation | One MCP URL must not expose another customer's data by accident. |
| Tool curation | Large OpenAPI specs can dump hundreds of noisy operations into the model. |
| Vendor quirks | Pagination cursors, custom fields, undocumented filters, and odd error formats rarely fit cleanly into generated handlers. |
| Permission design | Your support agent may need list and get, not delete and batch_update. |
| Runtime rate limits | 429 handling needs deterministic middleware, not a hopeful prompt. |
How Truto Generates Dynamic MCP Servers for Your Users
Truto takes a fundamentally different approach to exposing SaaS platforms to AI agents. Instead of generating static code from a schema, Truto operates as a dynamic proxy layer. Every connected integration can become a scoped MCP server because tools are generated dynamically from integration resources and documentation, not hand-coded per connector.
Truto derives MCP tools from two existing data sources: the API's resource definitions (what endpoints exist) and documentation records (human-readable descriptions and JSON Schemas).
The contract is simple: A resource method appears as an MCP tool only if it has a documentation record.
This acts as a strict quality gate, ensuring only well-described endpoints are exposed to LLMs. You document what you want the LLM to use, and nothing else. No undocumented endpoint sneaks into the agent's tool list.
The Dynamic Tool Assembly Process
Tool generation happens dynamically on every tools/list or tools/call request. Tools are never cached or pre-built, meaning any updates to your API documentation are instantly reflected in the AI agent's context window.
Here is how Truto assembles these tools:
- Method Filtering: Truto checks the configuration of the specific MCP token. If the server is restricted to read-only operations, it filters out all
create,update, anddeletemethods. - Documentation Gating: Truto looks for a documentation record for the specific resource and method. If no documentation exists, the tool is skipped entirely. You can explore the mechanics of this in our deep dive on auto-generated MCP tools.
- Schema Enhancement: Raw query and body schemas are parsed into JSON Schema format. Truto automatically injects LLM-specific instructions. For example, on
listmethods, Truto injects anext_cursorproperty with explicit instructions telling the LLM to pass the cursor value back unchanged without attempting to decode or modify it. - Flat Input Namespace Resolution: When an MCP client calls a tool, all arguments arrive as a single flat object. Truto's router splits these arguments into query parameters and request body parameters based on the property keys defined in the schemas, ensuring the underlying API receives exactly what it expects.
Below is an example of the JSON Schema payload Truto generates and sends to the LLM during the initialization phase:
{
"name": "list_all_crm_contacts",
"description": "Retrieve a paginated list of contacts from the CRM. Use the limit parameter to control batch size.",
"inputSchema": {
"type": "object",
"properties": {
"limit": {
"type": "string",
"description": "The number of records to fetch"
},
"next_cursor": {
"type": "string",
"description": "The cursor to fetch the next set of records. Always send back exactly the cursor value you received without decoding, modifying, or parsing it."
},
"email_domain": {
"type": "string",
"description": "Filter contacts by their email domain (e.g., example.com)"
}
},
"required":[]
}
}Per-Customer Tool Customization
Standardized tools only get you so far. In B2B SaaS, enterprise customers inevitably use custom fields or require specific parameter formats that deviate from the base API.
Truto handles this through an environment-level override hierarchy. You can update a tool's description, inject custom instructions, or modify the JSON Schema parameters for a single customer without affecting the base integration or any other tenant. If a customer needs their AI agent to understand a unique external_id format, you override the documentation record for their specific environment. The generated MCP server instantly reflects the customized schema.
Generating the Server via API
When you create an MCP server for an integrated account, Truto returns a URL that any MCP client can use. Here is the shape of the MCP server creation call:
POST /integrated-account/{integrated_account_id}/mcp
Authorization: Bearer <TRUTO_API_TOKEN>
Content-Type: application/json
{
"name": "Read-only HubSpot MCP",
"config": {
"methods": ["read"],
"tags": ["crm", "sales"],
"require_api_token_auth": true
},
"expires_at": "2026-06-30T23:59:59.000Z"
}And the response gives your product a shareable MCP URL:
{
"id": "mcp_abc123",
"name": "Read-only HubSpot MCP",
"config": {
"methods": ["read"],
"tags": ["crm", "sales"],
"require_api_token_auth": true
},
"expires_at": "2026-06-30T23:59:59.000Z",
"url": "https://api.truto.one/mcp/a1b2c3d4e5f6..."
}Visualizing the Execution Architecture
When an AI agent interacts with the generated MCP server, Truto handles the protocol translation, authentication, and execution entirely in the background.
sequenceDiagram
participant Agent as AI Agent (Claude/Custom)
participant MCP as Truto MCP Endpoint
participant Auth as Token Validation & Auth
participant Proxy as Truto Proxy API
participant SaaS as Your SaaS API
Agent->>MCP: POST /mcp/:token (tools/call)
MCP->>Auth: Validate Token & Check Expiry
Auth-->>MCP: Context & Credentials Loaded
MCP->>MCP: Split Flat Args into Query & Body
MCP->>Proxy: Execute Tool (e.g., handleCreateProxyApi)
Proxy->>SaaS: HTTP Request with Bearer Token
SaaS-->>Proxy: 200 OK (JSON Response)
Proxy-->>MCP: Format Response & Pagination Cursors
MCP-->>Agent: JSON-RPC 2.0 Result PayloadSecuring AI Agent Access: Authentication, Filtering, and Expiry
Truto mitigates identity drift by enforcing strict, configurable boundaries on every generated MCP server. Secure MCP generation means every MCP URL should be scoped, revocable, observable, and short-lived by default.
Granular Method Filtering
You can restrict an MCP server to specific operation types. Setting methods: ["read"] ensures the server only exposes get and list operations. The LLM physically cannot execute a create, update, or delete request because those tools are filtered out during the generation phase. You can expose only reads, only writes, exact methods, or custom operations like search or download.
Tag-Based Tool Grouping
Tags provide a way to organize and filter tools by functional area. When creating an MCP server, you can pass a configuration object that restricts the server to specific tags.
For example, if your SaaS includes both a support ticketing system and a billing portal, you can tag the ticketing resources with ["support"] and the billing resources with ["finance"]. By generating an MCP server with config: { tags: ["support"] }, the AI agent will only see tools related to tickets and comments. The billing endpoints simply do not exist in the agent's context window.
Automatic TTL Expiration
MCP servers can be created with a time-to-live via an expires_at field. This is highly effective for temporary access—such as granting an AI agent access to a specific dataset for a 24-hour analysis task.
Truto enforces this expiration at multiple layers. The underlying managed key-value store utilizes built-in expiration timestamps, ensuring token lookups fail immediately at the exact second of expiration. Simultaneously, scheduled alarms trigger background cleanup routines to purge the database records, leaving no stale access tokens behind.
Secondary API Token Authentication
By default, an MCP server's URL contains a cryptographic token that acts as the sole authentication mechanism. For enterprise deployments requiring higher security, Truto supports a require_api_token_auth flag.
When enabled, possession of the MCP URL is no longer sufficient. The connecting client must also provide a valid API token in the Authorization header. This ensures that even if an MCP URL is accidentally leaked in a log file, it cannot be used by an unauthenticated actor.
Handling Rate Limits and the Pass-Through Architecture
A critical design decision when generating MCP servers is how to handle API rate limits. When an AI agent executes a complex loop (like iterating through thousands of records to summarize data), it will inevitably hit upstream API rate limits.
Truto utilizes a pass-through architecture for rate limits. Truto does not retry, throttle, or apply exponential backoff on rate limit errors. Instead, when your SaaS API returns an HTTP 429, Truto passes that error directly back to the calling AI agent.
More importantly, Truto normalizes upstream rate limit information into standardized headers per the IETF specification:
ratelimit-limitratelimit-remainingratelimit-reset
By passing these standardized headers back in the JSON-RPC response, the AI agent (or the orchestrating framework like LangGraph or CrewAI) has the exact context it needs. The agent can read the ratelimit-reset value, intentionally pause its own execution loop, and resume exactly when the API allows it.
A minimal TypeScript wrapper in your orchestrator looks like this:
const sleep = (ms: number) => new Promise(resolve => setTimeout(resolve, ms))
async function fetchWithRateLimitBackoff(input: RequestInfo, init?: RequestInit) {
for (let attempt = 0; attempt < 4; attempt++) {
const res = await fetch(input, init)
if (res.status !== 429) {
return res
}
const resetSeconds = Number(res.headers.get('ratelimit-reset') ?? '1')
const retryAfterSeconds = Number(res.headers.get('retry-after') ?? resetSeconds)
const jitterMs = Math.floor(Math.random() * 250)
await sleep(Math.max(retryAfterSeconds, 1) * 1000 + jitterMs)
}
throw new Error('Rate limit still active after retries')
}Do not bury this in the prompt. Put it in code. Prompts are not schedulers, circuit breakers, or distributed quota managers. For a deeper treatment, see how to handle third-party API rate limits when an AI agent is scraping data.
Building Your Own MCP Infrastructure vs. Using Truto
If you decide to build this infrastructure internally, you are committing to maintaining a distributed state machine for OAuth and a multi-tenant JSON-RPC server. Here is how the engineering effort compares:
| Capability | Building it yourself | Using Truto |
|---|---|---|
| Protocol handling | You must build and maintain a JSON-RPC 2.0 server that correctly handles initialize, tools/list, and tools/call for thousands of concurrent sessions. |
Truto provides a fully managed JSON-RPC 2.0 endpoint. |
| Tool generation | You have to write custom logic to parse your API specs, filter out internal endpoints, and format them into LLM-friendly JSON Schemas. | Tools are generated dynamically from documentation records. Undocumented endpoints are automatically excluded. |
| Per-customer customization | You must build an override database and merge logic to handle tenant-specific custom fields, parameter changes, or prompt instructions. | Truto's environment-level overrides let you update tool descriptions and JSON Schema parameters for individual customers without altering the base integration. |
| Authentication state | You need a secure token vault, background workers for proactive token refresh, and circuit breakers for when third-party auth servers go down. | Truto manages the entire OAuth lifecycle, refreshing tokens shortly before they expire and scheduling work ahead of token expiry. |
| Multi-tenant isolation | You must build a routing layer that maps every incoming MCP request to the correct tenant's credentials. | Every generated MCP URL is cryptographically bound to a specific integrated account. |
| Rate limiting | You have to write middleware to catch 429s and translate them into standardized ratelimit-* headers for the LLM. |
Truto uses a pass-through architecture that automatically normalizes upstream rate limit headers. |
For a single internal tool, building a basic MCP server takes a few days. For a multi-tenant B2B SaaS product, building a secure, scalable MCP infrastructure takes months of dedicated engineering time.
How to Embed Truto's MCP Capabilities in Your SaaS Today
Building out the infrastructure to handle OAuth lifecycles, JSON-RPC 2.0 protocol translation, and dynamic tool generation is a massive engineering undertaking. Truto abstracts this entire layer, allowing you to offer MCP servers to your users immediately.
Most SaaS teams don't need to build all of this. The economics flip when you can ship MCP across dozens of providers in days. Truto's catalog covers 120+ AI-ready integrations spanning CRM, HRIS, ATS, accounting, ticketing, calendar, file storage, and more.
Here is the practical onboarding path to embed this in your product:
Step 1: Connect Truto to your existing auth flow
Use Truto's connect flow (hosted or embedded) to capture your users' third-party credentials. Each successful connection produces an integrated_account record with credentials stored encrypted. OAuth refresh runs automatically: tokens are refreshed shortly before they expire, and the platform schedules refresh work ahead of token expiry so long-running agent calls don't fail mid-stream.
Step 2: Curate the tool surface per use case
For each integration, decide which resources should appear as tools and tag them by use case. A sales-ops agent needs contacts, deals, opportunities. A support agent needs tickets, comments, users. Documentation entries (description + JSON Schema for query and body) are the gate—undocumented methods never become tools.
Step 3: Wire the create-MCP-server call into your UI
Add a "Connect to Claude / ChatGPT" button on the integration detail page. When clicked, call POST /integrated-account/:id/mcp with the appropriate methods, tags, and expires_at. Display the returned URL to the user.
// Pseudocode for your backend
const { url } = await truto.post(
`/integrated-account/${integratedAccountId}/mcp`,
{
name: `${user.email} - read-only`,
config: {
methods: ['read'],
tags: ['support'],
require_api_token_auth: false,
},
expires_at: addDays(new Date(), 30).toISOString(),
}
)
return { mcpUrl: url }Step 4: Document the client setup for your users
For Claude, users go to Settings > Connectors > Add custom connector and paste the URL. For ChatGPT, they enable Developer mode under Settings > Apps > Advanced settings and add the URL as a custom connector. Both flows take under two minutes.
Step 5: Monitor, rotate, and revoke
MCP tokens are first-class records. List them, patch their config or expiry, or delete them outright. When a customer churns or rotates access, your code revokes the token; the platform cleans up both the lookup record and any scheduled expiration work.
What to Ship Next
Generating MCP servers for your users does not have to mean building a JSON-RPC handler, an OAuth refresh service, a token vault, an alarm scheduler, and 200 hand-curated tool definitions per integration. It can mean exposing the integrations you already need—through a platform that already speaks MCP, manages credentials, and enforces scoped access—and pointing your users at the result.
If you want to allow your customers to connect their AI agents to your SaaS product, the path is equally straightforward. You can request the Truto team to build an integration for your specific APIs. We map your endpoints, configure the OAuth flows, and establish the documentation records. Within 24 hours, you will have a customizable, production-ready integration.
Stop forcing your customers to build custom scripts just to talk to your API. Give them a secure, auto-updating MCP server and let their AI agents do the work.
FAQ
- What does it mean to generate an MCP server for a SaaS user?
- It means producing a unique, authenticated JSON-RPC 2.0 endpoint URL scoped specifically to that user's connected account. This exposes your product's API operations as callable tools that any MCP-compliant client (Claude, ChatGPT, Cursor) can discover and invoke.
- Can I generate an MCP server directly from my OpenAPI spec?
- You can scaffold tool schemas from an OpenAPI spec, but it won't handle runtime execution context like OAuth token refreshes, multi-tenant isolation, per-user scoping, or filtering out undocumented endpoints. Static generators work for local demos, but collapse in production SaaS environments.
- How do I prevent an AI agent from accessing too much data via MCP?
- You can mitigate identity drift by generating MCP servers with strict constraints: method filtering (e.g., read-only access), tag-based tool grouping to scope by functional area, automatic TTL expiration, and optionally requiring a secondary API token for authentication.
- How does Truto handle API rate limits for AI agents?
- Truto uses a pass-through architecture. It normalizes rate limit data into standard IETF headers (ratelimit-limit, ratelimit-remaining, ratelimit-reset) and passes 429 errors directly to the AI agent. This allows the agent's orchestrator to manage backoff logic naturally instead of middleware silently swallowing errors.
- How fast can I add a new integration if Truto doesn't have it?
- Truto supports 120+ AI-ready integrations out of the box. If you need a custom integration for your own SaaS API, it is typically built within 24 hours of receiving the API documentation. Once mapped, generating MCP servers for your users is a single API call per account.