Skip to content

What is an MCP Server? The 2026 Architecture Guide for SaaS PMs

An MCP server is a standardized bridge between AI models and your SaaS APIs. Learn how the architecture works, why remote deployment wins, and whether to build or buy.

Sidharth Verma Sidharth Verma · · 14 min read
What is an MCP Server? The 2026 Architecture Guide for SaaS PMs

An MCP server is a lightweight service that exposes your application's capabilities — reading data, creating records, triggering workflows — to AI models like Claude, ChatGPT, or Gemini through a standardized protocol called the Model Context Protocol. If your engineering team is debating how to make your product "AI-ready" without building custom connectors for every AI platform, this is the concept you need to understand.

Before MCP existed, exposing your SaaS product to an AI model meant building custom, point-to-point connectors for OpenAI, Anthropic, Google, and every custom LangChain wrapper your customers decided to use. Ten AI models talking to ten data sources required a hundred custom integrations. MCP collapses that N×M problem into N+M: each AI application implements the client protocol once, each tool implements the server protocol once, and everything interoperates.

The protocol has moved fast. MCP is an open standard introduced by Anthropic in November 2024 to standardize the way AI systems integrate with external tools, systems, and data sources. In December 2025, Anthropic donated MCP to the Agentic AI Foundation (AAIF) under the Linux Foundation, co-founded by Anthropic, Block, and OpenAI with support from Google, Microsoft, and others. That governance shift matters for your roadmap: MCP is no longer a single vendor's experiment. It is a vendor-neutral standard backed by every major AI platform.

The adoption numbers confirm this isn't theoretical. MCP has become one of the fastest-growing open-source projects in AI, with over 97 million monthly SDK downloads, more than 10,000 active servers, and first-class client support across ChatGPT, Claude, Cursor, Gemini, Microsoft Copilot, Visual Studio Code, and more. As of early 2026, 28% of Fortune 500 companies have deployed MCP servers for production AI workflows.

This guide breaks down how the architecture works, why B2B SaaS companies are moving to remote servers, the real infrastructure costs of building in-house, and how to decide whether to build, buy, or wait.

What an MCP Server Actually Does

An MCP server translates between what an AI model wants to do and what your application's API can actually do. It advertises a list of available "tools" (typed actions with JSON Schema inputs and outputs), accepts requests from AI clients over JSON-RPC 2.0, executes those actions against your backend, and returns structured results.

The analogy that stuck across the industry is "USB-C for AI." Before USB-C, every device needed a different cable. Before MCP, every AI model needed a different connector to talk to external software. Build one MCP server for your product, and it works with Claude, ChatGPT, Gemini, Cursor, GitHub Copilot, and every other MCP-compatible client — now and in the future. You stop building bespoke integrations for each AI vendor and start building once.

For a PM, the practical implication is straightforward: you make a single investment in an MCP server, and your product becomes accessible to the entire AI ecosystem without additional connector work.

How the Model Context Protocol Architecture Works

MCP uses a client-server architecture with four distinct components. Getting this right matters because it determines where your engineering investment goes.

1. The MCP Host

The host is the application the end user sees — Claude Desktop, ChatGPT, Cursor, or your own custom agent. It owns the conversation, sends prompts to the model, receives tool requests back, executes them, and displays results.

2. The MCP Client

The client is a protocol handler embedded inside the host. It discovers what capabilities the server offers, validates the LLM's requested inputs against a strict JSON Schema, and manages the JSON-RPC 2.0 communication over HTTP or standard input/output (stdio).

3. The MCP Server

This is the execution environment — your service. It advertises tools with typed parameters, handles authentication to the underlying system, executes requests against your REST APIs or databases, and returns structured JSON responses.

4. Tools, Resources, and Prompts

Each MCP server exposes up to three types of capabilities:

Capability What it does Example
Tools Named actions with typed inputs/outputs create_a_salesforce_contact, list_all_zendesk_tickets
Resources Read-only data the model can pull into context A knowledge base article, a config file
Prompts Pre-built prompt templates for common workflows "Summarize open support tickets from the last 7 days"

Tools are the primary surface area for B2B SaaS. When a PM at Atlassian ships an MCP server for Jira, they are primarily exposing tools like create_issue, search_issues, and transition_issue. Resources and prompts are useful but secondary for most SaaS integration use cases.

Here is how these components interact during a standard execution flow:

sequenceDiagram
    participant User
    participant Host as MCP Host (Claude/ChatGPT)
    participant Client as MCP Client
    participant Server as MCP Server
    participant API as SaaS REST API

    User->>Host: "Find recent high-priority tickets"
    Host->>Client: Request available tools
    Client->>Server: tools/list (JSON-RPC)
    Server-->>Client: Returns [search_tickets, get_ticket]
    Host->>Host: LLM decides to use search_tickets
    Host->>Client: Execute search_tickets(priority="high")
    Client->>Server: tools/call (JSON-RPC)
    Server->>API: GET /api/v2/tickets?priority=high
    API-->>Server: 200 OK (JSON response)
    Server-->>Client: Formatted tool result
    Client-->>Host: Pass data to LLM
    Host-->>User: "Here are your high-priority tickets..."

When the client first connects to the server, it performs an initialization handshake. The client announces its capabilities, and the server responds with its protocol version and supported features. From there, all communication relies on standard JSON-RPC payloads.

For a deeper technical walkthrough of how MCP servers work, including the JSON-RPC message format and tool schemas, see our engineering deep-dive.

Local vs. Remote MCP Servers: Why B2B SaaS is Going Remote

This is the architectural fork that trips up most teams.

Local MCP servers run on the user's machine, communicate via stdio, and have direct access to local files and processes. A developer clones a GitHub repository, adds their personal API keys to an environment file, and runs a local Node.js or Python script. They are great for development, testing, and scenarios where data should never leave the workstation.

Remote MCP servers run on cloud infrastructure, are accessible via HTTPS, and serve multiple users from a central deployment. They handle authentication via OAuth 2.1, support multi-tenancy, and can be managed centrally by IT.

Local servers are fantastic for individual developers tinkering on their own machines. They are entirely useless for B2B SaaS companies trying to deploy production-grade AI features to thousands of enterprise customers. Most official MCP servers use stdio transport, meaning they run as local processes — which presents immediate issues for enterprise architecture: a single-user auth model, and the constraint that the MCP server must run on the same machine as the AI client.

Since May 2025, remote MCP server deployments are up nearly 4x. An analysis of the 20 most searched-for MCP servers showed that 80% provide remote server deployments. The reasons are entirely pragmatic:

  • Centralized Authentication: You cannot ask a VP of Sales to open a terminal, clone a repository, and paste a HubSpot API key into a .env file. Remote servers allow you to use standard OAuth 2.0 flows. The user clicks "Connect," authenticates in their browser, and the remote server handles the rest.
  • Multi-Tenant Security: Remote servers enforce strict tenant boundaries. A single cloud-hosted server infrastructure can securely isolate data for thousands of different companies.
  • Version Control and Updates: If an underlying third-party API changes, a local server breaks until the user manually pulls the latest code. With a remote server, you patch the integration once centrally, and all AI agents immediately use the updated logic.
  • Audit Logging: Enterprise procurement requires centralized logging of what actions AI agents take and when. Local servers have no built-in support for this.
Factor Local MCP Server Remote MCP Server
Deployment User's laptop Cloud infrastructure
Auth API keys on disk OAuth 2.1, SSO
Multi-user No Yes
Audit logging No built-in support Centralized
Best for Dev/testing, personal use Production B2B, team access

The pattern is clear: organizations are moving from experimental local servers to production remote deployments that integrate with their existing identity and infrastructure systems. You can see this playing out when evaluating options like the best MCP server for Attio — the top choices for production use are all managed remote servers.

The Hidden Infrastructure Costs of Building Custom MCP Servers

Here is where PMs consistently underestimate scope. Standing up a proof-of-concept MCP server that wraps a single API takes a weekend. Shipping a production-grade, multi-tenant MCP server that passes an enterprise security review takes months.

A McKinsey study found that over 70% of companies cite difficulty integrating AI with existing systems as a top challenge, slowing time-to-value and increasing technical debt. MCP simplifies the protocol layer, but it does not eliminate the hard engineering underneath.

OAuth Token Management at Scale

AI agents do not sleep, and they do not respect human working hours. If an agent tries to execute a background task at 3:00 AM and the OAuth access token has expired, your system must seamlessly refresh that token before executing the tool call.

Managing OAuth lifecycles at scale is notoriously difficult. Refresh tokens expire if they go unused for too long. Every SaaS vendor implements OAuth slightly differently — scopes, token lifetimes, refresh behavior — and each one requires testing. Worse, concurrent tool calls from an aggressive LLM can trigger race conditions where multiple threads attempt to refresh the same token simultaneously, invalidating the entire token chain and forcing the user to re-authenticate manually. You need distributed locks, durable state storage, and proactive token refresh scheduling just to keep connections alive.

Rate Limiting and Idempotency

LLMs are highly capable, but they are also prone to loops and retries. If an AI agent misunderstands an error response, it might attempt to call a create_record tool fifty times in three seconds.

If your MCP server simply proxies these requests directly to the destination API, you will instantly trigger third-party rate limits and burn through a customer's API quota in minutes. Worse, if the endpoint is not idempotent, the LLM might create fifty duplicate records in your customer's CRM. A production MCP server requires circuit breakers, exponential backoff logic, and request deduplication to protect both your infrastructure and the third-party API.

Flat Input Namespaces and Schema Mapping

When an MCP client calls a tool, all arguments arrive as a single, flat JSON object. Most REST APIs require parameters to be split between query strings, URL paths, and JSON bodies.

Your MCP server has to intelligently parse the flat input from the LLM and map it to the correct destination. If a query schema and a body schema both define a property with the same name (like id), your server needs deterministic logic to decide where that parameter belongs. Writing and maintaining these mapping schemas for hundreds of API endpoints is a massive drain on engineering resources.

The Cost Reality

Simple MCP servers are relatively affordable — basic database or API connections take 2–3 weeks with experienced developers. Enterprise integrations require serious investment: complex multi-system work runs 8–12 weeks plus security reviews and compliance documentation. Maintenance runs 20–30% annually.

MCP server development costs typically range from $25,000 to $50,000 for SMB MVP implementations and $60,000 to $120,000 for production-grade, multi-tenant SaaS deployments. Those numbers cover a single integration. If your product needs to connect to Salesforce, HubSpot, Jira, Zendesk, and a handful of HRIS platforms, you are multiplying that cost per integration — each with its own API quirks, auth flows, and pagination schemes.

Real case studies tell a consistent story. An e-commerce company connecting Shopify plus inventory: 4 weeks, 2 developers, with total first-year costs running 2–3x the initial estimate. A healthcare startup working with Epic and claims processing: 12 weeks, 3 developers, with first-year totals running 3–4x the initial quote once compliance documentation and security audits were factored in. The pattern is consistent: the protocol itself is simple; the production infrastructure around it is not.

For a deeper comparison, see our analysis of building versus buying MCP infrastructure.

How to Deploy AI-Ready Integrations Without Custom Code

Enterprises report a 70% reduction in AI operational costs and 50–75% savings in development time by using managed MCP platforms instead of building custom point-to-point integrations. This is where the build-vs-buy decision gets interesting. If integrations are a means to an end rather than your core product, the engineering hours poured into custom MCP servers are hours not spent on your actual product.

Modern integration platforms solve the MCP architecture problem by abstracting away the server infrastructure entirely. Instead of writing custom code for every tool, platforms dynamically generate MCP tools from existing API documentation and resource schemas.

Dynamic Tool Generation

Rather than hand-coding tool definitions, a managed platform derives them dynamically. The system reads the integration's resource definitions and OpenAPI documentation, extracting human-readable descriptions, query schemas, and body schemas.

flowchart LR
    A[Integration Config<br>base URL, auth, resources] --> C[Tool Generator]
    B[API Documentation<br>descriptions, JSON schemas] --> C
    C --> D[MCP Server Endpoint<br>JSON-RPC 2.0]
    D --> E[Claude / ChatGPT /<br>Custom Agent]

A tool only appears in the MCP server if it has a corresponding documentation entry. This acts as an automated quality gate: only curated, well-described endpoints get exposed to AI agents. The platform automatically injects helpful instructions into schemas — for instance, instructing the LLM to pass pagination cursors back exactly as received without modifying them. Adding a new tool is a configuration change, not a code deployment.

Scoping AI Access with Method Filters and Tool Tags

One of the biggest concerns PMs have about exposing APIs to AI agents is over-permissioning. An agent that can read contacts is useful; an agent that can accidentally delete your customer's entire CRM database is a liability.

Method filtering restricts what operations are allowed:

Filter Allows
read get, list only
write create, update, delete
custom Non-CRUD operations (e.g., search, export)
list Exact match — only list

You can combine these: ["read", "custom"] exposes get, list, and any custom methods while blocking create, update, and delete.

Tool tags restrict which resources are exposed. For example, a Zendesk integration might tag tickets and ticket_comments with support and users with directory. Creating an MCP server scoped to the support tag exposes only ticket-related tools, completely hiding administrative or billing endpoints from the LLM.

This granularity matters for enterprise security reviews. You can hand a customer's AI agent a read-only MCP server scoped to support tickets, and their security team can verify that the agent has no path to modifying CRM data or accessing employee records.

Tip

Architectural Best Practice: Always apply the principle of least privilege to MCP servers. If your AI agent only needs to summarize CRM data, configure the MCP server with a strict read-only method filter. Do not rely on the LLM's system prompt to enforce security boundaries.

Tenant-Scoped Security and Managed Auth

Every MCP server should be scoped to a single connected account — one customer's Salesforce instance, one customer's HubSpot workspace. The server URL contains a cryptographic token that encodes which account to use and what tools to expose, meaning the URL itself authenticates and authorizes every request.

The platform handles the entire OAuth lifecycle behind the scenes. When a tool is called, the platform retrieves the correct, unexpired access token, injects it into the request, and proxies the call to the destination API. The AI agent never sees or interacts with raw API keys.

For higher-security environments, you can layer additional authentication on top — requiring the MCP client to also present a valid API token. This two-layer approach means possession of the URL alone is not enough; the caller must also be authenticated.

For a practical example of this architecture in action, see how managed MCP for Claude handles auth and scoping.

The Enterprise MCP Roadmap: What's Coming in 2026

The MCP spec is not standing still, and the 2026 roadmap directly addresses the gaps that stall enterprise deployments.

Enterprise authentication (Q2 2026) adds OAuth 2.1 flows with PKCE for browser-based agents and SAML/OIDC integration for enterprise identity providers. This unlocks deployments in regulated industries that require enterprise-grade auth.

Agent-to-agent coordination (Q3 2026) enables one agent to call another through MCP, as if the second agent were a tool server. This creates hierarchical agent architectures where orchestrator agents delegate to specialized sub-agents.

MCP Registry (Q4 2026) provides a curated, verified server directory with security audits, usage statistics, and SLA commitments. Enterprise teams will evaluate servers against security requirements before deployment.

The 2026 roadmap validates MCP as a protocol growing from a developer tool into enterprise infrastructure. But the enterprise-readiness items are still pre-RFC. If you are deploying MCP in production today, the pragmatic approach is to architect for these gaps now and swap in the standardized versions as they land.

How to Decide: Build, Buy, or Wait?

The right answer depends on where integrations sit in your product strategy.

Scenario Recommendation
Integrations are your core product Build custom MCP servers. You need full control over tool definitions, schemas, and behavior.
Integrations support your core product (you need 10+ SaaS connectors) Use a managed platform. The cost of building and maintaining dozens of custom servers will outpace the platform cost within a quarter.
You're exploring AI features, single integration Start with an open-source MCP server for that specific vendor. Move to managed when you hit integration #3.
Enterprise customers are asking but you're pre-PMF Wait. Focus on your core product. Revisit when integration requests become a pattern.

The economics tilt heavily toward managed platforms once you cross the 3–5 integration threshold. Without MCP, a single custom integration between an AI model and an enterprise system typically costs €5,000–€15,000 with two to six weeks of development time. Across ten systems, that is €50,000–€150,000. With MCP you pay for server development once, and every additional AI application reuses the same servers. Most organizations hit break-even at the third or fourth integration.

With a managed integration platform like Truto, the math shifts further: you get dynamically generated MCP tools across 200+ SaaS platforms, managed OAuth and token refreshes, tenant-scoped security, and method/tag-based access control — without writing integration-specific code.

The honest caveat: managed platforms add a dependency. You trade engineering time for vendor risk. If the platform goes down, your AI integrations go down. That is a real trade-off, and one you should evaluate against your team's capacity and your product's integration ambitions.

What This Means for Your 2026 Roadmap

MCP is not a trend that reverses. Within a year, it moved from a niche experiment to Linux Foundation governance, with adoption across every major AI platform and toolchain. Here is what to do next:

  1. Audit your API surface. Identify which resources and methods in your product are most valuable to AI agents. Start with read-only operations on your most-requested data objects.
  2. Define your scoping model. Decide which tools should be exposed by default and which require explicit opt-in. Method filters and tag-based grouping give you fine-grained control.
  3. Pick your build path. If you have a single, core integration, build it. If you are connecting to multiple SaaS platforms on behalf of your customers, evaluate managed MCP platforms against the true cost of custom development.
  4. Ship read-only first. Give your customers' AI agents read access before write access. This limits the blast radius of mistakes while you learn how agents actually use your tools.
  5. Layer enterprise security controls. Scope each MCP server to a single tenant's connected account. Add optional API token authentication for high-security environments and set expiration policies for temporary access scenarios.

The companies that expose their product to AI agents in 2026 will be the ones that close enterprise deals in 2027. The ones that wait will be explaining to their board why their competitor's product shows up in every Claude and ChatGPT workflow and theirs does not.

Frequently Asked Questions

What is an MCP server in simple terms?
An MCP server is a service that exposes your application's capabilities (reading data, creating records, triggering actions) to AI models like Claude or ChatGPT through the Model Context Protocol. Think of it as a universal adapter between your API and any AI agent — build one server, and it works with every MCP-compatible client.
What is the difference between a local and remote MCP server?
A local MCP server runs on the user's machine and communicates via stdio, making it ideal for development and personal use. A remote MCP server runs on cloud infrastructure, is accessible via HTTPS, supports multi-user access and centralized auth (OAuth 2.1), and is the standard choice for production B2B SaaS deployments.
How much does it cost to build a custom MCP server?
Simple API connectors take 2-3 weeks and cost $25,000-$50,000. Enterprise-grade, multi-tenant MCP servers with compliance documentation typically run $60,000-$120,000 and take 8-12 weeks. Annual maintenance adds 20-30% of the initial build cost, and real-world first-year totals consistently run 2-4x the initial estimate.
Is MCP the same as LLM function calling?
No. LLM function calling is a vendor-specific mechanism (e.g., OpenAI's function calling API) for defining tool schemas within a single platform. MCP is a vendor-neutral, open protocol that standardizes tool discovery and execution across all compatible AI platforms, eliminating the need to build separate integrations for each AI vendor.
What is the difference between an MCP client and an MCP server?
The MCP client lives inside the AI application (like Claude Desktop or ChatGPT) and manages the connection — discovering tools, validating inputs, and routing requests. The MCP server is the service that hosts the actual tools and executes API calls against external systems like Salesforce, Jira, or HubSpot.

More from our Blog

Best MCP Server for Attio in 2026
AI & Agents

Best MCP Server for Attio in 2026

Compare the best MCP servers for Attio CRM in 2026. Open-source vs. Attio's hosted MCP vs. Truto's managed server — with setup guides for Claude, ChatGPT, and custom agents.

Uday Gajavalli Uday Gajavalli · · 12 min read