Skip to content

Why Schema Normalization is the Hardest Problem in SaaS Integrations

API schema normalization is the hardest problem in SaaS integrations. Learn why standard 1:1 key-value mapping fails and how programmable JSONata mapping fixes it.

Roopendra Talekar Roopendra Talekar · · 6 min read
Why Schema Normalization is the Hardest Problem in SaaS Integrations

API schema normalization is the process of translating disparate data models from different third-party APIs into a single, canonical JSON format.

It is the hardest problem in B2B product integrations because software vendors fundamentally disagree on how to model reality. Building a custom integration is rarely just mapping first_name to firstName. It involves dealing with pagination quirks, undocumented rate limits, and conflicting entity relationships. Industry data shows that maintaining a single custom integration costs engineering teams between $50,000 and $150,000 annually.

Why API Data Mapping is a Nightmare

To understand why forcing competing data structures into a common shape is difficult, look at how different platforms define a "Contact."

In HubSpot, a Contact is a relatively flat object. In Salesforce, the schema is a sprawling web of standard and custom objects. Salesforce allows external IDs for integration matching, while HubSpot requires unique property constraints. Salesforce tracks modifications via LastModifiedDate, while HubSpot uses properties.hs_lastmodifieddate.

Consider ticketing systems. Zendesk views the world through a customer support lens, dividing users into Agents and Requesters, and even splitting its API into a Tickets API and a Requests API. Jira Service Management views the world through an agile engineering lens, tracking Issues with highly customizable workflows.

When you build direct integrations, your codebase absorbs this domain complexity. You end up with brittle conditional logic scattered across your application.

The Standard Unified API Flaw: Dumb Key-Value Mapping

Most Unified API providers claim they solve this complexity. They don't.

When competitors talk about "custom field mapping," they are usually referring to a basic UI that lets you map vendor_field_A to unified_field_B. It is a rigid, 1:1 key-value mapping layer.

This works for exactly five minutes until you hit the real world. What happens when a customer's CRM stores a full name, but your app requires first_name and last_name? What if a record's status is represented as an integer (1, 2, 3) in the vendor API, but your system expects a string ("active", "pending", "closed")?

A simple key-value mapper completely falls over here. It cannot handle data transformations. You are forced to pull the raw data and write custom transformation scripts on your end anyway, defeating the entire purpose of paying for a unified API.

A Better Architecture: Interface, Proxy, and Mapping

At Truto, we handle this through a Unified API architecture that cleanly separates three concerns:

  1. The Unified Interface: The canonical JSON Schema developers interact with.
  2. The Proxy Layer: The underlying transport engine that handles raw HTTP requests, authentication, and pagination.
  3. The Mapping Layer: The translation engine between the two.
Info

What is an Integration Mapping? An integration mapping is a configuration object that tells the routing engine how to translate between the canonical format and a specific integration's native format. It covers request queries, bodies, headers, paths, and responses.

When a request hits GET /unified/crm/contacts, our router does not immediately call the vendor. Middleware resolves the integrated account credentials, finds the unified model, loads the integration mapping, and extracts the resource schema. Only then does the core orchestration engine take over.

Programmable Response Mapping with JSONata

At Truto, we don't do dumb key-value mapping. Instead of writing custom Node.js or Python scripts for every endpoint, we use JSONata expressions evaluated at runtime.

This gives you a highly programmable transformation layer. You can execute actual logic: conditionals, string concatenation, array transforms, and date formatting directly in the mapping layer.

For example, mapping a HubSpot contact isn't just a 1:1 field link. The response mapping looks like this:

response.{
  "id": $string(id),
  "first_name": properties.firstname,
  "last_name": properties.lastname,
  "name": properties.firstname & ' ' & properties.lastname,
  "status": properties.is_active ? "active" : "inactive",
  "created_at": $fromMillis(properties.createdate * 1000)
}

Notice the logic. We are concatenating strings to build a full name, running a ternary operator to normalize the status, and doing math to format the Unix timestamp.

For Salesforce, the same unified "contacts" resource uses a different mapping:

response.{
  "id": $string(Id),
  "first_name": FirstName,
  "last_name": LastName,
  "name": Name,
  "status": IsDeleted ? "inactive" : "active",
  "created_at": CreatedDate
}

Both produce the exact same unified output shape, despite the completely different field names and data types in the raw APIs.

The Superpower: Overriding the Mapping Logic

Because our mapping layer is programmable, you can modify the mapping logic yourself.

Truto includes an Override System that lets you customize mappings at the environment level or the individual integrated account level. If your enterprise customer has a heavily customized Salesforce setup with complex conditional data, you don't have to wait for us to support it.

You simply inject an override JSONata expression that handles their specific logic and maps it directly to a first-class property in your unified schema. The override is deep-merged with the base mapping at runtime.

Request Mapping: Translating the Query and Body

Normalization goes both ways. When you send a unified GET request with pagination and sorting, the request mapper translates your unified query parameters into the vendor's format.

A JSONata query mapping for pagination and sorting looks like this:

{
  "page_size": query.limit ? $number(query.limit) : 100,
  "after": query.next_cursor,
  "sort": query.sort_by ? query.sort_by : "created_at",
  "order": query.sort_order ? query.sort_order : "desc"
}

For POST and PATCH requests, body mapping uses similar JSONata expressions. The mapped body is deep-merged with the default body from the integration mapping config, ensuring default values are preserved unless explicitly overridden.

Handling Structural Mismatches

Simple field mapping is the easy part. Structural mismatches are where custom integrations usually break down. Not all APIs use a single endpoint for a given resource.

Dynamic Resource Resolution

Sometimes, an integration requires calling different endpoints depending on the query parameters or request body contents. Our routing engine handles this using conditional arrays.

For example, an HRIS integration might route a request to list employees to different endpoints based on employment type:

{
  "resource": [
    { "resource": "employees/full-time", "query_param": "type", "query_param_value": "full_time" },
    { "resource": "employees/contractors", "query_param": "type", "query_param_value": "contractor" },
    { "resource": "employees" }
  ]
}

Multi-Step Workflows

Some operations cannot be achieved with a single API call. We use a step runner to execute before and after step sequences. These are chains of proxy or unified API calls with conditional logic (run_if). Steps can reference data from previous steps, making it possible to build complex, multi-call workflows without writing custom code.

Fetching a contact often means you need their associated company. Truto uses a related_resources configuration to make subsequent API calls and join the results. The engine builds query parameters from the primary results, fetches the related data via the proxy layer, and merges them using equality (eq) or contains (in) operators.

The Reality Check: Preserving the Escape Hatch

Abstractions leak. No unified model can predict every edge case. If an API provider tells you their schema covers 100% of scenarios, they are lying. The honest truth is that sometimes you just need the raw data.

This is why we built an escape hatch directly into the response object. Every mapped result in Truto includes a remote_data field containing the raw, unmodified third-party response. If you need a hyper-specific field that you haven't mapped yet, it is right there.

{
  "result": [
    {
      "id": "123", 
      "name": "John Doe", 
      "status": "active",
      "remote_data": {
        "Id": "123",
        "Name": "John Doe",
        "IsDeleted": false,
        "Custom_Industry_Field__c": "Manufacturing"
      }
    }
  ],
  "next_cursor": "abc987",
  "prev_cursor": null,
  "result_count": 1
}

Beyond that, the Unified API sits on top of our Proxy API (/proxy/*). If you need to bypass the unified schema entirely and make a raw, authenticated call to the vendor's native endpoint, the underlying transport layer allows it.

Strategic Wrap-up

Building integrations is a choice between control and velocity. Hardcoding vendor-specific logic into your app gives you control but destroys velocity.

Stop rebuilding the same mapping logic for every new customer request, and stop settling for unified APIs that rely on rigid 1:1 key-value mappers. Use a unified architecture that lets you execute real logic and override mappings yourself, while preserving the raw data escape hatches you inevitably need.

FAQ

What is the problem with standard custom field mapping in unified APIs?
Most competitors offer a basic 1:1 key-value mapping UI. This falls apart when you need to transform data—like combining a first and last name, formatting a timestamp, or converting an integer status to a string. It forces you to write custom transformation code anyway.
How does Truto's mapping logic differ from competitors?
Truto uses JSONata expressions instead of simple key-value pairs. This allows you to execute actual logic directly in the mapping layer, including conditionals, string concatenation, array transforms, and date formatting.
Can I modify the unified API mapping logic for specific customers?
Yes. Truto's Override System allows you to inject custom JSONata expressions at the environment or individual account level, letting you handle heavily customized vendor instances without waiting for platform updates.

More from our Blog

What is a Unified API?
Educational

What is a Unified API?

Discover what a unified API is and how it normalizes data across SaaS platforms to accelerate your integration roadmap and reduce engineering overhead.

Uday Gajavalli Uday Gajavalli · · 8 min read