---
title: "Bidirectional HubSpot Sync Tutorial: Rate Limits, Loops, & Reconciliation"
slug: bidirectional-hubspot-sync-tutorial-rate-limits-reconciliation
date: 2026-05-27
author: Yuvraj Muley
categories: [Engineering, Guides, By Example]
excerpt: "Learn how to architect a reliable bidirectional HubSpot sync. This technical guide covers handling 429 rate limits, preventing infinite webhook loops, and schema normalization."
tldr: "Bidirectional HubSpot integrations fail due to strict 429 rate limits, complex schemas, and infinite webhook loops. Prevent this by separating Search queues, using compare-before-write fingerprinting, and reconciling drift on a schedule."
canonical: https://truto.one/blog/bidirectional-hubspot-sync-tutorial-rate-limits-reconciliation/
---

# Bidirectional HubSpot Sync Tutorial: Rate Limits, Loops, & Reconciliation


You are reading this because your bidirectional HubSpot integration is failing in production. You are hitting HTTP 429 Too Many Requests errors, your webhooks are creating infinite update loops, and your engineering team is spending cycles debugging schema mapping instead of building core product features.

One-way data pushes are simple. True bidirectional sync—where both your application and HubSpot can create, update, and delete records simultaneously without data loss, loops, or duplicates—is an entirely different architectural challenge. Shipping one is mostly an exercise in handling the parts HubSpot doesn't make easy: the tiered rate limits that surface mid-sync, the webhook echoes you create the moment you wire both directions, and the dynamic `properties` schema that turns field mapping into a translation project.

Stale CRM data is a revenue problem, not just an engineering annoyance. B2B contact data decays at approximately 2.1% per month, which compounds to roughly 22.5% annually, meaning nearly a quarter of your database could be outdated within a year. For a tech-heavy customer base, it is even worse. Gartner research indicates that B2B contact data can decay by as much as 70.3% per year under certain conditions, reflecting varying industry dynamics, job market volatility, and organizational restructuring. If your product usage data, lead scores, and engagement signals do not sync reliably with the CRM in near real-time, sales reps operate on stale data, directly impacting revenue.

This guide is a runnable walkthrough. We break down the hardest problems in bidirectional CRM syncs, complete with a real rate limiter that reads standardized headers, a reconciliation worker that breaks infinite loops with fingerprinting, and an honest read on which parts a unified API actually removes versus which parts you still own.

## The Architecture of a Bidirectional HubSpot Sync

Before writing any code, we must define what a bidirectional sync actually requires at the system level. A naive implementation simply listens for webhooks from HubSpot and fires HTTP requests back whenever local data changes. This approach will fail within the first week of production traffic.

A production-grade bidirectional sync requires four distinct architectural components:

1. **An Ingestion Layer:** Webhooks will arrive late, out of order, or duplicated. Your system must reconcile incoming state against local state before applying updates.
2. **A Rate-Aware Outbound Queue:** You cannot fire API requests synchronously in response to user actions. You must buffer outbound updates and drain them according to the provider's specific burst limits.
3. **A State Store with Content Fingerprints:** You must have a deterministic way to recognize if an incoming webhook was triggered by your own system's previous API call.
4. **A Reconciliation Worker:** A background process that periodically diffs local state against remote state to catch the inevitable dropped webhooks and network partitions.

```mermaid
flowchart LR
  A[Your App<br>Write Event] --> Q1[Outbound Queue]
  Q1 --> RL[Rate Limiter<br>token bucket]
  RL --> HS[HubSpot API]
  HS -- 429 + ratelimit-reset --> RL
  HS -- webhook --> IN[Inbound Handler]
  IN --> FP{Fingerprint<br>match?}
  FP -- yes --> DROP[Drop: self-loop]
  FP -- no --> APP[Apply to Your App]
```

If you lack any of these components, you will experience data corruption, API throttling, or total system lockup. For a deeper dive into these core concepts, review our guide on [how to architect a bidirectional HubSpot sync](https://truto.one/how-to-sync-customer-data-bidirectionally-between-your-app-and-hubspot/).

## Handling HubSpot's 429 Rate Limits Without Losing Data

A HubSpot 429 Too Many Requests error occurs when your application exceeds the API's burst limits. HubSpot uses a token bucket measured over a 10-second rolling window. The exact ceiling depends on the app type and the customer's plan:

| App / Plan | Burst limit | Source |
|---|---|---|
| Public / Marketplace OAuth app | 110 req / 10s | HubSpot usage guidelines |
| Private app, Free/Starter | 100 req / 10s | HubSpot community confirmation |
| Private app, Pro/Enterprise | 190 req / 10s | HubSpot changelog |
| Private app with API Limit Increase pack | 250 req / 10s | HubSpot changelog |
| CRM Search API (account-wide) | 5 req / sec | HubSpot changelog |

For apps using OAuth authentication distributed via the HubSpot marketplace, each HubSpot account that installs your app is limited to 110 requests every 10 seconds. Private apps on larger plans get more headroom, with burst limits up to 190 requests per 10 seconds, and the API Limit Increase capacity pack adds 1 million requests per day on top of that.

### The Search API Bottleneck

Many engineering teams build their sync logic assuming the general burst limit applies universally. It does not. The HubSpot Search API—which you must use if you are looking up records by email or custom properties before updating them—is aggressively throttled.

The burst limit for the search API is capped at 5 requests per second, with a maximum supported object count of 200 records per response. That 5 rps cap is account-level: if 10 users hit the Search API at the same second, the account is still capped at 5 rps, so some calls will 429. If you deduplicate contacts by email before inserting them, that is a Search call. At scale, you must maintain **separate queue concurrency limits** for search operations versus standard CRUD operations.

### Standardizing Rate Limit Headers

Handling backoff logic across multiple integrations usually requires writing custom interceptors for every provider. HubSpot returns different rate limit headers than Salesforce, which returns different headers than Zendesk.

> [!WARNING]
> **Truto does not magically retry or absorb HTTP 429 errors for you.** When HubSpot returns a rate limit error, Truto surfaces it directly to your caller. However, it normalizes the upstream rate limit headers into the IETF-standard `ratelimit-limit`, `ratelimit-remaining`, and `ratelimit-reset` headers across every provider. Your worker reads those headers; your worker does the backoff. This is the right architectural split: only your code knows your priority order and SLA.

Here is a robust Node.js limiter that reads the standardized headers and respects the reset window. It works exactly the same whether the upstream is HubSpot, Salesforce, or Pipedrive.

```typescript
import pRetry, { AbortError } from 'p-retry';

async function callWithBackoff(
  fn: () => Promise<Response>,
  opts: { maxAttempts?: number } = {}
): Promise<Response> {
  return pRetry(async () => {
    const res = await fn();

    // Read the normalized IETF headers from Truto
    const remaining = Number(res.headers.get('ratelimit-remaining') ?? '1');
    const resetSec = Number(res.headers.get('ratelimit-reset') ?? '1');

    if (res.status === 429) {
      // Wait until the bucket resets, plus jitter to avoid thundering herd
      const waitMs = (resetSec * 1000) + Math.floor(Math.random() * 250);
      console.warn(`[429 Throttled] Retrying in ${waitMs}ms...`);
      await new Promise(r => setTimeout(r, waitMs));
      throw new Error('rate_limited'); // triggers pRetry to attempt again
    }

    if (res.status >= 500) throw new Error(`upstream_${res.status}`);
    if (res.status >= 400) throw new AbortError(`client_${res.status}`);

    // Pre-emptive slowdown if we're about to empty the bucket
    if (remaining <= 2) {
      await new Promise(r => setTimeout(r, (resetSec * 1000) / 2));
    }
    
    return res;
  }, { retries: opts.maxAttempts ?? 6, factor: 2, minTimeout: 250 });
}
```

Two design notes engineers regularly get wrong:
1. **Run two queues, not one.** If you process search calls in the same queue as general writes at 8 rps, the search calls will 429 while your general calls look fine.
2. **Read the bucket before it empties.** Once `ratelimit-remaining` drops to 1 or 2, slow down voluntarily. Waiting for the 429 wastes a round trip and pollutes your error logs.

For more context on managing these limits across different platforms, see our [best practices for handling API rate limits](https://truto.one/best-practices-for-handling-api-rate-limits-and-retries-across-multiple-third-party-apis/).

## Preventing 'Vampire Records' and Infinite Webhook Loops

**A [vampire record](https://truto.one/the-architects-guide-to-bi-directional-api-sync-without-infinite-loops/) is a CRM record that bounces infinitely between two systems in a bidirectional sync.** Your app updates a contact. You push the update to HubSpot. HubSpot fires a `contact.propertyChange` webhook. Your webhook handler treats it as a remote update and writes it back to your database. Your database mutation fires your internal event bus. Your sync worker pushes the same update back to HubSpot. This loop continues forever.

```mermaid
sequenceDiagram
    participant App as Your App
    participant HS as HubSpot
    
    Note over App,HS: The Infinite Loop (Vampire Record)
    App->>HS: PATCH /contacts/123 (Update Title)
    HS-->>App: Webhook: contact.updated
    Note over App: App receives webhook,<br>updates local DB
    App->>HS: PATCH /contacts/123 (Triggered by DB update)
    HS-->>App: Webhook: contact.updated
    Note over App: Loop continues infinitely...
```

If left unchecked, an infinite loop will rapidly consume your API quota, trigger massive 429 errors, and flood your database with meaningless audit logs. Three architectural patterns actually break this loop in production:

### Strategy 1: Content Fingerprinting (Compare-Before-Write)

Before writing an update to your database or sending an update to HubSpot, you hash the normalized payload. You store this hash in a fast key-value store like Redis, or directly on the database record. When a webhook arrives, you hash the incoming data and compare it to the last known outbound hash. If they match, the data has not actually changed semantically, and you drop the event.

```typescript
import { createHash } from 'crypto';

function generateRecordFingerprint(record: Record<string, unknown>): string {
  // Canonicalize keys so { a:1, b:2 } and { b:2, a:1 } match perfectly
  const canonical = JSON.stringify(record, Object.keys(record).sort());
  return createHash('sha256').update(canonical).digest('hex');
}
```

*Crucial detail:* Fingerprint only the fields you actually sync. Server-side fields like `hs_lastmodifieddate` or computed lead scores will always diverge and produce false negatives, breaking your deduplication logic.

### Strategy 2: Dedicated Integration User Filtering

The cleanest mechanical way to prevent loops is to use a dedicated integration user (e.g., `sync-worker@yourcompany.com`). Ensure all API writes happen via this specific user identity. On inbound webhooks, inspect the payload. If the `modified_by` or `updatedByUserId` matches your integration user, drop the webhook immediately. 

However, if you cannot guarantee a dedicated integration user (which is common in multi-tenant OAuth setups where end-users authorize the app via their personal accounts), you cannot rely on this strategy alone.

### Strategy 3: Origin Tokens in Custom Properties

Write a short `sync_origin` property on every outbound update. On inbound, if the property matches your worker's identifier and the timestamp is within a few seconds of your write, drop it. This handles edge cases where the integration user attribution is lost by HubSpot workflows.

> [!TIP]
> Use **both** fingerprinting and integration user filtering together. Fingerprinting catches semantic echoes ("the data is the same"), and user filtering catches mechanical echoes ("we just wrote this"). Neither is sufficient alone. Our cookbook on [preventing infinite loops in bidirectional API syncs](https://truto.one/how-to-prevent-infinite-loops-in-bidirectional-api-syncs-a-developers-cookbook/) goes deeper into edge cases like burst writes within the same property-change debounce window.

## Normalizing HubSpot's Schema and filterGroups

If you are building the integration from scratch, rate limits and loops are only half the battle. The other half is schema mapping.

HubSpot's API returns data in a nested `properties` object. Instead of flat fields like `{ "first_name": "John" }`, you receive `{ "id": "123", "properties": { "firstname": "John", "hs_additional_emails": "john@example.com", "arr_estimate__c": "50000" } }`. Custom properties carry no marker; they are mixed into the same blob as standard fields.

Search is worse. To search for a contact, you must construct a complex `filterGroups` array using specific operators. If you want to build a string-contains filter for a first name, you must write something like this:

```json
{
  "filterGroups": [{
    "filters": [
      { "propertyName": "firstname", "operator": "CONTAINS_TOKEN", "value": "John" }
    ]
  }],
  "properties": ["firstname", "lastname", "email"],
  "limit": 100
}
```

If you hardcode this logic into your application, you now own integration-specific code. When you inevitably need to add Salesforce support, you will have to write entirely new logic to handle Salesforce's PascalCase flat fields and SOQL query syntax.

### The Truto Approach: JSONata Abstraction

Truto pushes this translation into configuration. Every integration is a set of declarative mappings (JSONata expressions stored as data) that translate between a flat, unified model and the provider's shape. 

Your application simply sends a standard REST request:

```http
GET /unified/crm/contacts?first_name=John&limit=10
```

Behind the scenes, Truto's internal JSONata configuration translates this into HubSpot's `filterGroups` payload automatically. Likewise, when HubSpot responds, Truto normalizes the deeply nested properties. A simplified slice of the HubSpot contact response mapping looks like this:

```yaml
response_mapping: >-
  (
    $defaultProperties := ["firstname", "lastname", "email",
      "phone", "mobilephone", "jobtitle"];
    $custom := $difference($keys(response.properties), $defaultProperties);
    {
      "id": response.id,
      "first_name": response.properties.firstname,
      "last_name": response.properties.lastname,
      "email_addresses": [
        response.properties.email ? { "email": response.properties.email, "is_primary": true },
        response.properties.hs_additional_emails
          ? response.properties.hs_additional_emails.$split(";").{ "email": $ }
      ],
      "custom_fields": response.properties.$sift(function($v, $k) { $k in $custom })
    }
  )
```

Your application never sees `hs_additional_emails`; it only sees the standardized `email_addresses` array. For a detailed breakdown of how this mapping layer works, review our [2026 architecture guide to building HubSpot integrations](https://truto.one/how-do-i-build-a-hubspot-integration-2026-architecture-guide/).

## Step-by-Step: Building the Reconciliation Engine

Let us tie these concepts together into a runnable webhook reconciliation engine. This Node.js worker listens for Truto unified webhooks, verifies the security signature, checks the fingerprint to prevent loops, queues the update, and runs periodic drift reconciliation.

### Step 1: Webhook Ingestion and Signature Verification

Truto delivers outbound webhooks via a queue and signs them using HMAC SHA-256 over the raw body. You must verify the `X-Truto-Signature` header before parsing the JSON to ensure the payload is authentic.

```typescript
import { createHmac, timingSafeEqual } from 'crypto';
import express from 'express';

const app = express();
// Use raw body parser to preserve the exact payload for signature verification
app.use(express.raw({ type: 'application/json' }));

app.post('/webhooks/truto', (req, res) => {
  const sig = req.header('x-truto-signature') ?? '';
  
  const expected = createHmac('sha256', process.env.TRUTO_WEBHOOK_SECRET!)
    .update(req.body) // Pass the raw Buffer, not the parsed object
    .digest('hex');

  // Use timingSafeEqual to prevent timing attacks
  const ok = sig.length === expected.length &&
    timingSafeEqual(Buffer.from(sig), Buffer.from(expected));
    
  if (!ok) {
    console.error('Invalid webhook signature');
    return res.status(401).end();
  }

  const event = JSON.parse(req.body.toString('utf8'));
  
  // Acknowledge receipt immediately so Truto does not retry
  res.status(202).end(); 
  
  // Process asynchronously
  enqueueForProcessing(event);
});
```

The 202-then-process pattern matters. If your handler does the database write inline, you will block the HTTP response while you wait on HubSpot's rate limiter. Truto's outbound delivery layer will assume a timeout, retry the webhook, and double your work.

### Step 2: Drop Self-Echoes (The Reconciliation Worker)

Once the webhook is queued, the worker extracts the normalized data and checks for infinite loops using our fingerprinting function.

```typescript
async function processEvent(event: any) {
  // Truto normalizes the event type
  if (event.event !== 'record:updated' || event.resource !== 'contacts') {
    return; 
  }

  const contactData = event.data;
  const incomingFp = generateRecordFingerprint(pickSyncedFields(contactData));
  
  // Fetch local state
  const local = await db.contacts.findByRemoteId(contactData.id);

  if (local?.last_outbound_fingerprint === incomingFp) {
    console.log(`[Sync] Dropping self-echo for ${contactData.id}`);
    return; // Drop the event, it's a vampire record echo
  }

  try {
    await applyToLocalDatabase(local, contactData);
    console.log(`Successfully reconciled contact ${contactData.id} from HubSpot.`);
  } catch (error) {
    console.error(`Failed to update local DB:`, error);
  }
}
```

### Step 3: Write Back Through the Rate-Limited Client

When your application updates a contact, fingerprint the payload **before** you write to HubSpot, and store the hash atomically with the local update.

```typescript
async function pushContactToHubSpot(contactId: string) {
  const contact = await db.contacts.findById(contactId);
  const payload = toUnifiedShape(contact);
  const fp = generateRecordFingerprint(pickSyncedFields(payload));

  const url = `https://api.truto.one/unified/crm/contacts/${contact.remote_id}?integrated_account_id=${contact.integrated_account_id}`;

  const apiCall = () => fetch(url, {
    method: 'PATCH',
    headers: {
      'Authorization': `Bearer ${process.env.TRUTO_API_KEY}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify(payload),
  });

  try {
    // Execute using our rate-aware backoff function from earlier
    const response = await callWithBackoff(apiCall);
    
    if (response.ok) {
      // Store the fingerprint so the resulting webhook echo is ignored
      await db.contacts.update(contactId, { last_outbound_fingerprint: fp });
    }
  } catch (error) {
    console.error('Failed to push update to HubSpot:', error);
  }
}
```

### Step 4: Periodic Reconciliation

Webhooks drop. Networks partition. HubSpot occasionally experiences regional outages. You must run a low-frequency reconciliation job that pulls modified contacts since the last cursor and diffs them against local state. This is your safety net for everything the event stream missed.

```typescript
async function reconcileDrift(accountId: string, sinceIsoString: string) {
  let cursor: string | undefined;
  
  do {
    const url = new URL('https://api.truto.one/unified/crm/contacts');
    url.searchParams.set('integrated_account_id', accountId);
    url.searchParams.set('updated_after', sinceIsoString);
    url.searchParams.set('limit', '100');
    if (cursor) url.searchParams.set('next_cursor', cursor);

    const res = await callWithBackoff(() => fetch(url, {
      headers: { 'Authorization': `Bearer ${process.env.TRUTO_API_KEY}` },
    }));
    
    const { result, next_cursor } = await res.json();

    for (const remote of result) {
      const local = await db.contacts.findByRemoteId(remote.id);
      const remoteFp = generateRecordFingerprint(pickSyncedFields(remote));
      
      if (local?.last_outbound_fingerprint !== remoteFp) {
        await applyToLocalDatabase(local, remote);
      }
    }
    cursor = next_cursor;
  } while (cursor);
}
```

Run this hourly per tenant during business hours and daily for cold accounts. The cost is one cursor-paginated call per active account; the benefit is silent drift detection and guaranteed eventual consistency.

## Why You Shouldn't Build This From Scratch

Building a bidirectional sync is an exercise in distributed systems engineering. Handling HTTP requests is trivial. Building durable queues, managing exponential backoff, normalizing deeply nested proprietary JSON schemas, and preventing infinite webhook loops requires months of dedicated engineering time.

The code above is real, but it is only the smallest version of the problem. The full surface area adds OAuth token refresh logic that handles HubSpot's 30-day token rotation, association graph traversal when a contact change should trigger company updates, and incident response when HubSpot ships an undocumented API change.

The pragmatic split: own your sync logic—the parts that encode your product's meaning of "a contact updated." Outsource the transport, the parts that are the same for every customer.

By leveraging a unified API architecture like Truto, you remove the transport layer: schema normalization via JSONata, OAuth lifecycle management, standardized rate limit headers across all providers, and signed unified webhooks. Truto does not remove your responsibility for retry/backoff (you read the headers and decide), loop prevention (you own the fingerprint store), or your product semantics. That is the correct boundary.

## Where to Take This Next

If you are about to ship a HubSpot sync, prioritize these three things this week:

1. **Decide your queue topology.** Set up a general API queue and a dedicated 5 rps Search queue per HubSpot account. Anything less and you will 429 in week two.
2. **Pick a loop-breaker.** Content fingerprinting is the most robust strategy; pair it with an integration user filter for cheap mechanical deduplication.
3. **Add reconciliation before you need it.** Hourly cursor-based pulls cost almost nothing and save you from every webhook outage that you will eventually have.

Stop burning engineering cycles on API quirks and rate limits. The work that pays off long-term is choosing where your team's effort goes: into the integration logic that is unique to your product, or into the boilerplate that everyone solves the same way.

> Want to see how Truto's unified CRM API, standardized rate limit headers, and signed unified webhooks fit into your existing architecture? Book 30 minutes with our team and walk through your sync design.
>
> [Talk to us](https://cal.com/truto/partner-with-truto)
