Skip to content

Creating a RapidBridge Sync Job

A RapidBridge Sync Job first needs a webhook endpoint to send data to.

Creating a webhook

Use the following request to create a webhook endpoint,

bash
curl --location 'https://api.truto.one/webhook' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <api_token>' \
--data '{
    "target_url": "https://webhook.site/6a7cc86e-9286-4be8-ba79-bcd8dfdbeee1",
    "is_active": true,
    "event_types": 
      [
        "sync_job_run:created", 
        "sync_job_run:updated", 
        "sync_job_run:started",
        "sync_job_run:completed",
        "sync_job_run:failed",
        "sync_job_run:deleted",
        "sync_job_run:record",
        "sync_job_run:record_error",
        "sync_job_run:rate_limited",
        "integrated_account:created",
        "integrated_account:active",
        "integrated_account:post_connect_form_submitted"
    ]
}'

You can also follow our more detailed guide into creating webhooks.

Creating a Sync Job

In this guide, we'll be syncing users, contacts, tickets and comments for each ticket from Zendesk using the Unified API for Ticketing.

Use the following request to create a Sync Job,

bash
curl --location 'https://api.truto.one/sync-job' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <api_token>' \
--data '{
    "integration_name": "zendesk",
    "resources": [
        {
            "resource": "ticketing/users",
            "method": "list"
        },
        {
            "resource": "ticketing/contacts",
            "method": "list"
        },
        {
            "resource": "ticketing/tickets",
            "method": "list"
        },
        {
            "resource": "ticketing/comments",
            "method": "list",
            "depends_on": "ticketing/tickets",
            "query": {
                "ticket_id": "{{resources.ticketing.tickets.id}}"
            }
        }
    ]
}'

In the request above,

  • integration_name is the identifier of the integration that the sync job is for. In this case Zendesk, so it's value is set to zendesk.
  • resources is the list of resources to fetch from Zendesk. Each item in the list has the following schema,
    • resource (required) is the name of the Unified API resource or the Proxy API resource. For Unified APIs, use the format unified_api_name/resource_name and for the Proxy APIs, just use resource_name.
    • method (required) can be list or get for Unified APIs. For Proxy APIs, it can be list, get or any other read-like custom method.
    • depends_on (optional) creates a dependency between this resource and some other resource in the resources list. In the example above, ticketing/comments resource needs a ticket_id query parameter to fetch the comments for, so we first fetch the list of tickets and then for each ticket, we fetch the comments.
    • query (optional) are the query parameters to be passed to each request. Placeholders can be used to populate the query parameters dynamically, like in the example above, ticketing/comments uses the {{resources.ticketing.tickets.id}} placeholder to refer to the id property of a Unified Ticket Resource and set it as the ticket_id. Refer the placeholder reference section for more details on the placeholders that can be used.

Running a Sync Job

Now that we have created a Webhook and a Sync Job, we can execute the Sync Job, create a Sync Job Run.

Make sure you have a Zendesk Integrated Account already created. Checkout our guide to connect an account.

To create a Sync Job Run, execute the following request,

bash
curl --location 'https://api.truto.one/sync-job-run' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <api_token>' \
--data '{
    "sync_job_id": "7279a917-b447-4629-9e46-a1eeb791ad6b",
    "integrated_account_id": "7ae7b0ab-c6a7-4f29-aec1-1f123517af5d",
    "webhook_id": "a5b21886-3b4d-4fd0-9956-ffc0714d701c"
}'

This should start executing the Sync Job and the Webhook endpoint should start receiving the events.

In the request above,

  • sync_job_id is the id of the Sync Job we created in the previous step
  • integrated_account_id is the id of the Integrated Account connected to a Zendesk account
  • webhook_id is the id of the Webhook created in the first step.

Error handling

Truto by default ignores any errors that occur during a Sync Job Run and continues with the next resource, sending you sync_job_run:record_error webhook events for each error encountered. This can be changed by setting the error_handling attribute in the Sync Job Run request to fail_fast. This will cause the Sync Job Run to fail as soon as an error occurs. The default value of error_handling is ignore.

bash
curl --location 'https://api.truto.one/sync-job-run' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <api_token>' \
--data '{
    "sync_job_id": "7279a917-b447-4629-9e46-a1eeb791ad6b",
    "integrated_account_id": "7ae7b0ab-c6a7-4f29-aec1-1f123517af5d",
    "webhook_id": "a5b21886-3b4d-4fd0-9956-ffc0714d701c",
    "error_handling": "fail_fast"
}'

Incremental syncing of data

By default, the Sync Job above will fetch all the objects for a resource in every Sync Job Run, i.e. all the tickets will be synced on every Sync Job Run. In most cases, an incremental way of syncing would be preferred, where only tickets that have changed from the time the previous Sync Job ran.

To do this, updated_at query parameter of the ticketing/tickets Unified API resource can be used, bound to previous_run_date. The binding can be created like so,

json
{
    "resource": "ticketing/tickets",
    "method": "list",
    "query": {
      "updated_at": {
        "gt": "{{previous_run_date}}"
      }
    }
}

previous_run_date is a special attribute tracked by Truto has the last date on which a Sync Job ran and completed successfully for the Sync Job and a particular Integrated Account. It's set to '1970-01-01T00:00:00.000Z' on the very first Sync Job Run.

The previous Sync Job can be updated with the new resource binding,

bash
curl --location 'https://api.truto.one/sync-job' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <api_token>' \
--data '{
    "integration_name": "zendesk",
    "resources": [
        {
            "resource": "ticketing/users",
            "method": "list"
        },
        {
            "resource": "ticketing/contacts",
            "method": "list"
        },
        {
            "resource": "ticketing/tickets",
            "method": "list",
            "query": {
              "updated_at": {
                "gt": "{{previous_run_date}}"
              }
            }
        },
        {
            "resource": "ticketing/comments",
            "method": "list",
            "depends_on": "ticketing/tickets",
            "query": {
                "ticket_id": "{{resources.ticketing.tickets.id}}"
            }
        }
    ]
}'

Now, everytime the Sync Job would execute, it will fetch only the tickets which have been changed from the last time a Sync Job Run ran.

Doing a full sync on demand

Sometimes you want to fully sync the data and just ignore the previous_run_date. It's possible to do it using ignore_previous_run attribute set to true in the Sync Job Run request.

bash
curl --location 'https://api.truto.one/sync-job-run' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <api_token>' \
--data '{
    "sync_job_id": "7279a917-b447-4629-9e46-a1eeb791ad6b",
    "integrated_account_id": "7ae7b0ab-c6a7-4f29-aec1-1f123517af5d",
    "webhook_id": "a5b21886-3b4d-4fd0-9956-ffc0714d701c",
    "ignore_previous_run": true
}'

Running a Sync Job on schedule

TIP

The cron expression is in UTC timezone.

To run a Sync Job on a recurring schedule, a Sync Job Cron Trigger can be created.

Use the following request to create one,

bash
curl --location 'https://api.truto.one/sync-job-cron-trigger' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <api_token>' \
--data '{
    "sync_job_id": "d7fd45d6-136a-4244-aeb9-b6439bfa8b71",
    "integrated_account_id": "6680c7ff-9f0e-45be-9915-a7334dc37f23",
    "webhook_id": "a5b21886-3b4d-4fd0-9956-ffc0714d701c",
    "cron_expression": "0 */6 * * *"
}'

The request schema is similar to Sync Job Run with an additional attribute - cron_expression. The Cron expression above will run the Sync Job every 6 hours.

Passing arguments to a Sync Job

Arguments can be passed to a Sync Job Run to fetch data dynamically. Imagine if in the example above, we needed to fetch tickets incrementally based on the previous Sync Job Run date, but also based on a date of our choosing for the initial sync.

To achieve this, first the schema of the arguments to be passed in the Sync Job Run needs to be added to the Sync Job. It's specified using args_schema attribute in the request body.

json
{
  "args_schema": {
    "ticket_sync_start_date": {
      "type": "string",
      "format": "date-time"
    }
  }
}

Next, we use this argument in the ticketing/tickets resource like so, but instead of the normal variable binding, a JSONata expression is used for query. The JSONata expression is like so,

jsonata
{
    'updated_at': {
        'gt': args.ticket_sync_start_date ? args.ticket_sync_start_date : previous_run_date
    }
}

The expression above uses ticket_sync_start_date from the arguments if it's passed or falls back to previous_run_date.

INFO

Update: We recently introduced conditional placeholders than can achieve the same result without the need for JSONata expressions. You'd still need JSONata expressions for more complex scenarios.

json
{
  "updated_at": "{{args.ticket_sync_start_date|previous_run_date}}"
}

The updated resource definition looks like so,

json
{
    "resource": "ticketing/tickets",
    "method": "list",
    "query": "{ 'updated_at': { 'gt': args.ticket_sync_start_date ? args.ticket_sync_start_date : previous_run_date } }"
}

The final request to create a Sync Job with arguments,

bash
curl --location 'https://api.truto.one/sync-job' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <api_token>' \
--data '{
    "integration_name": "zendesk",
    "args_schema": {
        "ticket_sync_start_date": {
            "type": "string",
            "format": "date-time"
        }
    },
    "resources": [
        {
            "resource": "ticketing/users",
            "method": "list"
        },
        {
            "resource": "ticketing/contacts",
            "method": "list"
        },
        {
            "resource": "ticketing/tickets",
            "method": "list",
            "query": "{ '\''updated_at'\'': { '\''gt'\'': args.ticket_sync_start_date ? args.ticket_sync_start_date : previous_run_date } }"
        },
        {
            "resource": "ticketing/comments",
            "method": "list",
            "depends_on": "ticketing/tickets",
            "query": {
                "ticket_id": "{{resources.ticketing.tickets.id}}"
            }
        }
    ]
}'

To run a Sync Job with arguments, args attribute needs to be added to the request body,

json
{
  "args": {
    "ticket_sync_start_date": "2023-07-23T18:10:56.072Z"
  }
}

The request would be,

bash
curl --location 'https://api.truto.one/sync-job-run' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <api_token>' \
--data '{
    "args": {
        "ticket_sync_start_date": "2023-07-23T18:10:56.072Z"
    },
    "sync_job_id": "7279a917-b447-4629-9e46-a1eeb791ad6b",
    "integrated_account_id": "7ae7b0ab-c6a7-4f29-aec1-1f123517af5d",
    "webhook_id": "a5b21886-3b4d-4fd0-9956-ffc0714d701c"
}'

Looping a request over an array

Taking the arguments example forward, imagine a hypothetical scenario where we need to fetch a specific set of tickets based on a list of ticket ids. This can be achieved using the loop_on attribute in the Sync Job Run request.

To create such a Sync Job, we first need to define the args_schema like so,

json
{
  "args_schema": {
    "ticket_ids": {
      "type": "array",
      "items": {
        "type": "string"
      }
    }
  }
}

Then define the ticketing/tickets resource like so,

json
{
    "resource": "ticketing/tickets",
    "method": "get",
    "loop_on": "args.ticket_ids",
    "id": "{{args.ticket_ids}}"
}

Here, the loop_on attribute specifies that for each element in the ticket_ids array, a request should be made. The id attribute specifies the placeholder to be used for the id of the ticket.

Complete request to create a Sync Job with looping,

bash
curl --location 'https://api.truto.one/sync-job' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <api_token>' \
--data '{
    "integration_name": "zendesk",
    "args_schema": {
        "ticket_ids": {
          "type": "array",
          "items": {
            "type": "string"
          }
        }
    },
    "resources": [
        {
            "resource": "ticketing/tickets",
            "method": "get",
            "loop_on": "args.ticket_ids",
            "id": "{{args.ticket_ids}}"
        }
    ]
}'

Recursively fetching data for the same resource

There are cases where you might need to fetch data for the same resource, where the resources might have a parent-child relationship. For example, in drive-items for the Unified File Storage API, there could be drive items of the type folder which might have child drive items within them. To fetch such resources, you can use the recurse attribute in the Sync Job resource,

json
{
  "resource": "file-storage/drive-items",
  "method": "list",
  "recurse": {
    "if": "{{resources.file-storage.drive-items.has_children:bool}}",
    "config": {
      "query": {
        "parent": {
          "id": "{{resources.file-storage.drive-items.id}}"
        }
      }
    }
  }
}

In the above example, we set the recurse condition in the if attribute. Most of the resources which follow this parent-child relation have an has_children attribute which is a boolean (you'll need to check the documentation to make sure). If the has_children attribute is true, then the config attribute will be used to fetch the child resources. The query attribute in the config specifies the query parameters to be passed to the request. The parent.id is set to the id of the parent resource. You can use placeholders to refer to the values of other fields.

Transforming the data fetched by the resources

Sometimes you might need to transform the data fetched by the resources before sending it to the webhook. This can be achieved using the transform notes in the Sync Job.

Transform nodes accept a JSONata expression which should return the final result to be sent to the webhook. These nodes are always dependent on a resource node (using depends_on), so you can't have an independent transform node. Also, the resource node which the transform nodes depend on should have a name attribute and the transform node itself should have a name attribute. You can also have transform nodes dependent on another transform node, and resource nodes dependent on transform nodes.

Some use cases where the transform nodes are especially helpful --

  1. Filtering out the data fetched by the resource -- When the underlying API doesn't support the filters you need, like updated_at, tags, etc., you can use the transform nodes to filter out the data.
  2. Modifying the output of the resource without modifying the Unified mappings -- You might need to modify the output of the resource before sending it to the webhook, like adding a new field, modifying the existing fields, etc.

This is the context object available for the JSONata expressions in the transform nodes,

  • args - The arguments passed to the Sync Job Run.
  • resources.<resource_name> - Contains the data fetched by the resource. The resource_name is the name of the resource.
  • previous_run_date - Refers to the last date on which a Sync Job ran and completed successfully for the Sync Job and an Integrated Account. It's set to '1970-01-01T00:00:00.000Z' on the very first Sync Job Run.
  • resource - Refers to the parent resource's attributes defined in the Sync Job with placeholders resolved.
  • <all_context_variables> - Refers to the context variables set in the Integrated Account.

Taking the example from Zendesk, this Sync Job ignores the contacts NOT updated from the last time the sync job ran,

json
{
    "integration_name": "zendesk",
    "args_schema": {},
    "resources": [
        {
            // needs a name
            "name": "all-contacts",
            "resource": "ticketing/contacts",
            "method": "list",
            "persist": false
        },
        {
            // needs a name
            "name": "filtered-contacts",
            "type": "transform",
            "config": {
              "expression": "resources.ticketing.contacts[updated_at >= %.%.%.previous_run_date]"
            },
            // refer to the name
            "depends_on": "all-contacts",
            "persist": true
        }
    ]
}

To make sure that you only get the filtered contacts on the webhook in the example above, persist has been set to true for the transform node and false for all the contacts.

Transform nodes need to have the persist attribute set to true if you need their output to be sent to the webhook. By default, it's set to false.

TIP

If the resource name contains special characters apart from underscore _, like -, then in JSONata you need to refer to them using backticks. For example, if the resource name is knowledge-base/page-content, then you need to refer to it as

resources.`knowledge-base`.`page-content`

in JSONata.

Spooling data into a single webhook event

Spool nodes allow you to paginate and fetch the complete resource and then send it in a single webhook event. One of the places where this is useful is in the Knowledge Base APIs where the page content might be split into multiple blocks and is provided through a paginated API. You can use the spool nodes to fetch all the blocks and then send them in a single webhook event. You can also have transform nodes dependent on spool nodes, which can transform the data fetched by the spool nodes.

As with transform nodes, spool nodes can't be independent and should be dependent on a resource node. The resource node which the spool nodes depend on should have a name attribute and the spool node itself should have a name attribute. You can't have a spool node dependent on another spool node.

Taking the example of a Notion integration, where we need the content of a Notion page as a markdown,

json
{
    "integration_name": "notion",
    "args_schema": {
      "page_id": {
        "type": "string",
        "required": true
      }
    },
    "resources": [
      {
          "name": "page-content",
          "resource": "knowledge-base/page-content",
          "method": "list",
          "query": {
              "page": {
                  "id": "{{args.page_id}}"
              },
              "truto_ignore_remote_data": true
          },
          "recurse": {
              "if": "{{resources.knowledge-base.page-content.has_children:bool}}",
              "config": {
                  "query": {
                      "page_content_id": "{{resources.knowledge-base.page-content.id}}"
                  }
              }
          },
          "persist": false
      },
      {
          "name": "remove-remote-data",
          "type": "transform",
          "config": {
              "expression": "[resources.`knowledge-base`.`page-content`.$sift(function($v, $k) {$k != 'remote_data'})]"
          },
          "depends_on": "page-content"
      },
      {
          "name": "all-page-content",
          "type": "spool",
          "depends_on": "remove-remote-data"
      },
      {
          "name": "combine-page-content",
          "type": "transform",
          "config": {
              "expression": "$blob($reduce($sortNodes(resources.`knowledge-base`.`page-content`, 'id', 'parent.id'), function($acc, $v) { $acc & $v.body.content }, ''), { \"type\": \"text/markdown\" })"
          },
          "depends_on": "all-page-content",
          "persist": true
      }
    ]
}

In the example above,

  1. We first fetch the page-content blocks for the page provided in the argument.
  2. It then goes through the remove-remote-data transform node to remove the remote_data attribute from the fetched blocks.
  3. The all-page-content spool node fetches all the blocks of the page content.
  4. Then the recurse logic kicks in and fetches the child blocks for each block if any.
  5. Finally, the combine-page-content transform node combines all the blocks fetched by the all-page-content spool node and sends it as a single webhook event. The data is converted into a Blob with the type text/markdown.

In spool nodes, the data is stored temporarily on Truto's servers. Once the sync job run completes/fails, the data is deleted.

Limitations

The amount of data you can store in spool nodes is limited to 128KB, so make sure that the data you are fetching in each page falls within that limit. That is the reason why in the example above, we have a remove-remote-data transform node to remove the remote_data attribute from the fetched blocks, as it might contain a lot of data.

Running Sync Job after an Integrated Account is connected

To run a Sync Job after an Integrated Account is connected, listen to the integrated_account:active event. This event is sent when an Integrated Account is created and is ready to be used. If you have a RapidForm configured for the integration, then you'll need to listen to the integrated_account:post_connect_form_submitted event.

An example webhook event for both is shown below,

json
{
  "id": "bed2145c-46d7-41fa-ad69-3e70cb5bb74d",
  "event": "integrated_account:active",
  "payload": {
    "id": "937fed13-712d-4647-dfe3-7613948b0348",
    "tenant_id": "acme-1",
    "environment_integration_id": "f7d82f6d-20f0-4bed-231c-d9c31e023710",
    // more data
  },
  "environment_id": "ac15abdc-b38e-47d0-97a2-6f494017c177",
  "created_at": "2023-08-31T18:08:27.879Z",
  "webhook_id": "a5b21886-3b4d-4fd0-9956-ffc0714d701c"
}

The Sync Job Run can be scheduled or executed immediately after receiving the webhook event by using the payload.id attribute as the integrated_account_id in the Sync Job Run request.

bash
curl --location 'https://api.truto.one/sync-job-run' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <api_token>' \
--data '{
    "sync_job_id": "7279a917-b447-4629-9e46-a1eeb791ad6b",
    "integrated_account_id": "937fed13-712d-4647-dfe3-7613948b0348", # from the webhook event
    "webhook_id": "a5b21886-3b4d-4fd0-9956-ffc0714d701c"
}'

Placeholder Reference

Placeholders can be used in the Sync Job resources to refer to the values of other fields in the Sync Job. The placeholders are enclosed in double curly braces {{}}.

The placeholders available are,

  • {{args.<arg_name>}} - Refers to the value of the argument passed to the Sync Job Run.
  • {{resources.<resource_name>.<field_name>}} - Refers to the value of a field in a resource fetched in the Sync Job. The field_name can be any field in the resource. This is only available when the resource is dependent on another resource using the depends_on attribute.
  • {{previous_run_date}} - Refers to the last date on which a Sync Job ran and completed successfully for the Sync Job and an Integrated Account. It's set to '1970-01-01T00:00:00.000Z' on the very first Sync Job Run.
  • {{truto_parent_resource.<attribute>}} - If using depends_on, then the parent resource's attributes can be accessed using this placeholder. For example, you could use the queries used in the parent resource by using {{truto_parent_resource.query.<query_name>}}. This can be useful for recurse use cases.
    • {{truto_parent_resource.query.<parameter_name>}}
    • {{truto_parent_resource.method}}
    • {{truto_parent_resource.resource}}
    • {{truto_parent_resource.id}}
    • {{truto_parent_resource.body.<attribute_name>}}
  • {{<integrated_account_context_variable>}} - Refers to the value of a context variable set in the Integrated Account. For example, if you have a variable with name foo in the integrated account, you can refer to it using {{foo}}.

Sync Job API Reference

Refer Sync Job API Reference for more details about the requests.