---
title: "Connect Lobstr to Claude: Manage Crawlers and Automated Results"
slug: connect-lobstr-to-claude-manage-crawlers-and-automated-results
date: 2026-06-19
author: Uday Gajavalli
categories: ["AI & Agents"]
excerpt: "Learn how to connect Lobstr to Claude using a managed MCP server to orchestrate web scraping squids, tasks, and data exports directly from your AI agent."
tldr: "Connect Lobstr to Claude using Truto's managed MCP server. This guide details how to handle Lobstr's asynchronous execution hierarchy, handle API rate limits natively, and automate complete web scraping workflows using Claude."
canonical: https://truto.one/blog/connect-lobstr-to-claude-manage-crawlers-and-automated-results/
---

# Connect Lobstr to Claude: Manage Crawlers and Automated Results


If you need to connect Lobstr to Claude to automate web data extraction, manage scraping crawlers, and orchestrate automated result exports, you need a [Model Context Protocol (MCP) server](https://truto.one/what-is-mcp-and-mcp-servers-and-how-do-they-work/). This server acts as the translation layer between Claude's tool calls and Lobstr's REST APIs. You can either spend weeks building and maintaining this infrastructure yourself, or use a managed integration platform like Truto to dynamically generate a secure, authenticated MCP server URL. 

If your team uses ChatGPT, check out our guide on [connecting Lobstr to ChatGPT](https://truto.one/connect-lobstr-to-chatgpt-automate-scraper-squids-and-data-runs/) or explore our broader architectural overview on [connecting Lobstr to AI Agents](https://truto.one/connect-lobstr-to-ai-agents-orchestrate-tasks-and-scraped-exports/).

Giving a Large Language Model (LLM) read and write access to an asynchronous, credit-based execution platform like Lobstr is an engineering challenge. You have to map highly variable crawler input schemas to MCP tool definitions, deal with asynchronous polling logic, and safely handle strict usage limits. Every time Lobstr adds a new crawler or updates an execution state, you have to update your server code, redeploy, and test the integration. This guide breaks down exactly how to use Truto to generate a secure, [managed MCP server](https://truto.one/managed-mcp-for-claude-full-saas-api-access-without-security-headaches/) for Lobstr, connect it natively to Claude, and execute complex scraping workflows using natural language.

::cta{buttonText="Talk to us" buttonUrl="https://cal.com/truto/partner-with-truto"}
Stop building custom API wrappers. Generate secure MCP servers for 100+ B2B apps in seconds.
:::

## The Engineering Reality of the Lobstr API

A custom MCP server is a [self-hosted integration layer](https://truto.one/the-hands-on-guide-to-building-mcp-servers-for-ai-agents-2026/). While the open MCP standard provides a predictable way for models to discover tools, the reality of implementing it against Lobstr's APIs is complex. Lobstr is not a standard REST CRUD app - it is a job orchestration and execution platform. 

If you decide to build a custom MCP server for Lobstr, you own the entire API lifecycle. Here are the specific challenges you will face:

**The Asynchronous Execution Hierarchy**
Lobstr relies on a strict operational hierarchy: `Crawler` -> `Squid` -> `Task` -> `Run` -> `Result`. An LLM cannot simply ask Lobstr to "scrape this URL." It must first identify the right Crawler, instantiate a Squid (the job container), queue Tasks (the target URLs), initiate a Run, and then poll the Run until it completes. Exposing this raw hierarchy to an LLM usually results in the model trying to skip steps - like requesting results before a run is finished. Your MCP server must explicitly define schemas that guide the LLM through this multi-step asynchronous dance.

**Highly Variable Parameter Schemas**
Every Lobstr crawler has a unique configuration schema. A LinkedIn profile scraper requires completely different inputs than an Amazon product scraper. The `params` object in Lobstr API requests is entirely dynamic. Hardcoding an OpenAPI spec for Lobstr is virtually impossible because the schema drifts depending on the specific crawler you use. A resilient MCP server needs to allow the LLM to query `get_single_lobstr_crawler_param_by_id` dynamically to figure out the required inputs before it attempts to build a task.

**Strict Rate Limits and Polling**
Because Lobstr runs are asynchronous, clients must poll the API to check execution status. Aggressive polling triggers Lobstr's rate limits. Factual note on rate limits: Truto does not retry, throttle, or apply backoff on rate limit errors. When the upstream Lobstr API returns an HTTP `429 Too Many Requests`, Truto passes that error directly to the caller. Truto normalizes the upstream rate limit info into standardized headers (`ratelimit-limit`, `ratelimit-remaining`, `ratelimit-reset`) per the IETF spec. The Claude client or calling agent is fully responsible for intercepting the `429`, reading the `ratelimit-reset` header, and implementing its own backoff and retry logic.

Instead of building this orchestration logic from scratch, you can use Truto. Truto exposes Lobstr's endpoints as meticulously documented, ready-to-use [MCP tools](https://truto.one/what-is-mcp-and-mcp-servers-and-how-do-they-work/), handling all the underlying HTTP boilerplate so Claude can focus on reasoning through the scraping lifecycle.

## How to Generate a Lobstr MCP Server

Truto dynamically generates MCP tools based on the active API documentation for your Lobstr integration. Tools are generated on the fly during the `tools/list` JSON-RPC handshake - they are never cached or stale. 

You can generate an MCP server for a connected Lobstr account using either the Truto UI or the API.

### Method 1: Via the Truto UI

If you are setting this up for internal team use, the Truto dashboard is the fastest route.

1. Navigate to the **Integrated Accounts** page in your Truto dashboard and select your connected Lobstr account.
2. Click the **MCP Servers** tab.
3. Click **Create MCP Server**.
4. Configure your server filters (e.g., restrict to `read` methods or specific tags like `runs` and `squids`).
5. Copy the generated MCP server URL (e.g., `https://api.truto.one/mcp/a1b2c3d4...`).

### Method 2: Via the Truto API

If you are building an AI agent product and need to programmatically provision Lobstr MCP servers for your end-users, you use the Truto REST API. The endpoint validates the integration, provisions a secure token backed by a distributed KV store, and schedules any necessary expiration alarms.

Execute a POST request to `/integrated-account/:id/mcp`:

```bash
curl -X POST https://api.truto.one/integrated-account/<lobstr_account_id>/mcp \
  -H "Authorization: Bearer <YOUR_TRUTO_API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Lobstr Web Scraping Agent",
    "config": {
      "methods": ["read", "write", "custom"]
    }
  }'
```

The API returns a fully qualified, authenticated MCP server URL:

```json
{
  "id": "mcp_token_987654",
  "name": "Lobstr Web Scraping Agent",
  "config": { "methods": ["read", "write", "custom"] },
  "expires_at": null,
  "url": "https://api.truto.one/mcp/xyz789..."
}
```

## Connecting the MCP Server to Claude

Once you have your Truto MCP URL, you need to register it with Claude. Anthropic supports connecting remote MCP servers over Server-Sent Events (SSE) or stdio.

### Method A: Via the Claude UI

If you are using the Claude desktop app or web interface (for Enterprise/Team plans), you can add the connector directly in the UI.

1. Open Claude and navigate to **Settings -> Integrations**.
2. Click **Add MCP Server** (or **Add Custom Connector**).
3. Paste the Truto MCP URL you generated.
4. Click **Add**.

Claude will immediately execute an `initialize` handshake, request the `tools/list`, and populate its context window with the available Lobstr capabilities.

### Method B: Via Manual Config File

If you are running Claude Desktop locally and prefer file-based configuration, or if you are integrating via a framework like Cursor, you can update your `claude_desktop_config.json` file. 

Because Truto provides a hosted SSE endpoint, you use the official MCP SSE transport module to connect:

```json
{
  "mcpServers": {
    "lobstr_truto": {
      "command": "npx",
      "args": [
        "-y",
        "@modelcontextprotocol/server-sse",
        "https://api.truto.one/mcp/xyz789..."
      ]
    }
  }
}
```

Restart Claude Desktop. The agent is now wired directly into your Lobstr environment.

## Hero Tools for Lobstr Automation

Truto exposes the entirety of the Lobstr API, but providing the LLM with the right context is key. Below are the highest-leverage "hero tools" generated by Truto that enable Claude to orchestrate the full execution lifecycle.

### `list_all_lobstr_crawlers`
Before Claude can scrape anything, it needs to know what tools are available on the platform. This tool lists all available Lobstr crawlers, returning their IDs, names, and credit costs per row.

> "I need to scrape some LinkedIn profiles. Can you list the available crawlers in my Lobstr account and find one that handles LinkedIn, and tell me how much it costs per row?"

### `get_single_lobstr_crawler_param_by_id`
Because crawler inputs are highly dynamic, Claude must call this tool to fetch the exact schema required for a specific crawler before building a Squid. It returns the configurable input parameters separated into task and squid configuration objects.

> "I found the LinkedIn profile scraper crawler. Fetch its parameter schema so we know exactly what input fields and JSON structure it requires before we build the task."

### `create_a_lobstr_squid`
A Squid is the execution container for a scraping job. This tool instantiates a new Squid in Lobstr for a specific crawler ID, preparing it to receive tasks.

> "Create a new Lobstr squid named 'Q3 Competitor Tracking' using the LinkedIn crawler ID we just looked up."

### `create_a_lobstr_task`
Tasks represent the actual work - usually target URLs or search queries. This tool allows Claude to batch-upload URLs into the Squid. It returns an array of queued tasks and indicates if any duplicates were skipped.

> "Add these five target LinkedIn URLs as tasks to the 'Q3 Competitor Tracking' squid we just created. Make sure they are formatted according to the crawler's parameter schema."

### `create_a_lobstr_run`
Once a Squid is loaded with tasks, this tool triggers the actual scraping engine. It returns a run ID and the initial execution status. Claude needs to hold onto this run hash for polling.

> "Start a run for the 'Q3 Competitor Tracking' squid. Give me the run ID so we can monitor its progress."

### `get_single_lobstr_run_stat_by_id`
Because scraping takes time, Claude uses this tool to check real-time statistics for an active run. It returns the percentage done, total tasks processed, duration, ETA, and a boolean `is_done` flag. 

> "Check the status of the run we just started. If it isn't finished, tell me the ETA and how many tasks have successfully processed so far."

### `list_all_lobstr_results`
Once a run is complete (`is_done: true`), Claude calls this tool to retrieve the actual scraped data. It returns an array of result rows containing the data payload extracted by the crawler.

> "The run is finished. Fetch all the results from the squid and summarize the key findings from the scraped LinkedIn profiles."

For the complete tool inventory, including delivery configurations, webhooks, and account credential management, see the [Truto Lobstr Integration Page](https://truto.one/integrations/detail/lobstr).

## Workflows in Action

When Claude has access to these MCP tools, it stops being a mere chat interface and becomes an autonomous data operations engineer. Here is how Claude handles complex Lobstr workflows in practice.

### 1. The End-to-End Autonomous Scrape

Marketing and growth teams frequently need to run ad-hoc data enrichment. Instead of logging into a UI, clicking through menus, and manually downloading CSVs, a user can instruct Claude to handle the entire asynchronous pipeline.

> "I need to scrape data for these 10 company URLs using the standard domain enrichment crawler. Set up the job in Lobstr, execute it, wait for it to finish, and then give me a table of the output data."

Here is how Claude executes this multi-step orchestration:

1. Claude calls `list_all_lobstr_crawlers` to find the ID for the domain enrichment crawler.
2. Claude calls `get_single_lobstr_crawler_param_by_id` to understand the exact JSON structure required for the URLs.
3. Claude calls `create_a_lobstr_squid` to spin up the execution container.
4. Claude calls `create_a_lobstr_task` passing the 10 URLs mapped to the required schema.
5. Claude calls `create_a_lobstr_run` to initiate the scrape and extracts the `run_hash`.
6. Claude enters a polling loop, calling `get_single_lobstr_run_stat_by_id`. (If the API returns a 429 rate limit error due to aggressive polling, Claude reads the `ratelimit-reset` header and backs off).
7. Once `is_done` is true, Claude calls `list_all_lobstr_results` to retrieve the payload.
8. Claude formats the raw JSON response into a clean Markdown table for the user.

```mermaid
sequenceDiagram
    participant User as User
    participant Claude as Claude Desktop
    participant MCP as Truto MCP Server
    participant Lobstr as Lobstr API
    
    User->>Claude: "Scrape these 10 URLs..."
    Claude->>MCP: Call list_all_lobstr_crawlers
    MCP->>Lobstr: GET /crawlers
    Lobstr-->>MCP: [Crawler List]
    MCP-->>Claude: Return Crawler ID
    
    Claude->>MCP: Call create_a_lobstr_squid
    MCP->>Lobstr: POST /squids
    Lobstr-->>MCP: {squid_id}
    MCP-->>Claude: Return Squid ID
    
    Claude->>MCP: Call create_a_lobstr_task
    MCP->>Lobstr: POST /squids/{id}/tasks
    Lobstr-->>MCP: {queued_tasks}
    MCP-->>Claude: Confirm tasks added
    
    Claude->>MCP: Call create_a_lobstr_run
    MCP->>Lobstr: POST /runs
    Lobstr-->>MCP: {run_hash, status: "running"}
    MCP-->>Claude: Return Run ID
    
    loop Polling
        Claude->>MCP: Call get_single_lobstr_run_stat_by_id
        MCP->>Lobstr: GET /runs/{hash}/stats
        Lobstr-->>MCP: {is_done: false, percent: 50}
        MCP-->>Claude: Status update
    end
    
    Claude->>MCP: Call list_all_lobstr_results
    MCP->>Lobstr: GET /squids/{id}/results
    Lobstr-->>MCP: [Scraped Data]
    MCP-->>Claude: Return JSON results
    Claude-->>User: Markdown table of data
```

### 2. Credit Monitoring and Run Abort

Data Ops teams need to ensure that scraping jobs do not quietly drain account credits. Claude can audit active runs, check resource consumption, and kill runaway jobs.

> "Check all my active Lobstr squids. If any run has consumed more than 500 credits but is less than 20% done, abort it immediately and tell me the run ID."

Claude processes this operational rule by bridging multiple endpoints:

1. Claude calls `list_all_lobstr_squids` to get the user's active configurations.
2. For each squid, Claude calls `list_all_lobstr_runs` to find active runs.
3. Claude calls `get_single_lobstr_run_stat_by_id` to check the `percent_done`.
4. Claude calls `get_single_lobstr_run_credit_by_id` to check the `total_credits` consumed.
5. If Claude finds a run matching the criteria (e.g., 600 credits used, 15% done), it calls `create_a_lobstr_run_abort` passing the `run_hash`.
6. Claude reports back to the user with the aborted run details.

## Security and Access Control

Giving an AI agent access to an execution platform that burns financial credits requires strict guardrails. Truto's MCP architecture provides native access controls at the server level, ensuring the model cannot perform unauthorized actions regardless of the user's prompt.

*   **Method Filtering:** You can enforce Read-Only architectures. By passing `config: { methods: ["read"] }` during MCP server creation, Truto will strip out all `create`, `update`, and `delete` tools. Claude can check run statuses and list results, but it physically cannot start new squids or spend credits.
*   **Tag Filtering:** You can restrict the MCP server to specific functional domains. If you only want the agent to audit account health, you can filter tools to only include those tagged with `accounts` or `billing`, hiding the crawler and execution tools completely.
*   **Double Authentication:** By enabling `require_api_token_auth: true`, the MCP server URL itself is no longer enough to execute tools. The connecting client (Claude) must pass a valid Truto API token in the header. This prevents unauthorized execution if the MCP URL is leaked in a config file.
*   **Auto-Expiring Servers:** If you are provisioning an agent for a temporary scraping project, you can set an `expires_at` timestamp. Truto's underlying durable scheduling system will automatically purge the credentials and invalidate the server at the exact expiration time, leaving zero zombie access points.

## Moving from Chat to Automation

Connecting Lobstr to Claude via an MCP server transitions your workflows from manual UI operations to intelligent, conversational automation. By utilizing Truto, you bypass the massive engineering overhead of translating Lobstr's asynchronous execution hierarchy and dynamic schemas into reliable LLM tools. 

Instead of writing custom polling loops, tracking cursor pagination, and maintaining crawler schemas, your engineering team can focus on what matters: building superior [AI agents](https://truto.one/the-hands-on-guide-to-building-mcp-servers-for-ai-agents-2026/) that extract value from the web. Truto handles the API normalization; Claude handles the logic.

::cta{buttonText="Talk to us" buttonUrl="https://cal.com/truto/partner-with-truto"}
Stop wrestling with API schemas. Generate a production-ready Lobstr MCP server in seconds.
:::