Skip to content

The Unholy Mess of File Picker APIs (and How We Tamed It)

Implementing native cloud file pickers for Google Drive, OneDrive, and Box is an architectural nightmare. Here is how we unified them into a single API.

Uday Gajavalli Uday Gajavalli · · 8 min read
The Unholy Mess of File Picker APIs (and How We Tamed It)

If you are building an AI agent or a RAG pipeline, you eventually hit a wall: how do you let users select which files to ingest from their cloud storage?

We see this constantly with customers building retrieval engines. They have the backend architecture figured out—retrieving files, parsing text, chunking data, generating embeddings, and storing them in a vector database (if you want the architectural deep dive on that pipeline, read RAG simplified with Truto).

The bottleneck isn't the vector math. It's the frontend.

Users don't want to sync an entire corporate Google Drive or SharePoint site into a vector database. That is a fast track to polluted context windows, hallucinating models, and astronomical embedding costs. They want to selectively pick specific files or folders. Sales decks? Yes. The 2019 holiday party planning folder? Absolutely not.

Building one native file picker is annoying. Building five—Google Drive, OneDrive, SharePoint, Box, and Dropbox—while managing their respective authentication lifecycles, script loaders, and frontend quirks, is a complete roadmap blocker.

Here is a look at the architectural reality of native file picker APIs, the hidden edge cases that break them, and how we abstracted the entire mess into a single function.

The Unholy Mess of Native File Picker Integration

You cannot skip these integrations. Google Workspace has over 3 billion users, and Microsoft Office 365 commands over 300 million active users. If you sell B2B software, supporting both ecosystems is mandatory.

But the developer experience for these native APIs is deeply fragmented. Every vendor has a completely different architectural philosophy for how a frontend component should be invoked.

Google Drive: Three Scripts, Loaded Sequentially

The Google Picker API is the most initialization-heavy of the bunch. To render it, you have to execute a fragile, multi-step script loading sequence:

  1. Load apis.google.com/js/api.js
  2. Call gapi.load('client:picker') and wait for the callback
  3. Load accounts.google.com/gsi/client (Google Identity Services)
  4. Fetch the Drive v3 REST discovery document
  5. Then instantiate the google.picker.PickerBuilder

Miss a step or mess up the async sequence, and you get a blank overlay. No error. No feedback.

The frontend initialization is only half the problem. If you actually want to download the file after the user picks it, Google requires your OAuth application to pass a Cloud Application Security Assessment (CASA) Tier 2 certification. This mandatory security audit is notoriously expensive and typically takes months to complete (read our CASA certification update to see how Truto handles this for you).

Microsoft OneDrive & SharePoint: MessageChannel Protocols and URL Forks

Microsoft's File Picker v8 takes a completely different approach. There is no SDK to install. Instead, you open a popup window, POST a form to /_layouts/15/FilePicker.aspx, and communicate with the picker via the MessageChannel API—a browser-level messaging protocol most frontend developers never touch directly.

The picker sends an initialize message with a MessagePort. You listen on that port for pick, close, and notification commands. It gets worse: OneDrive personal accounts use a different URL (/picker) than business accounts, meaning you have to detect the account type and fork your URL construction.

Oh, and there is a massive hidden trap: if you try to pass too many pre-selected items via the sourceItems configuration, the picker initialization fails entirely because it exceeds the URL length limits of the embedded iframe. We documented the sheer pain of this specific quirk in Challenges we face while building integrations: SharePoint.

Box: Inline Modals and CORS Requirements

Box renders its picker inline in your page using Box UI Elements. You load CSS and JS from Box's CDN, create a full-screen modal in the DOM, and instantiate a Box.FilePicker. Because it injects directly into the DOM, your CSS can clash with their elements.

The catch? Your domain must be in the CORS Allowed Domains list in the Box Developer Console. Miss this step, and you get cryptic cross-origin errors that frequently break ephemeral preview environments.

Dropbox: The App-Key Wildcard

Dropbox's Chooser ignores OAuth tokens entirely. Authentication is handled through an appKey that you pass when loading the Dropbox drop-in script. The Chooser manages its own auth flow, meaning your Dropbox integration doesn't share an auth model with the other four providers.

The Comparison at a Glance

Provider Auth Model Rendering Script Loading Key Gotcha
Google Drive OAuth token via Truto Google's overlay 3 sequential script loads Heaviest initialization
SharePoint Session-based (popup) Popup (1080×680) None — hosted page MessageChannel protocol
OneDrive Session-based (popup) Popup (1080×680) None — hosted page Personal vs. business URL fork
Box OAuth token via Truto Inline DOM modal CDN CSS + JS CORS domain allowlist required
Dropbox App key (no OAuth) Dropbox-managed popup Script injection Domain allowlist required

For a deeper look at these undocumented edge cases, read How to Integrate Google Drive, SharePoint, and Box: What It Really Takes.

Introducing Truto's showFilePicker: One Function, Five Providers

We looked at the market and didn't like the existing solutions.

Competitors like Merge.dev offer a file picker built into their Link component, but they rely on scheduled sync jobs and store customer data on their servers. If you are building a real-time AI agent, waiting for a cron job to sync a selected PDF is unacceptable. Apideck provides an open-source React component, but you still have to implement their specific UI and handle their unified pagination logic.

We took a different route. The showFilePicker method is implemented directly in our Truto Link SDK—the exact same SDK our customers already use to show the connect UI to their users. It exposes a single function that wraps all five native pickers:

import { showFilePicker } from '@truto/truto-link-sdk';
 
const pickedItems = await showFilePicker(
  'googledrive',           // or 'sharepoint', 'onedrive', 'box', 'dropbox'
  integratedAccountToken,   // Truto-managed auth token
  {
    // Provider-specific config (deep-merged with sensible defaults)
    views:[
      { viewId: 'DOCS', includeFolders: true, selectFolderEnabled: true },
      { enableDrives: true }  // include shared drives
    ]
  }
);

Screen recording of the Truto Link SDK Playground demonstrating the Google Drive file picker: the Truto homepage shows three cards (Connect an Integration, File Picker, RapidForm). The user clicks “Get started” on the File Picker card; a loading spinner appears, then the native Google Drive picker opens with the standard sidebar (My Drive, Shared drives) and main file list. The user scrolls the file list, selects a file (highlighted), confirms the selection, and the picker closes, returning control to the playground—illustrating one showFilePicker call launching the vendor’s official UI and returning the chosen file.

Behind that single function, the SDK uses a switch statement to route to the correct provider implementation. It abstracts away the script loading, the popup/modal management, and the MessageChannel protocols.

Crucially, we preserve the native, vendor-provided UI. Users see the exact Google Drive overlay or Microsoft popup they are used to. This instantly builds trust and prevents you from having to reverse-engineer search, folder navigation, and pagination logic.

More importantly, Truto handles the authentication lifecycle entirely. We manage the OAuth token refresh behind the scenes and inject the valid access_token directly into the native picker. You get zero-code auth management without forcing the user through a redundant login flow.

Tip

Config keys you don't recognize are passed through directly to the native provider SDK. This means you get full access to provider-specific options (like Google's selectableMimeTypes or Box's sortDirection) without Truto acting as a bottleneck.

The Post-Selection Pipeline: Deduplication and JSONata

Getting the user to select a file is only half the battle. Once the picker returns an array of files, you have to normalize that data before persisting it to your database or triggering a download job (see Tackling the Challenges of File Upload and Download Integrations).

Every provider in our SDK runs through a shared, four-step post-selection pipeline:

  1. Deduplication: All items are cast to an array and deduplicated by ID (handling edge cases where providers return duplicates from shared drives).
  2. Transformation: Data passes through an optional JSONata transformation layer via the trutoExpression configuration.
  3. Persistence: The transformed array is PATCHed directly to the integrated account context via the Truto API.
  4. Resolution: The promise resolves, returning the clean array to your application.

The JSONata transformation step is particularly powerful for RAG use cases. Because the expression runs on the full array of picked items, you can filter out specific MIME types, reshape the metadata, or compute derived values on the client side before the payload ever hits your backend.

const pickedItems = await showFilePicker('googledrive', token, {
  // Add a timestamp and source to every selected item before storing
  trutoExpression: '$.($merge([{"selected_at": $now(), "source": "google_drive"}, $]))'
});

Replace vs. Upsert: Managing Persistent File Selections

When users interact with file pickers across multiple sessions, you have to decide how to handle state. If a user picks three files on Monday and two files on Tuesday, do you overwrite the selection or merge them?

We handle this via the truto_upsert_drive_items configuration flag.

By default, the SDK operates in replace mode. Each picker invocation completely replaces the stored drive items. If you enable upsert mode, new items are merged with existing items and deduplicated by ID.

const pickedItems = await showFilePicker('sharepoint', token, {
  truto_upsert_drive_items: true // Merges new selections with historical ones
});

SharePoint and OneDrive actually support this natively on the frontend. We map the stored file IDs to their sourceItems configuration, meaning previously selected files are automatically pre-selected with checkmarks the next time the Microsoft popup opens. Google Drive and Box lack this native re-selection UI, but the upsert logic ensures your backend state remains consistent regardless of the provider.

The Trade-offs

Using a unified file picker through Truto isn't a silver bullet. You are still subject to per-provider limitations. Box's CORS requirements don't go away. Dropbox still needs domain allowlisting. Google's Picker API still loads three scripts. Truto handles the complexity for you, but the underlying constraints exist.

Furthermore, native pickers mean native UX inconsistencies. If pixel-perfect UI consistency across providers is your top priority, a custom-built picker might be more appropriate—though you lose the inherent user trust of the native interfaces.

Why AI Agents Need Native Pickers

Selective file ingestion is the only scalable architecture for modern AI applications. You have to let users explicitly choose which documents matter.

By using a unified API that wraps native pickers, you solve three problems at once:

  • User Trust: Users authenticate and select files through the official Google or Microsoft interfaces, not a custom third-party replica.
  • Context Quality: You only index the specific files the user cares about, keeping your vector database lean and your LLM responses highly relevant.
  • Engineering Velocity: Your team writes one showFilePicker implementation instead of maintaining five separate, undocumented frontend integrations.

Stop burning engineering cycles reverse-engineering Microsoft's iframe limits and Google's script loaders. Abstract the mess and focus on building your actual product.

FAQ

Why should I use native file pickers instead of a custom UI?
Native pickers preserve the vendor's official UI, which builds user trust and avoids the need to reverse-engineer search, folder navigation, and pagination logic for every cloud provider.
How does Truto handle authentication for file pickers?
Truto manages the entire OAuth lifecycle backend. The SDK simply passes the active access_token directly to the native picker component, requiring zero additional login flows for the user.
Can I transform file metadata before storing it?
Yes. Truto includes a trutoExpression configuration that accepts JSONata expressions, allowing you to filter, reshape, or enrich file metadata on the client side before it hits your database.
What is the difference between OneDrive File Picker and the Google Picker API?
Microsoft's File Picker uses a popup window with MessageChannel-based postMessage communication and no client-side SDK. Google's Picker API requires loading multiple scripts sequentially and uses PickerBuilder to configure the UI.

More from our Blog

RAG simplified with Truto
Product Updates

RAG simplified with Truto

Truto provides a comprehensive solution that supports every step of your RAG-based workflow. Learn more in this blog post.

Uday Gajavalli Uday Gajavalli · · 5 min read