How to Automate User Access Reviews Across Unmanaged SaaS Using Unified APIs
Learn how to architect automated SOC 2 and SOX user access reviews across hundreds of unmanaged SaaS applications using a Unified Directory API.
If your engineering roadmap includes the requirement to "automate quarterly user access reviews across every SaaS app our customers use", you are staring down a massive architectural challenge. The implicit search query your enterprise customers are asking is how to automatically pull a list of users, roles, and access levels from every single application their employees use, without relying on manual spreadsheets.
Your product manager drops this requirement in a sprint planning meeting. The goal is to automate SOC 2, SOX, and ISO 27001 user access reviews. Your customers want to stop exporting CSVs and chasing department heads via Slack to confirm who still needs access to Jira, HubSpot, or the 100+ other tools their teams use.
As we covered in our developer tutorial on pulling user lists, automating this workflow requires moving beyond one-off API scripts. You need a highly scalable, unified architecture capable of normalizing identity data across hundreds of disparate systems. Building this yourself, one integration at a time, is an engineering trap. You will spend the next three years maintaining point-to-point connectors while fighting terrible vendor API documentation, aggressive rate limits, and undocumented edge cases.
This guide breaks down the architectural reality of identity sprawl, why point-to-point connectors fail at scale, and exactly how to automate quarterly user access reviews across unmanaged SaaS applications using a declarative Unified User Directory API.
The Compliance Nightmare: Why Manual User Access Reviews Break at Scale
Security frameworks like SOC 2 and SOX mandate periodic user access reviews (UARs). Organizations must prove that access to their systems is restricted to authorized personnel, based on the principle of least privilege, and that dormant accounts are promptly deprovisioned.
UARs are a hard compliance control. If a service organization's policies say they conduct quarterly logical access reviews, that organization will need to provide quarterly evidence from the preceding year confirming those reviews were conducted. Auditors want evidence the control operated throughout the observation period. The most common failure is having a policy that says user access is reviewed quarterly, but only being able to produce one access review from six months ago. The policy exists, but the evidence doesn't support that it operates as described.
Historically, IT and security teams handled this manually. The traditional workflow looks like this:
- IT exports a CSV from every SaaS admin panel they have credentials to.
- The list is split by application owner and emailed to managers.
- Managers eyeball spreadsheets, cross-reference them against the HRIS, and rubber-stamp approvals.
- Someone consolidates the responses into a master audit folder.
- Three months later, repeat.
This manual process is a massive drain on resources and breaks immediately upon contact with reality. CSVs are inherently stale the moment they are exported. Apps without admin API access force IT to scrape UIs. Half the time, the right "manager" no longer works there.
If you are a GRC, SSPM, or identity governance product trying to automate this for your customers, you face an even harder problem: you don't have admin credentials at all. You have an OAuth connection from your end user, which means you need to programmatically pull the user list, role assignments, and last login data through whatever API the vendor offers. Multiply that by 100 SaaS apps per customer, and you have a severe engineering problem.
The Blind Spot: Unmanaged SaaS and Identity Sprawl
The standard engineering reflex when tasked with pulling user lists is to integrate with Okta, Microsoft Entra ID, or Google Workspace. If a customer wants user data, they should just pull it from the Identity Provider (IdP) via SCIM (System for Cross-domain Identity Management), a standard we detail in our guide to directory integrations.
This approach fails because SCIM and IdPs only see a visible slice of the SaaS estate. Connecting to major IdPs is table stakes, but it leaves a massive blind spot where the real risk lives.
Product-led growth has decentralized software purchasing. Marketing teams buy their own SEO tools. Engineering teams spin up new monitoring dashboards. Sales teams expense new prospecting software. Gartner projects that by 2027, 75% of employees will use technology outside of IT's purview. Organizations officially recognize only 10% of Shadow IT cloud services, which actually operate at ten times that amount. A typical business operates 108 identified cloud services, yet it secretly uses 975 additional cloud services that exist without detection.
Identity sprawl directly translates to breaches. Excessive permissions remain a leading cause of SaaS security incidents. Studies show that 85% of SaaS users have more privileges than their roles require, creating unnecessary attack surfaces. AppOmni's 2025 data reveals that 75% of organizations experienced a SaaS security incident in the past 12 months, with a significant number of these incidents tied to unauthorized applications.
The attack surface is growing exponentially. A 2026 Grip Security report found a year-over-year 490% spike in public SaaS attacks, with 80% of documented incidents involving PII and/or customer data. The poster boy example is the Salesloft Drift incident, where attackers stole active OAuth tokens used by customers to connect the Drift Chatbot to local Salesforce installations. Armed with legitimate tokens, attackers impersonated Drift and logged directly into Salesforce. One breach of a SaaS app cascaded into hundreds of compromises.
The takeaway: an access review program that only covers centralized IdPs misses the long tail of unmanaged SaaS. To provide genuine security value, your platform must connect directly to the underlying SaaS applications to audit local accounts, bypass shadow IT blind spots, and read the actual permissions granted within the app. For a deeper dive into this architectural gap, see our guide on the long tail of identity.
Why Building Point-to-Point Integrations for User Lists Fails
To audit the long tail of SaaS, you must integrate directly with the APIs of the applications your customers use. The instinct of every senior engineer staring at this requirement is to write a quick script. "It's just GET /users from each app, right?"
If you decide to build these point-to-point connectors in-house, the math quickly becomes terrifying. The average enterprise uses over 130 different SaaS applications. A realistic engineering team can ship two or three high-quality, production-grade integrations per quarter. Building 100+ this way means your access review feature ships in 2030.
The difficulty is not just writing the HTTP requests. Three weeks in, you will find your codebase infected with integration-specific logic trying to handle the following:
- Authentication Chaos: Application A uses standard OAuth 2.0 authorization code. Application B requires a static API key passed in a custom header. Application C requires you to exchange a signed JWT for a short-lived session token every 15 minutes. Others use Basic Auth or session cookies refreshed via post-install hooks. Each one is a separate code path with its own failure modes.
- Pagination Hell: Pagination is a tax on every endpoint. Application A uses cursor-based pagination (
?after=). Application B uses offset and limit parameters (?page=). Application C uses HTTP Link headers (RFC 5988). Your sync job must handle all strategies flawlessly to ensure no users are skipped during an audit. - Data Model Fragmentation: Field shapes rarely match a unified concept of a "user." HubSpot exposes contacts inside
properties.firstname. Salesforce uses flat PascalCase. Workday calls them workers. Each provider has its own enums for status (e.g., active, suspended, deleted, archived) and license type. - Rate Limits: Aggressive and undocumented. A single tenant scan of 5,000 users can burn the entire daily API quota for some applications.
- Missing Webhooks: Half of these applications do not emit user lifecycle events, forcing you to poll on a schedule.
- Silent Token Expiration: A nightly sync that worked for six months will quietly start failing when refresh tokens rotate or scopes change.
If you hardcode these differences, you end up with a massive, fragile codebase filled with if (provider === 'hubspot') statements. Every time a vendor deprecates an endpoint or changes a field name, your sync jobs break, your customers fail their compliance audits, and your engineering team drops feature work to fix technical debt.
The Maintenance Trap: The hidden cost is not the initial build - it is the ongoing maintenance. APIs deprecate endpoints, change pagination, rotate auth flows, and introduce breaking changes. Every change requires a code deploy, regression testing, and a release. Do that 100 times in parallel and your product engineering team will become an integration maintenance team.
How to Automate Quarterly User Access Reviews Using Unified Directory APIs
To solve this problem at scale, you must abandon integration-specific code. As explained in our guide on pulling user lists with a Unified Directory API, a Unified User Directory API abstracts the per-vendor mess into a single, normalized schema. Instead of learning the idiosyncrasies of 100 individual platforms, you call one endpoint, and the unified platform handles the translation, authentication, and pagination behind the scenes.
The Unified Relational Schema
To automate access reviews, a Unified API normalizes data into an identity-centric relational model. The resources you need for an audit-ready program look like this:
| Unified Resource | What It Answers |
|---|---|
Users |
Who exists in the app and what is their status (active / suspended / deactivated)? |
Groups |
Which departments, teams, or distribution lists exist? |
Roles |
What permission tiers does the app define? |
RoleAssignments |
Which user has which role? (The heart of access reviews) |
Licenses |
What paid seats are allocated and to whom? |
Activities |
Login events and admin actions for SIEM ingestion |
graph TD
Org[Organization / Workspace] -->|Contains| User[User]
Org -->|Defines| Role[Role]
Org -->|Defines| Group[Group]
User -->|Belongs to| Group
User -->|Granted access via| RoleAssignment[RoleAssignment]
RoleAssignment -->|Links to| RoleThe call to fetch users from any connected app collapses to a single standardized request:
curl -X GET "https://api.unified-platform.com/unified/user-directory/users?integrated_account_id=acc_123" \
-H "x-api-key: $YOUR_API_KEY"You receive a normalized response, regardless of whether it came from Okta, BambooHR, GitHub, Salesforce, or a niche industry app:
{
"result": [
{
"id": "u_a91",
"email": "jane@acme.com",
"first_name": "Jane",
"last_name": "Doe",
"status": "active",
"groups": ["engineering", "on-call"],
"last_login_at": "2026-04-12T08:14:11Z",
"remote_data": { "...the raw provider payload...": true }
}
],
"next_cursor": "eyJvZmZzZXQiOjEwMH0=",
"result_count": 100
}The remote_data field is critical for compliance - the platform never throws away the raw response, so when an auditor asks for the original provider record, you have the exact evidence.
Zero Integration-Specific Code via JSONata
The architectural secret to a scalable unified API is moving integration logic out of the codebase and into configuration data.
Advanced platforms achieve this using JSONata - a functional query and transformation language for JSON. Every field mapping, query translation, and conditional logic rule is stored as a JSONata expression. When a request is made, a generic execution engine reads the configuration, fetches the data, and evaluates the expression.
For example, transforming a complex response from HubSpot into a clean Unified User object requires a single expression:
response.{
"id": $string(id),
"email": properties.email,
"first_name": properties.firstname,
"last_name": properties.lastname,
"status": properties.archived = "true" ? "deactivated" : "active"
}The Salesforce equivalent is a different expression against Id, FirstName, Email, and IsActive. Both produce the exact same unified output. This means adding support for the 101st SaaS application is a data operation, not a code deployment. Your code never branches on the provider name. For a deeper read on this pattern, see our zero integration-specific code writeup.
Architecting the Sync: Best Practices for Continuous Access Discovery
Wiring up a Unified API is only the first step. Running GET /users reliably across thousands of customer-connected accounts, every night, in a way auditors will accept, is the actual engineering challenge. Your backend architecture must handle the harsh realities of background synchronization at scale.
1. Proactive OAuth Token Management and Distributed Locks
Most background sync failures are not API failures - they are token failures. Access tokens expire frequently (often every 60 minutes). If a quarterly access review sync job running at 2 AM attempts to use an expired token, the request fails.
The naive approach is to catch the 401 Unauthorized error, refresh the token, and retry. In a high-concurrency environment, this causes severe race conditions. If five sync jobs wake up simultaneously and attempt to refresh the same token, the identity provider flags it as a replay attack, issues an invalid_grant error, and permanently revokes the connection.
The reliable pattern utilizes proactive refreshes and distributed mutex locks. The platform schedules background work to refresh the token 60 to 180 seconds before expiration. It acquires a durable, per-account mutex lock so only one refresh runs at a time. Concurrent requests await the lock resolution rather than firing duplicate requests. On unrecoverable failures (like a user revoking access), the account is marked as needs_reauth and a webhook fires so your application can prompt the customer to reconnect.
2. Standardized Rate Limit Handling at the Edge
Pulling thousands of users and their role assignments inevitably triggers upstream API rate limits.
Architectural Reality Check: No unified API platform can magically absorb rate limits for you. If the upstream SaaS application returns an HTTP 429 Too Many Requests error, that error must be passed back to your application. Your application owns the retry policy, because background syncs and interactive requests require different backoff behaviors.
The problem is that every vendor communicates rate limit resets differently (X-RateLimit-Reset, Retry-After, custom headers). To make retry logic manageable, a robust unified API intercepts varied upstream responses and normalizes them into standard IETF draft headers:
ratelimit-limit: The maximum number of requests permitted.ratelimit-remaining: The number of requests remaining in the current window.ratelimit-reset: The time at which the rate limit window resets (in UTC epoch seconds).
Your engineering team can write a single, standardized exponential backoff wrapper that reads these headers, completely decoupled from upstream quirks:
// Example: exponential backoff using normalized headers
async function fetchWithBackoff(url: string, attempt = 0): Promise<Response> {
const res = await fetch(url, { headers: { 'x-api-key': process.env.API_KEY! } })
if (res.status !== 429) return res
const reset = Number(res.headers.get('ratelimit-reset') ?? 30)
const jitter = Math.random() * 1000
const delay = Math.min(reset * 1000, 2 ** attempt * 1000) + jitter
console.log(`Rate limited. Sleeping for ${delay}ms...`);
if (attempt >= 5) throw new Error('Rate limit retries exhausted')
await new Promise(r => setTimeout(r, delay))
return fetchWithBackoff(url, attempt + 1)
}3. Build an Audit-Ready IT Graph, Not Snapshots
A quarterly CSV export is a snapshot. An audit-ready system is a graph: every user, role assignment, group membership, and license seat must be timestamped with a history of changes.
For each connected account, schedule a recurring sync of Users, Groups, Roles, and RoleAssignments. Diff each run against the previous to capture additions, removals, and role changes. That diff is the immutable evidence package your auditor wants. Pair this with the unified Activities endpoint for login events - if a user has not logged in for 90 days, that is an access reviewer's first flag.
4. Handling Custom Permission Models via Overrides
Enterprise customers rarely use default permission models. Salesforce orgs have custom profiles. GitHub Enterprise has custom org-level roles. NetSuite has subsidiary-scoped permissions. If your unified API relies on rigid, hardcoded data models, it will drop this custom data, lying to your customers during an audit.
The architectural fix is a multi-level override hierarchy. If a specific customer uses a bespoke security field to determine administrative access, you can apply an account-level JSONata override to map that custom field directly into the unified RoleAssignments array. This customization happens entirely in configuration, meaning you can support bespoke enterprise permission models without touching your core platform code.
5. Be Honest About the Trade-Offs
A unified API is not a magic bullet. Be aware of the trade-offs:
- Coverage of esoteric features: If your access review needs a highly vendor-specific attribute, you will need to lean on a proxy/passthrough API or per-account overrides.
- Webhook fidelity varies: Some vendors emit excellent lifecycle webhooks; others emit none. For non-webhook providers, you are polling - plan your sync cadence accordingly.
- Roadmap coupling: Ensure your unified API provider supports data-only addition of new connectors so you are not blocked behind their sprint queue when a customer demands a niche integration.
Stop Chasing CSVs and Start Shipping
Automating user access reviews across unmanaged SaaS applications is a massive technical challenge, but it is also a massive competitive advantage. The explosive growth of SaaS, the surge in Shadow IT, and the rapid adoption of AI have created a tsunami of risks. Customers are desperate to replace manual CSV exports with continuous, automated compliance monitoring.
Attempting to build and maintain the necessary API integrations in-house will consume your engineering roadmap. By leveraging a declarative Unified Directory API, you abstract away the authentication chaos, pagination differences, and schema fragmentation of the SaaS ecosystem.
You stop maintaining auth flows and start shipping the actual product: review campaigns, automated reviewer assignment, evidence collection, and anomaly detection.
FAQ
- Why can't I just use SCIM or Okta for user access reviews?
- SCIM and major Identity Providers only cover applications formally managed by IT. They completely miss the "long tail" of shadow IT and unmanaged SaaS apps where the majority of security incidents occur.
- What is a unified user directory API?
- A unified user directory API is a single interface that abstracts identity data (users, groups, roles, role assignments, licenses, activities) across many SaaS applications into one normalized schema. Instead of integrating with each app individually, you call one endpoint and the platform handles auth, pagination, and field mapping per provider.
- How do unified APIs handle OAuth token expiration during background syncs?
- Production-grade unified APIs refresh OAuth tokens proactively before they expire and serialize concurrent refresh attempts per account using a mutex pattern, so multiple sync jobs don't race. On unrecoverable failures, a webhook fires so your application can prompt the user to reconnect.
- How do you handle API rate limits during large user syncs?
- A reliable unified API passes the HTTP 429 error back to the caller but normalizes the varied upstream rate limit headers into standardized IETF headers (like ratelimit-reset), allowing developers to write a single exponential backoff function for all integrations.
- What happens if a customer has custom roles or permissions in their SaaS app?
- Advanced unified APIs use a multi-level override hierarchy, allowing developers to write per-account JSONata mapping overrides to capture custom fields and bespoke permission models without changing core application code.