integrationengineer.dev

Reverse Engineering Vercel's Bot Protection: From Obfuscated JS to Bypass

March 2025 · 12 min read reverse-engineering web-security

Vercel ships a proof-of-work bot protection system for sites on their platform. Every visitor solves a cryptographic challenge before seeing any content. So I reverse engineered the whole thing: the obfuscated JavaScript, the Go WASM binary, all of it. I wanted to see how it works and whether it holds up.

The Challenge Page

Hit a Vercel-protected site without a valid session cookie and you get a 429: "We're verifying your browser." Behind that loading spinner, a lot is happening.

The page pulls in a minified script from /.well-known/vercel/security/static/challenge.v2.min.js. It's obfuscated. Single-letter variable names, an anti-debugging preamble, and a class that implements an entire Go WASM runtime. Yeah, in one file.

Deobfuscating the Worker Script

The script sets up a Web Worker that orchestrates the entire challenge flow. Once I renamed variables and added some structure, here's what's inside.

First, the anti-debugging preamble. It uses a classic trick: a catastrophic backtracking regex (((.+)+)+)+$ applied to toString() to detect if someone is monkey-patching functions at runtime:

JavaScriptconst antiDebug = oneShot(this, function () {
    return antiDebug
        .toString()
        .search('(((.+)+)+)+$')
        .toString()
        .constructor(antiDebug)
        .search('(((.+)+)+)+$');
});
antiDebug();

There's also a DevTools detector that exploits the fact that console.log() only reads an Error's .stack property when DevTools is open:

JavaScriptfunction isDevToolsOpen() {
    let opened = false;
    const err = new Error();
    Object.defineProperty(err, 'stack', {
        get() { opened = true; return ''; }
    });
    console.log(err);
    return opened;
}

Both of these are exposed on self.setTimeout.d and self.setTimeout.e, available for the WASM binary to call via the Go-to-JS bridge.

Architecture: How the Challenge Works

The system has four components talking to each other:

ArchitectureVercel Edge Server
    |
    |  1. Serves 429 + challenge page (embeds token)
    v
Main Page (challenge.v2.min.js)
    |
    |  2. Spawns Web Worker via MessagePort
    v
Web Worker
    |
    |  3. Fetches & runs Go WASM binary
    v
challenge.v2.wasm (Go compiled to WASM)
    |
    |  4. Solves proof-of-work, probes browser fingerprint
    v
Worker posts solution back --> Main page POSTs to /request-challenge
    |
    |  5. Server validates --> Sets _vcrcs cookie --> Redirects to real content
    v
Actual page content (200 OK)

The Challenge Token

The token is generated server-side and embedded in the challenge page. It follows this structure:

Token Format2.{timestamp}.{ttl}.{base64(payload)}.{hmac-md5}

Decoding the base64 payload reveals semicolon-delimited fields:

Field Example Purpose
Session ID 084da3aa11e059c04d0753ebb769f590 Unique per challenge (16 bytes)
Seed c0a95598 Starting nonce for proof-of-work (4 bytes)
Target hash 3222e2bc72c65a56da45e03e3a48cddfc8c42d64 SHA-1 sized target (20 bytes)
Difficulty 3 Proof-of-work difficulty level
Binary blob 45-46 bytes of encrypted data Server-side context (IP, TLS fingerprint, etc.)

The HMAC-MD5 signature at the end prevents any client-side tampering with the token.

The Go WASM Runtime

The bulk of the deobfuscated script is a class I named GoWasmRuntime, a minimal reimplementation of Go's wasm_exec.js. It provides the syscall/js bridge that lets Go code interact with JavaScript.

Go passes JS values through 64-bit floats using NaN-boxing. Numbers travel as-is. Non-number values (objects, strings, functions) get stored in a _values table and referenced by index, with type tags packed into the upper 32 bits. It's a neat trick. Everything looks like a float to WASM, but the runtime knows better.

Every Go-to-JS interaction goes through named syscalls: syscall/js.valueGet for property access, syscall/js.valueCall for method calls, syscall/js.valueNew for constructors. There's also a minimal WASI layer handling stdout (line-buffered to console.log), random number generation via crypto.getRandomValues, and process exit.

The WASM binary exports standard Go functions (_start, resume, go_scheduler) plus asyncify helpers that allow Go to yield across async JS calls.

The Solution

The WASM binary exposes a global Solve(token) function. When called, it returns a JSON object:

JSON{
    "solution": "a88ee0d11389bff3;a5c630112da23bea;c83c7ae15fd731d2"
}

Three 8-byte nonces, the output of a hashcash-style proof-of-work. The WASM brute-forces these values so that when hashed with the challenge seed and target, they satisfy the difficulty level.

The Session Cookie

On success, the server responds with 204 and sets:

Cookie_vcrcs=1.{timestamp}.3600.{base64(session_id)}.{hmac}

This cookie is valid for 1 hour. All subsequent requests with this cookie bypass the challenge entirely.

Where the Bot Detection Actually Lives

So here's where I was wrong. I assumed the challenge token, specifically that binary blob, contained browser fingerprint data. But think about it: the token is generated server-side before any JavaScript runs. It can only know what the server knows: IP address, TLS fingerprint, request headers.

The fingerprinting? It's in the WASM. The Go code has full access to globalThis through the syscall bridge, and the worker script exposes two functions that matter:

The WASM can call evalInMainThread() to probe anything in the main page context: navigator.webdriver, window.chrome, plugin arrays, screen dimensions, WebGL renderer strings. These fingerprint results get baked into the solution, not the token. The server verifies both the proof-of-work math and the embedded fingerprint data.

This is why I hit a wall earlier. I had an automated browser (Chrome DevTools via CDP) that computed a perfectly valid proof-of-work solution. Math checked out. Server still returned 708. The proof-of-work was fine, but the fingerprints baked into the solution screamed "automated." That rejection is the tell: Vercel doesn't just check your work, it checks who did it.

Running the WASM Solver Locally

Naturally, I tried pulling the Go WASM runtime into a standalone Node.js script to run the solver outside a browser:

JavaScriptimport fs from 'fs/promises';

const wasmBuffer = await fs.readFile('./challenge.v2.wasm');
const go = new GoWasmRuntime();
const { instance } = await WebAssembly.instantiate(wasmBuffer, go.importObject);

go.run(instance);

setTimeout(() => {
    const result = Solve(token);
    console.log('Solution:', result);
}, 500);

It works. The WASM loads, Go initializes, Solve() spits out nonces. But without a real browser behind those evalInMainThread() calls, the fingerprint data is garbage. You'd have to intercept every fingerprint probe and return convincing values. That's a cat-and-mouse game, and Vercel can change the questions any time by shipping a new WASM binary.

The Bypass: Stealth Browser + Wait for Challenge

So fighting the WASM is a dead end unless you want to reverse every fingerprint check and keep up with updates. The pragmatic move: let a real browser handle it. Use a stealth-patched browser, give the challenge time to complete, and grab the result. Most scraping tools fail here simply because they return the page before the WASM solver finishes.

Using Scrapling with its Patchright engine and stealth mode:

Pythonfrom scrapling.fetchers import PlayWrightFetcher

def wait_for_vercel_challenge(page):
    page.wait_for_function(
        "() => !document.body.innerText.includes('verifying your browser')",
        timeout=30000,
    )
    return page

page = PlayWrightFetcher.fetch(
    url="https://target-site.com/",
    headless=True,
    stealth=True,
    network_idle=True,
    page_action=wait_for_vercel_challenge,
    timeout=60000,
)

print(page.status)       # 200
print(page.get_all_text())  # Actual page content

What makes this work:

  1. stealth=True: injects evasion scripts before page load that patch the exact signals the WASM probes:
    • navigator.webdriver overridden to return false with a native-looking getter
    • window.chrome fully mocked with chrome.app, chrome.csi(), chrome.loadTimes()
    • 5 fake browser plugins injected into navigator.plugins
    • Screen dimensions set to realistic values
    • __pwInitScripts deleted to remove Playwright's fingerprint
  2. network_idle=True: waits for all network activity to settle (the WASM fetch + solution POST)
  3. page_action: waits until the "verifying your browser" text disappears, meaning the challenge solved and the page reloaded with real content
  4. headless=True: the stealth patches make headless indistinguishable from headful for these checks

Result: 200 OK, full page content, challenge bypassed.

The Efficient Approach

Since the _vcrcs cookie lasts for 1 hour, you don't need a browser for every request. The optimal strategy:

  1. Launch a stealth browser once to solve the challenge and capture the _vcrcs cookie
  2. Use the cookie with plain HTTP requests (curl, requests, etc.) for the next 60 minutes
  3. When the cookie expires, repeat step 1

One browser launch per hour. Everything else is lightweight HTTP.

Takeaways

Credit where it's due, this is well-designed. WASM-based proof-of-work, a NaN-boxed Go-to-JS bridge, HMAC-signed tokens, and fingerprint data embedded in the solution rather than the token. That last part is the smartest decision in the whole system.

But the weakest link isn't the cryptography. It's the fingerprint checks. They boil down to JavaScript API responses like navigator.webdriver, window.chrome, plugin arrays, and all of those can be spoofed with init scripts that run before the page loads.

The evalInMainThread() bridge is clever. Letting WASM in a worker probe the main thread's DOM and navigator objects is a nice architectural choice. But it also means the checks are only as strong as the browser's ability to tell real API responses from faked ones. And right now, that's not very strong.

Compiling the fingerprint logic into WASM does raise the bar compared to plain JS checks. You can't just read the source and see what's being checked. But the syscall bridge is observable, and the questions it asks have known answers.