Chrome replace GET request with a huge range request

September 16, 2022

Background

I use PDF.js to show large scanned documents in a project. The backend guys configure web server to support HTTP Range Requests for better render performance. More

PDF.js supports this feature with three related options:

disableRange
disableStream
disableAutoFetch

See comments from source code.

// options used in the project.
const options = {
  // ...
  disableRange: false,
  disableStream: false,
  disableAutoFetch: false, // Auto-fetch pages after first view displayed when disableStream enabled for better performance.
  // ...
};

Render Process

Issue a GET request to fetch PDF document.
After the headers of the request resolved,
- Cancel the GET request as soon as possible(What disableStream means)
- Issue more requests to fetch data what the viewer needed to display first pages use range request.
As the user scroll, send more range requests to get necessary data.

The Bug

When user reload the page after view all pages, the first GET request become a huge range request which download the whole document, it hurts the performance badly.

How to Resolve

First, I tried to disable the cache, and the bug just disappeared. It's interesting, seems something related to the browser cache policy.

I searched for range request cache issues, it seems not supported perfectly.

So I inspected the response headers of PDF document. The first GET request seems normal. The next range requests got some Cache-Control: public, max-age=345600. I guessed maybe the public cache policy cause the bug, So I talked with DevOps and backend guys about my guess, they removed the public to verify. The bug just disappear, problem solved!

Conclusion

It seems the Chrome browser makes some magic decisions with an inappropriate cache policy. I try to find some theory to support my guess, but get nothing.