Accept-Encoding: gzip with your request, or use a client that sends it automatically, and Context.dev returns compressed JSON when the response is large enough to benefit.
This is most useful for response-heavy endpoints like /web/scrape/html, /web/scrape/markdown, and /web/crawl, all covered in the Scrape Websites guide. In production benchmarks on scrape responses, gzip reduced bytes on the wire by about 82%. Large payloads, especially 500KB and above, saw a median latency improvement of about 70ms, while smaller responses were roughly neutral.
For brand, quality, extraction, and classification-style endpoints, gzip is still supported as an option. The benefit depends on how large the response is: small JSON payloads may not move much, while larger AI or crawl-derived responses can benefit from the same transfer-size reduction.
Enable gzip
Most HTTP clients already request gzip and decompress the response automatically. The main rule is: don’t overrideAccept-Encoding with identity unless you are deliberately measuring an uncompressed baseline.
requests and the official SDKs, the decoded JSON is what your application sees. The compression and decompression happen at the HTTP layer.
Verify compression
To confirm gzip is being negotiated, inspect the response headers:content-encoding header, the response may be small enough that compression is not worth applying, or your client or proxy may have already decompressed it before exposing headers.
Where compression helps most
Gzip improves transfer size for any compressible JSON response, but the latency impact depends on payload size and network conditions.| Response shape | Expected impact |
|---|---|
| Large rendered HTML | Highest benefit. HTML compresses well, and /web/scrape/html can return hundreds of KB or multiple MB. |
| Full-site crawl responses | High benefit. /web/crawl can return many Markdown documents in one JSON response. |
| Markdown scrape responses | Moderate benefit. Markdown is usually smaller than HTML, but large pages still benefit. |
| Brand and quality-style JSON responses | Optional. Use gzip when available, but expect smaller gains unless the response is large. |
| Very small responses | Usually neutral. The response is already small, so transfer savings are limited. |
Keep connections warm
Compression reduces bytes on the wire, but connection setup can still dominate latency. Reuse HTTP connections instead of creating a fresh client for every request:Combine with payload controls
Compression works best alongside endpoint parameters that avoid returning bytes you do not need:- Use
useMainContentOnly=truewhen you only need the main article or page body. - Prefer
/web/scrape/markdownover/web/scrape/htmlwhen plain text is enough. - Keep
includeImages=falseandshortenBase64Images=truefor Markdown scrapes unless image references are required. - Use
includeSelectorsandexcludeSelectorsto narrow scrape output to the relevant page regions. - Use
maxAgeMsfor repeated scrapes so cached responses can be served quickly.
Related resources
Scrape websites
Use scrape and crawl endpoints that benefit most from response compression.
Integration best practices
Production patterns for caching, timeouts, retries, and background jobs.
Prefetching
Warm brand lookups before user-facing requests.
Troubleshooting
Diagnose timeouts, retries, and slow requests.