Crawl Sitemap
Crawl an entire website’s sitemap and return all discovered page URLs.
Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Query Parameters
Domain to build a sitemap for
Maximum number of links to return from the sitemap crawl. Defaults to 10,000. Minimum is 1, maximum is 100,000.
1 <= x <= 100000Optional RE2-compatible regex pattern. Only URLs matching this pattern are returned and counted against maxLinks.
256"^https?://[^/]+/blog/"
Optional outbound HTTP headers forwarded only to the target URL, sent as deep-object query params such as headers[X-Custom]=value. When provided, caching is bypassed: the result is neither read from nor written to cache.
Optional timeout in milliseconds for the request. If the request takes longer than this value, it will be aborted with a 408 status code. Maximum allowed value is 300000ms (5 minutes).
1000 <= x <= 300000Response
Successful response
Indicates success
true The normalized domain that was crawled
Array of discovered page URLs from the sitemap (max 500)
Metadata about the sitemap crawl operation
Metadata about the API key used for the request. Included in every response whenever a valid API key is provided, even when the response status is not 200.