Skip to main content
GET
/
web
/
scrape
/
sitemap
JavaScript
import ContextDev from 'context.dev';

const client = new ContextDev({
  apiKey: process.env['CONTEXT_DEV_API_KEY'], // This is the default and can be omitted
});

const response = await client.web.webScrapeSitemap({ domain: 'domain' });

console.log(response.domain);
{
  "domain": "<string>",
  "urls": [
    "<string>"
  ],
  "meta": {
    "sitemapsDiscovered": 123,
    "sitemapsFetched": 123,
    "sitemapsSkipped": 123,
    "errors": 123
  }
}
1 Credit

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Query Parameters

domain
string
required

Domain to build a sitemap for

Maximum number of links to return from the sitemap crawl. Defaults to 10,000. Minimum is 1, maximum is 100,000.

Required range: 1 <= x <= 100000
urlRegex
string

Optional RE2-compatible regex pattern. Only URLs matching this pattern are returned and counted against maxLinks.

Maximum string length: 256
Example:

"^https?://[^/]+/blog/"

headers
object

Optional outbound HTTP headers forwarded only to the target URL, sent as deep-object query params such as headers[X-Custom]=value. When provided, caching is bypassed: the result is neither read from nor written to cache.

timeoutMS
integer

Optional timeout in milliseconds for the request. If the request takes longer than this value, it will be aborted with a 408 status code. Maximum allowed value is 300000ms (5 minutes).

Required range: 1000 <= x <= 300000

Response

Successful response

success
enum<boolean>
required

Indicates success

Available options:
true
domain
string
required

The normalized domain that was crawled

urls
string[]
required

Array of discovered page URLs from the sitemap (max 500)

meta
object
required

Metadata about the sitemap crawl operation