Crawl Sitemap - Context.dev

import ContextDev from 'context.dev'; const client = new ContextDev({ apiKey: process.env['CONTEXT_DEV_API_KEY'], // This is the default and can be omitted }); const response = await client.web.webScrapeSitemap({ domain: 'domain' }); console.log(response.domain);

{ "success": true, "domain": "<string>", "urls": [ "<string>" ], "meta": { "sitemapsDiscovered": 123, "sitemapsFetched": 123, "sitemapsSkipped": 123, "errors": 123 }, "key_metadata": { "credits_consumed": 123, "credits_remaining": 123 } }

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Query Parameters

domain

string

required

Domain to build a sitemap for

maxLinks

integer

default:10000

Maximum number of links to return from the sitemap crawl. Defaults to 10,000. Minimum is 1, maximum is 100,000.

Required range: 1 <= x <= 100000

urlRegex

string

Optional RE2-compatible regex pattern. Only URLs matching this pattern are returned and counted against maxLinks.

Maximum string length: 256

Example:

"^https?://[^/]+/blog/"

headers

object

Optional outbound HTTP headers forwarded only to the target URL, sent as deep-object query params such as headers[X-Custom]=value. When provided, caching is bypassed: the result is neither read from nor written to cache.

Show child attributes

timeoutMS

integer

Optional timeout in milliseconds for the request. If the request takes longer than this value, it will be aborted with a 408 status code. Maximum allowed value is 300000ms (5 minutes).

Required range: 1000 <= x <= 300000

Response

Successful response

success

enum<boolean>

required

Indicates success

Available options:

true

domain

string

required

The normalized domain that was crawled

urls

string[]

required

Array of discovered page URLs from the sitemap (max 500)