import ContextDev from 'context.dev';
const client = new ContextDev({
apiKey: process.env['CONTEXT_DEV_API_KEY'], // This is the default and can be omitted
});
const response = await client.web.webCrawlMd({ url: 'https://example.com' });
console.log(response.metadata);{
"results": [
{
"markdown": "<string>",
"metadata": {
"url": "<string>",
"title": "<string>",
"crawlDepth": 123,
"statusCode": 123,
"success": true
}
}
],
"metadata": {
"numUrls": 123,
"maxCrawlDepth": 123,
"numSucceeded": 123,
"numFailed": 123,
"numSkipped": 123
}
}Performs a crawl starting from a given URL, extracts page content as Markdown, and returns results for all crawled pages.
import ContextDev from 'context.dev';
const client = new ContextDev({
apiKey: process.env['CONTEXT_DEV_API_KEY'], // This is the default and can be omitted
});
const response = await client.web.webCrawlMd({ url: 'https://example.com' });
console.log(response.metadata);{
"results": [
{
"markdown": "<string>",
"metadata": {
"url": "<string>",
"title": "<string>",
"crawlDepth": 123,
"statusCode": 123,
"success": true
}
}
],
"metadata": {
"numUrls": 123,
"maxCrawlDepth": 123,
"numSucceeded": 123,
"numFailed": 123,
"numSkipped": 123
}
}Documentation Index
Fetch the complete documentation index at: https://docs.context.dev/llms.txt
Use this file to discover all available pages before exploring further.
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
The starting URL for the crawl (must include http:// or https:// protocol)
Maximum number of pages to crawl. Hard cap: 500.
1 <= x <= 500Maximum link depth from the starting URL (0 = only the starting page)
x >= 0Regex pattern. Only URLs matching this pattern will be followed and scraped.
"^https?://[^/]+/blog/"
Preserve hyperlinks in the Markdown output
Include image references in the Markdown output
Truncate base64-encoded image data in the Markdown output
Extract only the main content, stripping headers, footers, sidebars, and navigation
When true, follow links on subdomains of the starting URL's domain (e.g. docs.example.com when starting from example.com). www and apex are always treated as equivalent.
PDF parsing controls. Use start/end to limit text extraction and OCR to an inclusive 1-based page range.
Show child attributes
When true, the contents of iframes are rendered to Markdown for each crawled page.
Return a cached result if a prior scrape for the same parameters exists and is younger than this many milliseconds. Defaults to 1 day (86400000 ms) when omitted. Max is 30 days (2592000000 ms). Set to 0 to always scrape fresh.
0 <= x <= 2592000000Optional browser wait time in milliseconds after initial page load for each crawled page. Min: 0. Max: 30000 (30 seconds).
0 <= x <= 30000Soft time budget for the crawl in milliseconds. After each scrape, the crawler checks the elapsed time and, if exceeded, returns the pages collected so far instead of continuing. Min: 10000 (10s). Max: 240000 (4 min). Default: 120000 (2 min).
10000 <= x <= 240000Optional timeout in milliseconds for the request. If the request takes longer than this value, it will be aborted with a 408 status code. Maximum allowed value is 300000ms (5 minutes).
1000 <= x <= 300000Was this page helpful?