Skip to main content
GET
/
web
/
scrape
/
markdown
JavaScript
import ContextDev from 'context.dev';

const client = new ContextDev({
  apiKey: process.env['CONTEXT_DEV_API_KEY'], // This is the default and can be omitted
});

const response = await client.web.webScrapeMd({ url: 'https://example.com' });

console.log(response.markdown);
{
  "markdown": "<string>",
  "url": "<string>"
}
1 Credit

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Query Parameters

url
string<uri>
required

Full URL to scrape into LLM usable Markdown (must include http:// or https:// protocol)

Preserve hyperlinks in Markdown output

includeImages
boolean
default:false

Include image references in Markdown output

shortenBase64Images
boolean
default:true

Shorten base64-encoded image data in the Markdown output

useMainContentOnly
boolean
default:false

Extract only the main content of the page, excluding headers, footers, sidebars, and navigation

pdf
object

PDF parsing controls. Use start/end to limit text extraction and OCR to an inclusive 1-based page range.

includeFrames
boolean
default:false

When true, the contents of iframes are rendered to Markdown.

maxAgeMs
integer
default:86400000

Return a cached result if a prior scrape for the same parameters exists and is younger than this many milliseconds. Defaults to 1 day (86400000 ms) when omitted. Max is 30 days (2592000000 ms). Set to 0 to always scrape fresh.

Required range: 0 <= x <= 2592000000
waitForMs
integer

Optional browser wait time in milliseconds after initial page load before converting the page to Markdown. Min: 0. Max: 30000 (30 seconds).

Required range: 0 <= x <= 30000
headers
object

Optional outbound HTTP headers forwarded only to the target URL, sent as deep-object query params such as headers[X-Custom]=value. When provided, caching is bypassed: the result is neither read from nor written to cache.

timeoutMS
integer

Optional timeout in milliseconds for the request. If the request takes longer than this value, it will be aborted with a 408 status code. Maximum allowed value is 300000ms (5 minutes).

Required range: 1000 <= x <= 300000

Response

Successful response

success
enum<boolean>
required

Indicates success

Available options:
true
markdown
string
required

Page content converted to GitHub Flavored Markdown

url
string
required

The URL that was scraped