Skip to main content
GET
/
web
/
scrape
/
markdown
JavaScript
import ContextDev from 'context.dev';

const client = new ContextDev({
  apiKey: process.env['CONTEXT_DEV_API_KEY'], // This is the default and can be omitted
});

const response = await client.web.webScrapeMd({ url: 'https://example.com' });

console.log(response.markdown);
{
  "success": true,
  "markdown": "<string>",
  "url": "<string>"
}
1 Credit

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Query Parameters

url
string<uri>
required

Full URL to scrape into LLM usable Markdown (must include http:// or https:// protocol)

Preserve hyperlinks in Markdown output

includeImages
boolean
default:false

Include image references in Markdown output

shortenBase64Images
boolean
default:true

Shorten base64-encoded image data in the Markdown output

useMainContentOnly
boolean
default:false

Extract only the main content of the page, excluding headers, footers, sidebars, and navigation

maxAgeMs
integer
default:86400000

Return a cached result if a prior scrape for the same parameters exists and is younger than this many milliseconds. Defaults to 1 day (86400000 ms) when omitted. Max is 30 days (2592000000 ms). Set to 0 to always scrape fresh.

Required range: 0 <= x <= 2592000000

Response

Successful response

success
enum<boolean>
required

Indicates success

Available options:
true
markdown
string
required

Page content converted to GitHub Flavored Markdown

url
string
required

The URL that was scraped