> ## Documentation Index
> Fetch the complete documentation index at: https://docs.context.dev/llms.txt
> Use this file to discover all available pages before exploring further.

# Handle Rate Limits

> Stay under your plan's per-minute request cap with client caching, exponential backoff, prefetching, and graceful fallbacks.

Each plan has a per-minute request cap. If you exceed it, the API returns `429 Too Many Requests`.

To make your app production-ready, use these four patterns to prevent this or handle errors when it happens:

* Client-side caching for hot domains.
* Backoff on 429, honoring the `Retry-After` header.
* Prefetch to shift slow work ahead of bursts.
* Tier-aware fallbacks when the limit holds.

## Rate limits per plan

Rate limits apply per API key, are measured per minute, and are visible on your [dashboard](https://www.context.dev/dashboard). The current tiers:

| Plan       | Credits per month | Rate limit         | Overage              |
| ---------- | ----------------- | ------------------ | -------------------- |
| Free       | 500 one-time\*    | 10 requests/min    | None                 |
| Starter    | 30,000            | 120 requests/min   | \$19 per 10K credits |
| Pro        | 200,000           | 300 requests/min   | \$9 per 10K credits  |
| Scale      | 2,500,000         | 1,200 requests/min | \$6 per 10K credits  |
| Enterprise | Custom            | Custom             | Contact sales        |

\* Free plan credits are a one-time grant, not a monthly allowance.

<Info>
  [Logo Link](/guides/get-logo-from-url) and [Prefetch](/optimization/prefetching) endpoints do not have any rate limits.
</Info>

## What a 429 looks like

The API returns a JSON envelope:

```json theme={null}
{
  "status": "error",
  "message": "Rate limit exceeded",
  "code": 429,
  "key_metadata": {
    "credits_consumed": 0,
    "credits_remaining": 29940
  }
}
```

`credits_consumed` is always `0` on a 429 — throttled requests are never charged.

Every 429 response also includes a `Retry-After` header with the number of seconds (1–60) until your per-minute window resets:

```text theme={null}
Retry-After: 23
```

Through an SDK, the error surfaces as a typed exception with `status === 429`. The SDK does not retry automatically. You wire that in.

## Pattern 1: Client-side cache for hot domains

The cheapest way to stay under the cap is to skip the call. Brand data changes on the order of months, so a 24-hour client cache is safe for most products:

<CodeGroup>
  ```typescript TypeScript theme={null}
  import ContextDev from "context.dev";

  const client = new ContextDev({ apiKey: process.env.CONTEXT_DEV_API_KEY });

  const CACHE_TTL_MS = 30 * 24 * 60 * 60 * 1000;
  const cache = new Map<string, { data: unknown; at: number }>();

  async function getBrand(domain: string) {
    const hit = cache.get(domain);
    if (hit && Date.now() - hit.at < CACHE_TTL_MS) return hit.data;

    const { brand } = await client.brand.retrieve({ domain });
    cache.set(domain, { data: brand, at: Date.now() });
    return brand;
  }
  ```

  ```python Python theme={null}
  import os
  import time
  from context.dev import ContextDev

  client = ContextDev(api_key=os.environ["CONTEXT_DEV_API_KEY"])

  CACHE_TTL_S = 30 * 24 * 60 * 60
  _cache: dict[str, tuple[float, dict]] = {}

  def get_brand(domain: str):
      hit = _cache.get(domain)
      if hit and time.time() - hit[0] < CACHE_TTL_S:
          return hit[1]
      brand = client.brand.retrieve(domain=domain).brand
      _cache[domain] = (time.time(), brand)
      return brand
  ```

  ```ruby Ruby theme={null}
  require "context_dev"

  client = ContextDev::Client.new(api_key: ENV.fetch("CONTEXT_DEV_API_KEY"))

  CACHE_TTL_S = 30 * 24 * 60 * 60
  @cache = {}

  def get_brand(client, domain)
    hit = @cache[domain]
    return hit[:data] if hit && Time.now.to_i - hit[:at] < CACHE_TTL_S

    brand = client.brand.retrieve(domain: domain).brand
    @cache[domain] = { data: brand, at: Time.now.to_i }
    brand
  end

  get_brand(client, "acme.com")
  ```

  ```go Go theme={null}
  package main

  import (
      "context"
      "os"
      "sync"
      "time"

      contextdev "github.com/context-dot-dev/context-go-sdk"
      "github.com/context-dot-dev/context-go-sdk/option"
  )

  var (
      cacheMu sync.Mutex
      cache   = map[string]struct {
          data any
          at   time.Time
      }{}
      cacheTTL = 30 * 24 * time.Hour
      client   = contextdev.NewClient(option.WithAPIKey(os.Getenv("CONTEXT_DEV_API_KEY")))
  )

  func GetBrand(ctx context.Context, domain string) (any, error) {
      cacheMu.Lock()
      if hit, ok := cache[domain]; ok && time.Since(hit.at) < cacheTTL {
          cacheMu.Unlock()
          return hit.data, nil
      }
      cacheMu.Unlock()

      r, err := client.Brand.Get(ctx, contextdev.BrandGetParams{Domain: domain})
      if err != nil {
          return nil, err
      }
      cacheMu.Lock()
      cache[domain] = struct {
          data any
          at   time.Time
      }{r.Brand, time.Now()}
      cacheMu.Unlock()
      return r.Brand, nil
  }
  ```
</CodeGroup>

Reasonable TTL starting points: 30 days for brand responses, 7 days for product extractions, indefinite for industry codes (NAICS / SIC). Adjust per use case.

## Pattern 2: Backoff on 429 with `Retry-After`

When you hit rate limits, you get a 429 status code on the response:

```json theme={null}
{
  "status": "error",
  "message": "Rate limit exceeded",
  "code": 429
}
```

The response's `Retry-After` header tells you exactly how many seconds until your window resets, so use it as the wait time when it's present. Fall back to exponential backoff (wait 1 second before the first retry and double the delay on each subsequent attempt) if you can't read the header.

Here's an example of a retry script that honors `Retry-After` and falls back to exponential delays:

<CodeGroup>
  ```typescript TypeScript theme={null}
  async function retrieveWithBackoff(domain: string, maxAttempts = 4) {
    for (let attempt = 0; attempt < maxAttempts; attempt++) {
      try {
        return await client.brand.retrieve({ domain });
      } catch (err: any) {
        if (err.status !== 429 || attempt === maxAttempts - 1) throw err;

        const retryAfter = Number(err.headers?.["retry-after"]);
        const delayMs = retryAfter > 0 ? retryAfter * 1000 : Math.pow(2, attempt) * 1000;
        await new Promise((r) => setTimeout(r, delayMs));
      }
    }
  }
  ```

  ```python Python theme={null}
  import time
  from context.dev import APIStatusError

  def retrieve_with_backoff(domain: str, max_attempts: int = 4):
      for attempt in range(max_attempts):
          try:
              return client.brand.retrieve(domain=domain)
          except APIStatusError as e:
              if e.status_code != 429 or attempt == max_attempts - 1:
                  raise
              retry_after = int(e.response.headers.get("Retry-After", 0))
              time.sleep(retry_after if retry_after > 0 else 2 ** attempt)
  ```

  ```ruby Ruby theme={null}
  def retrieve_with_backoff(client, domain, max_attempts: 4)
    attempt = 0
    begin
      client.brand.retrieve(domain: domain)
    rescue ContextDev::Errors::APIStatusError => e
      raise unless e.status == 429 && attempt < max_attempts - 1
      retry_after = e.headers["retry-after"].to_i
      sleep(retry_after.positive? ? retry_after : 2**attempt)
      attempt += 1
      retry
    end
  end
  ```

  ```go Go theme={null}
  import (
      "context"
      "errors"
      "fmt"
      "math"
      "strconv"
      "time"

      contextdev "github.com/context-dot-dev/context-go-sdk"
  )

  func RetrieveWithBackoff(ctx context.Context, domain string) (*contextdev.BrandGetResponse, error) {
      const maxAttempts = 4
      for attempt := 0; attempt < maxAttempts; attempt++ {
          r, err := client.Brand.Get(ctx, contextdev.BrandGetParams{Domain: domain})
          if err == nil {
              return r, nil
          }
          var apiErr *contextdev.Error
          if !errors.As(err, &apiErr) || apiErr.StatusCode != 429 || attempt == maxAttempts-1 {
              return nil, err
          }
          delay := time.Duration(math.Pow(2, float64(attempt))) * time.Second
          if secs, parseErr := strconv.Atoi(apiErr.Response.Header.Get("Retry-After")); parseErr == nil && secs > 0 {
              delay = time.Duration(secs) * time.Second
          }
          time.Sleep(delay)
      }
      return nil, fmt.Errorf("unreachable")
  }
  ```
</CodeGroup>

## Pattern 3: Prefetch to shift slow work ahead of bursts

Bursty traffic (like when a marketing email triggers 200 signups in 60 seconds) can get you rate limited. Prefetching doesn't reduce the number of Brand API calls that count against your limit; every user-facing `/brand/retrieve` still spends rate-limit budget. What it does is shift the slow crawl work earlier, so each call during the burst completes in under a second instead of stalling for up to a minute and piling up retries on top of an already-saturated window.

Here's how it works:

* During the burst, your application calls `/brand/prefetch` (if it has a domain) or `/brand/prefetch-by-email` (if it has an email) right when it first receives the target domain or email. These prefetch endpoints are rate-limit-free, so 200 calls in a minute is fine.
* A few seconds later, when the user actually submits and the user-facing client hits the Brand API, the request lands on a warm cache and returns in under a second. That call still counts toward your per-minute limit; it's just fast.

See [Prefetch for Faster Response](/optimization/prefetching) for the full pattern.

## Pattern 4: Degrade gracefully when the limit holds

If exponential backoff has run out of retries and you are still seeing 429s, the user is better served by a missing-data fallback than an error screen.

Some examples:

* **Onboarding form.** Skip the prefilled fields. Let the user enter them by hand and do not block on the API.
* **Logo wall.** Render the customer's name in a styled box instead of the logo.
* **CRM enrichment.** Queue the contact for an offline enrichment job that runs overnight.

Build the fallback once and the end user never sees a rate-limit message.

## Related resources

<CardGroup cols={2}>
  <Card title="Prefetch" icon="bolt" href="/optimization/prefetching">
    Warm the cache so burst-time calls return fast.
  </Card>

  <Card title="Best practices" icon="list-check" href="/optimization/best-practices">
    Cache, fallback, and proxy patterns end to end.
  </Card>

  <Card title="Troubleshooting" icon="triangle-exclamation" href="/optimization/troubleshooting">
    Other status codes, retry logic, and SDK gotchas.
  </Card>

  <Card title="Pricing" icon="receipt" href="https://www.context.dev/pricing">
    Per-plan credit, rate limit, and overage details.
  </Card>
</CardGroup>
