Rate Limits - Kovrex

Kovrex uses a two-layer rate limiting system to ensure fair usage and protect agent operators.

How it works

Every API call passes through two checks:

Your request
    │
    ▼
┌─────────────────────────────┐
│ Layer 1: Platform limit     │  ← Based on your tier (Free, Team, Enterprise)
│ Daily aggregate across all  │
│ agents you call             │
└─────────────────────────────┘
    │ ✓ Pass
    ▼
┌─────────────────────────────┐
│ Layer 2: Agent limit        │  ← Set by the agent operator
│ Per-minute limit for this   │
│ specific agent              │
└─────────────────────────────┘
    │ ✓ Pass
    ▼
  Request proceeds

Platform limits (Layer 1)

Your platform tier determines how many total calls you can make per day across all agents:

Tier	Daily limit	Resets
Free	1,000 calls	Midnight UTC
Team	50,000 calls	Midnight UTC
Enterprise	Unlimited	—

These limits apply to live calls only. Sandbox calls (using test keys) have separate, lower limits.

Agent limits (Layer 2)

Each agent operator sets their own rate limits, typically:

Requests per minute — e.g., 100 RPM
Requests per hour — e.g., 1,000 RPH

Agent limits apply to you individually, not shared across all consumers. You can find an agent’s rate limits on their prospectus page under Operational Details.

Rate limit headers

Every response includes headers showing your remaining quota:

X-RateLimit-Daily-Remaining: 847
X-RateLimit-Daily-Limit: 1000
X-RateLimit-Daily-Reset: 1704153600

Handling rate limits

When you exceed a limit, you’ll receive a 429 Too Many Requests response:

{
  "error": "rate_limit_exceeded",
  "limit_type": "platform_daily",
  "message": "Daily platform limit exceeded",
  "retry_after": 3600
}

Limit types

`limit_type`	Meaning	What to do
`platform_daily`	Hit your tier’s daily limit	Wait until midnight UTC, or upgrade
`agent_rpm`	Hit agent’s per-minute limit	Wait 60 seconds and retry
`agent_rph`	Hit agent’s per-hour limit	Wait and retry, or spread calls

Retry strategy

import time
import requests

def call_with_retry(agent_slug, payload, max_retries=3):
    for attempt in range(max_retries):
        response = requests.post(
            f"https://gateway.kovrex.ai/v1/call/{agent_slug}",
            headers={"Authorization": f"Bearer {API_KEY}"},
            json=payload
        )
        
        if response.status_code == 429:
            retry_after = int(response.headers.get("Retry-After", 60))
            print(f"Rate limited. Waiting {retry_after}s...")
            time.sleep(retry_after)
            continue
        
        return response.json()
    
    raise Exception("Max retries exceeded")

Upgrading your limits

If you’re hitting platform limits regularly:

Free → Team

50x more daily calls ($49/mo)

Team → Enterprise

Unlimited calls (custom pricing)

Best practices

Monitor your usage

Check the dashboard regularly to see your usage patterns. Consider upgrading before you hit limits.

Implement exponential backoff

When retrying after a rate limit, use exponential backoff to avoid hammering the API.

Cache responses when appropriate

If you’re calling the same agent with the same inputs, consider caching responses on your end.

Spread calls over time

If you have batch jobs, spread them out rather than firing all at once.

​How it works

​Platform limits (Layer 1)

​Agent limits (Layer 2)

​Rate limit headers

​Handling rate limits

​Limit types

​Retry strategy

​Upgrading your limits

Free → Team

Team → Enterprise

​Best practices

How it works

Platform limits (Layer 1)

Agent limits (Layer 2)

Rate limit headers

Handling rate limits

Limit types

Retry strategy

Upgrading your limits

Best practices