Skip to main content
Available on Enterprise Self Hosting plan only.Requires 1.17.0 or higher version of the Gateway.

Policies

  • Policies allow organisations to apply flexible budget and rate limit controls based on dynamic conditions (API keys, metadata, workspace, etc.).
  • There are two types of policies:
    1. Usage Limits Policies: Control the total usage (cost or tokens) over a period
    2. Rate Limits Policies: Control the rate of requests or tokens per minute/hour/day

Policy Structure

Both policy types share common concepts:

Conditions

Conditions define which requests the policy applies to. Each condition has a key and value: Valid keys:
  • api_key - Apply to a specific API key
  • workspace_id - Apply to a workspace
  • metadata.* - Apply based on custom metadata fields (e.g., metadata._user, metadata.team)
Example:
[
  {
    "key": "api_key",
    "value": "*"
  },
  {
    "key": "metadata.team",
    "value": "engineering"
  }
]

Group By

Group by defines how usage is aggregated. Each group entry has a key: Valid keys:
  • api_key - Group by API key
  • workspace_id - Group by workspace
  • metadata.* - Group by custom metadata fields
Example:
[
  {
    "key": "api_key"
  },
  {
    "key": "metadata._user"
  }
]

Authentication

All policy endpoints require authentication. You can authenticate using:
  • API Key: Include in x-portkey-api-key header

Permissions

Policies require the following RBAC permissions:
  • policies:create - Create policies
  • policies:read - Read policies
  • policies:update - Update policies
  • policies:delete - Delete policies
  • policies:list - List policies

Base URL

All policy endpoints are under:
/v1/policies

Usage Limits Policies

Usage limits policies allow you to set maximum usage (cost or tokens) that can be consumed over a period. When the limit is reached, requests will be blocked until the limit resets.

Policy Types

Usage limits policies support two types:
  • cost: Limit based on total cost (in dollars)
  • tokens: Limit based on total tokens consumed

Usage Limits Policies

Usage limits policies allow you to set maximum usage (cost or tokens) that can be consumed over a period. When the limit is reached, requests will be blocked until the limit resets.

Policy Types

Usage limits policies support two types:
  • cost: Limit based on total cost (in dollars)
  • tokens: Limit based on total tokens consumed

Parameters

Usage limits policies support the following parameters:

credit_limit (required)

The maximum usage limit that can be consumed before requests are blocked.
  • Type: Number (integer or float)
  • Minimum value:
    • For cost type: 1 (represents $1.00)
    • For tokens type: 100 tokens
  • Units:
    • For cost type: Value is in USD (dollars)
    • For tokens type: Value is in tokens
  • Behavior: When the credit limit is reached, all matching requests will be blocked until the limit resets (if periodic_reset is configured) or the policy is updated

alert_threshold (optional)

An optional threshold that triggers notifications before the credit limit is reached.
  • Type: Number (integer or float)
  • Minimum value: 1
  • Units:
    • For cost type: Value is in USD (dollars)
    • For tokens type: Value is in tokens
  • Validation: Must be less than credit_limit if provided
  • Behavior:
    • When usage reaches this threshold, email notifications are sent to configured recipients
    • The API key continues to function normally until the full credit_limit is reached
    • Useful for proactive monitoring and budget management

periodic_reset (optional)

Configures automatic reset of the usage limit at regular intervals.
  • Type: String (enum)
  • Valid values:
    • "weekly" - Budget limits automatically reset every week
    • "monthly" - Budget limits automatically reset every month
    • Omitted/not provided - No periodic reset (limit applies until exhausted)
  • Reset timing:
    • Weekly: Resets occur every Monday at 12:00 AM UTC
    • Monthly: Resets occur on the 1st calendar day of each month at 12:00 AM UTC, regardless of when the policy was created
  • Behavior: When a reset occurs, the usage counter resets to zero and the limit becomes available again

Validation Rules

  1. Conditions: Must be a non-empty array. Each condition must have key and value fields.
  2. Group By: Must be a non-empty array. Each group must have a key field.
  3. Valid Keys: For both conditions and group_by, valid keys are:
    • api_key
    • workspace_id
    • Any key starting with metadata. (e.g., metadata._user)
  4. Alert Threshold: Must be less than credit_limit if provided.
  5. Workspace: Workspace ID is required (can be provided via API key or request body).

Examples

1: Monthly Cost Limit per API Key

Limit each API key to $1000 per month:
{
  "name": "Monthly Cost Limit per API Key",
  "conditions": [
    {
      "key": "api_key",
      "value": "*"
    }
  ],
  "group_by": [
    {
      "key": "api_key"
    }
  ],
  "type": "cost",
  "credit_limit": 1000.0,
  "alert_threshold": 800.0,
  "periodic_reset": "monthly"
}

2: Token Limit by User

Limit tokens per user (via metadata):
{
  "name": "Token Limit per User",
  "conditions": [
    {
      "key": "api_key",
      "value": "*"
    }
  ],
  "group_by": [
    {
      "key": "metadata._user"
    }
  ],
  "type": "tokens",
  "credit_limit": 1000000,
  "periodic_reset": "weekly"
}

Rate Limits Policies

Rate limits policies allow you to control the rate of requests or tokens consumed per minute, hour, or day. When the rate limit is exceeded, requests will be throttled.

Policy Types

Rate limits policies support two types:
  • requests: Limit based on number of requests
  • tokens: Limit based on number of tokens

Rate Units

Rate limits can be specified in three units:
  • rpm: Requests/Tokens per minute
  • rph: Requests/Tokens per hour
  • rpd: Requests/Tokens per day

Parameters

Rate limits policies support the following parameters:

type (required)

The type of rate limit to enforce.
  • Type: String (enum)
  • Valid values:
    • "requests" - Limit based on number of API requests
    • "tokens" - Limit based on number of tokens consumed
  • Behavior: Determines what metric is being rate-limited

unit (required)

The time interval unit for the rate limit.
  • Type: String (enum)
  • Valid values:
    • "rpm" - Requests/Tokens per minute
    • "rph" - Requests/Tokens per hour
    • "rpd" - Requests/Tokens per day
  • Behavior:
    • Defines the time window over which the rate limit is calculated
    • Limits reset automatically at the start of each time period
    • Per Minute: Limits reset every minute, ideal for fine-grained control
    • Per Hour: Limits reset hourly, providing balanced usage control
    • Per Day: Limits reset daily, suitable for broader usage patterns

value (required)

The maximum number of requests or tokens allowed within the specified time unit.
  • Type: Number (integer)
  • Minimum value: 1 (setting to 0 effectively disables the policy)
  • Units:
    • For requests type: Value represents the number of API requests
    • For tokens type: Value represents the number of tokens
  • Behavior:
    • When the rate limit is exceeded, subsequent requests are throttled/rejected until the time period resets
    • The limit applies immediately after the policy is created
    • Rate limits are enforced on a rolling window basis within the specified time unit

Validation Rules

  1. Conditions: Must be a non-empty array. Each condition must have key and value fields.
  2. Group By: Must be a non-empty array. Each group must have a key field.
  3. Valid Keys: For both conditions and group_by, valid keys are:
    • api_key
    • workspace_id
    • Any key starting with metadata. (e.g., metadata._user)
  4. Value: Must be a numeric value.
  5. Workspace: Workspace ID is required (can be provided via API key or request body).

Examples

1: Requests per Minute per API Key

Limit each API key to 100 requests per minute:
{
  "name": "100 RPM per API Key",
  "conditions": [
    {
      "key": "api_key",
      "value": "*"
    }
  ],
  "group_by": [
    {
      "key": "api_key"
    }
  ],
  "type": "requests",
  "unit": "rpm",
  "value": 100
}

2: Tokens per Hour by User

Limit tokens per user (via metadata) to 10,000 per hour:
{
  "name": "10K Tokens per Hour per User",
  "conditions": [
    {
      "key": "api_key",
      "value": "*"
    }
  ],
  "group_by": [
    {
      "key": "metadata._user"
    }
  ],
  "type": "tokens",
  "unit": "rph",
  "value": 10000
}

3: Daily Request Limit

Limit total requests per day for a workspace:
{
  "name": "10K Requests per Day",
  "conditions": [
    {
      "key": "api_key",
      "value": "*"
    }
  ],
  "group_by": [
    {
      "key": "workspace_id"
    }
  ],
  "type": "requests",
  "unit": "rpd",
  "value": 10000
}

API Reference

For detailed API documentation, see the following endpoints:

Usage Limits Policies

Create Usage Limits Policy

List Usage Limits Policies

Retrieve Usage Limits Policy

Update Usage Limits Policy

Delete Usage Limits Policy

Rate Limits Policies

Create Rate Limits Policy

List Rate Limits Policies

Retrieve Rate Limits Policy

Update Rate Limits Policy

Delete Rate Limits Policy