Multi-Tenant LLM Access with Enterprise Control for Customer-Facing Apps
Modern SaaS platforms face a critical challenge when adding AI capabilities: how to provide thousands of customers with access to powerful language models while maintaining predictable costs, ensuring fair resource allocation, and preventing abuse.Consider a typical scenario: Your B2B SaaS platform serves 5,000 users across 200 customer accounts. Each customer wants to leverage AI for different use cases - some need GPT-4 for complex analysis, others require Claude for writing tasks, and many are satisfied with GPT-3.5 for basic queries. Without proper infrastructure, you risk:
Uncapped Costs: A single customer consuming $10,000 in API credits overnight
Resource Contention: Power users monopolizing API capacity
Security Vulnerabilities: Shared API keys creating audit and compliance issues
Poor User Experience: No visibility into usage limits or remaining budgets
Operational Complexity: Managing multiple provider APIs without unified controls
This guide demonstrates how to build production-ready AI features that give your customers flexibility while maintaining enterprise-grade control and predictability.
Before diving into configuration, let’s understand how Portkey organizes access control. The platform uses a three-tier hierarchy that maps naturally to business structures:
Copy
Ask AI
Organization (Your Entire SaaS Platform)├── Workspaces (Logical Groups - Teams, Tiers, or Regions)│ ├── Enterprise Customers Workspace│ ├── Professional Customers Workspace│ └── Starter Customers Workspace└── API Keys (Individual Access Points) ├── Each customer gets their own API key ├── Keys inherit workspace settings └── Keys can have individual budget and reate limits
Why this structure?
Organization: Sets platform-wide policies and defaults
Workspaces: Groups customers with similar needs (e.g., all Enterprise customers)
API Keys: Individual access points with customer-specific limits
We will first start with first setting up some organisation properites for Portkey. Navigate to your organization settings to establish platform-wide rules.
What is Metadata?Metadata in Portkey is custom key-value pair attached to every AI request. Think of it as tags that help you track who’s using AI, how they’re using it, and what it’s costing you. This becomes crucial when you have thousands of customers making millions of requests.For example, metadata helps you answer questions like:
Workspaces are containers that group customers with similar characteristics. Let’s create workspaces for different customer tiers:To create a new workspace:
Click on your current workspace name in the sidebar
Integrations securely store your provider credentials while enabling controlled access. Think of them as secure vaults that your workspaces can access without ever exposing the actual API keys.
Provide customers with real-time access to their available models:
Copy
Ask AI
import jsonfrom portkey_ai import Portkeyclient = Portkey( api_key = "your-customer-api-key")models = client.models.list()# Extract the data array from the responsemodel_data = models.data if hasattr(models, 'data') else models# Create organized structure with just IDsorganized = {"ids": []}for model in model_data: organized["ids"].append(model.id)print(organized)
from portkey_ai import Portkeydef get_user_limits(workspace_slug, portkey_api_key, user_email): """Get rate and usage limits for a user by email""" portkey = Portkey(api_key=portkey_api_key) api_keys = portkey.api_keys.list(workspace_id=workspace_slug) # Filter by user email in metadata for key in api_keys.get('data', []): metadata = key.get('defaults', {}).get('metadata') or key.get('metadata') # Check if metadata contains user_email if metadata and isinstance(metadata, dict) and metadata.get('user_email') == user_email: print(f"User: {user_email}") print(f"API Key: {key.get('name')}") # Rate limits for limit in key.get('rate_limits', []): print(f"Rate Limit: {limit['value']} {limit['unit']}") # Usage limits usage = key.get('usage_limits') or {} print(f"Usage Limit: ${usage.get('credit_limit')} {usage.get('periodic_reset')}") print(f"Alert Threshold: {usage.get('alert_threshold')}") return # If no metadata match, show first available key's limits print(f"No metadata match for {user_email}. Showing available limits:") if api_keys.get('data'): key = api_keys['data'][0] for limit in key.get('rate_limits', []): print(f"Rate Limit: {limit['value']} {limit['unit']}") usage = key.get('usage_limits') or {} print(f"Usage Limit: ${usage.get('credit_limit')} {usage.get('periodic_reset')}") print(f"Alert Threshold: {usage.get('alert_threshold')}%")# Usageif __name__ == "__main__": get_user_limits( workspace_slug="your-workspace-slug", portkey_api_key="your-portkey-admin-api-key", user_email="your-customer-email-metadata-value" # in this example assuming your user api keys have user_email metadata value )# Expected output for your data:# Rate Limit: 100 rpm# Usage Limit: $100 monthly# Alert Threshold: 80%
You’ve successfully built a multi-tenant AI infrastructure that provides:
Individual customer control with per-user API keys and budgets
Tiered access to models based on subscription levels
Automatic enforcement of spending and rate limits
Complete visibility into usage patterns and costs
Enterprise security with encrypted keys and audit trails
Your customers get powerful AI capabilities with transparent limits. Your business gets predictable costs and complete control. Your engineering team gets a simple, maintainable solution.