This document describes the architecture and functionality of the PluginCompass AI provider management system, including: - Admin panel structure and authentication - Provider management with supported providers (OpenRouter, Mistral, Google, Groq, NVIDIA, Chutes, Ollama) - Rate limiting system with per-provider and per-model limits - Fallback system architecture with multi-level fallback chains - Usage tracking and monitoring capabilities The documentation covers both technical implementation details and operational guidance for managing the provider infrastructure.
11 KiB
PluginCompass Provider Management & Fallback System
Overview
This document describes the architecture and functionality of the PluginCompass AI provider management system, including the admin panel configuration, provider limits, usage tracking, and fallback mechanisms.
1. Admin Panel Structure
1.1 Admin Panel Sections
The PluginCompass admin panel is accessible at /admin and provides the following management areas:
Main Admin Areas:
- Build Models (
/admin/build) - Configure AI models available to users - Plan Models (
/admin/plan) - Configure planning provider chain - Plans (
/admin/plans) - Manage subscription plans and pricing - Accounts (
/admin/accounts) - User account management - Affiliates (
/admin/affiliates) - Affiliate program management - Withdrawals (
/admin/withdrawals) - Affiliate payout management - Tracking (
/admin/tracking) - Analytics and usage statistics - Resources (
/admin/resources) - System resource monitoring - External Testing (
/admin/external-testing) - WordPress testing configuration - Contact Messages (
/admin/contact-messages) - Customer inquiries
1.2 Admin Authentication
The admin panel uses session-based authentication with the following security measures:
- Credentials: Configured via environment variables
ADMIN_USERandADMIN_PASSWORD - Session Duration: 24 hours (configurable via
ADMIN_SESSION_TTL_MS) - Rate Limiting: Maximum 5 login attempts per minute per IP
- Account Lockout: 15 minutes after failed attempts
API Authentication: All admin API endpoints require authentication via session cookies. The endpoints include:
- Login:
POST /api/admin/login - Logout:
POST /api/admin/logout - Session check:
GET /api/admin/me
2. Provider Management
2.1 Supported Providers
PluginCompass supports multiple AI providers for both planning and building:
Build Providers:
- OpenRouter (primary aggregator)
- Mistral
- Google (Gemini)
- Groq
- NVIDIA (NIM)
- Chutes AI
- OpenCode (internal/self-hosted)
- Ollama (self-hosted planning)
Planning Providers:
- OpenRouter
- Mistral
- Google (Gemini)
- Groq
- NVIDIA (NIM)
- Ollama (self-hosted)
2.2 Provider Configuration
Each provider requires API credentials configured via environment variables:
| Provider | Environment Variable | Default API URL |
|---|---|---|
| OpenRouter | OPENROUTER_API_KEY |
https://openrouter.ai/api/v1 |
| Mistral | MISTRAL_API_KEY |
https://api.mistral.ai/v1 |
GOOGLE_API_KEY |
https://generativelanguage.googleapis.com/v1beta2 |
|
| Groq | GROQ_API_KEY |
https://api.groq.com/openai/v1/chat/completions |
| NVIDIA | NVIDIA_API_KEY |
https://api.nvidia.com/v1 |
| Chutes AI | CHUTES_API_KEY or PLUGIN_COMPASS_CHUTES_API_KEY |
https://api.chutes.ai/v1 |
| Ollama | OLLAMA_API_URL |
Configurable self-hosted URL |
2.3 Model Discovery
The system automatically discovers available models from each provider:
- CLI-based Discovery: Queries OpenCode CLI for available models
- Provider API Discovery: Fetches model lists directly from provider APIs
- Manual Configuration: Admin can manually add model configurations
Model Configuration per Provider: Each configured model includes:
- Model identifier (provider/model format)
- Display label for users
- Tier classification (free, plus, pro)
- Icon association
- Provider priority order
- Media support flag (image uploads)
3. Provider Limits & Usage Tracking
3.1 Rate Limiting System
PluginCompass implements a flexible rate limiting system with the following components:
Limit Types:
- Tokens per Minute (TPM): Token consumption rate limit
- Tokens per Day (TPD): Daily token consumption limit
- Requests per Minute (RPM): API call rate limit
- Requests per Day (RPD): Daily API call limit
Scope Levels:
- Per Provider: Limits apply to all models from that provider
- Per Model: Limits apply to specific models only
Default Behavior:
- All limits default to 0 (unlimited)
- Limits are configurable per provider or per model
- Usage is tracked independently for each provider
3.2 Usage Tracking
The system tracks usage in real-time with the following characteristics:
Tracked Metrics:
- Tokens consumed per request
- Number of API requests
- Timestamps for rate window calculation
- Per-model breakdown when scoped
Data Retention:
- Usage data retained for 48 hours for rate limiting
- Aggregated statistics persisted for reporting
- Separate tracking for planning vs building
State Files:
provider-limits.json: Saved limit configurationsprovider-usage.json: Recent usage datatoken-usage.json: User token consumption
3.3 Limit Enforcement
When a request is made, the system:
- Identifies the provider and model
- Checks applicable limits (provider-level or model-level)
- Compares current usage against limits
- Returns rate limit error if exceeded
- Records usage after successful requests
4. Fallback System
4.1 Fallback Architecture
PluginCompass implements a multi-level fallback system for reliability:
Fallback Levels:
- Model-level Fallback: Alternative models within same provider
- Provider-level Fallback: Switch to alternative providers
- Ultimate Backup: Final fallback model when all providers fail
4.2 Model Fallback Chain
For Each Provider: Each provider has a configured fallback chain:
OpenRouter:
Primary Model → Backup 1 → Backup 2 → Backup 3 → Environment Fallbacks → Static Fallbacks
Mistral:
Primary Model → Backup 1 → Backup 2 → Backup 3 → Default (mistral-large-latest)
Groq:
llama-3.3-70b-versatile → mixtral-8x7b-32768 → llama-3.1-70b-versatile
Google (Gemini):
gemini-1.5-flash → gemini-1.5-pro → gemini-pro
NVIDIA (NIM):
meta/llama-3.1-70b-instruct → meta/llama-3.1-8b-instruct
4.3 Planning Chain Fallback
The planning system follows a configured priority chain:
- Attempts first provider in chain
- If rate limited or error occurs, moves to next provider
- Continues through all configured providers
- Returns error if all providers fail
Planning Chain Configuration:
- Configurable via admin panel
- Each entry:
{ provider, model } - Priority determines fallback order
- Supports provider prefix in model names (e.g., "groq/compound-mini")
4.4 Build Fallback Chain
For building operations, the system follows this sequence:
- Primary Model: User-selected or auto-assigned model
- Provider Fallback: Alternative providers with same model
- OpenCode Fallback: Internal OpenCode processing
- Ultimate Backup: Configured backup model (last resort)
4.5 Error Classification & Fallback Decision
Errors are classified to determine fallback behavior:
| Error Type | Example | Fallback Action |
|---|---|---|
| Rate Limit (429) | "Too many requests" | Wait 30s, switch provider |
| Server Error (5xx) | "Internal error" | Wait 30s, switch provider |
| Auth Error (401) | "Invalid API key" | Switch immediately |
| Billing Error (402) | "Insufficient credits" | Switch immediately |
| Model Not Found (404) | "Unknown model" | Wait 30s, switch model |
| User Error (400) | "Invalid request" | Return error, no fallback |
| Token Limit | "Context length exceeded" | Return error, no fallback |
Continue Mechanism:
- For early terminations, system sends "continue" message
- Retries up to 3 times with same model
- After 3 failures, switches to fallback model
5. Admin Configuration Interface
5.1 Model Management
The admin panel allows configuration of:
- Add/Update Models: Select from discovered models or add custom
- Provider Priority: Drag-and-drop reordering of provider fallback order
- Tier Assignment: free (1x), plus (2x), pro (3x) multipliers
- Icon Selection: Associate icons with models
- Media Support: Enable/disable image upload capability
5.2 Provider Limits Configuration
The limits interface provides:
- Provider Selection: Dropdown for each configured provider
- Scope Selection: Per-provider or per-model limits
- Limit Inputs: Numeric fields for TPM, TPD, RPM, RPD
- Live Usage Display: Current usage statistics per provider
- Save/Reset: Persist or revert limit changes
5.3 Ultimate Backup Configuration
The admin can configure a final fallback model that will be used when all other providers fail. This ensures system availability even during widespread outages.
6. Configuration Files
6.1 Environment Variables
Key configuration is done via environment variables:
# Provider API Keys
OPENROUTER_API_KEY=
MISTRAL_API_KEY=
GOOGLE_API_KEY=
GROQ_API_KEY=
NVIDIA_API_KEY=
CHUTES_API_KEY=
# Admin Authentication
ADMIN_USER=
ADMIN_PASSWORD=
# Rate Limiting
ADMIN_LOGIN_RATE_LIMIT=5
USER_LOGIN_RATE_LIMIT=10
API_RATE_LIMIT=100
6.2 Runtime State
The system maintains runtime state in:
.data/.opencode-chat/provider-limits.json- Persisted limits.data/.opencode-chat/provider-usage.json- Recent usage- In-memory state for active sessions and rate tracking
7. Security Considerations
7.1 Rate Limiting
- Login Protection: 5 attempts/minute, 15-minute lockout
- API Protection: 100 requests/minute per user
- Provider Protection: Configurable limits prevent abuse
7.2 Authentication
- Session-based auth with secure cookies
- OAuth support for Google and GitHub
- Rate-limited login attempts
- Session timeout enforcement
7.3 Data Protection
- Provider API keys stored securely
- Usage data retained only as needed
- No sensitive data in logs
- Encrypted session storage
8. Monitoring & Analytics
8.1 Tracking Metrics
The system tracks:
- User Analytics: Session duration, feature usage, model preferences
- Business Metrics: MRR, LTV, churn rate, CAC
- Technical Metrics: AI response times, error rates, queue wait times
- Provider Metrics: Per-provider usage and error rates
8.2 Admin Dashboard
The tracking page provides:
- Daily/weekly/monthly active users
- Revenue analytics
- Conversion funnels
- Error rate monitoring
- Resource utilization
9. Summary
PluginCompass provides a robust, multi-provider AI infrastructure with:
- Flexible Provider Management: Support for 6+ AI providers with automatic model discovery
- Granular Rate Limiting: Per-provider and per-model limits with configurable thresholds
- Intelligent Fallback: Multi-level fallback chains ensure high availability
- Comprehensive Admin Control: Full configuration through web-based admin panel
- Usage Tracking: Real-time monitoring of token consumption and API usage
- Security Measures: Rate limiting, authentication, and session management
This architecture ensures reliable AI-powered development while maintaining control over costs and system availability.