# PluginCompass Provider Management & Fallback System ## Overview This document describes the architecture and functionality of the PluginCompass AI provider management system, including the admin panel configuration, provider limits, usage tracking, and fallback mechanisms. --- ## 1. Admin Panel Structure ### 1.1 Admin Panel Sections The PluginCompass admin panel is accessible at `/admin` and provides the following management areas: **Main Admin Areas:** - **Build Models** (`/admin/build`) - Configure AI models available to users - **Plan Models** (`/admin/plan`) - Configure planning provider chain - **Plans** (`/admin/plans`) - Manage subscription plans and pricing - **Accounts** (`/admin/accounts`) - User account management - **Affiliates** (`/admin/affiliates`) - Affiliate program management - **Withdrawals** (`/admin/withdrawals`) - Affiliate payout management - **Tracking** (`/admin/tracking`) - Analytics and usage statistics - **Resources** (`/admin/resources`) - System resource monitoring - **External Testing** (`/admin/external-testing`) - WordPress testing configuration - **Contact Messages** (`/admin/contact-messages`) - Customer inquiries ### 1.2 Admin Authentication The admin panel uses session-based authentication with the following security measures: - **Credentials**: Configured via environment variables `ADMIN_USER` and `ADMIN_PASSWORD` - **Session Duration**: 24 hours (configurable via `ADMIN_SESSION_TTL_MS`) - **Rate Limiting**: Maximum 5 login attempts per minute per IP - **Account Lockout**: 15 minutes after failed attempts **API Authentication:** All admin API endpoints require authentication via session cookies. The endpoints include: - Login: `POST /api/admin/login` - Logout: `POST /api/admin/logout` - Session check: `GET /api/admin/me` --- ## 2. Provider Management ### 2.1 Supported Providers PluginCompass supports multiple AI providers for both planning and building: **Build Providers:** - OpenRouter (primary aggregator) - Mistral - Google (Gemini) - Groq - NVIDIA (NIM) - Chutes AI - OpenCode (internal/self-hosted) - Ollama (self-hosted planning) **Planning Providers:** - OpenRouter - Mistral - Google (Gemini) - Groq - NVIDIA (NIM) - Ollama (self-hosted) ### 2.2 Provider Configuration Each provider requires API credentials configured via environment variables: | Provider | Environment Variable | Default API URL | |---------|---------------------|----------------| | OpenRouter | `OPENROUTER_API_KEY` | `https://openrouter.ai/api/v1` | | Mistral | `MISTRAL_API_KEY` | `https://api.mistral.ai/v1` | | Google | `GOOGLE_API_KEY` | `https://generativelanguage.googleapis.com/v1beta2` | | Groq | `GROQ_API_KEY` | `https://api.groq.com/openai/v1/chat/completions` | | NVIDIA | `NVIDIA_API_KEY` | `https://api.nvidia.com/v1` | | Chutes AI | `CHUTES_API_KEY` or `PLUGIN_COMPASS_CHUTES_API_KEY` | `https://api.chutes.ai/v1` | | Ollama | `OLLAMA_API_URL` | Configurable self-hosted URL | ### 2.3 Model Discovery The system automatically discovers available models from each provider: 1. **CLI-based Discovery**: Queries OpenCode CLI for available models 2. **Provider API Discovery**: Fetches model lists directly from provider APIs 3. **Manual Configuration**: Admin can manually add model configurations **Model Configuration per Provider:** Each configured model includes: - Model identifier (provider/model format) - Display label for users - Tier classification (free, plus, pro) - Icon association - Provider priority order - Media support flag (image uploads) --- ## 3. Provider Limits & Usage Tracking ### 3.1 Rate Limiting System PluginCompass implements a flexible rate limiting system with the following components: **Limit Types:** - **Tokens per Minute (TPM)**: Token consumption rate limit - **Tokens per Day (TPD)**: Daily token consumption limit - **Requests per Minute (RPM)**: API call rate limit - **Requests per Day (RPD)**: Daily API call limit **Scope Levels:** - **Per Provider**: Limits apply to all models from that provider - **Per Model**: Limits apply to specific models only **Default Behavior:** - All limits default to 0 (unlimited) - Limits are configurable per provider or per model - Usage is tracked independently for each provider ### 3.2 Usage Tracking The system tracks usage in real-time with the following characteristics: **Tracked Metrics:** - Tokens consumed per request - Number of API requests - Timestamps for rate window calculation - Per-model breakdown when scoped **Data Retention:** - Usage data retained for 48 hours for rate limiting - Aggregated statistics persisted for reporting - Separate tracking for planning vs building **State Files:** - `provider-limits.json`: Saved limit configurations - `provider-usage.json`: Recent usage data - `token-usage.json`: User token consumption ### 3.3 Limit Enforcement When a request is made, the system: 1. Identifies the provider and model 2. Checks applicable limits (provider-level or model-level) 3. Compares current usage against limits 4. Returns rate limit error if exceeded 5. Records usage after successful requests --- ## 4. Fallback System ### 4.1 Fallback Architecture PluginCompass implements a multi-level fallback system for reliability: **Fallback Levels:** 1. **Model-level Fallback**: Alternative models within same provider 2. **Provider-level Fallback**: Switch to alternative providers 3. **Ultimate Backup**: Final fallback model when all providers fail ### 4.2 Model Fallback Chain **For Each Provider:** Each provider has a configured fallback chain: **OpenRouter:** ``` Primary Model → Backup 1 → Backup 2 → Backup 3 → Environment Fallbacks → Static Fallbacks ``` **Mistral:** ``` Primary Model → Backup 1 → Backup 2 → Backup 3 → Default (mistral-large-latest) ``` **Groq:** ``` llama-3.3-70b-versatile → mixtral-8x7b-32768 → llama-3.1-70b-versatile ``` **Google (Gemini):** ``` gemini-1.5-flash → gemini-1.5-pro → gemini-pro ``` **NVIDIA (NIM):** ``` meta/llama-3.1-70b-instruct → meta/llama-3.1-8b-instruct ``` ### 4.3 Planning Chain Fallback The planning system follows a configured priority chain: 1. Attempts first provider in chain 2. If rate limited or error occurs, moves to next provider 3. Continues through all configured providers 4. Returns error if all providers fail **Planning Chain Configuration:** - Configurable via admin panel - Each entry: `{ provider, model }` - Priority determines fallback order - Supports provider prefix in model names (e.g., "groq/compound-mini") ### 4.4 Build Fallback Chain For building operations, the system follows this sequence: 1. **Primary Model**: User-selected or auto-assigned model 2. **Provider Fallback**: Alternative providers with same model 3. **OpenCode Fallback**: Internal OpenCode processing 4. **Ultimate Backup**: Configured backup model (last resort) ### 4.5 Error Classification & Fallback Decision Errors are classified to determine fallback behavior: | Error Type | Example | Fallback Action | |-----------|---------|-----------------| | Rate Limit (429) | "Too many requests" | Wait 30s, switch provider | | Server Error (5xx) | "Internal error" | Wait 30s, switch provider | | Auth Error (401) | "Invalid API key" | Switch immediately | | Billing Error (402) | "Insufficient credits" | Switch immediately | | Model Not Found (404) | "Unknown model" | Wait 30s, switch model | | User Error (400) | "Invalid request" | Return error, no fallback | | Token Limit | "Context length exceeded" | Return error, no fallback | **Continue Mechanism:** - For early terminations, system sends "continue" message - Retries up to 3 times with same model - After 3 failures, switches to fallback model --- ## 5. Admin Configuration Interface ### 5.1 Model Management The admin panel allows configuration of: - **Add/Update Models**: Select from discovered models or add custom - **Provider Priority**: Drag-and-drop reordering of provider fallback order - **Tier Assignment**: free (1x), plus (2x), pro (3x) multipliers - **Icon Selection**: Associate icons with models - **Media Support**: Enable/disable image upload capability ### 5.2 Provider Limits Configuration The limits interface provides: - **Provider Selection**: Dropdown for each configured provider - **Scope Selection**: Per-provider or per-model limits - **Limit Inputs**: Numeric fields for TPM, TPD, RPM, RPD - **Live Usage Display**: Current usage statistics per provider - **Save/Reset**: Persist or revert limit changes ### 5.3 Ultimate Backup Configuration The admin can configure a final fallback model that will be used when all other providers fail. This ensures system availability even during widespread outages. --- ## 6. Configuration Files ### 6.1 Environment Variables Key configuration is done via environment variables: ```bash # Provider API Keys OPENROUTER_API_KEY= MISTRAL_API_KEY= GOOGLE_API_KEY= GROQ_API_KEY= NVIDIA_API_KEY= CHUTES_API_KEY= # Admin Authentication ADMIN_USER= ADMIN_PASSWORD= # Rate Limiting ADMIN_LOGIN_RATE_LIMIT=5 USER_LOGIN_RATE_LIMIT=10 API_RATE_LIMIT=100 ``` ### 6.2 Runtime State The system maintains runtime state in: - `.data/.opencode-chat/provider-limits.json` - Persisted limits - `.data/.opencode-chat/provider-usage.json` - Recent usage - In-memory state for active sessions and rate tracking --- ## 7. Security Considerations ### 7.1 Rate Limiting - **Login Protection**: 5 attempts/minute, 15-minute lockout - **API Protection**: 100 requests/minute per user - **Provider Protection**: Configurable limits prevent abuse ### 7.2 Authentication - Session-based auth with secure cookies - OAuth support for Google and GitHub - Rate-limited login attempts - Session timeout enforcement ### 7.3 Data Protection - Provider API keys stored securely - Usage data retained only as needed - No sensitive data in logs - Encrypted session storage --- ## 8. Monitoring & Analytics ### 8.1 Tracking Metrics The system tracks: - **User Analytics**: Session duration, feature usage, model preferences - **Business Metrics**: MRR, LTV, churn rate, CAC - **Technical Metrics**: AI response times, error rates, queue wait times - **Provider Metrics**: Per-provider usage and error rates ### 8.2 Admin Dashboard The tracking page provides: - Daily/weekly/monthly active users - Revenue analytics - Conversion funnels - Error rate monitoring - Resource utilization --- ## 9. Summary PluginCompass provides a robust, multi-provider AI infrastructure with: 1. **Flexible Provider Management**: Support for 6+ AI providers with automatic model discovery 2. **Granular Rate Limiting**: Per-provider and per-model limits with configurable thresholds 3. **Intelligent Fallback**: Multi-level fallback chains ensure high availability 4. **Comprehensive Admin Control**: Full configuration through web-based admin panel 5. **Usage Tracking**: Real-time monitoring of token consumption and API usage 6. **Security Measures**: Rate limiting, authentication, and session management This architecture ensures reliable AI-powered development while maintaining control over costs and system availability.