From 58bab1c5d8cdbe43f32cb7a614f4eb67cb015dac Mon Sep 17 00:00:00 2001 From: southseact-3d Date: Mon, 9 Feb 2026 18:23:55 +0000 Subject: [PATCH] Add PluginCompass Provider System documentation This document describes the architecture and functionality of the PluginCompass AI provider management system, including: - Admin panel structure and authentication - Provider management with supported providers (OpenRouter, Mistral, Google, Groq, NVIDIA, Chutes, Ollama) - Rate limiting system with per-provider and per-model limits - Fallback system architecture with multi-level fallback chains - Usage tracking and monitoring capabilities The documentation covers both technical implementation details and operational guidance for managing the provider infrastructure. --- PLUGIN_COMPASS_PROVIDER_SYSTEM.md | 361 ++++++++++++++++++++++++++++++ 1 file changed, 361 insertions(+) create mode 100644 PLUGIN_COMPASS_PROVIDER_SYSTEM.md diff --git a/PLUGIN_COMPASS_PROVIDER_SYSTEM.md b/PLUGIN_COMPASS_PROVIDER_SYSTEM.md new file mode 100644 index 0000000..d0ff358 --- /dev/null +++ b/PLUGIN_COMPASS_PROVIDER_SYSTEM.md @@ -0,0 +1,361 @@ +# PluginCompass Provider Management & Fallback System + +## Overview + +This document describes the architecture and functionality of the PluginCompass AI provider management system, including the admin panel configuration, provider limits, usage tracking, and fallback mechanisms. + +--- + +## 1. Admin Panel Structure + +### 1.1 Admin Panel Sections + +The PluginCompass admin panel is accessible at `/admin` and provides the following management areas: + +**Main Admin Areas:** +- **Build Models** (`/admin/build`) - Configure AI models available to users +- **Plan Models** (`/admin/plan`) - Configure planning provider chain +- **Plans** (`/admin/plans`) - Manage subscription plans and pricing +- **Accounts** (`/admin/accounts`) - User account management +- **Affiliates** (`/admin/affiliates`) - Affiliate program management +- **Withdrawals** (`/admin/withdrawals`) - Affiliate payout management +- **Tracking** (`/admin/tracking`) - Analytics and usage statistics +- **Resources** (`/admin/resources`) - System resource monitoring +- **External Testing** (`/admin/external-testing`) - WordPress testing configuration +- **Contact Messages** (`/admin/contact-messages`) - Customer inquiries + +### 1.2 Admin Authentication + +The admin panel uses session-based authentication with the following security measures: + +- **Credentials**: Configured via environment variables `ADMIN_USER` and `ADMIN_PASSWORD` +- **Session Duration**: 24 hours (configurable via `ADMIN_SESSION_TTL_MS`) +- **Rate Limiting**: Maximum 5 login attempts per minute per IP +- **Account Lockout**: 15 minutes after failed attempts + +**API Authentication:** +All admin API endpoints require authentication via session cookies. The endpoints include: +- Login: `POST /api/admin/login` +- Logout: `POST /api/admin/logout` +- Session check: `GET /api/admin/me` + +--- + +## 2. Provider Management + +### 2.1 Supported Providers + +PluginCompass supports multiple AI providers for both planning and building: + +**Build Providers:** +- OpenRouter (primary aggregator) +- Mistral +- Google (Gemini) +- Groq +- NVIDIA (NIM) +- Chutes AI +- OpenCode (internal/self-hosted) +- Ollama (self-hosted planning) + +**Planning Providers:** +- OpenRouter +- Mistral +- Google (Gemini) +- Groq +- NVIDIA (NIM) +- Ollama (self-hosted) + +### 2.2 Provider Configuration + +Each provider requires API credentials configured via environment variables: + +| Provider | Environment Variable | Default API URL | +|---------|---------------------|----------------| +| OpenRouter | `OPENROUTER_API_KEY` | `https://openrouter.ai/api/v1` | +| Mistral | `MISTRAL_API_KEY` | `https://api.mistral.ai/v1` | +| Google | `GOOGLE_API_KEY` | `https://generativelanguage.googleapis.com/v1beta2` | +| Groq | `GROQ_API_KEY` | `https://api.groq.com/openai/v1/chat/completions` | +| NVIDIA | `NVIDIA_API_KEY` | `https://api.nvidia.com/v1` | +| Chutes AI | `CHUTES_API_KEY` or `PLUGIN_COMPASS_CHUTES_API_KEY` | `https://api.chutes.ai/v1` | +| Ollama | `OLLAMA_API_URL` | Configurable self-hosted URL | + +### 2.3 Model Discovery + +The system automatically discovers available models from each provider: + +1. **CLI-based Discovery**: Queries OpenCode CLI for available models +2. **Provider API Discovery**: Fetches model lists directly from provider APIs +3. **Manual Configuration**: Admin can manually add model configurations + +**Model Configuration per Provider:** +Each configured model includes: +- Model identifier (provider/model format) +- Display label for users +- Tier classification (free, plus, pro) +- Icon association +- Provider priority order +- Media support flag (image uploads) + +--- + +## 3. Provider Limits & Usage Tracking + +### 3.1 Rate Limiting System + +PluginCompass implements a flexible rate limiting system with the following components: + +**Limit Types:** +- **Tokens per Minute (TPM)**: Token consumption rate limit +- **Tokens per Day (TPD)**: Daily token consumption limit +- **Requests per Minute (RPM)**: API call rate limit +- **Requests per Day (RPD)**: Daily API call limit + +**Scope Levels:** +- **Per Provider**: Limits apply to all models from that provider +- **Per Model**: Limits apply to specific models only + +**Default Behavior:** +- All limits default to 0 (unlimited) +- Limits are configurable per provider or per model +- Usage is tracked independently for each provider + +### 3.2 Usage Tracking + +The system tracks usage in real-time with the following characteristics: + +**Tracked Metrics:** +- Tokens consumed per request +- Number of API requests +- Timestamps for rate window calculation +- Per-model breakdown when scoped + +**Data Retention:** +- Usage data retained for 48 hours for rate limiting +- Aggregated statistics persisted for reporting +- Separate tracking for planning vs building + +**State Files:** +- `provider-limits.json`: Saved limit configurations +- `provider-usage.json`: Recent usage data +- `token-usage.json`: User token consumption + +### 3.3 Limit Enforcement + +When a request is made, the system: + +1. Identifies the provider and model +2. Checks applicable limits (provider-level or model-level) +3. Compares current usage against limits +4. Returns rate limit error if exceeded +5. Records usage after successful requests + +--- + +## 4. Fallback System + +### 4.1 Fallback Architecture + +PluginCompass implements a multi-level fallback system for reliability: + +**Fallback Levels:** +1. **Model-level Fallback**: Alternative models within same provider +2. **Provider-level Fallback**: Switch to alternative providers +3. **Ultimate Backup**: Final fallback model when all providers fail + +### 4.2 Model Fallback Chain + +**For Each Provider:** +Each provider has a configured fallback chain: + +**OpenRouter:** +``` +Primary Model → Backup 1 → Backup 2 → Backup 3 → Environment Fallbacks → Static Fallbacks +``` + +**Mistral:** +``` +Primary Model → Backup 1 → Backup 2 → Backup 3 → Default (mistral-large-latest) +``` + +**Groq:** +``` +llama-3.3-70b-versatile → mixtral-8x7b-32768 → llama-3.1-70b-versatile +``` + +**Google (Gemini):** +``` +gemini-1.5-flash → gemini-1.5-pro → gemini-pro +``` + +**NVIDIA (NIM):** +``` +meta/llama-3.1-70b-instruct → meta/llama-3.1-8b-instruct +``` + +### 4.3 Planning Chain Fallback + +The planning system follows a configured priority chain: + +1. Attempts first provider in chain +2. If rate limited or error occurs, moves to next provider +3. Continues through all configured providers +4. Returns error if all providers fail + +**Planning Chain Configuration:** +- Configurable via admin panel +- Each entry: `{ provider, model }` +- Priority determines fallback order +- Supports provider prefix in model names (e.g., "groq/compound-mini") + +### 4.4 Build Fallback Chain + +For building operations, the system follows this sequence: + +1. **Primary Model**: User-selected or auto-assigned model +2. **Provider Fallback**: Alternative providers with same model +3. **OpenCode Fallback**: Internal OpenCode processing +4. **Ultimate Backup**: Configured backup model (last resort) + +### 4.5 Error Classification & Fallback Decision + +Errors are classified to determine fallback behavior: + +| Error Type | Example | Fallback Action | +|-----------|---------|-----------------| +| Rate Limit (429) | "Too many requests" | Wait 30s, switch provider | +| Server Error (5xx) | "Internal error" | Wait 30s, switch provider | +| Auth Error (401) | "Invalid API key" | Switch immediately | +| Billing Error (402) | "Insufficient credits" | Switch immediately | +| Model Not Found (404) | "Unknown model" | Wait 30s, switch model | +| User Error (400) | "Invalid request" | Return error, no fallback | +| Token Limit | "Context length exceeded" | Return error, no fallback | + +**Continue Mechanism:** +- For early terminations, system sends "continue" message +- Retries up to 3 times with same model +- After 3 failures, switches to fallback model + +--- + +## 5. Admin Configuration Interface + +### 5.1 Model Management + +The admin panel allows configuration of: + +- **Add/Update Models**: Select from discovered models or add custom +- **Provider Priority**: Drag-and-drop reordering of provider fallback order +- **Tier Assignment**: free (1x), plus (2x), pro (3x) multipliers +- **Icon Selection**: Associate icons with models +- **Media Support**: Enable/disable image upload capability + +### 5.2 Provider Limits Configuration + +The limits interface provides: + +- **Provider Selection**: Dropdown for each configured provider +- **Scope Selection**: Per-provider or per-model limits +- **Limit Inputs**: Numeric fields for TPM, TPD, RPM, RPD +- **Live Usage Display**: Current usage statistics per provider +- **Save/Reset**: Persist or revert limit changes + +### 5.3 Ultimate Backup Configuration + +The admin can configure a final fallback model that will be used when all other providers fail. This ensures system availability even during widespread outages. + +--- + +## 6. Configuration Files + +### 6.1 Environment Variables + +Key configuration is done via environment variables: + +```bash +# Provider API Keys +OPENROUTER_API_KEY= +MISTRAL_API_KEY= +GOOGLE_API_KEY= +GROQ_API_KEY= +NVIDIA_API_KEY= +CHUTES_API_KEY= + +# Admin Authentication +ADMIN_USER= +ADMIN_PASSWORD= + +# Rate Limiting +ADMIN_LOGIN_RATE_LIMIT=5 +USER_LOGIN_RATE_LIMIT=10 +API_RATE_LIMIT=100 +``` + +### 6.2 Runtime State + +The system maintains runtime state in: + +- `.data/.opencode-chat/provider-limits.json` - Persisted limits +- `.data/.opencode-chat/provider-usage.json` - Recent usage +- In-memory state for active sessions and rate tracking + +--- + +## 7. Security Considerations + +### 7.1 Rate Limiting + +- **Login Protection**: 5 attempts/minute, 15-minute lockout +- **API Protection**: 100 requests/minute per user +- **Provider Protection**: Configurable limits prevent abuse + +### 7.2 Authentication + +- Session-based auth with secure cookies +- OAuth support for Google and GitHub +- Rate-limited login attempts +- Session timeout enforcement + +### 7.3 Data Protection + +- Provider API keys stored securely +- Usage data retained only as needed +- No sensitive data in logs +- Encrypted session storage + +--- + +## 8. Monitoring & Analytics + +### 8.1 Tracking Metrics + +The system tracks: + +- **User Analytics**: Session duration, feature usage, model preferences +- **Business Metrics**: MRR, LTV, churn rate, CAC +- **Technical Metrics**: AI response times, error rates, queue wait times +- **Provider Metrics**: Per-provider usage and error rates + +### 8.2 Admin Dashboard + +The tracking page provides: + +- Daily/weekly/monthly active users +- Revenue analytics +- Conversion funnels +- Error rate monitoring +- Resource utilization + +--- + +## 9. Summary + +PluginCompass provides a robust, multi-provider AI infrastructure with: + +1. **Flexible Provider Management**: Support for 6+ AI providers with automatic model discovery +2. **Granular Rate Limiting**: Per-provider and per-model limits with configurable thresholds +3. **Intelligent Fallback**: Multi-level fallback chains ensure high availability +4. **Comprehensive Admin Control**: Full configuration through web-based admin panel +5. **Usage Tracking**: Real-time monitoring of token consumption and API usage +6. **Security Measures**: Rate limiting, authentication, and session management + +This architecture ensures reliable AI-powered development while maintaining control over costs and system availability.