Add PluginCompass Provider System documentation

This document describes the architecture and functionality of the PluginCompass AI provider management system, including: - Admin panel structure and authentication - Provider management with supported providers (OpenRouter, Mistral, Google, Groq, NVIDIA, Chutes, Ollama) - Rate limiting system with per-provider and per-model limits - Fallback system architecture with multi-level fallback chains - Usage tracking and monitoring capabilities The documentation covers both technical implementation details and operational guidance for managing the provider infrastructure.
2026-02-09 18:23:55 +00:00
parent 91308fd061
commit 58bab1c5d8
1 changed files with 361 additions and 0 deletions
--- a/PLUGIN_COMPASS_PROVIDER_SYSTEM.md
+++ b/PLUGIN_COMPASS_PROVIDER_SYSTEM.md
@@ -0,0 +1,361 @@
+# PluginCompass Provider Management & Fallback System
+
+## Overview
+
+This document describes the architecture and functionality of the PluginCompass AI provider management system, including the admin panel configuration, provider limits, usage tracking, and fallback mechanisms.
+
+---
+
+## 1. Admin Panel Structure
+
+### 1.1 Admin Panel Sections
+
+The PluginCompass admin panel is accessible at `/admin` and provides the following management areas:
+
+**Main Admin Areas:**
+- **Build Models** (`/admin/build`) - Configure AI models available to users
+- **Plan Models** (`/admin/plan`) - Configure planning provider chain
+- **Plans** (`/admin/plans`) - Manage subscription plans and pricing
+- **Accounts** (`/admin/accounts`) - User account management
+- **Affiliates** (`/admin/affiliates`) - Affiliate program management
+- **Withdrawals** (`/admin/withdrawals`) - Affiliate payout management
+- **Tracking** (`/admin/tracking`) - Analytics and usage statistics
+- **Resources** (`/admin/resources`) - System resource monitoring
+- **External Testing** (`/admin/external-testing`) - WordPress testing configuration
+- **Contact Messages** (`/admin/contact-messages`) - Customer inquiries
+
+### 1.2 Admin Authentication
+
+The admin panel uses session-based authentication with the following security measures:
+
+- **Credentials**: Configured via environment variables `ADMIN_USER` and `ADMIN_PASSWORD`
+- **Session Duration**: 24 hours (configurable via `ADMIN_SESSION_TTL_MS`)
+- **Rate Limiting**: Maximum 5 login attempts per minute per IP
+- **Account Lockout**: 15 minutes after failed attempts
+
+**API Authentication:**
+All admin API endpoints require authentication via session cookies. The endpoints include:
+- Login: `POST /api/admin/login`
+- Logout: `POST /api/admin/logout`
+- Session check: `GET /api/admin/me`
+
+---
+
+## 2. Provider Management
+
+### 2.1 Supported Providers
+
+PluginCompass supports multiple AI providers for both planning and building:
+
+**Build Providers:**
+- OpenRouter (primary aggregator)
+- Mistral
+- Google (Gemini)
+- Groq
+- NVIDIA (NIM)
+- Chutes AI
+- OpenCode (internal/self-hosted)
+- Ollama (self-hosted planning)
+
+**Planning Providers:**
+- OpenRouter
+- Mistral
+- Google (Gemini)
+- Groq
+- NVIDIA (NIM)
+- Ollama (self-hosted)
+
+### 2.2 Provider Configuration
+
+Each provider requires API credentials configured via environment variables:
+
+| Provider | Environment Variable | Default API URL |
+|---------|---------------------|----------------|
+| OpenRouter | `OPENROUTER_API_KEY` | `https://openrouter.ai/api/v1` |
+| Mistral | `MISTRAL_API_KEY` | `https://api.mistral.ai/v1` |
+| Google | `GOOGLE_API_KEY` | `https://generativelanguage.googleapis.com/v1beta2` |
+| Groq | `GROQ_API_KEY` | `https://api.groq.com/openai/v1/chat/completions` |
+| NVIDIA | `NVIDIA_API_KEY` | `https://api.nvidia.com/v1` |
+| Chutes AI | `CHUTES_API_KEY` or `PLUGIN_COMPASS_CHUTES_API_KEY` | `https://api.chutes.ai/v1` |
+| Ollama | `OLLAMA_API_URL` | Configurable self-hosted URL |
+
+### 2.3 Model Discovery
+
+The system automatically discovers available models from each provider:
+
+1. **CLI-based Discovery**: Queries OpenCode CLI for available models
+2. **Provider API Discovery**: Fetches model lists directly from provider APIs
+3. **Manual Configuration**: Admin can manually add model configurations
+
+**Model Configuration per Provider:**
+Each configured model includes:
+- Model identifier (provider/model format)
+- Display label for users
+- Tier classification (free, plus, pro)
+- Icon association
+- Provider priority order
+- Media support flag (image uploads)
+
+---
+
+## 3. Provider Limits & Usage Tracking
+
+### 3.1 Rate Limiting System
+
+PluginCompass implements a flexible rate limiting system with the following components:
+
+**Limit Types:**
+- **Tokens per Minute (TPM)**: Token consumption rate limit
+- **Tokens per Day (TPD)**: Daily token consumption limit
+- **Requests per Minute (RPM)**: API call rate limit
+- **Requests per Day (RPD)**: Daily API call limit
+
+**Scope Levels:**
+- **Per Provider**: Limits apply to all models from that provider
+- **Per Model**: Limits apply to specific models only
+
+**Default Behavior:**
+- All limits default to 0 (unlimited)
+- Limits are configurable per provider or per model
+- Usage is tracked independently for each provider
+
+### 3.2 Usage Tracking
+
+The system tracks usage in real-time with the following characteristics:
+
+**Tracked Metrics:**
+- Tokens consumed per request
+- Number of API requests
+- Timestamps for rate window calculation
+- Per-model breakdown when scoped
+
+**Data Retention:**
+- Usage data retained for 48 hours for rate limiting
+- Aggregated statistics persisted for reporting
+- Separate tracking for planning vs building
+
+**State Files:**
+- `provider-limits.json`: Saved limit configurations
+- `provider-usage.json`: Recent usage data
+- `token-usage.json`: User token consumption
+
+### 3.3 Limit Enforcement
+
+When a request is made, the system:
+
+1. Identifies the provider and model
+2. Checks applicable limits (provider-level or model-level)
+3. Compares current usage against limits
+4. Returns rate limit error if exceeded
+5. Records usage after successful requests
+
+---
+
+## 4. Fallback System
+
+### 4.1 Fallback Architecture
+
+PluginCompass implements a multi-level fallback system for reliability:
+
+**Fallback Levels:**
+1. **Model-level Fallback**: Alternative models within same provider
+2. **Provider-level Fallback**: Switch to alternative providers
+3. **Ultimate Backup**: Final fallback model when all providers fail
+
+### 4.2 Model Fallback Chain
+
+**For Each Provider:**
+Each provider has a configured fallback chain:
+
+**OpenRouter:**
+```
+Primary Model → Backup 1 → Backup 2 → Backup 3 → Environment Fallbacks → Static Fallbacks
+```
+
+**Mistral:**
+```
+Primary Model → Backup 1 → Backup 2 → Backup 3 → Default (mistral-large-latest)
+```
+
+**Groq:**
+```
+llama-3.3-70b-versatile → mixtral-8x7b-32768 → llama-3.1-70b-versatile
+```
+
+**Google (Gemini):**
+```
+gemini-1.5-flash → gemini-1.5-pro → gemini-pro
+```
+
+**NVIDIA (NIM):**
+```
+meta/llama-3.1-70b-instruct → meta/llama-3.1-8b-instruct
+```
+
+### 4.3 Planning Chain Fallback
+
+The planning system follows a configured priority chain:
+
+1. Attempts first provider in chain
+2. If rate limited or error occurs, moves to next provider
+3. Continues through all configured providers
+4. Returns error if all providers fail
+
+**Planning Chain Configuration:**
+- Configurable via admin panel
+- Each entry: `{ provider, model }`
+- Priority determines fallback order
+- Supports provider prefix in model names (e.g., "groq/compound-mini")
+
+### 4.4 Build Fallback Chain
+
+For building operations, the system follows this sequence:
+
+1. **Primary Model**: User-selected or auto-assigned model
+2. **Provider Fallback**: Alternative providers with same model
+3. **OpenCode Fallback**: Internal OpenCode processing
+4. **Ultimate Backup**: Configured backup model (last resort)
+
+### 4.5 Error Classification & Fallback Decision
+
+Errors are classified to determine fallback behavior:
+
+| Error Type | Example | Fallback Action |
+|-----------|---------|-----------------|
+| Rate Limit (429) | "Too many requests" | Wait 30s, switch provider |
+| Server Error (5xx) | "Internal error" | Wait 30s, switch provider |
+| Auth Error (401) | "Invalid API key" | Switch immediately |
+| Billing Error (402) | "Insufficient credits" | Switch immediately |
+| Model Not Found (404) | "Unknown model" | Wait 30s, switch model |
+| User Error (400) | "Invalid request" | Return error, no fallback |
+| Token Limit | "Context length exceeded" | Return error, no fallback |
+
+**Continue Mechanism:**
+- For early terminations, system sends "continue" message
+- Retries up to 3 times with same model
+- After 3 failures, switches to fallback model
+
+---
+
+## 5. Admin Configuration Interface
+
+### 5.1 Model Management
+
+The admin panel allows configuration of:
+
+- **Add/Update Models**: Select from discovered models or add custom
+- **Provider Priority**: Drag-and-drop reordering of provider fallback order
+- **Tier Assignment**: free (1x), plus (2x), pro (3x) multipliers
+- **Icon Selection**: Associate icons with models
+- **Media Support**: Enable/disable image upload capability
+
+### 5.2 Provider Limits Configuration
+
+The limits interface provides:
+
+- **Provider Selection**: Dropdown for each configured provider
+- **Scope Selection**: Per-provider or per-model limits
+- **Limit Inputs**: Numeric fields for TPM, TPD, RPM, RPD
+- **Live Usage Display**: Current usage statistics per provider
+- **Save/Reset**: Persist or revert limit changes
+
+### 5.3 Ultimate Backup Configuration
+
+The admin can configure a final fallback model that will be used when all other providers fail. This ensures system availability even during widespread outages.
+
+---
+
+## 6. Configuration Files
+
+### 6.1 Environment Variables
+
+Key configuration is done via environment variables:
+
+```bash
+# Provider API Keys
+OPENROUTER_API_KEY=
+MISTRAL_API_KEY=
+GOOGLE_API_KEY=
+GROQ_API_KEY=
+NVIDIA_API_KEY=
+CHUTES_API_KEY=
+
+# Admin Authentication
+ADMIN_USER=
+ADMIN_PASSWORD=
+
+# Rate Limiting
+ADMIN_LOGIN_RATE_LIMIT=5
+USER_LOGIN_RATE_LIMIT=10
+API_RATE_LIMIT=100
+```
+
+### 6.2 Runtime State
+
+The system maintains runtime state in:
+
+- `.data/.opencode-chat/provider-limits.json` - Persisted limits
+- `.data/.opencode-chat/provider-usage.json` - Recent usage
+- In-memory state for active sessions and rate tracking
+
+---
+
+## 7. Security Considerations
+
+### 7.1 Rate Limiting
+
+- **Login Protection**: 5 attempts/minute, 15-minute lockout
+- **API Protection**: 100 requests/minute per user
+- **Provider Protection**: Configurable limits prevent abuse
+
+### 7.2 Authentication
+
+- Session-based auth with secure cookies
+- OAuth support for Google and GitHub
+- Rate-limited login attempts
+- Session timeout enforcement
+
+### 7.3 Data Protection
+
+- Provider API keys stored securely
+- Usage data retained only as needed
+- No sensitive data in logs
+- Encrypted session storage
+
+---
+
+## 8. Monitoring & Analytics
+
+### 8.1 Tracking Metrics
+
+The system tracks:
+
+- **User Analytics**: Session duration, feature usage, model preferences
+- **Business Metrics**: MRR, LTV, churn rate, CAC
+- **Technical Metrics**: AI response times, error rates, queue wait times
+- **Provider Metrics**: Per-provider usage and error rates
+
+### 8.2 Admin Dashboard
+
+The tracking page provides:
+
+- Daily/weekly/monthly active users
+- Revenue analytics
+- Conversion funnels
+- Error rate monitoring
+- Resource utilization
+
+---
+
+## 9. Summary
+
+PluginCompass provides a robust, multi-provider AI infrastructure with:
+
+1. **Flexible Provider Management**: Support for 6+ AI providers with automatic model discovery
+2. **Granular Rate Limiting**: Per-provider and per-model limits with configurable thresholds
+3. **Intelligent Fallback**: Multi-level fallback chains ensure high availability
+4. **Comprehensive Admin Control**: Full configuration through web-based admin panel
+5. **Usage Tracking**: Real-time monitoring of token consumption and API usage
+6. **Security Measures**: Rate limiting, authentication, and session management
+
+This architecture ensures reliable AI-powered development while maintaining control over costs and system availability.