Add PluginCompass Provider System documentation
This document describes the architecture and functionality of the PluginCompass AI provider management system, including: - Admin panel structure and authentication - Provider management with supported providers (OpenRouter, Mistral, Google, Groq, NVIDIA, Chutes, Ollama) - Rate limiting system with per-provider and per-model limits - Fallback system architecture with multi-level fallback chains - Usage tracking and monitoring capabilities The documentation covers both technical implementation details and operational guidance for managing the provider infrastructure.
This commit is contained in:
361
PLUGIN_COMPASS_PROVIDER_SYSTEM.md
Normal file
361
PLUGIN_COMPASS_PROVIDER_SYSTEM.md
Normal file
@@ -0,0 +1,361 @@
|
||||
# PluginCompass Provider Management & Fallback System
|
||||
|
||||
## Overview
|
||||
|
||||
This document describes the architecture and functionality of the PluginCompass AI provider management system, including the admin panel configuration, provider limits, usage tracking, and fallback mechanisms.
|
||||
|
||||
---
|
||||
|
||||
## 1. Admin Panel Structure
|
||||
|
||||
### 1.1 Admin Panel Sections
|
||||
|
||||
The PluginCompass admin panel is accessible at `/admin` and provides the following management areas:
|
||||
|
||||
**Main Admin Areas:**
|
||||
- **Build Models** (`/admin/build`) - Configure AI models available to users
|
||||
- **Plan Models** (`/admin/plan`) - Configure planning provider chain
|
||||
- **Plans** (`/admin/plans`) - Manage subscription plans and pricing
|
||||
- **Accounts** (`/admin/accounts`) - User account management
|
||||
- **Affiliates** (`/admin/affiliates`) - Affiliate program management
|
||||
- **Withdrawals** (`/admin/withdrawals`) - Affiliate payout management
|
||||
- **Tracking** (`/admin/tracking`) - Analytics and usage statistics
|
||||
- **Resources** (`/admin/resources`) - System resource monitoring
|
||||
- **External Testing** (`/admin/external-testing`) - WordPress testing configuration
|
||||
- **Contact Messages** (`/admin/contact-messages`) - Customer inquiries
|
||||
|
||||
### 1.2 Admin Authentication
|
||||
|
||||
The admin panel uses session-based authentication with the following security measures:
|
||||
|
||||
- **Credentials**: Configured via environment variables `ADMIN_USER` and `ADMIN_PASSWORD`
|
||||
- **Session Duration**: 24 hours (configurable via `ADMIN_SESSION_TTL_MS`)
|
||||
- **Rate Limiting**: Maximum 5 login attempts per minute per IP
|
||||
- **Account Lockout**: 15 minutes after failed attempts
|
||||
|
||||
**API Authentication:**
|
||||
All admin API endpoints require authentication via session cookies. The endpoints include:
|
||||
- Login: `POST /api/admin/login`
|
||||
- Logout: `POST /api/admin/logout`
|
||||
- Session check: `GET /api/admin/me`
|
||||
|
||||
---
|
||||
|
||||
## 2. Provider Management
|
||||
|
||||
### 2.1 Supported Providers
|
||||
|
||||
PluginCompass supports multiple AI providers for both planning and building:
|
||||
|
||||
**Build Providers:**
|
||||
- OpenRouter (primary aggregator)
|
||||
- Mistral
|
||||
- Google (Gemini)
|
||||
- Groq
|
||||
- NVIDIA (NIM)
|
||||
- Chutes AI
|
||||
- OpenCode (internal/self-hosted)
|
||||
- Ollama (self-hosted planning)
|
||||
|
||||
**Planning Providers:**
|
||||
- OpenRouter
|
||||
- Mistral
|
||||
- Google (Gemini)
|
||||
- Groq
|
||||
- NVIDIA (NIM)
|
||||
- Ollama (self-hosted)
|
||||
|
||||
### 2.2 Provider Configuration
|
||||
|
||||
Each provider requires API credentials configured via environment variables:
|
||||
|
||||
| Provider | Environment Variable | Default API URL |
|
||||
|---------|---------------------|----------------|
|
||||
| OpenRouter | `OPENROUTER_API_KEY` | `https://openrouter.ai/api/v1` |
|
||||
| Mistral | `MISTRAL_API_KEY` | `https://api.mistral.ai/v1` |
|
||||
| Google | `GOOGLE_API_KEY` | `https://generativelanguage.googleapis.com/v1beta2` |
|
||||
| Groq | `GROQ_API_KEY` | `https://api.groq.com/openai/v1/chat/completions` |
|
||||
| NVIDIA | `NVIDIA_API_KEY` | `https://api.nvidia.com/v1` |
|
||||
| Chutes AI | `CHUTES_API_KEY` or `PLUGIN_COMPASS_CHUTES_API_KEY` | `https://api.chutes.ai/v1` |
|
||||
| Ollama | `OLLAMA_API_URL` | Configurable self-hosted URL |
|
||||
|
||||
### 2.3 Model Discovery
|
||||
|
||||
The system automatically discovers available models from each provider:
|
||||
|
||||
1. **CLI-based Discovery**: Queries OpenCode CLI for available models
|
||||
2. **Provider API Discovery**: Fetches model lists directly from provider APIs
|
||||
3. **Manual Configuration**: Admin can manually add model configurations
|
||||
|
||||
**Model Configuration per Provider:**
|
||||
Each configured model includes:
|
||||
- Model identifier (provider/model format)
|
||||
- Display label for users
|
||||
- Tier classification (free, plus, pro)
|
||||
- Icon association
|
||||
- Provider priority order
|
||||
- Media support flag (image uploads)
|
||||
|
||||
---
|
||||
|
||||
## 3. Provider Limits & Usage Tracking
|
||||
|
||||
### 3.1 Rate Limiting System
|
||||
|
||||
PluginCompass implements a flexible rate limiting system with the following components:
|
||||
|
||||
**Limit Types:**
|
||||
- **Tokens per Minute (TPM)**: Token consumption rate limit
|
||||
- **Tokens per Day (TPD)**: Daily token consumption limit
|
||||
- **Requests per Minute (RPM)**: API call rate limit
|
||||
- **Requests per Day (RPD)**: Daily API call limit
|
||||
|
||||
**Scope Levels:**
|
||||
- **Per Provider**: Limits apply to all models from that provider
|
||||
- **Per Model**: Limits apply to specific models only
|
||||
|
||||
**Default Behavior:**
|
||||
- All limits default to 0 (unlimited)
|
||||
- Limits are configurable per provider or per model
|
||||
- Usage is tracked independently for each provider
|
||||
|
||||
### 3.2 Usage Tracking
|
||||
|
||||
The system tracks usage in real-time with the following characteristics:
|
||||
|
||||
**Tracked Metrics:**
|
||||
- Tokens consumed per request
|
||||
- Number of API requests
|
||||
- Timestamps for rate window calculation
|
||||
- Per-model breakdown when scoped
|
||||
|
||||
**Data Retention:**
|
||||
- Usage data retained for 48 hours for rate limiting
|
||||
- Aggregated statistics persisted for reporting
|
||||
- Separate tracking for planning vs building
|
||||
|
||||
**State Files:**
|
||||
- `provider-limits.json`: Saved limit configurations
|
||||
- `provider-usage.json`: Recent usage data
|
||||
- `token-usage.json`: User token consumption
|
||||
|
||||
### 3.3 Limit Enforcement
|
||||
|
||||
When a request is made, the system:
|
||||
|
||||
1. Identifies the provider and model
|
||||
2. Checks applicable limits (provider-level or model-level)
|
||||
3. Compares current usage against limits
|
||||
4. Returns rate limit error if exceeded
|
||||
5. Records usage after successful requests
|
||||
|
||||
---
|
||||
|
||||
## 4. Fallback System
|
||||
|
||||
### 4.1 Fallback Architecture
|
||||
|
||||
PluginCompass implements a multi-level fallback system for reliability:
|
||||
|
||||
**Fallback Levels:**
|
||||
1. **Model-level Fallback**: Alternative models within same provider
|
||||
2. **Provider-level Fallback**: Switch to alternative providers
|
||||
3. **Ultimate Backup**: Final fallback model when all providers fail
|
||||
|
||||
### 4.2 Model Fallback Chain
|
||||
|
||||
**For Each Provider:**
|
||||
Each provider has a configured fallback chain:
|
||||
|
||||
**OpenRouter:**
|
||||
```
|
||||
Primary Model → Backup 1 → Backup 2 → Backup 3 → Environment Fallbacks → Static Fallbacks
|
||||
```
|
||||
|
||||
**Mistral:**
|
||||
```
|
||||
Primary Model → Backup 1 → Backup 2 → Backup 3 → Default (mistral-large-latest)
|
||||
```
|
||||
|
||||
**Groq:**
|
||||
```
|
||||
llama-3.3-70b-versatile → mixtral-8x7b-32768 → llama-3.1-70b-versatile
|
||||
```
|
||||
|
||||
**Google (Gemini):**
|
||||
```
|
||||
gemini-1.5-flash → gemini-1.5-pro → gemini-pro
|
||||
```
|
||||
|
||||
**NVIDIA (NIM):**
|
||||
```
|
||||
meta/llama-3.1-70b-instruct → meta/llama-3.1-8b-instruct
|
||||
```
|
||||
|
||||
### 4.3 Planning Chain Fallback
|
||||
|
||||
The planning system follows a configured priority chain:
|
||||
|
||||
1. Attempts first provider in chain
|
||||
2. If rate limited or error occurs, moves to next provider
|
||||
3. Continues through all configured providers
|
||||
4. Returns error if all providers fail
|
||||
|
||||
**Planning Chain Configuration:**
|
||||
- Configurable via admin panel
|
||||
- Each entry: `{ provider, model }`
|
||||
- Priority determines fallback order
|
||||
- Supports provider prefix in model names (e.g., "groq/compound-mini")
|
||||
|
||||
### 4.4 Build Fallback Chain
|
||||
|
||||
For building operations, the system follows this sequence:
|
||||
|
||||
1. **Primary Model**: User-selected or auto-assigned model
|
||||
2. **Provider Fallback**: Alternative providers with same model
|
||||
3. **OpenCode Fallback**: Internal OpenCode processing
|
||||
4. **Ultimate Backup**: Configured backup model (last resort)
|
||||
|
||||
### 4.5 Error Classification & Fallback Decision
|
||||
|
||||
Errors are classified to determine fallback behavior:
|
||||
|
||||
| Error Type | Example | Fallback Action |
|
||||
|-----------|---------|-----------------|
|
||||
| Rate Limit (429) | "Too many requests" | Wait 30s, switch provider |
|
||||
| Server Error (5xx) | "Internal error" | Wait 30s, switch provider |
|
||||
| Auth Error (401) | "Invalid API key" | Switch immediately |
|
||||
| Billing Error (402) | "Insufficient credits" | Switch immediately |
|
||||
| Model Not Found (404) | "Unknown model" | Wait 30s, switch model |
|
||||
| User Error (400) | "Invalid request" | Return error, no fallback |
|
||||
| Token Limit | "Context length exceeded" | Return error, no fallback |
|
||||
|
||||
**Continue Mechanism:**
|
||||
- For early terminations, system sends "continue" message
|
||||
- Retries up to 3 times with same model
|
||||
- After 3 failures, switches to fallback model
|
||||
|
||||
---
|
||||
|
||||
## 5. Admin Configuration Interface
|
||||
|
||||
### 5.1 Model Management
|
||||
|
||||
The admin panel allows configuration of:
|
||||
|
||||
- **Add/Update Models**: Select from discovered models or add custom
|
||||
- **Provider Priority**: Drag-and-drop reordering of provider fallback order
|
||||
- **Tier Assignment**: free (1x), plus (2x), pro (3x) multipliers
|
||||
- **Icon Selection**: Associate icons with models
|
||||
- **Media Support**: Enable/disable image upload capability
|
||||
|
||||
### 5.2 Provider Limits Configuration
|
||||
|
||||
The limits interface provides:
|
||||
|
||||
- **Provider Selection**: Dropdown for each configured provider
|
||||
- **Scope Selection**: Per-provider or per-model limits
|
||||
- **Limit Inputs**: Numeric fields for TPM, TPD, RPM, RPD
|
||||
- **Live Usage Display**: Current usage statistics per provider
|
||||
- **Save/Reset**: Persist or revert limit changes
|
||||
|
||||
### 5.3 Ultimate Backup Configuration
|
||||
|
||||
The admin can configure a final fallback model that will be used when all other providers fail. This ensures system availability even during widespread outages.
|
||||
|
||||
---
|
||||
|
||||
## 6. Configuration Files
|
||||
|
||||
### 6.1 Environment Variables
|
||||
|
||||
Key configuration is done via environment variables:
|
||||
|
||||
```bash
|
||||
# Provider API Keys
|
||||
OPENROUTER_API_KEY=
|
||||
MISTRAL_API_KEY=
|
||||
GOOGLE_API_KEY=
|
||||
GROQ_API_KEY=
|
||||
NVIDIA_API_KEY=
|
||||
CHUTES_API_KEY=
|
||||
|
||||
# Admin Authentication
|
||||
ADMIN_USER=
|
||||
ADMIN_PASSWORD=
|
||||
|
||||
# Rate Limiting
|
||||
ADMIN_LOGIN_RATE_LIMIT=5
|
||||
USER_LOGIN_RATE_LIMIT=10
|
||||
API_RATE_LIMIT=100
|
||||
```
|
||||
|
||||
### 6.2 Runtime State
|
||||
|
||||
The system maintains runtime state in:
|
||||
|
||||
- `.data/.opencode-chat/provider-limits.json` - Persisted limits
|
||||
- `.data/.opencode-chat/provider-usage.json` - Recent usage
|
||||
- In-memory state for active sessions and rate tracking
|
||||
|
||||
---
|
||||
|
||||
## 7. Security Considerations
|
||||
|
||||
### 7.1 Rate Limiting
|
||||
|
||||
- **Login Protection**: 5 attempts/minute, 15-minute lockout
|
||||
- **API Protection**: 100 requests/minute per user
|
||||
- **Provider Protection**: Configurable limits prevent abuse
|
||||
|
||||
### 7.2 Authentication
|
||||
|
||||
- Session-based auth with secure cookies
|
||||
- OAuth support for Google and GitHub
|
||||
- Rate-limited login attempts
|
||||
- Session timeout enforcement
|
||||
|
||||
### 7.3 Data Protection
|
||||
|
||||
- Provider API keys stored securely
|
||||
- Usage data retained only as needed
|
||||
- No sensitive data in logs
|
||||
- Encrypted session storage
|
||||
|
||||
---
|
||||
|
||||
## 8. Monitoring & Analytics
|
||||
|
||||
### 8.1 Tracking Metrics
|
||||
|
||||
The system tracks:
|
||||
|
||||
- **User Analytics**: Session duration, feature usage, model preferences
|
||||
- **Business Metrics**: MRR, LTV, churn rate, CAC
|
||||
- **Technical Metrics**: AI response times, error rates, queue wait times
|
||||
- **Provider Metrics**: Per-provider usage and error rates
|
||||
|
||||
### 8.2 Admin Dashboard
|
||||
|
||||
The tracking page provides:
|
||||
|
||||
- Daily/weekly/monthly active users
|
||||
- Revenue analytics
|
||||
- Conversion funnels
|
||||
- Error rate monitoring
|
||||
- Resource utilization
|
||||
|
||||
---
|
||||
|
||||
## 9. Summary
|
||||
|
||||
PluginCompass provides a robust, multi-provider AI infrastructure with:
|
||||
|
||||
1. **Flexible Provider Management**: Support for 6+ AI providers with automatic model discovery
|
||||
2. **Granular Rate Limiting**: Per-provider and per-model limits with configurable thresholds
|
||||
3. **Intelligent Fallback**: Multi-level fallback chains ensure high availability
|
||||
4. **Comprehensive Admin Control**: Full configuration through web-based admin panel
|
||||
5. **Usage Tracking**: Real-time monitoring of token consumption and API usage
|
||||
6. **Security Measures**: Rate limiting, authentication, and session management
|
||||
|
||||
This architecture ensures reliable AI-powered development while maintaining control over costs and system availability.
|
||||
Reference in New Issue
Block a user