Files
shopify-ai-backup/PLUGIN_COMPASS_PROVIDER_SYSTEM.md
southseact-3d 58bab1c5d8 Add PluginCompass Provider System documentation
This document describes the architecture and functionality of the PluginCompass AI provider management system, including:
- Admin panel structure and authentication
- Provider management with supported providers (OpenRouter, Mistral, Google, Groq, NVIDIA, Chutes, Ollama)
- Rate limiting system with per-provider and per-model limits
- Fallback system architecture with multi-level fallback chains
- Usage tracking and monitoring capabilities

The documentation covers both technical implementation details and operational guidance for managing the provider infrastructure.
2026-02-09 18:23:55 +00:00

11 KiB

PluginCompass Provider Management & Fallback System

Overview

This document describes the architecture and functionality of the PluginCompass AI provider management system, including the admin panel configuration, provider limits, usage tracking, and fallback mechanisms.


1. Admin Panel Structure

1.1 Admin Panel Sections

The PluginCompass admin panel is accessible at /admin and provides the following management areas:

Main Admin Areas:

  • Build Models (/admin/build) - Configure AI models available to users
  • Plan Models (/admin/plan) - Configure planning provider chain
  • Plans (/admin/plans) - Manage subscription plans and pricing
  • Accounts (/admin/accounts) - User account management
  • Affiliates (/admin/affiliates) - Affiliate program management
  • Withdrawals (/admin/withdrawals) - Affiliate payout management
  • Tracking (/admin/tracking) - Analytics and usage statistics
  • Resources (/admin/resources) - System resource monitoring
  • External Testing (/admin/external-testing) - WordPress testing configuration
  • Contact Messages (/admin/contact-messages) - Customer inquiries

1.2 Admin Authentication

The admin panel uses session-based authentication with the following security measures:

  • Credentials: Configured via environment variables ADMIN_USER and ADMIN_PASSWORD
  • Session Duration: 24 hours (configurable via ADMIN_SESSION_TTL_MS)
  • Rate Limiting: Maximum 5 login attempts per minute per IP
  • Account Lockout: 15 minutes after failed attempts

API Authentication: All admin API endpoints require authentication via session cookies. The endpoints include:

  • Login: POST /api/admin/login
  • Logout: POST /api/admin/logout
  • Session check: GET /api/admin/me

2. Provider Management

2.1 Supported Providers

PluginCompass supports multiple AI providers for both planning and building:

Build Providers:

  • OpenRouter (primary aggregator)
  • Mistral
  • Google (Gemini)
  • Groq
  • NVIDIA (NIM)
  • Chutes AI
  • OpenCode (internal/self-hosted)
  • Ollama (self-hosted planning)

Planning Providers:

  • OpenRouter
  • Mistral
  • Google (Gemini)
  • Groq
  • NVIDIA (NIM)
  • Ollama (self-hosted)

2.2 Provider Configuration

Each provider requires API credentials configured via environment variables:

Provider Environment Variable Default API URL
OpenRouter OPENROUTER_API_KEY https://openrouter.ai/api/v1
Mistral MISTRAL_API_KEY https://api.mistral.ai/v1
Google GOOGLE_API_KEY https://generativelanguage.googleapis.com/v1beta2
Groq GROQ_API_KEY https://api.groq.com/openai/v1/chat/completions
NVIDIA NVIDIA_API_KEY https://api.nvidia.com/v1
Chutes AI CHUTES_API_KEY or PLUGIN_COMPASS_CHUTES_API_KEY https://api.chutes.ai/v1
Ollama OLLAMA_API_URL Configurable self-hosted URL

2.3 Model Discovery

The system automatically discovers available models from each provider:

  1. CLI-based Discovery: Queries OpenCode CLI for available models
  2. Provider API Discovery: Fetches model lists directly from provider APIs
  3. Manual Configuration: Admin can manually add model configurations

Model Configuration per Provider: Each configured model includes:

  • Model identifier (provider/model format)
  • Display label for users
  • Tier classification (free, plus, pro)
  • Icon association
  • Provider priority order
  • Media support flag (image uploads)

3. Provider Limits & Usage Tracking

3.1 Rate Limiting System

PluginCompass implements a flexible rate limiting system with the following components:

Limit Types:

  • Tokens per Minute (TPM): Token consumption rate limit
  • Tokens per Day (TPD): Daily token consumption limit
  • Requests per Minute (RPM): API call rate limit
  • Requests per Day (RPD): Daily API call limit

Scope Levels:

  • Per Provider: Limits apply to all models from that provider
  • Per Model: Limits apply to specific models only

Default Behavior:

  • All limits default to 0 (unlimited)
  • Limits are configurable per provider or per model
  • Usage is tracked independently for each provider

3.2 Usage Tracking

The system tracks usage in real-time with the following characteristics:

Tracked Metrics:

  • Tokens consumed per request
  • Number of API requests
  • Timestamps for rate window calculation
  • Per-model breakdown when scoped

Data Retention:

  • Usage data retained for 48 hours for rate limiting
  • Aggregated statistics persisted for reporting
  • Separate tracking for planning vs building

State Files:

  • provider-limits.json: Saved limit configurations
  • provider-usage.json: Recent usage data
  • token-usage.json: User token consumption

3.3 Limit Enforcement

When a request is made, the system:

  1. Identifies the provider and model
  2. Checks applicable limits (provider-level or model-level)
  3. Compares current usage against limits
  4. Returns rate limit error if exceeded
  5. Records usage after successful requests

4. Fallback System

4.1 Fallback Architecture

PluginCompass implements a multi-level fallback system for reliability:

Fallback Levels:

  1. Model-level Fallback: Alternative models within same provider
  2. Provider-level Fallback: Switch to alternative providers
  3. Ultimate Backup: Final fallback model when all providers fail

4.2 Model Fallback Chain

For Each Provider: Each provider has a configured fallback chain:

OpenRouter:

Primary Model → Backup 1 → Backup 2 → Backup 3 → Environment Fallbacks → Static Fallbacks

Mistral:

Primary Model → Backup 1 → Backup 2 → Backup 3 → Default (mistral-large-latest)

Groq:

llama-3.3-70b-versatile → mixtral-8x7b-32768 → llama-3.1-70b-versatile

Google (Gemini):

gemini-1.5-flash → gemini-1.5-pro → gemini-pro

NVIDIA (NIM):

meta/llama-3.1-70b-instruct → meta/llama-3.1-8b-instruct

4.3 Planning Chain Fallback

The planning system follows a configured priority chain:

  1. Attempts first provider in chain
  2. If rate limited or error occurs, moves to next provider
  3. Continues through all configured providers
  4. Returns error if all providers fail

Planning Chain Configuration:

  • Configurable via admin panel
  • Each entry: { provider, model }
  • Priority determines fallback order
  • Supports provider prefix in model names (e.g., "groq/compound-mini")

4.4 Build Fallback Chain

For building operations, the system follows this sequence:

  1. Primary Model: User-selected or auto-assigned model
  2. Provider Fallback: Alternative providers with same model
  3. OpenCode Fallback: Internal OpenCode processing
  4. Ultimate Backup: Configured backup model (last resort)

4.5 Error Classification & Fallback Decision

Errors are classified to determine fallback behavior:

Error Type Example Fallback Action
Rate Limit (429) "Too many requests" Wait 30s, switch provider
Server Error (5xx) "Internal error" Wait 30s, switch provider
Auth Error (401) "Invalid API key" Switch immediately
Billing Error (402) "Insufficient credits" Switch immediately
Model Not Found (404) "Unknown model" Wait 30s, switch model
User Error (400) "Invalid request" Return error, no fallback
Token Limit "Context length exceeded" Return error, no fallback

Continue Mechanism:

  • For early terminations, system sends "continue" message
  • Retries up to 3 times with same model
  • After 3 failures, switches to fallback model

5. Admin Configuration Interface

5.1 Model Management

The admin panel allows configuration of:

  • Add/Update Models: Select from discovered models or add custom
  • Provider Priority: Drag-and-drop reordering of provider fallback order
  • Tier Assignment: free (1x), plus (2x), pro (3x) multipliers
  • Icon Selection: Associate icons with models
  • Media Support: Enable/disable image upload capability

5.2 Provider Limits Configuration

The limits interface provides:

  • Provider Selection: Dropdown for each configured provider
  • Scope Selection: Per-provider or per-model limits
  • Limit Inputs: Numeric fields for TPM, TPD, RPM, RPD
  • Live Usage Display: Current usage statistics per provider
  • Save/Reset: Persist or revert limit changes

5.3 Ultimate Backup Configuration

The admin can configure a final fallback model that will be used when all other providers fail. This ensures system availability even during widespread outages.


6. Configuration Files

6.1 Environment Variables

Key configuration is done via environment variables:

# Provider API Keys
OPENROUTER_API_KEY=
MISTRAL_API_KEY=
GOOGLE_API_KEY=
GROQ_API_KEY=
NVIDIA_API_KEY=
CHUTES_API_KEY=

# Admin Authentication
ADMIN_USER=
ADMIN_PASSWORD=

# Rate Limiting
ADMIN_LOGIN_RATE_LIMIT=5
USER_LOGIN_RATE_LIMIT=10
API_RATE_LIMIT=100

6.2 Runtime State

The system maintains runtime state in:

  • .data/.opencode-chat/provider-limits.json - Persisted limits
  • .data/.opencode-chat/provider-usage.json - Recent usage
  • In-memory state for active sessions and rate tracking

7. Security Considerations

7.1 Rate Limiting

  • Login Protection: 5 attempts/minute, 15-minute lockout
  • API Protection: 100 requests/minute per user
  • Provider Protection: Configurable limits prevent abuse

7.2 Authentication

  • Session-based auth with secure cookies
  • OAuth support for Google and GitHub
  • Rate-limited login attempts
  • Session timeout enforcement

7.3 Data Protection

  • Provider API keys stored securely
  • Usage data retained only as needed
  • No sensitive data in logs
  • Encrypted session storage

8. Monitoring & Analytics

8.1 Tracking Metrics

The system tracks:

  • User Analytics: Session duration, feature usage, model preferences
  • Business Metrics: MRR, LTV, churn rate, CAC
  • Technical Metrics: AI response times, error rates, queue wait times
  • Provider Metrics: Per-provider usage and error rates

8.2 Admin Dashboard

The tracking page provides:

  • Daily/weekly/monthly active users
  • Revenue analytics
  • Conversion funnels
  • Error rate monitoring
  • Resource utilization

9. Summary

PluginCompass provides a robust, multi-provider AI infrastructure with:

  1. Flexible Provider Management: Support for 6+ AI providers with automatic model discovery
  2. Granular Rate Limiting: Per-provider and per-model limits with configurable thresholds
  3. Intelligent Fallback: Multi-level fallback chains ensure high availability
  4. Comprehensive Admin Control: Full configuration through web-based admin panel
  5. Usage Tracking: Real-time monitoring of token consumption and API usage
  6. Security Measures: Rate limiting, authentication, and session management

This architecture ensures reliable AI-powered development while maintaining control over costs and system availability.