From b313994578011e3ffbebeb072d4c5838a5369200 Mon Sep 17 00:00:00 2001 From: Liam Hetherington Date: Thu, 12 Feb 2026 09:13:53 +0000 Subject: [PATCH] Delete Chutes AI Tool Call Issue Plan document Removed detailed analysis and proposed solutions for the Chutes AI tool call counting issue. --- opencode/CHUTES_AI_TOOL_CALL_ISSUE_PLAN.md | 391 --------------------- 1 file changed, 391 deletions(-) diff --git a/opencode/CHUTES_AI_TOOL_CALL_ISSUE_PLAN.md b/opencode/CHUTES_AI_TOOL_CALL_ISSUE_PLAN.md index 3727d26..8b13789 100644 --- a/opencode/CHUTES_AI_TOOL_CALL_ISSUE_PLAN.md +++ b/opencode/CHUTES_AI_TOOL_CALL_ISSUE_PLAN.md @@ -1,392 +1 @@ -# Chutes AI Tool Call Counting Issue - Analysis and Plan -**Status:** Under Investigation -**Date:** February 11, 2026 -**Related Files:** -- `opencode/packages/opencode/src/session/llm.ts` (lines 87-167) -- `opencode/packages/opencode/src/session/processor.ts` -- `opencode/packages/opencode/src/session/prompt.ts` (lines 716-745, 755-837) - ---- - -## Executive Summary - -When using Chutes AI with opencode, tool calls are counting as separate API requests, causing excessive billing. This is due to the Vercel AI SDK's `streamText` function automatically handling multi-step tool execution. - ---- - -## The Problem - -### Current Behavior (Undesired) - -The Vercel AI SDK's `streamText` function automatically executes tools and sends results back to the model in **multiple HTTP requests**: - -``` -Step 1: Initial Request -- Model receives prompt + available tools -- Model returns: "read file X" -- SDK executes tool locally - -Step 2: Automatic Follow-up Request (NEW API CALL!) -- SDK sends tool results back to model -- Model returns: "edit file X with changes" -- SDK executes tool locally - -Step 3: Automatic Follow-up Request (NEW API CALL!) -- SDK sends tool results back to model -- Model returns final response -- Stream ends -``` - -**Result:** Each step after the first counts as a separate Chutes AI API request, multiplying costs. - -### Root Cause - -In `llm.ts` lines 87-167, `streamText` is called with tools that include `execute` functions: - -```typescript -// prompt.ts lines 716-745 -const result = await item.execute(args, ctx) // <-- Tools have execute functions -``` - -When tools have `execute` functions, the SDK automatically: -1. Executes the tools when the model requests them -2. Sends the results back to the model in a **new API request** -3. Continues this process for multiple steps - ---- - -## User Concerns (Critical Issues) - -### Concern #1: Model Won't See Tool Results in Time - -**Issue:** If we limit to `maxSteps: 1`, the model will: -1. Call "read file" -2. SDK executes it -3. SDK STOPS (doesn't send results back) -4. Model never sees the file contents to make edit decisions - -**Impact:** Breaks sequential workflows like read→edit. - -### Concern #2: Model Can't Do Multiple Tool Calls - -**Issue:** Will the model be limited to only one tool call per session/iteration? - -**Impact:** Complex multi-step tasks become impossible. - -### Concern #3: Session Completion Timing - -**Issue:** Will tool results only be available after the entire session finishes? - -**Impact:** Model can't react to tool outputs in real-time. - ---- - -## Technical Analysis - -### How Tool Execution Currently Works - -1. **Opencode's Outer Loop** (`prompt.ts:282`): - ```typescript - while (true) { - // Each iteration is one "step" - const tools = await resolveTools({...}) - const result = await processor.process({tools, ...}) - } - ``` - -2. **SDK's Internal Multi-Step** (`llm.ts:87`): - ```typescript - const result = streamText({ - tools, // Tools with execute functions - // No maxSteps or stopWhen parameter! - }) - ``` - -3. **Processor Handles Events** (`processor.ts:94`): - - `tool-input-start`: Tool call begins - - `tool-call`: Tool is called - - `tool-result`: Tool execution completes - - `finish-step`: Step ends - -### Message Flow - -**Current Flow (SDK Multi-Step):** -``` -┌─────────────┐ ┌──────────┐ ┌──────────────┐ -│ Opencode │────▶│ SDK │────▶│ Chutes AI │ -│ Loop │ │ streamText│ │ API Request 1│ -└─────────────┘ └──────────┘ └──────────────┘ - │ │ - │ ▼ - │ ┌──────────────┐ - │ │ Model decides│ - │ │ to call tools│ - │ └──────────────┘ - │ │ - ▼ │ - ┌──────────┐ │ - │ Executes │ │ - │ tools │ │ - └──────────┘ │ - │ │ - ▼ ▼ - ┌──────────────┐ ┌──────────────┐ - │ Chutes AI │◀────│ SDK sends │ - │ API Request 2│ │ tool results │ - └──────────────┘ └──────────────┘ -``` - -**Proposed Flow (maxSteps: 1):** -``` -┌─────────────┐ ┌──────────┐ ┌──────────────┐ -│ Opencode │────▶│ SDK │────▶│ Chutes AI │ -│ Loop │ │ streamText│ │ API Request 1│ -└─────────────┘ │ (maxSteps:1)│ └──────────────┘ - ▲ └──────────┘ │ - │ │ │ - │ ▼ ▼ - │ ┌──────────┐ ┌──────────────┐ - │ │ Executes │ │ Model decides│ - │ │ tools │ │ to call tools│ - │ └──────────┘ └──────────────┘ - │ │ │ - │ │ ┌──────┘ - │ │ │ SDK STOPS - │ ▼ │ (doesn't send - │ ┌──────────────┐ │ results back) - │ │ Tool results │ │ - │ │ stored in │ │ - │ │ opencode │ │ - │ └──────────────┘ │ - │ │ │ - └───────────────────┴─────────────┘ - │ - Next opencode loop iteration - Tool results included in messages - │ - ▼ - ┌──────────────────────────────┐ - │ Model now sees tool results │ - │ and can make next decision │ - └──────────────────────────────┘ -``` - ---- - -## Proposed Solutions - -### Option A: Add `maxSteps: 1` Parameter (Quick Fix) - -**Change:** Add to `llm.ts:87`: -```typescript -import { stepCountIs } from 'ai' - -const result = streamText({ - // ... existing options - stopWhen: stepCountIs(1), - // ... -}) -``` - -**Pros:** -- Prevents SDK from making multiple LLM calls internally -- Each `streamText()` call = exactly 1 Chutes AI request -- Opencode's outer loop handles iterations with full control - -**Cons:** -- **MAY BREAK SEQUENTIAL WORKFLOWS**: Model won't see tool results until next opencode loop iteration -- Tool execution still happens but results aren't automatically fed back - -**Risk Level:** HIGH - May break multi-step tool workflows - -### Option B: Remove `execute` Functions from Tools - -**Change:** Modify `prompt.ts` to pass tools WITHOUT `execute` functions: - -```typescript -// Instead of: -tools[item.id] = tool({ - execute: async (args, options) => { ... } // Remove this -}) - -// Use: -tools[item.id] = tool({ - description: item.description, - inputSchema: jsonSchema(schema as any), - // NO execute function - SDK won't auto-execute -}) -``` - -Then manually execute tools in `processor.ts` when `tool-call` events are received. - -**Pros:** -- SDK never automatically executes tools -- Full control over execution flow -- No hidden API requests - -**Cons:** -- Requires significant refactoring of tool handling -- Need to manually implement tool execution loop -- Risk of introducing bugs - -**Risk Level:** MEDIUM-HIGH - Requires substantial code changes - -### Option C: Provider-Specific Configuration for Chutes - -**Change:** Detect Chutes provider and apply special handling: - -```typescript -const result = streamText({ - // ... existing options - ...(input.model.providerID === 'chutes' && { - stopWhen: stepCountIs(1), - }), - // ... -}) -``` - -**Pros:** -- Only affects Chutes AI, other providers work as before -- Minimal code changes -- Can test specifically with Chutes - -**Cons:** -- Still has the same risks as Option A -- Provider-specific code adds complexity - -**Risk Level:** MEDIUM - Targeted fix but still risky - -### Option D: Keep Current Behavior + Documentation - -**Change:** None - just document the behavior - -**Pros:** -- No code changes = no risk of breaking anything -- Works correctly for sequential workflows - -**Cons:** -- Chutes AI users pay for multiple requests -- Not a real solution - -**Risk Level:** NONE - But doesn't solve the problem - ---- - -## Key Findings - -1. **SDK Version:** Using `ai@5.0.124` (from root package.json line 43) - -2. **Default Behavior:** According to docs, `stopWhen` defaults to `stepCountIs(1)`, but it's not explicitly set in the code, and the behavior suggests multi-step is enabled - -3. **Tool Execution:** Even with `maxSteps: 1`, tools WILL still execute because they have `execute` functions - the SDK just won't automatically send results back to the model - -4. **Message Conversion:** `MessageV2.toModelMessages()` (line 656 in message-v2.ts) already handles converting tool results back to model messages for the next iteration - -5. **Opencode's Loop:** The outer `while (true)` loop in `prompt.ts:282` manages the conversation flow and WILL include tool results in the next iteration's messages - ---- - -## Critical Questions to Resolve - -1. **Does the model ACTUALLY lose context with `maxSteps: 1`?** - - Theory: SDK executes tools, stores results, opencode loop includes them in next iteration - - Need to verify: Does the model see results in time to make sequential decisions? - -2. **What happens to parallel tool calls?** - - If model calls 3 tools at once, will they all execute before next iteration? - - Or will opencode's loop serialize them? - -3. **How does this affect Chutes AI billing specifically?** - - Does Chutes count: (a) HTTP requests, (b) tokens, or (c) conversation steps? - - If (a), then `maxSteps: 1` definitely helps - - If (b) or (c), may not help as much - -4. **Can we test without affecting production?** - - Need a test environment or feature flag - - Should A/B test with different providers - ---- - -## Recommended Next Steps - -1. **Create a Test Branch:** Implement Option A (`maxSteps: 1`) in isolation -2. **Test Sequential Workflows:** Verify read→edit workflows still work -3. **Monitor Request Count:** Log actual HTTP requests to Chutes API -4. **Measure Latency:** Check if response times change significantly -5. **Test Parallel Tool Calls:** Ensure multiple tools in one step work correctly -6. **Document Behavior:** Update documentation to explain the flow -7. **Consider Option B:** If Option A breaks workflows, implement manual tool execution - ---- - -## Code References - -### streamText Call (llm.ts:87-167) -```typescript -const result = streamText({ - onError(error) { ... }, - async experimental_repairToolCall(failed) { ... }, - temperature: params.temperature, - topP: params.topP, - topK: params.topK, - providerOptions: ProviderTransform.providerOptions(input.model, params.options), - activeTools: Object.keys(tools).filter((x) => x !== "invalid"), - tools, // <-- These have execute functions! - maxOutputTokens, - abortSignal: input.abort, - headers: { ... }, - maxRetries: 0, - messages: [ ... ], - model: wrapLanguageModel({ ... }), - experimental_telemetry: { ... }, - // MISSING: maxSteps or stopWhen parameter! -}) -``` - -### Tool Definition with Execute (prompt.ts:716-745) -```typescript -tools[item.id] = tool({ - id: item.id as any, - description: item.description, - inputSchema: jsonSchema(schema as any), - async execute(args, options) { - const ctx = context(args, options) - await Plugin.trigger("tool.execute.before", ...) - const result = await item.execute(args, ctx) // <-- Execute function! - await Plugin.trigger("tool.execute.after", ...) - return result - }, -}) -``` - -### Opencode's Outer Loop (prompt.ts:282) -```typescript -while (true) { - SessionStatus.set(sessionID, { type: "busy" }) - let step = 0 - step++ - - const tools = await resolveTools({...}) - - const result = await processor.process({ - tools, - model, - // ... - }) - - if (result === "stop") break -} -``` - ---- - -## Conclusion - -The issue is confirmed: the Vercel AI SDK's automatic multi-step execution causes multiple Chutes AI API requests per conversation turn. The proposed fix of adding `maxSteps: 1` or `stopWhen: stepCountIs(1)` would reduce this to one request per opencode loop iteration. - -However, the user's concerns about breaking sequential workflows are valid and need thorough testing before implementation. The recommended approach is to create a test branch, verify all workflow types, and then decide on the best solution. - -**Priority:** HIGH -**Effort:** LOW for Option A, HIGH for Option B -**Risk:** MEDIUM-HIGH (may break existing workflows)