Delete Chutes AI Tool Call Issue Plan document
Removed detailed analysis and proposed solutions for the Chutes AI tool call counting issue.
This commit is contained in:
committed by
GitHub
parent
e2abb2ee98
commit
b313994578
@@ -1,392 +1 @@
|
||||
# Chutes AI Tool Call Counting Issue - Analysis and Plan
|
||||
|
||||
**Status:** Under Investigation
|
||||
**Date:** February 11, 2026
|
||||
**Related Files:**
|
||||
- `opencode/packages/opencode/src/session/llm.ts` (lines 87-167)
|
||||
- `opencode/packages/opencode/src/session/processor.ts`
|
||||
- `opencode/packages/opencode/src/session/prompt.ts` (lines 716-745, 755-837)
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
When using Chutes AI with opencode, tool calls are counting as separate API requests, causing excessive billing. This is due to the Vercel AI SDK's `streamText` function automatically handling multi-step tool execution.
|
||||
|
||||
---
|
||||
|
||||
## The Problem
|
||||
|
||||
### Current Behavior (Undesired)
|
||||
|
||||
The Vercel AI SDK's `streamText` function automatically executes tools and sends results back to the model in **multiple HTTP requests**:
|
||||
|
||||
```
|
||||
Step 1: Initial Request
|
||||
- Model receives prompt + available tools
|
||||
- Model returns: "read file X"
|
||||
- SDK executes tool locally
|
||||
|
||||
Step 2: Automatic Follow-up Request (NEW API CALL!)
|
||||
- SDK sends tool results back to model
|
||||
- Model returns: "edit file X with changes"
|
||||
- SDK executes tool locally
|
||||
|
||||
Step 3: Automatic Follow-up Request (NEW API CALL!)
|
||||
- SDK sends tool results back to model
|
||||
- Model returns final response
|
||||
- Stream ends
|
||||
```
|
||||
|
||||
**Result:** Each step after the first counts as a separate Chutes AI API request, multiplying costs.
|
||||
|
||||
### Root Cause
|
||||
|
||||
In `llm.ts` lines 87-167, `streamText` is called with tools that include `execute` functions:
|
||||
|
||||
```typescript
|
||||
// prompt.ts lines 716-745
|
||||
const result = await item.execute(args, ctx) // <-- Tools have execute functions
|
||||
```
|
||||
|
||||
When tools have `execute` functions, the SDK automatically:
|
||||
1. Executes the tools when the model requests them
|
||||
2. Sends the results back to the model in a **new API request**
|
||||
3. Continues this process for multiple steps
|
||||
|
||||
---
|
||||
|
||||
## User Concerns (Critical Issues)
|
||||
|
||||
### Concern #1: Model Won't See Tool Results in Time
|
||||
|
||||
**Issue:** If we limit to `maxSteps: 1`, the model will:
|
||||
1. Call "read file"
|
||||
2. SDK executes it
|
||||
3. SDK STOPS (doesn't send results back)
|
||||
4. Model never sees the file contents to make edit decisions
|
||||
|
||||
**Impact:** Breaks sequential workflows like read→edit.
|
||||
|
||||
### Concern #2: Model Can't Do Multiple Tool Calls
|
||||
|
||||
**Issue:** Will the model be limited to only one tool call per session/iteration?
|
||||
|
||||
**Impact:** Complex multi-step tasks become impossible.
|
||||
|
||||
### Concern #3: Session Completion Timing
|
||||
|
||||
**Issue:** Will tool results only be available after the entire session finishes?
|
||||
|
||||
**Impact:** Model can't react to tool outputs in real-time.
|
||||
|
||||
---
|
||||
|
||||
## Technical Analysis
|
||||
|
||||
### How Tool Execution Currently Works
|
||||
|
||||
1. **Opencode's Outer Loop** (`prompt.ts:282`):
|
||||
```typescript
|
||||
while (true) {
|
||||
// Each iteration is one "step"
|
||||
const tools = await resolveTools({...})
|
||||
const result = await processor.process({tools, ...})
|
||||
}
|
||||
```
|
||||
|
||||
2. **SDK's Internal Multi-Step** (`llm.ts:87`):
|
||||
```typescript
|
||||
const result = streamText({
|
||||
tools, // Tools with execute functions
|
||||
// No maxSteps or stopWhen parameter!
|
||||
})
|
||||
```
|
||||
|
||||
3. **Processor Handles Events** (`processor.ts:94`):
|
||||
- `tool-input-start`: Tool call begins
|
||||
- `tool-call`: Tool is called
|
||||
- `tool-result`: Tool execution completes
|
||||
- `finish-step`: Step ends
|
||||
|
||||
### Message Flow
|
||||
|
||||
**Current Flow (SDK Multi-Step):**
|
||||
```
|
||||
┌─────────────┐ ┌──────────┐ ┌──────────────┐
|
||||
│ Opencode │────▶│ SDK │────▶│ Chutes AI │
|
||||
│ Loop │ │ streamText│ │ API Request 1│
|
||||
└─────────────┘ └──────────┘ └──────────────┘
|
||||
│ │
|
||||
│ ▼
|
||||
│ ┌──────────────┐
|
||||
│ │ Model decides│
|
||||
│ │ to call tools│
|
||||
│ └──────────────┘
|
||||
│ │
|
||||
▼ │
|
||||
┌──────────┐ │
|
||||
│ Executes │ │
|
||||
│ tools │ │
|
||||
└──────────┘ │
|
||||
│ │
|
||||
▼ ▼
|
||||
┌──────────────┐ ┌──────────────┐
|
||||
│ Chutes AI │◀────│ SDK sends │
|
||||
│ API Request 2│ │ tool results │
|
||||
└──────────────┘ └──────────────┘
|
||||
```
|
||||
|
||||
**Proposed Flow (maxSteps: 1):**
|
||||
```
|
||||
┌─────────────┐ ┌──────────┐ ┌──────────────┐
|
||||
│ Opencode │────▶│ SDK │────▶│ Chutes AI │
|
||||
│ Loop │ │ streamText│ │ API Request 1│
|
||||
└─────────────┘ │ (maxSteps:1)│ └──────────────┘
|
||||
▲ └──────────┘ │
|
||||
│ │ │
|
||||
│ ▼ ▼
|
||||
│ ┌──────────┐ ┌──────────────┐
|
||||
│ │ Executes │ │ Model decides│
|
||||
│ │ tools │ │ to call tools│
|
||||
│ └──────────┘ └──────────────┘
|
||||
│ │ │
|
||||
│ │ ┌──────┘
|
||||
│ │ │ SDK STOPS
|
||||
│ ▼ │ (doesn't send
|
||||
│ ┌──────────────┐ │ results back)
|
||||
│ │ Tool results │ │
|
||||
│ │ stored in │ │
|
||||
│ │ opencode │ │
|
||||
│ └──────────────┘ │
|
||||
│ │ │
|
||||
└───────────────────┴─────────────┘
|
||||
│
|
||||
Next opencode loop iteration
|
||||
Tool results included in messages
|
||||
│
|
||||
▼
|
||||
┌──────────────────────────────┐
|
||||
│ Model now sees tool results │
|
||||
│ and can make next decision │
|
||||
└──────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Proposed Solutions
|
||||
|
||||
### Option A: Add `maxSteps: 1` Parameter (Quick Fix)
|
||||
|
||||
**Change:** Add to `llm.ts:87`:
|
||||
```typescript
|
||||
import { stepCountIs } from 'ai'
|
||||
|
||||
const result = streamText({
|
||||
// ... existing options
|
||||
stopWhen: stepCountIs(1),
|
||||
// ...
|
||||
})
|
||||
```
|
||||
|
||||
**Pros:**
|
||||
- Prevents SDK from making multiple LLM calls internally
|
||||
- Each `streamText()` call = exactly 1 Chutes AI request
|
||||
- Opencode's outer loop handles iterations with full control
|
||||
|
||||
**Cons:**
|
||||
- **MAY BREAK SEQUENTIAL WORKFLOWS**: Model won't see tool results until next opencode loop iteration
|
||||
- Tool execution still happens but results aren't automatically fed back
|
||||
|
||||
**Risk Level:** HIGH - May break multi-step tool workflows
|
||||
|
||||
### Option B: Remove `execute` Functions from Tools
|
||||
|
||||
**Change:** Modify `prompt.ts` to pass tools WITHOUT `execute` functions:
|
||||
|
||||
```typescript
|
||||
// Instead of:
|
||||
tools[item.id] = tool({
|
||||
execute: async (args, options) => { ... } // Remove this
|
||||
})
|
||||
|
||||
// Use:
|
||||
tools[item.id] = tool({
|
||||
description: item.description,
|
||||
inputSchema: jsonSchema(schema as any),
|
||||
// NO execute function - SDK won't auto-execute
|
||||
})
|
||||
```
|
||||
|
||||
Then manually execute tools in `processor.ts` when `tool-call` events are received.
|
||||
|
||||
**Pros:**
|
||||
- SDK never automatically executes tools
|
||||
- Full control over execution flow
|
||||
- No hidden API requests
|
||||
|
||||
**Cons:**
|
||||
- Requires significant refactoring of tool handling
|
||||
- Need to manually implement tool execution loop
|
||||
- Risk of introducing bugs
|
||||
|
||||
**Risk Level:** MEDIUM-HIGH - Requires substantial code changes
|
||||
|
||||
### Option C: Provider-Specific Configuration for Chutes
|
||||
|
||||
**Change:** Detect Chutes provider and apply special handling:
|
||||
|
||||
```typescript
|
||||
const result = streamText({
|
||||
// ... existing options
|
||||
...(input.model.providerID === 'chutes' && {
|
||||
stopWhen: stepCountIs(1),
|
||||
}),
|
||||
// ...
|
||||
})
|
||||
```
|
||||
|
||||
**Pros:**
|
||||
- Only affects Chutes AI, other providers work as before
|
||||
- Minimal code changes
|
||||
- Can test specifically with Chutes
|
||||
|
||||
**Cons:**
|
||||
- Still has the same risks as Option A
|
||||
- Provider-specific code adds complexity
|
||||
|
||||
**Risk Level:** MEDIUM - Targeted fix but still risky
|
||||
|
||||
### Option D: Keep Current Behavior + Documentation
|
||||
|
||||
**Change:** None - just document the behavior
|
||||
|
||||
**Pros:**
|
||||
- No code changes = no risk of breaking anything
|
||||
- Works correctly for sequential workflows
|
||||
|
||||
**Cons:**
|
||||
- Chutes AI users pay for multiple requests
|
||||
- Not a real solution
|
||||
|
||||
**Risk Level:** NONE - But doesn't solve the problem
|
||||
|
||||
---
|
||||
|
||||
## Key Findings
|
||||
|
||||
1. **SDK Version:** Using `ai@5.0.124` (from root package.json line 43)
|
||||
|
||||
2. **Default Behavior:** According to docs, `stopWhen` defaults to `stepCountIs(1)`, but it's not explicitly set in the code, and the behavior suggests multi-step is enabled
|
||||
|
||||
3. **Tool Execution:** Even with `maxSteps: 1`, tools WILL still execute because they have `execute` functions - the SDK just won't automatically send results back to the model
|
||||
|
||||
4. **Message Conversion:** `MessageV2.toModelMessages()` (line 656 in message-v2.ts) already handles converting tool results back to model messages for the next iteration
|
||||
|
||||
5. **Opencode's Loop:** The outer `while (true)` loop in `prompt.ts:282` manages the conversation flow and WILL include tool results in the next iteration's messages
|
||||
|
||||
---
|
||||
|
||||
## Critical Questions to Resolve
|
||||
|
||||
1. **Does the model ACTUALLY lose context with `maxSteps: 1`?**
|
||||
- Theory: SDK executes tools, stores results, opencode loop includes them in next iteration
|
||||
- Need to verify: Does the model see results in time to make sequential decisions?
|
||||
|
||||
2. **What happens to parallel tool calls?**
|
||||
- If model calls 3 tools at once, will they all execute before next iteration?
|
||||
- Or will opencode's loop serialize them?
|
||||
|
||||
3. **How does this affect Chutes AI billing specifically?**
|
||||
- Does Chutes count: (a) HTTP requests, (b) tokens, or (c) conversation steps?
|
||||
- If (a), then `maxSteps: 1` definitely helps
|
||||
- If (b) or (c), may not help as much
|
||||
|
||||
4. **Can we test without affecting production?**
|
||||
- Need a test environment or feature flag
|
||||
- Should A/B test with different providers
|
||||
|
||||
---
|
||||
|
||||
## Recommended Next Steps
|
||||
|
||||
1. **Create a Test Branch:** Implement Option A (`maxSteps: 1`) in isolation
|
||||
2. **Test Sequential Workflows:** Verify read→edit workflows still work
|
||||
3. **Monitor Request Count:** Log actual HTTP requests to Chutes API
|
||||
4. **Measure Latency:** Check if response times change significantly
|
||||
5. **Test Parallel Tool Calls:** Ensure multiple tools in one step work correctly
|
||||
6. **Document Behavior:** Update documentation to explain the flow
|
||||
7. **Consider Option B:** If Option A breaks workflows, implement manual tool execution
|
||||
|
||||
---
|
||||
|
||||
## Code References
|
||||
|
||||
### streamText Call (llm.ts:87-167)
|
||||
```typescript
|
||||
const result = streamText({
|
||||
onError(error) { ... },
|
||||
async experimental_repairToolCall(failed) { ... },
|
||||
temperature: params.temperature,
|
||||
topP: params.topP,
|
||||
topK: params.topK,
|
||||
providerOptions: ProviderTransform.providerOptions(input.model, params.options),
|
||||
activeTools: Object.keys(tools).filter((x) => x !== "invalid"),
|
||||
tools, // <-- These have execute functions!
|
||||
maxOutputTokens,
|
||||
abortSignal: input.abort,
|
||||
headers: { ... },
|
||||
maxRetries: 0,
|
||||
messages: [ ... ],
|
||||
model: wrapLanguageModel({ ... }),
|
||||
experimental_telemetry: { ... },
|
||||
// MISSING: maxSteps or stopWhen parameter!
|
||||
})
|
||||
```
|
||||
|
||||
### Tool Definition with Execute (prompt.ts:716-745)
|
||||
```typescript
|
||||
tools[item.id] = tool({
|
||||
id: item.id as any,
|
||||
description: item.description,
|
||||
inputSchema: jsonSchema(schema as any),
|
||||
async execute(args, options) {
|
||||
const ctx = context(args, options)
|
||||
await Plugin.trigger("tool.execute.before", ...)
|
||||
const result = await item.execute(args, ctx) // <-- Execute function!
|
||||
await Plugin.trigger("tool.execute.after", ...)
|
||||
return result
|
||||
},
|
||||
})
|
||||
```
|
||||
|
||||
### Opencode's Outer Loop (prompt.ts:282)
|
||||
```typescript
|
||||
while (true) {
|
||||
SessionStatus.set(sessionID, { type: "busy" })
|
||||
let step = 0
|
||||
step++
|
||||
|
||||
const tools = await resolveTools({...})
|
||||
|
||||
const result = await processor.process({
|
||||
tools,
|
||||
model,
|
||||
// ...
|
||||
})
|
||||
|
||||
if (result === "stop") break
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Conclusion
|
||||
|
||||
The issue is confirmed: the Vercel AI SDK's automatic multi-step execution causes multiple Chutes AI API requests per conversation turn. The proposed fix of adding `maxSteps: 1` or `stopWhen: stepCountIs(1)` would reduce this to one request per opencode loop iteration.
|
||||
|
||||
However, the user's concerns about breaking sequential workflows are valid and need thorough testing before implementation. The recommended approach is to create a test branch, verify all workflow types, and then decide on the best solution.
|
||||
|
||||
**Priority:** HIGH
|
||||
**Effort:** LOW for Option A, HIGH for Option B
|
||||
**Risk:** MEDIUM-HIGH (may break existing workflows)
|
||||
|
||||
Reference in New Issue
Block a user