Delete Chutes AI Tool Call Issue Plan document
Removed detailed analysis and proposed solutions for the Chutes AI tool call counting issue.
This commit is contained in:
committed by
GitHub
parent
e2abb2ee98
commit
b313994578
@@ -1,392 +1 @@
|
|||||||
# Chutes AI Tool Call Counting Issue - Analysis and Plan
|
|
||||||
|
|
||||||
**Status:** Under Investigation
|
|
||||||
**Date:** February 11, 2026
|
|
||||||
**Related Files:**
|
|
||||||
- `opencode/packages/opencode/src/session/llm.ts` (lines 87-167)
|
|
||||||
- `opencode/packages/opencode/src/session/processor.ts`
|
|
||||||
- `opencode/packages/opencode/src/session/prompt.ts` (lines 716-745, 755-837)
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Executive Summary
|
|
||||||
|
|
||||||
When using Chutes AI with opencode, tool calls are counting as separate API requests, causing excessive billing. This is due to the Vercel AI SDK's `streamText` function automatically handling multi-step tool execution.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## The Problem
|
|
||||||
|
|
||||||
### Current Behavior (Undesired)
|
|
||||||
|
|
||||||
The Vercel AI SDK's `streamText` function automatically executes tools and sends results back to the model in **multiple HTTP requests**:
|
|
||||||
|
|
||||||
```
|
|
||||||
Step 1: Initial Request
|
|
||||||
- Model receives prompt + available tools
|
|
||||||
- Model returns: "read file X"
|
|
||||||
- SDK executes tool locally
|
|
||||||
|
|
||||||
Step 2: Automatic Follow-up Request (NEW API CALL!)
|
|
||||||
- SDK sends tool results back to model
|
|
||||||
- Model returns: "edit file X with changes"
|
|
||||||
- SDK executes tool locally
|
|
||||||
|
|
||||||
Step 3: Automatic Follow-up Request (NEW API CALL!)
|
|
||||||
- SDK sends tool results back to model
|
|
||||||
- Model returns final response
|
|
||||||
- Stream ends
|
|
||||||
```
|
|
||||||
|
|
||||||
**Result:** Each step after the first counts as a separate Chutes AI API request, multiplying costs.
|
|
||||||
|
|
||||||
### Root Cause
|
|
||||||
|
|
||||||
In `llm.ts` lines 87-167, `streamText` is called with tools that include `execute` functions:
|
|
||||||
|
|
||||||
```typescript
|
|
||||||
// prompt.ts lines 716-745
|
|
||||||
const result = await item.execute(args, ctx) // <-- Tools have execute functions
|
|
||||||
```
|
|
||||||
|
|
||||||
When tools have `execute` functions, the SDK automatically:
|
|
||||||
1. Executes the tools when the model requests them
|
|
||||||
2. Sends the results back to the model in a **new API request**
|
|
||||||
3. Continues this process for multiple steps
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## User Concerns (Critical Issues)
|
|
||||||
|
|
||||||
### Concern #1: Model Won't See Tool Results in Time
|
|
||||||
|
|
||||||
**Issue:** If we limit to `maxSteps: 1`, the model will:
|
|
||||||
1. Call "read file"
|
|
||||||
2. SDK executes it
|
|
||||||
3. SDK STOPS (doesn't send results back)
|
|
||||||
4. Model never sees the file contents to make edit decisions
|
|
||||||
|
|
||||||
**Impact:** Breaks sequential workflows like read→edit.
|
|
||||||
|
|
||||||
### Concern #2: Model Can't Do Multiple Tool Calls
|
|
||||||
|
|
||||||
**Issue:** Will the model be limited to only one tool call per session/iteration?
|
|
||||||
|
|
||||||
**Impact:** Complex multi-step tasks become impossible.
|
|
||||||
|
|
||||||
### Concern #3: Session Completion Timing
|
|
||||||
|
|
||||||
**Issue:** Will tool results only be available after the entire session finishes?
|
|
||||||
|
|
||||||
**Impact:** Model can't react to tool outputs in real-time.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Technical Analysis
|
|
||||||
|
|
||||||
### How Tool Execution Currently Works
|
|
||||||
|
|
||||||
1. **Opencode's Outer Loop** (`prompt.ts:282`):
|
|
||||||
```typescript
|
|
||||||
while (true) {
|
|
||||||
// Each iteration is one "step"
|
|
||||||
const tools = await resolveTools({...})
|
|
||||||
const result = await processor.process({tools, ...})
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
2. **SDK's Internal Multi-Step** (`llm.ts:87`):
|
|
||||||
```typescript
|
|
||||||
const result = streamText({
|
|
||||||
tools, // Tools with execute functions
|
|
||||||
// No maxSteps or stopWhen parameter!
|
|
||||||
})
|
|
||||||
```
|
|
||||||
|
|
||||||
3. **Processor Handles Events** (`processor.ts:94`):
|
|
||||||
- `tool-input-start`: Tool call begins
|
|
||||||
- `tool-call`: Tool is called
|
|
||||||
- `tool-result`: Tool execution completes
|
|
||||||
- `finish-step`: Step ends
|
|
||||||
|
|
||||||
### Message Flow
|
|
||||||
|
|
||||||
**Current Flow (SDK Multi-Step):**
|
|
||||||
```
|
|
||||||
┌─────────────┐ ┌──────────┐ ┌──────────────┐
|
|
||||||
│ Opencode │────▶│ SDK │────▶│ Chutes AI │
|
|
||||||
│ Loop │ │ streamText│ │ API Request 1│
|
|
||||||
└─────────────┘ └──────────┘ └──────────────┘
|
|
||||||
│ │
|
|
||||||
│ ▼
|
|
||||||
│ ┌──────────────┐
|
|
||||||
│ │ Model decides│
|
|
||||||
│ │ to call tools│
|
|
||||||
│ └──────────────┘
|
|
||||||
│ │
|
|
||||||
▼ │
|
|
||||||
┌──────────┐ │
|
|
||||||
│ Executes │ │
|
|
||||||
│ tools │ │
|
|
||||||
└──────────┘ │
|
|
||||||
│ │
|
|
||||||
▼ ▼
|
|
||||||
┌──────────────┐ ┌──────────────┐
|
|
||||||
│ Chutes AI │◀────│ SDK sends │
|
|
||||||
│ API Request 2│ │ tool results │
|
|
||||||
└──────────────┘ └──────────────┘
|
|
||||||
```
|
|
||||||
|
|
||||||
**Proposed Flow (maxSteps: 1):**
|
|
||||||
```
|
|
||||||
┌─────────────┐ ┌──────────┐ ┌──────────────┐
|
|
||||||
│ Opencode │────▶│ SDK │────▶│ Chutes AI │
|
|
||||||
│ Loop │ │ streamText│ │ API Request 1│
|
|
||||||
└─────────────┘ │ (maxSteps:1)│ └──────────────┘
|
|
||||||
▲ └──────────┘ │
|
|
||||||
│ │ │
|
|
||||||
│ ▼ ▼
|
|
||||||
│ ┌──────────┐ ┌──────────────┐
|
|
||||||
│ │ Executes │ │ Model decides│
|
|
||||||
│ │ tools │ │ to call tools│
|
|
||||||
│ └──────────┘ └──────────────┘
|
|
||||||
│ │ │
|
|
||||||
│ │ ┌──────┘
|
|
||||||
│ │ │ SDK STOPS
|
|
||||||
│ ▼ │ (doesn't send
|
|
||||||
│ ┌──────────────┐ │ results back)
|
|
||||||
│ │ Tool results │ │
|
|
||||||
│ │ stored in │ │
|
|
||||||
│ │ opencode │ │
|
|
||||||
│ └──────────────┘ │
|
|
||||||
│ │ │
|
|
||||||
└───────────────────┴─────────────┘
|
|
||||||
│
|
|
||||||
Next opencode loop iteration
|
|
||||||
Tool results included in messages
|
|
||||||
│
|
|
||||||
▼
|
|
||||||
┌──────────────────────────────┐
|
|
||||||
│ Model now sees tool results │
|
|
||||||
│ and can make next decision │
|
|
||||||
└──────────────────────────────┘
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Proposed Solutions
|
|
||||||
|
|
||||||
### Option A: Add `maxSteps: 1` Parameter (Quick Fix)
|
|
||||||
|
|
||||||
**Change:** Add to `llm.ts:87`:
|
|
||||||
```typescript
|
|
||||||
import { stepCountIs } from 'ai'
|
|
||||||
|
|
||||||
const result = streamText({
|
|
||||||
// ... existing options
|
|
||||||
stopWhen: stepCountIs(1),
|
|
||||||
// ...
|
|
||||||
})
|
|
||||||
```
|
|
||||||
|
|
||||||
**Pros:**
|
|
||||||
- Prevents SDK from making multiple LLM calls internally
|
|
||||||
- Each `streamText()` call = exactly 1 Chutes AI request
|
|
||||||
- Opencode's outer loop handles iterations with full control
|
|
||||||
|
|
||||||
**Cons:**
|
|
||||||
- **MAY BREAK SEQUENTIAL WORKFLOWS**: Model won't see tool results until next opencode loop iteration
|
|
||||||
- Tool execution still happens but results aren't automatically fed back
|
|
||||||
|
|
||||||
**Risk Level:** HIGH - May break multi-step tool workflows
|
|
||||||
|
|
||||||
### Option B: Remove `execute` Functions from Tools
|
|
||||||
|
|
||||||
**Change:** Modify `prompt.ts` to pass tools WITHOUT `execute` functions:
|
|
||||||
|
|
||||||
```typescript
|
|
||||||
// Instead of:
|
|
||||||
tools[item.id] = tool({
|
|
||||||
execute: async (args, options) => { ... } // Remove this
|
|
||||||
})
|
|
||||||
|
|
||||||
// Use:
|
|
||||||
tools[item.id] = tool({
|
|
||||||
description: item.description,
|
|
||||||
inputSchema: jsonSchema(schema as any),
|
|
||||||
// NO execute function - SDK won't auto-execute
|
|
||||||
})
|
|
||||||
```
|
|
||||||
|
|
||||||
Then manually execute tools in `processor.ts` when `tool-call` events are received.
|
|
||||||
|
|
||||||
**Pros:**
|
|
||||||
- SDK never automatically executes tools
|
|
||||||
- Full control over execution flow
|
|
||||||
- No hidden API requests
|
|
||||||
|
|
||||||
**Cons:**
|
|
||||||
- Requires significant refactoring of tool handling
|
|
||||||
- Need to manually implement tool execution loop
|
|
||||||
- Risk of introducing bugs
|
|
||||||
|
|
||||||
**Risk Level:** MEDIUM-HIGH - Requires substantial code changes
|
|
||||||
|
|
||||||
### Option C: Provider-Specific Configuration for Chutes
|
|
||||||
|
|
||||||
**Change:** Detect Chutes provider and apply special handling:
|
|
||||||
|
|
||||||
```typescript
|
|
||||||
const result = streamText({
|
|
||||||
// ... existing options
|
|
||||||
...(input.model.providerID === 'chutes' && {
|
|
||||||
stopWhen: stepCountIs(1),
|
|
||||||
}),
|
|
||||||
// ...
|
|
||||||
})
|
|
||||||
```
|
|
||||||
|
|
||||||
**Pros:**
|
|
||||||
- Only affects Chutes AI, other providers work as before
|
|
||||||
- Minimal code changes
|
|
||||||
- Can test specifically with Chutes
|
|
||||||
|
|
||||||
**Cons:**
|
|
||||||
- Still has the same risks as Option A
|
|
||||||
- Provider-specific code adds complexity
|
|
||||||
|
|
||||||
**Risk Level:** MEDIUM - Targeted fix but still risky
|
|
||||||
|
|
||||||
### Option D: Keep Current Behavior + Documentation
|
|
||||||
|
|
||||||
**Change:** None - just document the behavior
|
|
||||||
|
|
||||||
**Pros:**
|
|
||||||
- No code changes = no risk of breaking anything
|
|
||||||
- Works correctly for sequential workflows
|
|
||||||
|
|
||||||
**Cons:**
|
|
||||||
- Chutes AI users pay for multiple requests
|
|
||||||
- Not a real solution
|
|
||||||
|
|
||||||
**Risk Level:** NONE - But doesn't solve the problem
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Key Findings
|
|
||||||
|
|
||||||
1. **SDK Version:** Using `ai@5.0.124` (from root package.json line 43)
|
|
||||||
|
|
||||||
2. **Default Behavior:** According to docs, `stopWhen` defaults to `stepCountIs(1)`, but it's not explicitly set in the code, and the behavior suggests multi-step is enabled
|
|
||||||
|
|
||||||
3. **Tool Execution:** Even with `maxSteps: 1`, tools WILL still execute because they have `execute` functions - the SDK just won't automatically send results back to the model
|
|
||||||
|
|
||||||
4. **Message Conversion:** `MessageV2.toModelMessages()` (line 656 in message-v2.ts) already handles converting tool results back to model messages for the next iteration
|
|
||||||
|
|
||||||
5. **Opencode's Loop:** The outer `while (true)` loop in `prompt.ts:282` manages the conversation flow and WILL include tool results in the next iteration's messages
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Critical Questions to Resolve
|
|
||||||
|
|
||||||
1. **Does the model ACTUALLY lose context with `maxSteps: 1`?**
|
|
||||||
- Theory: SDK executes tools, stores results, opencode loop includes them in next iteration
|
|
||||||
- Need to verify: Does the model see results in time to make sequential decisions?
|
|
||||||
|
|
||||||
2. **What happens to parallel tool calls?**
|
|
||||||
- If model calls 3 tools at once, will they all execute before next iteration?
|
|
||||||
- Or will opencode's loop serialize them?
|
|
||||||
|
|
||||||
3. **How does this affect Chutes AI billing specifically?**
|
|
||||||
- Does Chutes count: (a) HTTP requests, (b) tokens, or (c) conversation steps?
|
|
||||||
- If (a), then `maxSteps: 1` definitely helps
|
|
||||||
- If (b) or (c), may not help as much
|
|
||||||
|
|
||||||
4. **Can we test without affecting production?**
|
|
||||||
- Need a test environment or feature flag
|
|
||||||
- Should A/B test with different providers
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Recommended Next Steps
|
|
||||||
|
|
||||||
1. **Create a Test Branch:** Implement Option A (`maxSteps: 1`) in isolation
|
|
||||||
2. **Test Sequential Workflows:** Verify read→edit workflows still work
|
|
||||||
3. **Monitor Request Count:** Log actual HTTP requests to Chutes API
|
|
||||||
4. **Measure Latency:** Check if response times change significantly
|
|
||||||
5. **Test Parallel Tool Calls:** Ensure multiple tools in one step work correctly
|
|
||||||
6. **Document Behavior:** Update documentation to explain the flow
|
|
||||||
7. **Consider Option B:** If Option A breaks workflows, implement manual tool execution
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Code References
|
|
||||||
|
|
||||||
### streamText Call (llm.ts:87-167)
|
|
||||||
```typescript
|
|
||||||
const result = streamText({
|
|
||||||
onError(error) { ... },
|
|
||||||
async experimental_repairToolCall(failed) { ... },
|
|
||||||
temperature: params.temperature,
|
|
||||||
topP: params.topP,
|
|
||||||
topK: params.topK,
|
|
||||||
providerOptions: ProviderTransform.providerOptions(input.model, params.options),
|
|
||||||
activeTools: Object.keys(tools).filter((x) => x !== "invalid"),
|
|
||||||
tools, // <-- These have execute functions!
|
|
||||||
maxOutputTokens,
|
|
||||||
abortSignal: input.abort,
|
|
||||||
headers: { ... },
|
|
||||||
maxRetries: 0,
|
|
||||||
messages: [ ... ],
|
|
||||||
model: wrapLanguageModel({ ... }),
|
|
||||||
experimental_telemetry: { ... },
|
|
||||||
// MISSING: maxSteps or stopWhen parameter!
|
|
||||||
})
|
|
||||||
```
|
|
||||||
|
|
||||||
### Tool Definition with Execute (prompt.ts:716-745)
|
|
||||||
```typescript
|
|
||||||
tools[item.id] = tool({
|
|
||||||
id: item.id as any,
|
|
||||||
description: item.description,
|
|
||||||
inputSchema: jsonSchema(schema as any),
|
|
||||||
async execute(args, options) {
|
|
||||||
const ctx = context(args, options)
|
|
||||||
await Plugin.trigger("tool.execute.before", ...)
|
|
||||||
const result = await item.execute(args, ctx) // <-- Execute function!
|
|
||||||
await Plugin.trigger("tool.execute.after", ...)
|
|
||||||
return result
|
|
||||||
},
|
|
||||||
})
|
|
||||||
```
|
|
||||||
|
|
||||||
### Opencode's Outer Loop (prompt.ts:282)
|
|
||||||
```typescript
|
|
||||||
while (true) {
|
|
||||||
SessionStatus.set(sessionID, { type: "busy" })
|
|
||||||
let step = 0
|
|
||||||
step++
|
|
||||||
|
|
||||||
const tools = await resolveTools({...})
|
|
||||||
|
|
||||||
const result = await processor.process({
|
|
||||||
tools,
|
|
||||||
model,
|
|
||||||
// ...
|
|
||||||
})
|
|
||||||
|
|
||||||
if (result === "stop") break
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Conclusion
|
|
||||||
|
|
||||||
The issue is confirmed: the Vercel AI SDK's automatic multi-step execution causes multiple Chutes AI API requests per conversation turn. The proposed fix of adding `maxSteps: 1` or `stopWhen: stepCountIs(1)` would reduce this to one request per opencode loop iteration.
|
|
||||||
|
|
||||||
However, the user's concerns about breaking sequential workflows are valid and need thorough testing before implementation. The recommended approach is to create a test branch, verify all workflow types, and then decide on the best solution.
|
|
||||||
|
|
||||||
**Priority:** HIGH
|
|
||||||
**Effort:** LOW for Option A, HIGH for Option B
|
|
||||||
**Risk:** MEDIUM-HIGH (may break existing workflows)
|
|
||||||
|
|||||||
Reference in New Issue
Block a user