Files

southseact-3d f5537e3e0b docs: Add comprehensive plan for Chutes AI tool call counting issue

Documents the issue where tool calls count as separate Chutes AI requests,
proposed solutions, technical analysis, and user concerns about breaking
sequential workflows.

Includes:
- Root cause analysis of Vercel AI SDK multi-step execution
- 4 proposed solution options with pros/cons
- User concerns about model context and workflow breaks
- Code references and technical diagrams
- Recommended next steps for testing and implementation

Relates to: Tool call execution flow in session management

2026-02-11 13:53:06 +00:00

14 KiB

Raw Blame History

Chutes AI Tool Call Counting Issue - Analysis and Plan

Status: Under Investigation
Date: February 11, 2026
Related Files:

opencode/packages/opencode/src/session/llm.ts (lines 87-167)
opencode/packages/opencode/src/session/processor.ts
opencode/packages/opencode/src/session/prompt.ts (lines 716-745, 755-837)

Executive Summary

When using Chutes AI with opencode, tool calls are counting as separate API requests, causing excessive billing. This is due to the Vercel AI SDK's streamText function automatically handling multi-step tool execution.

The Problem

Current Behavior (Undesired)

The Vercel AI SDK's streamText function automatically executes tools and sends results back to the model in multiple HTTP requests:

Step 1: Initial Request
- Model receives prompt + available tools
- Model returns: "read file X"
- SDK executes tool locally

Step 2: Automatic Follow-up Request (NEW API CALL!)
- SDK sends tool results back to model
- Model returns: "edit file X with changes"
- SDK executes tool locally

Step 3: Automatic Follow-up Request (NEW API CALL!)
- SDK sends tool results back to model
- Model returns final response
- Stream ends

Result: Each step after the first counts as a separate Chutes AI API request, multiplying costs.

Root Cause

In llm.ts lines 87-167, streamText is called with tools that include execute functions:

// prompt.ts lines 716-745
const result = await item.execute(args, ctx)  // <-- Tools have execute functions

When tools have execute functions, the SDK automatically:

Executes the tools when the model requests them
Sends the results back to the model in a new API request
Continues this process for multiple steps

User Concerns (Critical Issues)

Concern #1: Model Won't See Tool Results in Time

Issue: If we limit to maxSteps: 1, the model will:

Call "read file"
SDK executes it
SDK STOPS (doesn't send results back)
Model never sees the file contents to make edit decisions

Impact: Breaks sequential workflows like read→edit.

Concern #2: Model Can't Do Multiple Tool Calls

Issue: Will the model be limited to only one tool call per session/iteration?

Impact: Complex multi-step tasks become impossible.

Concern #3: Session Completion Timing

Issue: Will tool results only be available after the entire session finishes?

Impact: Model can't react to tool outputs in real-time.

Technical Analysis

How Tool Execution Currently Works

Opencode's Outer Loop (prompt.ts:282):

while (true) {
  // Each iteration is one "step"
  const tools = await resolveTools({...})
  const result = await processor.process({tools, ...})
}

SDK's Internal Multi-Step (llm.ts:87):

const result = streamText({
  tools,  // Tools with execute functions
  // No maxSteps or stopWhen parameter!
})

Processor Handles Events (processor.ts:94):
- tool-input-start: Tool call begins
- tool-call: Tool is called
- tool-result: Tool execution completes
- finish-step: Step ends

Message Flow

Current Flow (SDK Multi-Step):

┌─────────────┐     ┌──────────┐     ┌──────────────┐
│ Opencode    │────▶│ SDK      │────▶│ Chutes AI    │
│ Loop        │     │ streamText│     │ API Request 1│
└─────────────┘     └──────────┘     └──────────────┘
                           │                  │
                           │                  ▼
                           │          ┌──────────────┐
                           │          │ Model decides│
                           │          │ to call tools│
                           │          └──────────────┘
                           │                  │
                           ▼                  │
                    ┌──────────┐              │
                    │ Executes │              │
                    │ tools    │              │
                    └──────────┘              │
                           │                  │
                           ▼                  ▼
                    ┌──────────────┐     ┌──────────────┐
                    │ Chutes AI    │◀────│ SDK sends    │
                    │ API Request 2│     │ tool results │
                    └──────────────┘     └──────────────┘

Proposed Flow (maxSteps: 1):

┌─────────────┐     ┌──────────┐     ┌──────────────┐
│ Opencode    │────▶│ SDK      │────▶│ Chutes AI    │
│ Loop        │     │ streamText│     │ API Request 1│
└─────────────┘     │ (maxSteps:1)│    └──────────────┘
     ▲              └──────────┘              │
     │                   │                    │
     │                   ▼                    ▼
     │            ┌──────────┐          ┌──────────────┐
     │            │ Executes │          │ Model decides│
     │            │ tools    │          │ to call tools│
     │            └──────────┘          └──────────────┘
     │                   │                    │
     │                   │             ┌──────┘
     │                   │             │ SDK STOPS
     │                   ▼             │ (doesn't send
     │            ┌──────────────┐     │  results back)
     │            │ Tool results │     │
     │            │ stored in    │     │
     │            │ opencode     │     │
     │            └──────────────┘     │
     │                   │             │
     └───────────────────┴─────────────┘
                         │
            Next opencode loop iteration
            Tool results included in messages
                         │
                         ▼
            ┌──────────────────────────────┐
            │ Model now sees tool results  │
            │ and can make next decision   │
            └──────────────────────────────┘

Proposed Solutions

Option A: Add `maxSteps: 1` Parameter (Quick Fix)

Change: Add to llm.ts:87:

import { stepCountIs } from 'ai'

const result = streamText({
  // ... existing options
  stopWhen: stepCountIs(1),
  // ...
})

Pros:

Prevents SDK from making multiple LLM calls internally
Each streamText() call = exactly 1 Chutes AI request
Opencode's outer loop handles iterations with full control

Cons:

MAY BREAK SEQUENTIAL WORKFLOWS: Model won't see tool results until next opencode loop iteration
Tool execution still happens but results aren't automatically fed back

Risk Level: HIGH - May break multi-step tool workflows

Option B: Remove `execute` Functions from Tools

Change: Modify prompt.ts to pass tools WITHOUT execute functions:

// Instead of:
tools[item.id] = tool({
  execute: async (args, options) => { ... }  // Remove this
})

// Use:
tools[item.id] = tool({
  description: item.description,
  inputSchema: jsonSchema(schema as any),
  // NO execute function - SDK won't auto-execute
})

Then manually execute tools in processor.ts when tool-call events are received.

Pros:

SDK never automatically executes tools
Full control over execution flow
No hidden API requests

Cons:

Requires significant refactoring of tool handling
Need to manually implement tool execution loop
Risk of introducing bugs

Risk Level: MEDIUM-HIGH - Requires substantial code changes

Option C: Provider-Specific Configuration for Chutes

Change: Detect Chutes provider and apply special handling:

const result = streamText({
  // ... existing options
  ...(input.model.providerID === 'chutes' && {
    stopWhen: stepCountIs(1),
  }),
  // ...
})

Pros:

Only affects Chutes AI, other providers work as before
Minimal code changes
Can test specifically with Chutes

Cons:

Still has the same risks as Option A
Provider-specific code adds complexity

Risk Level: MEDIUM - Targeted fix but still risky

Option D: Keep Current Behavior + Documentation

Change: None - just document the behavior

Pros:

No code changes = no risk of breaking anything
Works correctly for sequential workflows

Cons:

Chutes AI users pay for multiple requests
Not a real solution

Risk Level: NONE - But doesn't solve the problem

Key Findings

SDK Version: Using ai@5.0.124 (from root package.json line 43)
Default Behavior: According to docs, stopWhen defaults to stepCountIs(1), but it's not explicitly set in the code, and the behavior suggests multi-step is enabled
Tool Execution: Even with maxSteps: 1, tools WILL still execute because they have execute functions - the SDK just won't automatically send results back to the model
Message Conversion: MessageV2.toModelMessages() (line 656 in message-v2.ts) already handles converting tool results back to model messages for the next iteration
Opencode's Loop: The outer while (true) loop in prompt.ts:282 manages the conversation flow and WILL include tool results in the next iteration's messages

Critical Questions to Resolve

Does the model ACTUALLY lose context with maxSteps: 1?
- Theory: SDK executes tools, stores results, opencode loop includes them in next iteration
- Need to verify: Does the model see results in time to make sequential decisions?
What happens to parallel tool calls?
- If model calls 3 tools at once, will they all execute before next iteration?
- Or will opencode's loop serialize them?
How does this affect Chutes AI billing specifically?
- Does Chutes count: (a) HTTP requests, (b) tokens, or (c) conversation steps?
- If (a), then maxSteps: 1 definitely helps
- If (b) or (c), may not help as much
Can we test without affecting production?
- Need a test environment or feature flag
- Should A/B test with different providers

Recommended Next Steps

Create a Test Branch: Implement Option A (maxSteps: 1) in isolation
Test Sequential Workflows: Verify read→edit workflows still work
Monitor Request Count: Log actual HTTP requests to Chutes API
Measure Latency: Check if response times change significantly
Test Parallel Tool Calls: Ensure multiple tools in one step work correctly
Document Behavior: Update documentation to explain the flow
Consider Option B: If Option A breaks workflows, implement manual tool execution

Code References

streamText Call (llm.ts:87-167)

const result = streamText({
  onError(error) { ... },
  async experimental_repairToolCall(failed) { ... },
  temperature: params.temperature,
  topP: params.topP,
  topK: params.topK,
  providerOptions: ProviderTransform.providerOptions(input.model, params.options),
  activeTools: Object.keys(tools).filter((x) => x !== "invalid"),
  tools,  // <-- These have execute functions!
  maxOutputTokens,
  abortSignal: input.abort,
  headers: { ... },
  maxRetries: 0,
  messages: [ ... ],
  model: wrapLanguageModel({ ... }),
  experimental_telemetry: { ... },
  // MISSING: maxSteps or stopWhen parameter!
})

Tool Definition with Execute (prompt.ts:716-745)

tools[item.id] = tool({
  id: item.id as any,
  description: item.description,
  inputSchema: jsonSchema(schema as any),
  async execute(args, options) {
    const ctx = context(args, options)
    await Plugin.trigger("tool.execute.before", ...)
    const result = await item.execute(args, ctx)  // <-- Execute function!
    await Plugin.trigger("tool.execute.after", ...)
    return result
  },
})

Opencode's Outer Loop (prompt.ts:282)

while (true) {
  SessionStatus.set(sessionID, { type: "busy" })
  let step = 0
  step++
  
  const tools = await resolveTools({...})
  
  const result = await processor.process({
    tools,
    model,
    // ...
  })
  
  if (result === "stop") break
}

Conclusion

The issue is confirmed: the Vercel AI SDK's automatic multi-step execution causes multiple Chutes AI API requests per conversation turn. The proposed fix of adding maxSteps: 1 or stopWhen: stepCountIs(1) would reduce this to one request per opencode loop iteration.

However, the user's concerns about breaking sequential workflows are valid and need thorough testing before implementation. The recommended approach is to create a test branch, verify all workflow types, and then decide on the best solution.

Priority: HIGH
Effort: LOW for Option A, HIGH for Option B
Risk: MEDIUM-HIGH (may break existing workflows)

14 KiB Raw Blame History

Chutes AI Tool Call Counting Issue - Analysis and Plan

Executive Summary

The Problem

Current Behavior (Undesired)

Root Cause

User Concerns (Critical Issues)

Concern #1: Model Won't See Tool Results in Time

Concern #2: Model Can't Do Multiple Tool Calls

Concern #3: Session Completion Timing

Technical Analysis

How Tool Execution Currently Works

Message Flow

Proposed Solutions

Option A: Add maxSteps: 1 Parameter (Quick Fix)

Option B: Remove execute Functions from Tools

Option C: Provider-Specific Configuration for Chutes

Option D: Keep Current Behavior + Documentation

Key Findings

Critical Questions to Resolve

Recommended Next Steps

Code References

streamText Call (llm.ts:87-167)

Tool Definition with Execute (prompt.ts:716-745)

Opencode's Outer Loop (prompt.ts:282)

Conclusion

14 KiB

Raw Blame History

Option A: Add `maxSteps: 1` Parameter (Quick Fix)

Option B: Remove `execute` Functions from Tools