Files
shopify-ai-backup/opencode/CHUTES_AI_TOOL_CALL_ISSUE_PLAN.md
southseact-3d f5537e3e0b docs: Add comprehensive plan for Chutes AI tool call counting issue
Documents the issue where tool calls count as separate Chutes AI requests,
proposed solutions, technical analysis, and user concerns about breaking
sequential workflows.

Includes:
- Root cause analysis of Vercel AI SDK multi-step execution
- 4 proposed solution options with pros/cons
- User concerns about model context and workflow breaks
- Code references and technical diagrams
- Recommended next steps for testing and implementation

Relates to: Tool call execution flow in session management
2026-02-11 13:53:06 +00:00

14 KiB

Chutes AI Tool Call Counting Issue - Analysis and Plan

Status: Under Investigation
Date: February 11, 2026
Related Files:

  • opencode/packages/opencode/src/session/llm.ts (lines 87-167)
  • opencode/packages/opencode/src/session/processor.ts
  • opencode/packages/opencode/src/session/prompt.ts (lines 716-745, 755-837)

Executive Summary

When using Chutes AI with opencode, tool calls are counting as separate API requests, causing excessive billing. This is due to the Vercel AI SDK's streamText function automatically handling multi-step tool execution.


The Problem

Current Behavior (Undesired)

The Vercel AI SDK's streamText function automatically executes tools and sends results back to the model in multiple HTTP requests:

Step 1: Initial Request
- Model receives prompt + available tools
- Model returns: "read file X"
- SDK executes tool locally

Step 2: Automatic Follow-up Request (NEW API CALL!)
- SDK sends tool results back to model
- Model returns: "edit file X with changes"
- SDK executes tool locally

Step 3: Automatic Follow-up Request (NEW API CALL!)
- SDK sends tool results back to model
- Model returns final response
- Stream ends

Result: Each step after the first counts as a separate Chutes AI API request, multiplying costs.

Root Cause

In llm.ts lines 87-167, streamText is called with tools that include execute functions:

// prompt.ts lines 716-745
const result = await item.execute(args, ctx)  // <-- Tools have execute functions

When tools have execute functions, the SDK automatically:

  1. Executes the tools when the model requests them
  2. Sends the results back to the model in a new API request
  3. Continues this process for multiple steps

User Concerns (Critical Issues)

Concern #1: Model Won't See Tool Results in Time

Issue: If we limit to maxSteps: 1, the model will:

  1. Call "read file"
  2. SDK executes it
  3. SDK STOPS (doesn't send results back)
  4. Model never sees the file contents to make edit decisions

Impact: Breaks sequential workflows like read→edit.

Concern #2: Model Can't Do Multiple Tool Calls

Issue: Will the model be limited to only one tool call per session/iteration?

Impact: Complex multi-step tasks become impossible.

Concern #3: Session Completion Timing

Issue: Will tool results only be available after the entire session finishes?

Impact: Model can't react to tool outputs in real-time.


Technical Analysis

How Tool Execution Currently Works

  1. Opencode's Outer Loop (prompt.ts:282):

    while (true) {
      // Each iteration is one "step"
      const tools = await resolveTools({...})
      const result = await processor.process({tools, ...})
    }
    
  2. SDK's Internal Multi-Step (llm.ts:87):

    const result = streamText({
      tools,  // Tools with execute functions
      // No maxSteps or stopWhen parameter!
    })
    
  3. Processor Handles Events (processor.ts:94):

    • tool-input-start: Tool call begins
    • tool-call: Tool is called
    • tool-result: Tool execution completes
    • finish-step: Step ends

Message Flow

Current Flow (SDK Multi-Step):

┌─────────────┐     ┌──────────┐     ┌──────────────┐
│ Opencode    │────▶│ SDK      │────▶│ Chutes AI    │
│ Loop        │     │ streamText│     │ API Request 1│
└─────────────┘     └──────────┘     └──────────────┘
                           │                  │
                           │                  ▼
                           │          ┌──────────────┐
                           │          │ Model decides│
                           │          │ to call tools│
                           │          └──────────────┘
                           │                  │
                           ▼                  │
                    ┌──────────┐              │
                    │ Executes │              │
                    │ tools    │              │
                    └──────────┘              │
                           │                  │
                           ▼                  ▼
                    ┌──────────────┐     ┌──────────────┐
                    │ Chutes AI    │◀────│ SDK sends    │
                    │ API Request 2│     │ tool results │
                    └──────────────┘     └──────────────┘

Proposed Flow (maxSteps: 1):

┌─────────────┐     ┌──────────┐     ┌──────────────┐
│ Opencode    │────▶│ SDK      │────▶│ Chutes AI    │
│ Loop        │     │ streamText│     │ API Request 1│
└─────────────┘     │ (maxSteps:1)│    └──────────────┘
     ▲              └──────────┘              │
     │                   │                    │
     │                   ▼                    ▼
     │            ┌──────────┐          ┌──────────────┐
     │            │ Executes │          │ Model decides│
     │            │ tools    │          │ to call tools│
     │            └──────────┘          └──────────────┘
     │                   │                    │
     │                   │             ┌──────┘
     │                   │             │ SDK STOPS
     │                   ▼             │ (doesn't send
     │            ┌──────────────┐     │  results back)
     │            │ Tool results │     │
     │            │ stored in    │     │
     │            │ opencode     │     │
     │            └──────────────┘     │
     │                   │             │
     └───────────────────┴─────────────┘
                         │
            Next opencode loop iteration
            Tool results included in messages
                         │
                         ▼
            ┌──────────────────────────────┐
            │ Model now sees tool results  │
            │ and can make next decision   │
            └──────────────────────────────┘

Proposed Solutions

Option A: Add maxSteps: 1 Parameter (Quick Fix)

Change: Add to llm.ts:87:

import { stepCountIs } from 'ai'

const result = streamText({
  // ... existing options
  stopWhen: stepCountIs(1),
  // ...
})

Pros:

  • Prevents SDK from making multiple LLM calls internally
  • Each streamText() call = exactly 1 Chutes AI request
  • Opencode's outer loop handles iterations with full control

Cons:

  • MAY BREAK SEQUENTIAL WORKFLOWS: Model won't see tool results until next opencode loop iteration
  • Tool execution still happens but results aren't automatically fed back

Risk Level: HIGH - May break multi-step tool workflows

Option B: Remove execute Functions from Tools

Change: Modify prompt.ts to pass tools WITHOUT execute functions:

// Instead of:
tools[item.id] = tool({
  execute: async (args, options) => { ... }  // Remove this
})

// Use:
tools[item.id] = tool({
  description: item.description,
  inputSchema: jsonSchema(schema as any),
  // NO execute function - SDK won't auto-execute
})

Then manually execute tools in processor.ts when tool-call events are received.

Pros:

  • SDK never automatically executes tools
  • Full control over execution flow
  • No hidden API requests

Cons:

  • Requires significant refactoring of tool handling
  • Need to manually implement tool execution loop
  • Risk of introducing bugs

Risk Level: MEDIUM-HIGH - Requires substantial code changes

Option C: Provider-Specific Configuration for Chutes

Change: Detect Chutes provider and apply special handling:

const result = streamText({
  // ... existing options
  ...(input.model.providerID === 'chutes' && {
    stopWhen: stepCountIs(1),
  }),
  // ...
})

Pros:

  • Only affects Chutes AI, other providers work as before
  • Minimal code changes
  • Can test specifically with Chutes

Cons:

  • Still has the same risks as Option A
  • Provider-specific code adds complexity

Risk Level: MEDIUM - Targeted fix but still risky

Option D: Keep Current Behavior + Documentation

Change: None - just document the behavior

Pros:

  • No code changes = no risk of breaking anything
  • Works correctly for sequential workflows

Cons:

  • Chutes AI users pay for multiple requests
  • Not a real solution

Risk Level: NONE - But doesn't solve the problem


Key Findings

  1. SDK Version: Using ai@5.0.124 (from root package.json line 43)

  2. Default Behavior: According to docs, stopWhen defaults to stepCountIs(1), but it's not explicitly set in the code, and the behavior suggests multi-step is enabled

  3. Tool Execution: Even with maxSteps: 1, tools WILL still execute because they have execute functions - the SDK just won't automatically send results back to the model

  4. Message Conversion: MessageV2.toModelMessages() (line 656 in message-v2.ts) already handles converting tool results back to model messages for the next iteration

  5. Opencode's Loop: The outer while (true) loop in prompt.ts:282 manages the conversation flow and WILL include tool results in the next iteration's messages


Critical Questions to Resolve

  1. Does the model ACTUALLY lose context with maxSteps: 1?

    • Theory: SDK executes tools, stores results, opencode loop includes them in next iteration
    • Need to verify: Does the model see results in time to make sequential decisions?
  2. What happens to parallel tool calls?

    • If model calls 3 tools at once, will they all execute before next iteration?
    • Or will opencode's loop serialize them?
  3. How does this affect Chutes AI billing specifically?

    • Does Chutes count: (a) HTTP requests, (b) tokens, or (c) conversation steps?
    • If (a), then maxSteps: 1 definitely helps
    • If (b) or (c), may not help as much
  4. Can we test without affecting production?

    • Need a test environment or feature flag
    • Should A/B test with different providers

  1. Create a Test Branch: Implement Option A (maxSteps: 1) in isolation
  2. Test Sequential Workflows: Verify read→edit workflows still work
  3. Monitor Request Count: Log actual HTTP requests to Chutes API
  4. Measure Latency: Check if response times change significantly
  5. Test Parallel Tool Calls: Ensure multiple tools in one step work correctly
  6. Document Behavior: Update documentation to explain the flow
  7. Consider Option B: If Option A breaks workflows, implement manual tool execution

Code References

streamText Call (llm.ts:87-167)

const result = streamText({
  onError(error) { ... },
  async experimental_repairToolCall(failed) { ... },
  temperature: params.temperature,
  topP: params.topP,
  topK: params.topK,
  providerOptions: ProviderTransform.providerOptions(input.model, params.options),
  activeTools: Object.keys(tools).filter((x) => x !== "invalid"),
  tools,  // <-- These have execute functions!
  maxOutputTokens,
  abortSignal: input.abort,
  headers: { ... },
  maxRetries: 0,
  messages: [ ... ],
  model: wrapLanguageModel({ ... }),
  experimental_telemetry: { ... },
  // MISSING: maxSteps or stopWhen parameter!
})

Tool Definition with Execute (prompt.ts:716-745)

tools[item.id] = tool({
  id: item.id as any,
  description: item.description,
  inputSchema: jsonSchema(schema as any),
  async execute(args, options) {
    const ctx = context(args, options)
    await Plugin.trigger("tool.execute.before", ...)
    const result = await item.execute(args, ctx)  // <-- Execute function!
    await Plugin.trigger("tool.execute.after", ...)
    return result
  },
})

Opencode's Outer Loop (prompt.ts:282)

while (true) {
  SessionStatus.set(sessionID, { type: "busy" })
  let step = 0
  step++
  
  const tools = await resolveTools({...})
  
  const result = await processor.process({
    tools,
    model,
    // ...
  })
  
  if (result === "stop") break
}

Conclusion

The issue is confirmed: the Vercel AI SDK's automatic multi-step execution causes multiple Chutes AI API requests per conversation turn. The proposed fix of adding maxSteps: 1 or stopWhen: stepCountIs(1) would reduce this to one request per opencode loop iteration.

However, the user's concerns about breaking sequential workflows are valid and need thorough testing before implementation. The recommended approach is to create a test branch, verify all workflow types, and then decide on the best solution.

Priority: HIGH
Effort: LOW for Option A, HIGH for Option B
Risk: MEDIUM-HIGH (may break existing workflows)