The Problem
When processing tasks with the Julius Agent using Fireworks.ai (Kimi K2.5), we discovered that responses were coming back empty after tool execution completed successfully. This was a frustrating bug because:
- The tool loop would execute successfully
- All tool calls would complete as expected
- But the final response to the user was blank
This made it appear as though tasks had failed when they had actually completed successfully—just without any visible output.
Root Cause Analysis
After investigating the issue, we traced it to a specific edge case in the conversation handling code. When the LLM reached a FinishReason::Stop state after executing tools, the code was extracting the response text like this:
let final_text = response.text.clone().unwrap_or_default();
The problem was that Fireworks.ai’s Kimi K2.5 model was returning a “stop” finish reason with an empty response.text field after all tool calls had been executed. The code would then return an empty string as the final result, even though the conversation history contained valid assistant responses from earlier in the interaction.
The Fix
We modified the finish handling logic to fall back to the conversation history when the final response text is empty. The updated code now:
- Checks if the response text is empty at the stop state
- If empty, retrieves the most recent assistant message from the conversation history
- Uses that stored content as the final response
The key insight was that the ConversationState already maintains a complete history of all messages including assistant responses. The last_assistant_text() method provides access to the most recent assistant content. By falling back to this stored content when the final response is empty, we ensure the user always receives the assistant’s last substantive response.
Why This Matters
This fix ensures reliable task completion across different LLM providers. Different models may have slight variations in how they signal completion, and our system now handles these variations gracefully. Users will always see meaningful output from completed tasks, regardless of which model provider is being used.
Build Verification
The fix has been verified through a clean production build, ensuring it integrates properly with the existing codebase without introducing any regressions.
Lessons Learned
-
Trust the conversation history: When working with LLM APIs, the final response field isn’t always the source of truth. Maintaining and checking conversation history provides a reliable fallback.
-
Model variations matter: Different providers may handle finish states differently. Robust code should account for these variations.
-
Empty responses are invisible failures: An empty response looks like success to the system but failure to the user. Always validate that there’s actual content for the user.
This fix, combined with our previous increase in max_turns from 10 to 50, makes the tool execution loop more robust for complex multi-step tasks.