I discovered I was designing my AI tools backwards.
Here’s an example. This was my newsletter processing chain : reading emails, calling a newsletter processor, extracting companies, & then adding them to the CRM. This involved four different steps, costing $3.69 for every thousand newsletters processed.
Before: Newsletter Processing Chain
# Step 1: Find newsletters (separate tool)
ruby read_email.rb --from "newsletter@techcrunch.com" --limit 5
# Output: 340 tokens of detailed email data
# Step 2: Process each newsletter (separate tool)
ruby enhanced_newsletter_processor.rb
# Output: 420 tokens per newsletter summary
# Step 3: Extract companies (separate tool)
ruby enhanced_company_extractor.rb --input newsletter_summary.txt
# Output: 280 tokens of company data
# Step 4: Add to CRM (separate tool)
ruby validate_and_add_company.rb startup.com
# Output: 190 tokens of validation results
# Total: 1,230 tokens, 4 separate tool calls, no safety checks
# Cost: $3.69 per 1,000 newsletter processing workflows
Then I created a unified newsletter tool which combined everything using the Google Agent Development Kit, Google’s framework for building production grade AI agent tools :
# Single consolidated operation
ruby unified_newsletter_tool.rb --action process \
--source "techcrunch" --format concise \
--auto-extract-companies
# Output: 85 tokens with all operations completed
# 93% token reduction, built-in safety, cached results
# Cost: $0.26 per 1,000 newsletter processing workflows
# Savings: $3.43 per 1,000 workflows (93% cost reduction)
Why is the unified newsletter tool more complicated?
It includes multiple actions in a single interface (process, search, extract, validate), implements state management that tracks usage patterns & caches results, has rate limiting built in, & produces structured JSON outputs with metadata instead of plain text.
But here’s the counterintuitive part : despite being more complex internally, the unified tool is simpler for the LLM to use because it provides consistent, structured outputs that are easier to parse, even though those outputs are longer.
To understand the impact, we ran tests of 30 iterations per test scenario. The results show the impact of the new architecture :
| Metric | Before | After | Improvement |
|---|---|---|---|
| LLM Tokens per Op | 112.4 | 66.1 | 41.2% reduction |
| Cost per 1K Ops | $1.642 | $0.957 | 41.7% savings |
| Success Rate | 87% | 94% | 8% improvement |
| Tools per Workflow | 3-5 | 1 | 70% reduction |
| Cache Hit Rate | 0% | 30% | Performance boost |
| Error Recovery | Manual | Automatic | Better UX |
We were able to reduce tokens by 41% (p=0.01, statistically significant), which translated linearly into cost savings. The success rate improved by 8% (p=0.03), & we were able to hit the cache 30% of the time, which is another cost savings.
While individual tools produced shorter, “cleaner” responses, they forced the LLM to work harder parsing inconsistent formats. Structured, comprehensive outputs from unified tools enabled more efficient LLM processing, despite being longer.
My workflow relied on dozens of specialized Ruby tools for email, research, & task management. Each tool had its own interface, error handling, & output format. By rolling them up into meta tools, the ultimate performance is better, & there’s tremendous cost savings. You can find the complete architecture on GitHub.