Every portfolio manager knows the efficient frontier - the set of optimal portfolios offering maximum returns for given risk levels. What if AI prompts had their own efficient frontier?
As we all start to use AI, prompt optimization will be a consistent challenge. GEPA, GEnerative PAreto, is a technique to discover the equivalent efficient frontier for AI.
Reading the paper, I noticed the initial results were promising, with a 10-point improvement on certain benchmarks & a 9.2 times shorter prompt length. Shorter prompt length, & we all know that input prompts are the biggest driver of cost (see The Hungry, Hungry AI Model). So, I implemented GEPA in EvoBlog.
To use GEPA, we must identify the scoring axes that an LLM uses to score a post. Here are mine :
| Evaluation Axis | Weight | Description |
|---|---|---|
| Style Match | 25% | How well the post matches Tom Tunguz’s distinctive writing style |
| Argument Quality | 20% | Strength and logic of the arguments presented |
| Data Usage | 15% | Effective use of statistics, examples, and quantified metrics |
| Readability | 15% | Clarity, sentence structure, and ease of reading |
| Originality | 15% | Fresh perspectives, novel connections, avoiding clichés |
| Engagement | 10% | Hooks, emotional language, reader involvement |
Now that we have this framework, we can enter a prompt to generate a blog post & have the EvoBlog system iterate through different prompts to meet the efficient frontier for each dimension, weighted across all variables—not just one.
Here are the scores for two hypothetical blog posts. You can see one spikes more on style, while the other one focuses on data usage. Using GEPA, we can determine which is the better all-around post. In this case, it is the data-focused post.
All of this to say, dear reader, that I’ve only ever published one blog post fully generated by AI.
My goal with these automated systems is to learn how they work, how to tune them, & generate initial drafts that approximate my first & second drafts. I will always be completing drafts three & four.
The efficient frontier is no substitute for insight & an authentic voice.