- RagMetrics
- Posts
- Gradient Prompt Optimization: Enhancing LLM Performance with Incremental Adjustments
Gradient Prompt Optimization: Enhancing LLM Performance with Incremental Adjustments
Have you ever spent a few hours (days, weeks) trying to optimize a prompt?
You make an edit. Run a few examples. See if they’re good. Make another edit to fix the last example. Oh no, the first example broke.
It’s a pain, right?
What if you could get the LLM to do that for you?
A recent paper from China uses LLMs as prompt optimizers. They call it Gradient Prompt Optimization (GPO). It works like this:
1. You write the task and initial prompt.
2. You ask an LLM to update your prompt using a metaprompt. You’ve seen metaprompts before. There’s one in the Anthropic Prompt Generator (link below). But GPO’s metaprompt is specific. It asks the LLM to fix performance on a few specific examples. And it can only change a few words at a time. Kind of like when you update model weights, you only make small updates at each forward pass.
3. Rinse and repeat
GPO improves Llama-2’s performance on BBH from 30 to 35, on MMLU from 36 to 39 and on GSM8K from 22 to 28. GPO does better than other prompt optimizers such as CoT and APE.
I would have loved to see a test on a newer model than Llama2.
Paper: https://lnkd.in/dkyqYN8m
Code: https://lnkd.in/dD6qbVUy
Anthropic prompt generator: https://lnkd.in/dfnHTXEs
#PromptEngineering#LLM#AI#ArtificialIntelligence#LanguageModels#AIResearch