evaluation

Better Agents Through Evaluation: How To

See how testing your agents with LLM-judged questions (evaluation) will improve their quality, prevent regressions, and boost reliability for years…

4 weeks ago

The Archive pt 3: Don’t Hack Away on Vibes Alone

Let's not hack away on The Archive on vibes alone; let's evaluate our code. Using Promptfoo!

7 months ago