Discussion about this post

User's avatar
Aditya Sharan's avatar

Thank You for writing this. Much needed. I think soon we'll be in the Era of some substack posts getting more Citations than papers. This might be one of those.

Brad K's avatar

For agent evals, should we consider prompts that share the same environment (a code repo, a synthetic database, etc.) as dependent and in the same cluster? What would that mean for a set of agent evals that all share the same environment, like the agent company, as an example

3 more comments...

No posts

Ready for more?