Regression Testing | LangSmith Evaluations

Пікірлер: 4

@MattJonesYT
4 ай бұрын
This is extremely useful, especially for agent systems where the rules have been written to be over-fit for a particular LLM. I find crewai often has that problem, it works well for the LLM it was written for but then makes nonsense with a different LLM.
@MattJonesYT
4 ай бұрын
An extension of this idea would be doing regressions on the prompt system as a whole in an agent system to see how well it adapts to other LLMs. Make a matrix of how its prompts work for its original LLM vs new, out-of-sample LLMs. If it immediately breaks on new LLMs then it is probably over-fit and you can have AI try to re-write those prompts to be simpler and then make a system that is more robust for different LLMs.
@UtopIA-IA
4 ай бұрын
Thank you
@nachoeigu
Ай бұрын
Where could we find the Jupyter Notebook files?

Regression Testing | LangSmith Evaluations - Part 15