Why Evals Matter | LangSmith Evaluations - Part 1

Рет қаралды 6,213

With the rapid pace of AI, developers are often faced with a paradox of choice: how to choose the right prompt, how to trade-off LLM quality vs cost? Evaluations can accelerate development with structured process for making these decisions. But, we've heard that it is challenging to get started. So, we are launching a series of short videos focused on explaining how to perform evaluations using LangSmith.
This video lays out 4 main considerations for evaluation: (1) dataset, (2) evaluator, (3) task, (4) how to apply evaluation to improve your product (e.g., unit tests, A/B tests, etc).
Getting started documentation:
docs.smith.langchain.com/eval...

Жүктеу

Пікірлер: 4

@chaitanyagoel9837
5 күн бұрын
🎯 Key points for quick navigation: 00:00 *🎥 Introduction to Evaluations* - Introduction to the importance of evaluations for new models. - Overview of public evaluations and the components involved. 00:54 *🧪 Evaluation Methods* - Explanation of human evaluations and their structure. - Comparative evaluation methods like Chatbot Arena. - Different metrics used to interpret results, such as ELO scores. 02:44 *🔍 Personalized Testing* - Discussion on the trend of personalized testing and evaluations. - Methods to build and curate datasets for evaluations. - Examples of user interactions and synthetic data generation. 04:05 *🤖 Evaluation Judges* - Various types of judges for evaluations including humans and LLMs. - Modes of evaluation, both reference-free and ground-truth based. - Application of evaluations in different contexts like unit tests and AB testing. 05:28 *🔧 Implementing Evaluations with LangSmith* - Introduction to LangSmith platform for running evaluations. - Overview of LangSmith features: dataset creation, evaluator definition, trace inspections. - Future videos will explore detailed steps to build evaluations using LangSmith. Made with HARPA AI
@andrianantenainaprincyraso7162
Ай бұрын
cool !
@aaronbiliyok4553
Ай бұрын
Hey Lance Good job... can you please share you slides?
@kareammohamad
Ай бұрын
Fine

Evaluation Primitives | LangSmith Evaluations - Part 2

Reliable, fully local RAG agents with LLaMA3

They RUINED Everything! 😢

Эффект Карбонаро и бесконечное пиво

Como ela fez isso? 😲

Sigma Girl Education #sigma #viral #comedy

Building Context-Aware Reasoning Applications with LangChain and LangSmith

Is LangGraph the Future of AgentExecutor? Comparison Reveals All!

Building a self-corrective coding assistant from scratch

LangGraph 101: it's better than LangChain

LangSmith Tutorial - LLM Evaluation for Beginners

Is Tree-based RAG Struggling? Not with Knowledge Graphs!

LLM Evaluation: Creating an LLM Eval from Scratch Featuring Bazaarvoice

LangGraph: Multi-Agent Workflows

Devon: Opensource AI Software Engineer - Pair Programmer Creates Software!

They RUINED Everything! 😢

Why Evals Matter | LangSmith Evaluations - Part 1

Пікірлер: 4