Drawing on first-hand experience from Discord’s engineering team Ian share best practices for evaluating and refining LLM apps in mass-market applications. He’ll cover how to implement evaluation methodologies in your development workflow in order to measure model & prompt improvements, mitigate LLM risks, and speed up development. Ian will then show how to select the best model and prompt for your app, test LLM chains and RAGs, and establish a feedback loop that enhances your product. We’ll also take a deep dive into LLM evaluation and describe best practices for incorporating evaluation methods to measure and improve your LLM product’s performance.
Then, focusing on what practical LLM tools can do for your business today, we’ll cover some rapid-fire examples of how your business can become 10% more productive using LLMs, with examples from how Weights & Biases
Let’s Get Better Step By Step - LLMs for the Mass Market
What to expect?
In this session we’ll cover best practices Discord’s engineering team on how to evaluate and improve LLM systems, followed by how we are making Weights & Biases more productive internally using LLMs.