-
Course
- AI
Evaluating and Optimizing LLM Agents
Learn to evaluate and optimize LLM agents using tools like G-Eval, DeepEval, and LangSmith. Apply metrics, build custom tests, and tune quality, cost, and latency for real-world performance and reliability
What you'll learn
This course is designed for AI engineers, developers, and data scientists building intelligent agents who must ensure those agents produce accurate, relevant, and efficient responses—especially in complex enterprise environments. In this course, Evaluating and Optimizing LLM Agents, you’ll gain the skills needed to assess and enhance agent performance in real-world environments. First, you’ll explore core evaluation metrics like answer relevancy, hallucination rate, and contextual fit, and apply them using tools such as G-Eval and DeepEval. Next, you’ll create domain-specific test suites with open-rag-eval and build dashboards with LangSmith to monitor performance across cost, latency, and quality. Finally, you’ll learn how to apply these strategies across various architectures, including RAG agents, multi-agent systems, and chat-based tools. When you’re finished with this course, you’ll have a practical, repeatable framework for evaluating and optimizing LLM agents at scale.
Table of contents
About the author
Dr. Daniel “Brian” Letort is a 22+ year veteran of Information Technology. During a 21-year tenure at Northrop Grumman, Brian held various roles across software engineering, systems engineering, Chief Applications Architect, Chief Data Scientist, and Chief Enterprise Architect. Brian held the NG Fellow title for six years and Technical Fellow title for four years prior. In 2022, Brian joined Digital Realty as the Chief Architect - Product and Artificial Intelligence. Aside from working at Digital Realty, Brian has 12+ years experience in teaching Data Science and Computer Science classes as an adjunct professor. Brian has authored two books and holds two patents.
More Courses by Brian