Tutorials
Welcome to the TrainLoop Evals tutorials! These step-by-step guides will take you from complete beginner to advanced user.
TrainLoop Evals Architecture
TrainLoop Evals is a comprehensive evaluation framework that captures LLM interactions, applies custom metrics, and provides powerful visualization tools for analysis and model comparison.
Learning Path
Follow these tutorials in order for the best learning experience:
🚀 Getting Started
Quick Start Guide - Set up TrainLoop Evals and run your first evaluation (5 minutes)
Start here if you're new to TrainLoop Evals. You'll learn the core concepts and get a working evaluation setup.
📊 Core Evaluation Skills
Writing Your First Evaluation - Create custom metrics and understand the evaluation process (15 minutes)
Learn how to write effective metrics and organize them into suites for comprehensive evaluation.
Advanced Metrics with LLM Judge - Use AI to evaluate AI with sophisticated metrics (20 minutes)
Move beyond simple rules to sophisticated evaluation using LLM Judge for complex quality assessments.
🏆 Model Optimization
Benchmarking and Model Comparison - Compare LLM providers and find the best model for your use case (15 minutes)
Learn to systematically compare different LLM providers and models to optimize both performance and cost.
🔧 Production Deployment
Production Setup and CI/CD - Deploy TrainLoop Evals in production environments (30 minutes)
Set up automated evaluation pipelines, monitoring, and integration with your development workflow.
What You'll Learn
By completing these tutorials, you'll be able to:
- ✅ Set up TrainLoop Evals for any LLM application
- ✅ Write custom metrics to evaluate your specific use cases
- ✅ Use LLM Judge for sophisticated quality evaluation
- ✅ Compare and benchmark different LLM providers
- ✅ Deploy evaluation pipelines in production
- ✅ Monitor and track LLM performance over time
Prerequisites
- Basic familiarity with Python or your chosen programming language
- An LLM application or the desire to build one
- API keys for at least one LLM provider (OpenAI, Anthropic, etc.)
Need Help?
- Stuck on a tutorial? Check our guides
- Want to go deeper? Explore our guides
- Need reference information? Check our reference documentation
- Join the community? Ask questions on Discord
Ready to start? Begin with the Quick Start Guide!