🏆 MLE-Dojo Benchmark Leaderboard
MLE-Dojo is a Gym-style framework for systematically training, evaluating, and improving autonomous large language model (LLM) agents in iterative machine learning engineering (MLE) workflows.
Rank | Model | Organizer | License | Elo Score |
---|---|---|---|---|
1 | DeepSeek | Proprietary | 1214 |
MLE-Dojo
MLE-Dojo is a Gym-style framework for systematically training, evaluating, and improving autonomous large language model (LLM) agents in iterative machine learning engineering (MLE) workflows. Built upon 200+ real-world Kaggle challenges. MLE-Dojo covers diverse, open-ended MLE tasks carefully curated to reflect realistic Machine Learning Engineering scenarios such as data processing, architecture search, hyperparameter tuning, and code debugging, etc. MLE-Dojo's fully executable environment and flexible interface support comprehensive agent training via both supervised fine-tuning and reinforcement learning, facilitating iterative experimentation, realistic data sampling, and real-time outcome verification.
New Updates
We actively maintain this as a long-term real-time leaderboard with updated models and evaluation tasks to foster community-driven innovation.