🚀Introducing Code Arena: the next generation of live coding evals for frontier AI models. Built to test how models plan, scaffold, debug, and build real web apps step-by-step.
Try Claude, GPT-5, GLM-4.6 and Gemini in Code Arena today!
🚀 Introducing Arena Expert: a new LMArena evaluation framework to identify the toughest, most expert-level prompts from real users, powering a new Expert leaderboard.
We also introduce Occupational Categories that underlie eight new leaderboards:
💻 Software & IT Services
✍️ Writing, Literature, & Language
🔬 Life, Physical, & Social Science
🎭 Entertainment, Sports, & Media
📈 Business, Management, & Financial Ops
🧮 Mathematical
⚖️ Legal & Government
🩺 Medicine & Healthcare
Explore how models perform across fields in thread 🧵 👇