gpt-oss 120B fell off hard on lmarena, it loses to Qwen 30B-3AB *instruct* (not thinking) on every category (except ≈tie in math), to say nothing of its weight class and category peer glm-4.5 air. I don't get how this can happen.
47,76K