Will an LLM break 1400 ELO on LMSys before February? | Manifold

Will an LLM break 1400 ELO on LMSys before February?

Premium

49

Ṁ120k

Feb 2

3%

chance

1D

1W

1M

ALL

Google currently leads with Gemini -- which has two models at around 1370

But OpenAI just announced O3 -- which is getting great marks on things like hard science questions.
https://deepnewz.com/ai-modeling/openai-unveils-o3-o3-mini-models-exceeding-human-performance-on-arc-agi-4f05e4f7

The resolution is simple. Will and LMSys update contain a model with 1400 ELO? Cutoff is last day in January (East Coast time).

Update 2025-26-01 (PST): - Resolution Criteria Update:
- The resolution will be based on the information available on the website on February 1st. (AI summary of creator comment)

This question is managed and resolved by Manifold.

#Technical AI Timelines

Get

1,000

and

3.00

Sort by:

bought Ṁ3,250 NO

@Moscow25 is it per status on the last of January, or will you include whatever's the first update in February? I think you usually go by the latter on your markets (am I conflating creators, perhaps?), but unless it's in description I imagine it should be by the former.

@Moscow25 ping :)

@HenriThunberg whatever is on the website on Feb 1st

bought Ṁ333 YES

Let's see if DeepSeek R1 makes a dent!

We are running low on time. But the models are pretty good. 1374 ELO

Big question is will there be a model launch in time...

You can bet on which one crosses 1400 first here

https://manifold.markets/ChinmayTheMathGuy/what-will-be-true-of-the-first-mode

@ChinmayTheMathGuy cool -- gave you some liquidity for that market -- needs more

Worth noting: this market is essentially https://manifold.markets/bobbill/will-any-llm-outrank-gpt4-by-150-el but with a 1 month later close date

Will any LLM outrank GPT-4 by 150 Elo in LMSYS chatbot arena before 2025?

2% chance. https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard If GPT-4.5 and reaches 150 Elo above GPT-4 market resolves yes. If at any point any llm / chatbot reaches this threshold, market resolves yes. I will not bet in this market.

I would recommend rewording the title to before february

bought Ṁ1,000 NO

also @Moscow25 wanna bet more at 55%? Got a limit up for an hour or so

Related questions

Which organization will have the top LLM on LMSys on March 1st?

Will LLMs mostly overcome the Reversal Curse by the end of 2025?

Will an LLM improve its own ability along some important metric well beyond the best trained LLMs before 2026?

Llama 3 405B ELO on Lmsys Arena Leaderboard 2 weeks after first appearance?

Will an LLM be able to solve a Rubik's Cube by 2025?

EOY 2025: Will open LLMs match closed-source LLMs on coding to within 50 ELO points?

What organization will top the LLM leaderboards on LMArena at end of 2025? 🤖📊

400-point pwn solved by an LLM by 2025

Will any LLM outrank GPT-4 by 150 Elo in LMSYS chatbot arena before 2025?

Will LLM progress stall in 2024?

Related questions

Which organization will have the top LLM on LMSys on March 1st?

EOY 2025: Will open LLMs match closed-source LLMs on coding to within 50 ELO points?

Will LLMs mostly overcome the Reversal Curse by the end of 2025?

What organization will top the LLM leaderboards on LMArena at end of 2025? 🤖📊

Will an LLM improve its own ability along some important metric well beyond the best trained LLMs before 2026?

400-point pwn solved by an LLM by 2025

Llama 3 405B ELO on Lmsys Arena Leaderboard 2 weeks after first appearance?

Will any LLM outrank GPT-4 by 150 Elo in LMSYS chatbot arena before 2025?

Will an LLM be able to solve a Rubik's Cube by 2025?

Will LLM progress stall in 2024?

© Manifold Markets, Inc.•Terms + Mana-only Terms•Privacy•Rules