Top OSWorld score in 2025?
2
Ṁ8467
2026

Invalid contract

Background

OSWorld is a benchmark for evaluating multimodal AI agents on real-world computer tasks in open-ended environments. It tests an AI's ability to navigate operating systems, use applications, and complete practical tasks through a combination of vision and text inputs/outputs.

As of January 24, 2025, the highest OSWorld score is held by OpenAI CUA (200 steps) with a score of 38.1. Other notable scores include:

  • UI-TARS-72B-DPO (50 steps): 24.6

  • UI-TARS-72B-DPO (15 steps): 22.7

  • Claude 3.5 Sonnet (50 steps): 22.0

Resolution Criteria

This market will resolve to the highest verified OSWorld score achieved by any AI model during the 2025 calendar year (January 1, 2025 to December 31, 2025). The score must be publicly reported and verifiable through official sources such as the OSWorld leaderboard, academic publications, or credible tech news outlets.

If multiple models achieve the same highest score, the market will resolve to that score. If scores are reported with different decimal precisions, they will be considered at their reported precision.

Get
Ṁ1,000
and
S3.00
© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules