My attempts to get DALL-E 2 to draw very simple shapes went... poorly.
(https://outsidetheasylum.blog/testing-dall-e-2-mathematics-comprehension/)
When this market closes, I'll test out the most advanced image-generation models that I was able to get access to. If there are multiple, I'll try them all.
If any of them can consistently return a pentagon from the description "a pentagon", a heptagon from the description "a heptagon", and similar, I'll resolve to YES. If they need a bit of nudging like specific prompt wording but can still generalize correctly to any polygon, that'll be good enough to resolve YES. Otherwise I'll resolve NO.
A model that was trained specifically to make geometric shapes doesn't count; it has to be a generalist like DALL-E 2. In order for a new model to qualify for this market, it needs to be no worse than DALL-E 2 at the vast majority of things it's asked to draw.
It will be interesting to see how inference-time scaling methods improve the ability of generative models to solve these kinds of problems
@IsaacKing is there a version of this question for 2025?
I'm thinking of writing another question that's specifically about inference-time scaling for generating images with correct geometric shapes. But I like this question and it's a bit more general
@IsaacKing is the EOY for this market 2023 or 2024? The title says "end of 2024", but the market itself is marked "Jan 1". I'm making a bet only if it's the end of 2023
@LukaChrelashvili Whoops, sorry about that. I think more people are likely to go by the title rather than end date, so I've changed the end date to match the title.

DALL-E 3 can't create a pentagon or any convex polygon other than a square and a triangle. Even if I specify or nudge the prompts by saying "shape with 5 sides", the resulting image almost always shows a hexagon.
Shape words are very hard. Be more explicit here. Most babies cannot understand more sophisticated shape words.
@IsaacKing There is no basic difference between machine and human intelligence. You would not ask a one year old child to draw you a "pentagon", whatever that is. Ask it to draw a form with five sides or whatever words will be appropriate.