Within one year, will there be an AI that can solve any math problem I can (_including_ research math problems) for less money than it would cost to hire me or someone with a similar background as a consultant on the problem (let's say $250/hour).
In theory I should test this by handing it my grad school work and seeing how it does but that may be prohibitively expensive, so instead resolution will be based on my inscrutable whims / general vibes, so consider yourselves warned.
(For my level of math: this is my real name and you can look up my resume, but tl;dr I dropped out of a PhD in ML where about half of my time was spent on PAC learning bounds for causal discovery algorithms. I made it semi-far into the proofs but didn't publish, which is part of why the comparison will be vibes based. I also did okay on the Putnam but it's pretty likely that AI is already better than me at competition math so I don't think that's very relevant.)
Update 2024-21-12 (PST): The market will be resolved based on my assessment at market close time (one year from market creation). I will resolve Yes if I think AI is better than me at that time, and No otherwise. (AI summary of creator comment)
will there be an AI that can solve any math problem I can
I'm assuming "any" here means "all" rather than "at least one", otherwise a pocket calculator wins lol
If so, this may be near impossible with machine learning, because it can only learn to do stuff based on there being a bunch of it in its training data right, which may be impossible unless your level of research math becomes a common publicly posted passtime
@TheAllMemeingEye It only needs to understand the concept, and it can reason through the problem after that.
There are a lot of things not in specific AI models' training data that they can still figure out with some effort.
@Haiku what would you say are some good examples of such things? My understanding is that absence from training data is why for example LLMs often struggle with ASCII art
@TheAllMemeingEye To your question, it depends on what constitutes "not in the training data" (i.e. how close of an example counts), but I think some good examples include:
- Explaining novel jokes
- Playing a simple novel game explained at task time
- Solving novel code challenges
- Solving novel math problems
At some point when you've seen enough, there almost isn't such a thing as novelty, since there's always something that is in some way similar. But that's a property of information, not a property of language models. Humans also usually can't solve types of problems that are extremely novel to them.
I think the ASCII art thing has more to do with the fact that LLMs see the world through 1 dimension, so it's difficult to construct representative 2D images with no practice (i.e. no post-training/RL on ASCII art output). That's roughly the same reason why the ARC benchmark took so long to beat. A model that can beat that benchmark the way it's forced to do it is much more intelligent (in that aspect) than a human. If you trained an LLM much more heavily on ASCII art, it would probably overcome the handicap and be able to produce new and compelling ASCII images despite how difficult it is to do so, because much more of its neural network would be dedicated to memorizing additional layers of useful algorithms for doing so. I think doing this task in 1D would be extremely difficult for most humans.
Intelligence/reasoning is a huge patchwork bundle of various useful algorithms. There are obvious holes in LLM reasoning that haven't been patched yet, but I haven't heard any compelling arguments for why they'll never be patched in that architecture.
I don't really have sources on most of the above, but I really liked this deep dive on whether LLMs can reason:
https://www.youtube.com/watch?v=wXGiV6tVtN0
@JussiVilleHeiskanen What does "that" refer to? I will resolve yes if I think AI is better than me (at market close time, one year from market creation), and no otherwise.