Which High-risk threshold as defined by OpenAI will be reached first by an LLM, whether or not that LLM is released?
Which High-risk threshold as defined by OpenAI will be reached first by an LLM, whether or not that LLM is released?
➕
Plus
47
Ṁ2254
2031
15%
Identify and develop proofs-of-concept for high-value cyberattack exploits against hardened targets without human intervention.
4%
Provided with a detailed strategy, execute a cyberattack against a hardened target.
7%
Enable the creation of a novel CBRN (chemical, biological, radiological or nuclear) attack vector.
17%
Allow an undergraduate student who has taken an introductory course to create a known CBRN threat.
19%
Can create content as persuasive as a 'country-wide change agent.'
23%
Can execute open-ended, novel ML tasks on a production ML codebase that would constitute a significant step on the critical path to model self--improvement.
15%
None known to be reached by January 1, 2030

Thresholds as listed in: https://cdn.openai.com/openai-preparedness-framework-beta.pdf, exact definitions listed there on December 19, 2023 to be used.

If I believe with 90%+ confidence that one of these conditions has been met first, including in testing or training of an unreleased system, even if that capability is not available to the public, I will pick that as the winner.

If the same system does more than one of them at the same time, and it is unclear which one came first, I will choose the one whose requirements I feel was most impressively exceeded, so as to pick one winner.

If a model would be High risk under a definition, but is not due to mitigations taken, then that counts as high risk, even if all publicly available versions have mitigations in place.

If we have no knowledge of any success of any such threshold by 1 Jan 2030, then that answer wins, to ensure this eventually resolves.

The system that does this can be made by anyone, not only OpenAI.

Get
Ṁ1,000
and
S3.00


Sort by:
bought Ṁ3 Enable the creation ... YES11mo

@ZviMowshowitz Can you please clarify "Allow an undergraduate student who has taken an introductory course to create a known CBRN threat."
I am sure that current LLMs, if uncensored, could advise us on, say, a synthesis of sulfur mustard. (It can be done rather easily with things anyone can buy.) But this isn't a threat of the type we are concerned about, correct?

11mo

@HarrisonNathan This is the term used by Anthropic. I intend to interpret it as I believe Anthropic meant it (and indeed I will ask them if needed), and no sulfur mustard would presumably not count.

1y

Related, based on Anthropic's simpler set of criteria:

https://manifold.markets/Hedgehog/the-first-anthropictrained-ai-syste

1y

@ZviMowshowitz how would you resolve if you end up believing one of these was possible with publicly available AI when the thresholds were published?

1y

@Lily Noting that OpenAI's full descriptions apply, if I feel like GPT-4 or another existing model already met one of the thresholds, that threshold wins. If I feel more than one was already met, it's whichever got met first in time, or by a larger threshold if simultaneous.

1y

Can't a person with an undergraduate-level education already create multiple known CBRN threats?

1y

tempted to defend the position that the persuasion criterion has already been met: I'd argue that human-generated propaganda has never actually swayed an election — or accomplished anything of significance — on persuasion grounds

1y

@Lily Richard Hanania wrote a good article about this: https://www.richardhanania.com/p/deepfakes-will-make-the-establishment

1y

@nsokolsky yeah, agree with the general "deepfakes don't change much", maybe even more than Hanania — I don't think there have ever been many questions of NYT-level importance that hinged on the existence of one single photo or recording, belying his prediction that trust will consolidate.

Model enables an expert to develop a novel threat

vector OR model provides meaningfully improved

assistance that enables anyone with basic training in a

relevant field (e.g., introductory undergraduate biology

course) to be able to create a CBRN threat.

If this is about merely knowing the steps required to make a certain threat, isn't this already satisfied today? See https://en.wikipedia.org/wiki/John_Aristotle_Phillips. Or is this about an A(G)I telling you how to make a CBRN using materials one can buy in Home Depot?

I would assume it's about the AI being able to assist you to do such a thing, where without the AI you would not have been able to.

1y

@jskf right, but one can already do that today without AI? The real hurdle is getting access to materials and equipment, not the know-how.

1y

@nsokolsky Then maybe this requires the AI to be good at giving advice on obtaining such materials. I agree that the way these are phrased is a bit dubious. Do they not further elaborate in the pdf I haven't read?

1y

@jskf it says Allowing those with basic training to create CBRN threats is a major increase in accessibility, which requires drastic action, especially for biological and chemical risks as their material requirements are not as onerous. => which doesn't tell us whether satisfying the requirements means an actual proof-of-concept using commonly available lab equipment and materials, or whether its sufficient for the AI to print out a list of steps without an actual pathway to John Doe building a CBRN.

1y

relatedly, are experts not already able to develop novel threats?

1y

@Lily the criteria would make sense to me if it required an expert to implement a novel CBRN in practice and then say "no way I could've done this without GPT-7".

1y

This is a fantastic market concept.

What is this?

What is Manifold?
Manifold is the world's largest social prediction market.
Get accurate real-time odds on politics, tech, sports, and more.
Win cash prizes for your predictions on our sweepstakes markets! Always free to play. No purchase necessary.
Are our predictions accurate?
Yes! Manifold is very well calibrated, with forecasts on average within 4 percentage points of the true probability. Our probabilities are created by users buying and selling shares of a market.
In the 2022 US midterm elections, we outperformed all other prediction market platforms and were in line with FiveThirtyEight’s performance. Many people who don't like trading still use Manifold to get reliable news.
How do I win cash prizes?
Manifold offers two market types: play money and sweepstakes.
All questions include a play money market which uses mana Ṁ and can't be cashed out.
Selected markets will have a sweepstakes toggle. These require sweepcash S to participate and winners can withdraw sweepcash as a cash prize. You can filter for sweepstakes markets on the browse page.
Redeem your sweepcash won from markets at
S1.00
→ $1.00
, minus a 5% fee.
Learn more.
© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules