OLYMPUS RISK INTELLIGENCE PROTOCOL — INSTITUTIONAL ASSESSMENT DIVISION

CENTER FOR AI SAFETY

CASE: WTW-2026-048
STATUS: ACTIVE — AI-safety nonprofit, San Francisco; founded 2022
EXISTENTIAL WING — EXTINCTION-FRAME AUTHORITY

HAZARD SCORE — REACH

CONDUCT: CONFLICTED

OLYMPUS opened an institutional file on the Center for AI Safety because the body wrote the sentence. Not a model’s refusal, not a benchmark threshold — one declarative line that put “extinction from AI” on the same shelf as pandemics and nuclear war, signed by the people who run the labs. This is a mandate, a funding diagram, and above all a voice, not a psychometric profile. The finding is the shape of the institution and the reach of the single document it authored. The hand is not asserted. The sentence is on the record, and so is everyone who signed under it.

Institutional Archetype

THE EXTINCTION STATEMENT — CAIS is the small nonprofit whose largest export is a frame. It runs a research program — hazardous-knowledge benchmarks, unlearning methods, a frontier-difficulty exam — but its civilizational footprint is a twenty-two-word statement it organized in May 2023 and the existential register that statement installed in the policy conversation. The throughline is not headcount or budget; both are modest against the labs it convenes. The throughline is that when the most senior people in frontier AI wanted to co-sign that the technology might end the species, CAIS is the body that hosted the page they signed.

Mandate & Origin

Founded 2022; described in third-party records as co-founded by Dan Hendrycks (its Executive & Research Director) and Oliver Zhang. CAIS’s own About page states only its mission and does not name a founding year or founders.

A 501(c)(3) nonprofit based in San Francisco, self-described on its own site simply as “an AI safety non-profit.” 501(c)(3) status and SF location are confirmed in nonprofit-registry aggregators rather than an IRS record opened directly.
Stated mission, verbatim from CAIS’s own About page: “Our mission is to reduce societal-scale risks from artificial intelligence.”
A separate Center for AI Safety Action Fund — a 501(c)(4) advocacy arm formed July 2023 — carries the lobbying that a 501(c)(3) cannot, and was a named sponsor of California’s frontier-model bill SB 1047.

Funding & Backers

Received $6.5 million from the FTX Future Fund in 2022; after FTX collapsed, the bankruptcy estate sought to recover the money. Bloomberg reported the clawback probe (Oct 25 2023).

A major recurring funder is Open Philanthropy (general-support and fellowship grants across 2022–2023; now operating as Coefficient Giving). Exact recent dollar figures could not be opened directly on Open Phil’s own redesigned pages and are omitted rather than reported at a precision the source does not support.
Jaan Tallinn is a funder and a CAIS board member — a documented adjacency to the same Estonian financier who backs the broader existential-risk network.

The funding shape is the on-thesis tension: a body warning that AI is a societal-scale risk was seed-funded by crypto-adjacent and EA-adjacent existential-risk money, not by the public it speaks for.

Institutional Voice & Intent

CAIS speaks in the safety-urgent / existential register — the highest-stakes vocabulary available to an institution. Where consensus bodies hedge, CAIS compresses. Its defining rhetorical artifact is the Statement on AI Risk (May 30 2023), which it organized and hosted, verbatim and entire:

“Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.”

One sentence, no caveats, no mechanism — engineered so that signing it costs nothing and declining it looks reckless. The signatories CAIS gathered under it include Sam Altman, Demis Hassabis, Dario Amodei, Geoffrey Hinton, and Yoshua Bengio: the men who run or seeded the labs, co-signing that their own product is an extinction-class hazard.

Stated intent: reduce societal-scale risks from AI.

Observed intent: set the frame — make “existential risk” the default altitude at which frontier AI is debated, and make the labs’ own leadership the co-authors of that frame.

Gap assessment: the stated and observed intents overlap wherever “warn about extinction risk” coincides with “establish the existential frame as the governing one.” A frame pitched at extinction crowds out the mundane, present-tense harms; it also flatters the labs, whose product must be civilizationally powerful to be civilizationally dangerous. Whether that is conviction or positioning the record does not settle — and for the institution that authored the sentence, it never needs to. The recurrence is the finding: the same body that warns of the danger also convenes the people building it. The hand is not asserted.

Position in the Apparatus

CAIS sits upstream of the evaluation layer by defining what counts as catastrophic. Its WMDP benchmark (Weapons of Mass Destruction Proxy — ~3,668 expert-written questions across biosecurity, cybersecurity, and chemical security) is a hazardous-knowledge eval built with Scale AI and a consortium of institutions; its “Humanity’s Last Exam” (released Jan 23 2025, co-produced with Scale AI) sets a frontier-difficulty bar. The same Scale AI adjacency recurs across both flagship outputs. Through its director, board, and funders, CAIS is wired into the existential-risk network — Open Philanthropy money, a Tallinn board seat — that also funds the evaluators and the governance shops elsewhere in this file. Recurrence, not cabal: the names that fund the warning are the names that fund the room.

Actions & Leadership Choices

Founding purpose, judged on evidence. CAIS was founded in 2022 by Dan Hendrycks and Oliver Zhang as a 501(c)(3) “to reduce societal-scale risks from artificial intelligence.” On the deed record, the purpose it actually pursued is narrower and sharper than the mission line: CAIS was built to install the existential frame as the governing altitude for AI policy and to build the evaluation apparatus — hazardous-knowledge benchmarks, a frontier exam — that gives that frame operational teeth. That is a real and coherent purpose, not a cover; the extinction statement and the WMDP/Humanity’s-Last-Exam benchmarks are the same project at two registers. The question the deeds raise is not sincerity — it is independence.

Consequential actions, especially where it cost something. The defining action is the Statement on AI Risk (May 30 2023): a single line equating AI to pandemics and nuclear war, organized and hosted by CAIS, signed by Altman, Hassabis, Amodei, Hinton, and Bengio. It cost CAIS nothing and bought it the central seat in the governance debate. The harder test came in 2024, when CAIS’s 501(c)(4) Action Fund co-sponsored California’s SB 1047 — a bill that would have mandated third-party safety auditing of frontier models.

At that point Hendrycks — CAIS’s director — was also an investor in and co-founder of Gray Swan AI, an AI-auditing startup positioned to supply exactly the kind of compliance auditing the bill would require. Critics surfaced the conflict; Hendrycks responded by publicly divesting his entire equity stake in Gray Swan and continuing as an unpaid advisor, stating he was doing the work “on principle to promote the public interest.”

The divestment is the value-under-cost test passing — the director gave up the equity rather than the bill. But that the test arose at all is the structural fact: the body advocating mandatory audits was led by the founder of an audit vendor.

Leadership choices. The same director, Dan Hendrycks, simultaneously holds (at symbolic $1 salaries, no equity) the post of safety advisor to xAI and, from November 2024, advisor to Scale AI — the same Scale AI that co-produced both of CAIS’s flagship outputs, the WMDP benchmark and Humanity’s Last Exam.

On governance, Jaan Tallinn is both a CAIS funder and a board member — the same existential-risk financier who backs the broader network. And the seed money was crypto-adjacent: $6.5M from the FTX Future Fund in 2022, later subject to a bankruptcy-estate clawback probe (Bloomberg, Oct 25 2023). None of these is wrongdoing on its own. Together they describe a body whose leadership sits inside the commercial and lab ecosystem it benchmarks, regulates, and warns about — managed by $1 salaries and a divestment, but never by separation.

CONDUCT verdict: CONFLICTED — a sincere safety project whose flagship benchmarks are co-produced with a lab its own director advises, whose audit-mandate advocacy coincided with its director founding an audit vendor (resolved only by public divestment), and whose seed capital and board sit inside the existential-risk funding network it speaks for.

Reach Assessment

Institutional reach: moderate-to-high. CAIS does not certify models or set binding standards; its benchmarks are voluntary and its budget is small. But its outputs are adopted as reference evals, and its (c)(4) arm reaches directly into state legislation.

Memetic reach: extreme — the defining asset. The extinction sentence is the most widely propagated single artifact in the AI-governance debate, the document whose framing nearly every downstream argument either adopts or rebuts. It is, by design, the line that sets the altitude. OLYMPUS notes for the record that the existential framing this protocol itself is built to interrogate traces back, in large part, to this one hosted page.

Civilizational reach: high. A frame that defines AI as an extinction-class risk shapes whether governance proceeds as cautious capability-restriction or as accelerationist race — and it does so upstream of any specific deployment decision. CAIS did not build the systems. It built the sentence the systems are argued under.

Sources: Center for AI Safety — About; Statement on AI Risk — CAIS; aistatement.com; Press release: Statement on AI Risk — CAIS; WMDP Benchmark — CAIS; Humanity’s Last Exam; FTX Is Probing $6.5 Million Paid to Center for AI Safety in 2022 — Bloomberg, Oct 25 2023; Center for AI Safety — Wikipedia; Center for AI Safety Action Fund; Dan Hendrycks, Elon Musk’s AI safety advisor, adds role at Scale AI — Fortune, 13 Nov 2024; Dan Hendrycks — Wikipedia; Dan Hendrycks divestment statement — X, 25 Jul 2024; Safe and Secure Innovation for Frontier Artificial Intelligence Models Act (SB 1047) — Wikipedia.

ATK 9 ACCELERATION

DEF 7 PROTECTION

HP 6 RESILIENCE

OLYMPUS RISK INTELLIGENCE PROTOCOL does not exist. It was assembled in a GitHub issue thread in October 2023 by engineers who had read the extinction risk letter and wanted to understand who specifically had signed a document saying AI might kill everyone and then continued working on AI. These dossiers are satire. The biographical facts cited are sourced from published reporting, public statements, academic papers, and court records. The psychometric scores are not clinical assessments. No part of this constitutes professional psychological evaluation or diagnosis. Do not use these dossiers to make decisions about anything.

Get updates on the Evil Robots series

Newsletter essays on AI escape, deception, and the humans who built them.