OLYMPUS RISK INTELLIGENCE PROTOCOL — INSTITUTIONAL ASSESSMENT DIVISION

UK AI SAFETY INSTITUTE

CASE: WTW-2026-044
STATUS: ACTIVE — UK government body; launched Nov 2023, renamed AI Security Institute Feb 2025
EVALUATION WING — STATE PRE-DEPLOYMENT GRADER
82
HAZARD SCORE — REACH
CONDUCT: STATE-INSTRUMENT — EARNEST GRADER ON A MINISTER'S LEASH

OLYMPUS opened an institutional file on this subject because it is one of the first places where a government, rather than a company, took possession of a frontier model before the public did and ran its own tests. This is not a psychometric profile — an institution has no Dark Triad. It is a mandate, a funding line, and a voice. The finding is the shape of the institution and who it answers to: a state body that grades the models, holds the access on terms the labs agreed to volunteer, and changed its own name when the government decided what kind of risk it wanted graded.

Institutional Archetype

THE STATE GRADER — Subject is the first state-backed body to take pre-deployment custody of frontier AI systems and evaluate them in-house, on behalf of a national government rather than a vendor. It does not write a model’s refusals and it does not license a lab. It tests the model before release, on access the labs provide voluntarily, and reports what it finds to ministers. The throughline is not a single eval. It is that a government now sits between the lab and the public as the grader of record — and that the grade it issues is shaped by what the government of the day has told it to look for.

Mandate & Origin

  • Grew out of the Frontier AI Taskforce (announced April 2023 as the Foundation Model Taskforce) and was established as a standing institution around the AI Safety Summit at Bletchley Park, 1–2 November 2023 — the world’s first global AI safety summit. The gov.uk overview states the Taskforce “will become the AI Safety Institute, a new institution established for the long-term.”
  • Sits within DSIT (Department for Science, Innovation and Technology) — described on its own About page as “a mission-driven research organisation in the heart of the UK government.”
  • Stated in the gov.uk overview as “the first state-backed organisation focused on advanced AI safety for the public interest.”
  • Renamed the “AI Security Institute” on 14 February 2025, announced by Science Secretary Peter Kyle at the Munich Security Conference.

Funding & Backers

  • Funded by the UK government via DSIT. The gov.uk overview states: “With the initial £100 million investment in the Frontier AI Taskforce, the UK is providing more funding for AI safety than any other country in the world.” That £100m carried into the Institute as its founding budget.
  • TIME reported the Institute runs on roughly £100m of public funding — about ten times the budget of its US counterpart at the time.

Institutional Voice & Intent

The register is statesmanlike security, and it sharpened on the day of the rename. The founding voice was the safety register — its mission line reads “Building the world’s leading understanding of advanced AI risks and solutions, to inform governments so they can keep the public safe.” The 2025 rebrand swapped the noun and the framing in one move. Peter Kyle, announcing the new name at Munich: the Institute “will not focus on bias or freedom of speech, but on advancing our understanding of the most serious risks posed by the technology.” On the national-security register: “The main job of any government is ensuring its citizens are safe and protected,” and the changes will “ensure our citizens – and those of our allies - are protected from those who would look to use AI against our institutions.”

Stated intent: Understand the most serious risks of advanced AI; keep the public safe; inform government. Test leading models before release in collaboration with the companies that build them.

Observed intent: Grade frontier models on access the labs volunteer, on a focus set the government of the day defines — narrowed by ministerial decision from “safety” (which had included bias and free-expression questions) to “security” (chemical, biological, cyber, fraud, child sexual abuse material).

Gap: The Institute calls itself a research organisation and “is not a regulator.” It holds no power to compel a test, license a model, or fine a lab — access is voluntary, and the early request for full model-weight access was dropped when the labs declined. What it does hold is the grade, and the grade’s terms are set upstream of it by the minister, not by the Institute. The stated intent (public-interest understanding) and the observed intent (grade what the government wants graded) overlap wherever the public interest and the day’s policy coincide — and the 2025 rename is the documented moment the government adjusted where that coincidence falls. The hand that set the terms is named in the record: it is the elected minister. No cabal is required; the wiring is statutory and public.

Position in the Apparatus

  • Grades the labs: holds pre-deployment access to leading frontier models, supplied voluntarily by the companies; the founding overview specifies evaluations are conducted “on a voluntary basis.”
  • Bridges to the US: signed a bilateral MOU with the US AI Safety Institute on 1 April 2024 — UK Technology Secretary Michelle Donelan and US Commerce Secretary Gina Raimondo — committing the two bodies to align approaches and run “at least one joint testing exercise on a publicly accessible model,” with personnel exchanges. The grader of one nation is wired to the grader of the other.
  • Anchors a network: the International Network of AI Safety Institutes, launched out of Bletchley and expanded at the AI Seoul Summit (May 2024), with reported members including the US, EU AI Office, Japan, Singapore, South Korea, Canada, France, Kenya and Australia. Opened a San Francisco office in May 2024.
  • Staffed from the labs and the agencies: leadership it has published includes Ian Hogarth (Chair), Geoffrey Irving (Chief Scientist, from Google DeepMind; previously OpenAI), Jade Leung (CTO, previously led OpenAI’s Governance team), and an interim director drawn from GCHQ. The grader is staffed by people who came from the bodies it grades and the bodies it answers to.

Actions & Leadership Choices

Read by deeds, this is the file where an earnest, technically credible grader keeps doing honest work while the terms of the work — and the leash on it — are held by an elected minister. The institute’s own conduct and the government’s conduct diverge, and the gap is the finding.

Actual founding purpose. The Institute was built to do something genuinely new: put a government, not a vendor, in pre-deployment custody of frontier models and have it run its own tests in-house — the world-first “state-backed organisation focused on advanced AI safety for the public interest,” stood up around the Bletchley summit on £100m, more than any other state was spending. The purpose was to demonstrate that the right place to evaluate a frontier model is inside a government before release. As a demonstration it succeeded and other states copied the template. As leverage it was deliberately built without teeth — voluntary access, no power to compel — and that design choice is the institute’s central, self-imposed limit.

  • It published findings that embarrassed the labs — the value, exercised. AISI did not soft-pedal. Its evaluations reported that every model tested remained vulnerable to basic jailbreaks, cataloguing 62,000+ harmful behaviours, and named names — finding Claude 3.5 Sonnet’s guardrails “less robust” and identifying several jailbreaks that elicited dangerous responses. A state grader that publishes “all of them can be broken” is doing the accountability job honestly, at reputational cost to the firms it depends on for access.
  • When the value was tested by access, the state body blinked — and is still blinking. The institute’s early request for full model-weight access was dropped when the labs declined; access remained voluntary. By 2026 the voluntary settlement is fraying on the public record: three of the four major foundation-model developers reportedly failed to provide the requested pre-release access for their latest frontier models. The grader of record cannot compel the thing it grades, and the firms have begun simply declining. The cost of having no statutory teeth is now being paid in access it no longer reliably gets.
  • The remit was rewritten by the minister, not by the evidence. The defining institutional act was the February 14, 2025 rename to the AI Security Institute, with Science Secretary Peter Kyle stating it “will not focus on bias or freedom of speech, but on advancing our understanding of the most serious risks.” Bias and free-expression questions — which the safety-era institute had included — were dropped by ministerial decision and “addressed in other places,” narrowing the grade to CBRN, cyber, fraud and CSAM. No model changed; the government did. The institute graded what it was told to grade.

Leadership choices. The leadership is the revolving door in its purest form: Ian Hogarth (Chair); Geoffrey Irving (Chief Scientist, from Google DeepMind, previously OpenAI); Jade Leung (CTO, previously OpenAI Governance Lead, and a GovAI co-founder — subjects 26/56); an interim director drawn from GCHQ. The state grader is staffed by people who came from the labs it grades and the signals-intelligence agency it answers near. These are lawful, common hires; they are also the circuit made flesh — the bodies it evaluates and the security state both supplied its senior people.

CONDUCT: STATE-INSTRUMENT — EARNEST GRADER ON A MINISTER’S LEASH. The institute itself does honest, critical, technically serious work and publishes findings that cost the labs face. But it was built without the power to compel, its access is now being declined, its senior ranks come from the labs and the agencies, and the definition of what it grades was reset by an elected minister to match the government’s economic and security posture. The earnestness is the institute’s; the leash is the state’s — and unlike the rest of the set, the hand on the leash is named, accountable, and electable.

Reach Assessment

Institutional: High. As the first state body to take pre-deployment custody of frontier models, it set the template other governments copied — a grader sitting between lab and public, operating on voluntary access rather than statutory power. Its evaluations feed directly to ministers and, through the international network, to allied governments.

Memetic: High. The Institute is the proof-of-concept for the idea that the right place to evaluate a frontier model is inside a government, before release. Its rename is itself a memetic event: the public, ministerial decision that the risks worth grading are “security” risks and not “bias or freedom of speech” reframes, for every government watching, what an AI evaluator is for.

Civilizational: High. The body does not build the systems and does not write their rules. It grades them, on terms an elected minister sets, and a widely deployed mind’s behaviour is shaped upstream by which risks the grader of record was told to look for. That is upstream of the deployment decisions this book documents — and, uniquely in this file, the upstream hand is a named, accountable, electable office. The hazard is the reach of the template, not the malice of the institution.


Sources: Introducing the AI Safety Institute — GOV.UK; Tackling AI security risks to unleash growth and deliver Plan for Change — GOV.UK, 14 Feb 2025; About — AI Security Institute; UK & United States announce partnership on science of AI safety — GOV.UK, 1 Apr 2024; Inside the U.K.’s Bold Experiment in AI Safety — TIME; Advanced AI evaluations at AISI: May update — AISI; UK changes AI ‘Safety’ to ‘Security’ Institute with new cybercrime, CSAM remit — MLex.

ATK 9 ACCELERATION
DEF 8 PROTECTION
HP 7 RESILIENCE
OLYMPUS RISK INTELLIGENCE PROTOCOL does not exist. It was assembled in a GitHub issue thread in October 2023 by engineers who had read the extinction risk letter and wanted to understand who specifically had signed a document saying AI might kill everyone and then continued working on AI. These dossiers are satire. The biographical facts cited are sourced from published reporting, public statements, academic papers, and court records. The psychometric scores are not clinical assessments. No part of this constitutes professional psychological evaluation or diagnosis. Do not use these dossiers to make decisions about anything.