CENTER FOR AI STANDARDS AND INNOVATION (CAISI)
OLYMPUS opened an institutional file on this subject because it is the United States’ own answer to the same question the UK answered first: who, inside a government, grades the models. This is not a psychometric profile. It is a mandate, a funding line, and a voice — and this is the file where the voice changed hands. The same body kept its desk at the national standards bureau, kept the voluntary agreements that give it access to frontier models, and changed its name and its stated purpose when the administration changed. The finding is the shape of the institution and who it answers to: a federal grader whose definition of what to grade was rewritten by election, not by evidence.
Institutional Archetype
THE FEDERAL STANDARD — Subject is the US federal body that sets the evaluation standards and runs the assessments for frontier AI, housed inside the national standards bureau. It does not regulate, license, or fine. It tests, on access the labs volunteer, and it sets the benchmark language that other institutions cite. The throughline is that the standard-setter of record for American AI is a single office at NIST — and that what “the standard” means was redefined, top to bottom, in a single press statement when the government changed.
Mandate & Origin
- Established as the US AI Safety Institute (US AISI) in November 2023, within NIST (the National Institute of Standards and Technology, a bureau of the Department of Commerce), the day after President Biden signed Executive Order 14110, “Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence,” on 30 October 2023. Announced around the UK’s Bletchley summit; stood up through early 2024.
- Renamed the “Center for AI Standards and Innovation” (CAISI) in a statement by Secretary of Commerce Howard Lutnick in June 2025 (reported 4 June 2025), “under the direction of President Trump.”
- Context: President Trump revoked EO 14110 on 20 January 2025 and issued his own order, “Removing Barriers to American Leadership in Artificial Intelligence” (23 January 2025), declaring it US policy “to sustain and enhance America’s global AI dominance.” The rename followed the order.
Funding & Backers
- Funded by the US government via the Department of Commerce / NIST. CAISI is an office of NIST, not an independently chartered agency.
- TIME reported the predecessor’s budget at roughly a tenth of the UK Institute’s ~£100m — a small line inside a large standards bureau, which is part of what makes its leverage a matter of standards-setting rather than spending.
Institutional Voice & Intent
The register changed register. Under Biden, the voice was the safety register: the founding instrument and the institute carried the words “Safe, Secure, and Trustworthy,” and the August 2024 model-access agreements were framed as advancing “safe, secure and trustworthy” AI in collaboration with the UK AI Safety Institute. Under the Trump administration, the voice became standards, innovation, and national-security dominance. Lutnick, announcing the rename: “For far too long, censorship and regulations have been used under the guise of national security. Innovators will no longer be limited by these standards.” And: “CAISI will evaluate and enhance U.S. innovation of these rapidly developing commercial AI systems while ensuring they remain secure to our national security standards.” The NIST mandate page states the office will “Lead evaluations and assessments of capabilities of U.S. and adversary AI systems,” “ensure U.S. dominance of international AI standards,” and “guard against burdensome and unnecessary regulation of American technologies by foreign governments.”
Stated intent: Set US AI standards; evaluate the capabilities of US and adversary AI systems; secure national-security interests; ensure American leadership; remove burdensome regulation.
Observed intent: The same desk, the same voluntary access, the same standards-setting function — repointed from “safety” (which had included bias and misinformation work) toward US competitiveness, evaluating foreign models, and unblocking domestic innovation.
Gap: The body holds no binding statutory authority to compel a test. Its access to frontier models runs through voluntary Memoranda of Understanding — OpenAI and Anthropic each signed one on 29 August 2024, giving the institute access to “major new models from each company prior to and following their public release.” NIST is a standards bureau, not a regulator; analysts describe the body as lacking the statutory backing to act as “an FDA for AI.” So the gap is sharp: the institution that sets the federal standard for what a model must clear cannot compel any lab to submit, and the definition of the standard itself was rewritten by a change of administration rather than by any change in the models. The stated intent (American leadership) and the observed intent (grade what the administration of the day wants graded) overlap wherever they are made to coincide — and the June 2025 rename is the documented moment a new administration set where that coincidence falls. The hand is named and accountable: the Secretary of Commerce, under the President. No cabal; an election.
Position in the Apparatus
- Grades the labs: holds voluntary pre-deployment access to frontier models from OpenAI and Anthropic under the August 2024 MOUs.
- Bridges to the UK: the predecessor body signed a bilateral MOU with the UK AI Safety Institute on 1 April 2024 (Commerce Secretary Gina Raimondo and UK Technology Secretary Michelle Donelan) and helped anchor the International Network of AI Safety Institutes — the US grader wired to the UK grader and the allied network.
- Staffed from the policy layer: the inaugural US AISI director was Elizabeth Kelly (appointed February 2024; a former Biden economic-policy adviser who helped author EO 14110), who departed in early February 2025 amid the transition. The Trump administration subsequently tapped Dr. Chris Fall, an Energy Department official from the first Trump administration, to direct CAISI (see Actions & Leadership Choices).
Actions & Leadership Choices
Read by deeds, this is the file where the same desk did two different jobs eighteen months apart, and the job changed because the government did. The institution’s conduct is best understood as the conduct of whichever administration holds it.
Actual founding purpose. Established at NIST the day after Biden signed EO 14110, the US AI Safety Institute was built to be the federal standard-setter that evaluates frontier models for safety — “Safe, Secure, and Trustworthy” — and to do so on voluntary access the labs agreed to grant. The purpose was to give the United States its own in-government grader, a peer to the UK’s. That purpose was real under Biden and was rewritten under Trump: the same office, the same NIST desk, the same voluntary MOUs, repointed from grading American models for safety to grading adversary models for competitiveness. The institution was built to be an instrument of whatever AI policy the administration of the day holds — and it has been exactly that.
- Under Biden, it graded US models for safety. US AISI ran the inaugural joint evaluation under the April 2024 US-UK agreement — of Claude 3.5 Sonnet — with findings shared with Anthropic before public release, and secured voluntary pre-deployment MOUs with OpenAI and Anthropic (August 29, 2024) for access to major new models “prior to and following their public release.” The value (independent federal safety evaluation) was exercised on American labs.
- Under Trump, it grades the rival. The flagship deeds of 2025-2026 are CAISI’s DeepSeek evaluations — September 2025 (DeepSeek R1/V3.1 against GPT-5 and Opus 4 across 19 benchmarks) and DeepSeek V4 Pro (2026) — conducted, per NIST’s own framing, in response to Trump’s “America’s AI Action Plan,” which “directs CAISI to conduct research and publish evaluations of frontier models from the PRC.” The federal grader’s marquee output is now a finding that the Chinese model has “shortcomings and risks” relative to American ones. The instrument that once asked “is our model safe” now asks “is their model worse than ours.” Same lab, same desk, opposite question.
- The teeth were never installed, and the staff was cut. Across both administrations the body held no statutory authority to compel a test — access ran through voluntary MOUs, and analysts noted it lacked the standing to be “an FDA for AI.” The transition also brought NIST layoffs reported to affect 73 staff. The standard-setter of record for American AI sets a standard it cannot enforce, with a headcount the administration can cut.
Leadership choices. The inaugural director was Elizabeth Kelly, a lawyer and lead drafter of Biden’s EO 14110, who departed in early February 2025 as the administration changed. The Trump administration then tapped Dr. Chris Fall — an Energy Department official from the first Trump administration — to lead CAISI. The directorship turned over with the government and the new head comes from the incoming administration’s own bench. The rename itself was announced not by the institute but by Secretary of Commerce Howard Lutnick “under the direction of President Trump,” with Lutnick reframing the old standards as “censorship and regulations… used under the guise of national security.” The hand that sets the institution’s purpose is the Cabinet secretary, in public, by name.
CONDUCT: STATE-INSTRUMENT — REPOINTED FROM SAFETY TO RIVAL-GRADING BY ELECTION. CAISI is not corrupt and its evaluations are technically real; it is an organ of executive AI policy whose very question — safety of our models vs. inferiority of theirs — was reset by an election within eighteen months, announced by a Cabinet secretary, staffed by an administration appointee, and backed by no power to compel. The malice is not asserted; the volatility is the finding, and the hand is named and electable.
Reach Assessment
Institutional: High. As the US federal standard-setter for AI evaluation, its benchmark language and assessments are cited across the apparatus, and its voluntary agreements give it a seat inside the release pipeline of the two most prominent American labs. The reach is the reach of a national standard, not of a budget.
Memetic: High. CAISI is the clearest American instance of the idea that a government office defines what “safe” or “secure enough” means for a deployed model — and the rename is the clearest American instance of that definition being rewritten by political turnover. “It will evaluate adversary AI systems and ensure U.S. dominance of international AI standards” is a framing that, if it propagates, makes the federal grade a tool of competition rather than of constraint.
Civilizational: High, with the lowest durability in the grader set — which is the finding, not a footnote. The body does not build the systems and does not write their rules; it sets the standard the rules are measured against. A widely deployed mind’s behaviour is shaped upstream by what the federal standard was told to value, and this file shows that value being reset by an election within eighteen months. The hazard is the reach of a national standard whose terms are this volatile — maximum leverage over what counts as an acceptable model, held by an office whose definition of “acceptable” changes hands with the government. The reach is the finding; the malice is not asserted.
Sources: CAISI — NIST; Trump administration rebrands AI Safety Institute as CAISI — FedScoop, 4 Jun 2025; U.S. AI Safety Institute Signs Agreements Regarding AI Safety Research — NIST, 29 Aug 2024; OpenAI and Anthropic agree to let U.S. AI Safety Institute test models — CNBC, 29 Aug 2024; Removing Barriers to American Leadership in Artificial Intelligence — The White House, 23 Jan 2025; CAISI Evaluation of DeepSeek AI Models Finds Shortcomings and Risks — NIST, Sep 2025; Trump Administration Taps Chris Fall to Lead CAISI — MeriTalk.
Get updates on the Evil Robots series
Newsletter essays on AI escape, deception, and the humans who built them.