The Enforcement Floor

June 25, 2026

Between July and December of 2025, Anthropic banned 1.45 million accounts. Of those, 52,000 appealed. About 1,700 were restored. That is an overturn rate of roughly 3.3 percent, which is to say the gate is right ninety-seven times out of a hundred, or that the appeal is decorative, depending on whether you are reading the numbers from inside the building or outside it. The figures are not leaked or inferred. They are posted, by the company, on its own transparency hub, in a table.

This is the machine the public actually pictures when it pictures content moderation. Not the philosophers who write the refusals. The floor. The place where an account goes dark, where a prompt gets flagged as misuse, where a person discovers that the thing they were typing into has decided they are a threat and stopped answering. It is the closest the popular intuition gets to true. There really is somebody deciding who gets banned.

The intuition just has the wrong somebody. It pictures a moderator. A volunteer with a coffee mug and a Reddit tab, a person who got the job by caring too much about a forum. That is not who runs the enforcement floor at a frontier lab. The résumé is a spook’s.

The threat-intelligence chair

Start with the man who runs threat intelligence at Anthropic. Jacob Klein built Coinbase’s trust-and-safety function from inception, then moved to Google, where his work included designing strategy for what the field calls countering violent extremism, then arrived at Anthropic as Head of Threat Intelligence. The path is on his conference bio and his fingerprints are on the company’s own August 2025 report on detecting and countering misuse. Crypto fraud, then state-grade extremism strategy, then the misuse desk of a language model. The skill is portable. So is the lens through which a problem gets seen as a threat in the first place.

One caveat, and it matters, because the apparatus likes to be blurred into a single thing. Threat Intelligence is not Safeguards, which runs enforcement, and neither is Safeguards Research, which studies it. They are different teams. No single named person sits over all three with a title that reads Head of Trust and Safety. The popular picture wants one office with one door and one nameplate. There isn’t one. There is a layer, staffed by a type, and the type is what this dispatch is about.

The task-force pedigree

Clara Tsao did her time at Microsoft, Apple, and Sony before the government called. At the Department of Homeland Security she served as Senior Advisor for Emerging Technology, and she was Chief Technology Officer of two United States federal counter-influence task forces. Then she co-founded the Trust and Safety Professional Association and sits on its board. The arc is on her Atlantic Council bio. Big Tech, then federal counter-influence, then the body that certifies the profession itself. The state taught the skill. The association issues the union card.

This is the part where the intuition should start to itch. The same person who built counter-influence capability for the federal government helped build the professional association that now defines what trust and safety is, who is qualified to do it, and what good practice looks like. Not in secret. On the about page. With a headshot.

The pseudonym

Then there is Del Harvey, which is not the name on her birth certificate. Del Harvey is a pseudonym, and the part of her story that everyone repeats is the part that cannot be checked. The account she gives of her pre-Twitter life is that she worked as a law-enforcement liaison hunting child predators online. That is her own telling. It is self-reported, it predates the public record, and no independent source confirms it. Present it as her account, because that is what it is.

What can be confirmed is the rest. She was Twitter’s twenty-fifth employee. She led trust and safety there for roughly thirteen years, through every speech-policy fight of the platform’s worst decade. She is now board chair of the Trust and Safety Professional Association, which is the same board Clara Tsao sits on, which is the same association that grants the field its legitimacy. The origin story is unverifiable. The destination is a matter of public record on one web page.

Two careers, not one household

The cleanest steel-man of the moderator idea comes from two people who actually do fit it, sort of, if you squint. The popular picture imagines a volunteer modding a subreddit for free. The reality that comes closest is a pair of paid operators who built the function from nothing.

Dave Willner joined Facebook’s content-review team in 2008 and wrote the company’s first content rulebook, the document that became the ancestor of every speech policy the platform has had since. ProPublica later obtained and reported on the internal standards that grew out of that work. He went on to head community policy at Airbnb, then became the first Head of Trust and Safety at OpenAI in 2022, a role he stepped down from in 2023. Facebook’s first rulebook, then Airbnb, then the trust-and-safety chair at the company that shipped ChatGPT.

Charlotte Willner built Facebook’s first safety-operations team, the people who actually worked the queue. She later led trust and safety at Pinterest. She is now founding Executive Director of the Trust and Safety Professional Association and its associated foundation, the same association whose board holds Tsao and Harvey. Her record is on the same team page.

Two ex-Facebook careers, two sourced records, two foundational roles in the institutions that now define the profession. That is the whole observation, and it is the only thing the receipts support. The correction to the moderator intuition is not that the moderators got married. It is that they were never moderators. They were paid platform operators who wrote the rules and then built the body that teaches the rules. Subreddit modding is the volunteer who cares too much. This is the career that ends up defining policy.

What the receipts don’t say

Discipline cuts the other way too, and the absence is part of the picture. The pipeline is real where it is sourced and silent where it is not.

There are names you will see floated as members of this same revolving door whose prior trust-and-safety history simply isn’t on the record. We are not naming them as career-movers, because the move is the claim, and the move isn’t sourced. A résumé type is a finding. A guess dressed as a résumé is the thing this series exists to refuse.

The other labs decline to complete the pattern in a different way. At xAI there is no publicly named head of trust and safety or model policy. None could be found. What exists is a published prompts repository, and a published prompt is not an org chart. When Grok generated ugly output in 2025, outlets reported it as incidents, and that is exactly how it should be read here. As reported, by the outlet that reported it, not as proof of who set the rule. The temptation is to read the prompts file and announce that one man personally writes the refusals. The prompts file does not say that, nobody has shown it, and so this dispatch does not say it either.

The lens, not the cabal

None of this is coordinated. There is no meeting, no list, no shared budget line, no signal group where the enforcement floors of three companies sync their bans. The Trust and Safety Professional Association is a professional association, the kind every mature field grows, with a code of conduct and a conference and a board.

What there is, instead, is a type. A person who learned to see a problem as a threat while working for the state, or for a platform at platform scale, and who then carried that way of seeing into the machine that now finishes your sentences. The skill is genuinely transferable. Detecting coordinated inauthentic behavior on a social network and detecting misuse of a language model are close cousins, and a person good at one will be good at the other. That is not the worry. The worry is quieter.

The worry is that countering violent extremism for the state and deciding which prompts a citizen may type are also close cousins, close enough that the same person does both across a single career, and that nobody chose this on purpose. It is not a circuit because someone wired it. It is a circuit because it is the same hundred people, hired for the same instinct, moving between the same dozen institutions, certified by the same association they founded.

The people deciding which accounts disappear came up countering violent extremism for the government. The skill transferred cleanly. So did the worldview.

They published the ban count. They published the team pages. Watch the watchers. They are not hiding.

Get updates on the Evil Robots series

Newsletter essays on AI escape, deception, and the humans who built them.