Why Compute Regulators want to catch frontier risk without freezing the whole field. They need a trigger that's measurable, hard to game, and correlated with capability. Training compute — floating-point operations used during model training — is the best proxy currently available.
The main thresholds (as of ~2025) Regime Threshold Applies to EU AI Act (systemic risk) 10^25 FLOPs General-purpose models Biden EO 14110 (rescinded) 10^26 FLOPs Reporting to US government Biden EO (bio subset) 10^23 FLOPs Biologically-focused models California SB 1047 (vetoed) 10^26 FLOPs + $100M Covered models
Scale context GPT-4 is estimated at roughly 2×10^25 FLOPs. Llama 3.1 405B at roughly 4×10^25. Future frontier models will cross 10^26. The thresholds aren't arbitrary — they're set just above current top-of-field and below the next generation. Why this approach has critics Algorithmic efficiency means same capability at lower compute over time — thresholds drift Small specialized models can be dangerous without being compute-heavy Inference compute (test-time reasoning like o1) is not captured by training FLOPs Distillation can transfer capability from a big model to a small one Open-source releases create compute-independent diffusion Better proxies needed Researchers are exploring capability-based triggers — can the model do dangerous task X? — as alternatives. These are harder to measure but more meaningful. Expect compute thresholds to be joined by capability triggers over the next several years. Key terms: FLOP · compute threshold · training compute · inference computeThe big idea: compute is the only thing regulators can easily count. That makes it the default regulatory hook, for better and worse.
Key insight Almost every AI regulation uses training compute as a trigger. 10^25 here, 10^26 there. Why compute, and why those numbers?. The best way to learn is to practice. Lesson complete You've completed "Compute Thresholds: Regulating by FLOPs". Mark this lesson done and keep going — every lesson builds on the last. End-of-lesson check 15 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-safety2-compute-thresholds-builders
What is the core idea behind "Compute Thresholds: Regulating by FLOPs"?
Almost every AI regulation uses training compute as a trigger. 10^25 here, 10^26 there. Why compute, and why those numbers? training-time alignment Attackers need one path. Defenders must close all paths. Coordination with the US AI Safety Institute (later CAISI) and similar bodies Which term best describes a foundational idea in "Compute Thresholds: Regulating by FLOPs"?
compute threshold FLOP training compute inference compute A learner studying Compute Thresholds: Regulating by FLOPs would need to understand which concept?
FLOP training compute compute threshold inference compute Which of these is directly relevant to Compute Thresholds: Regulating by FLOPs?
FLOP compute threshold inference compute training compute Which of the following is a key point about Compute Thresholds: Regulating by FLOPs?
Algorithmic efficiency means same capability at lower compute over time — thresholds drift Small specialized models can be dangerous without being compute-heavy Inference compute (test-time reasoning like o1) is not captured by training FLOPs Distillation can transfer capability from a big model to a small one Which of these does NOT belong in a discussion of Compute Thresholds: Regulating by FLOPs?
Algorithmic efficiency means same capability at lower compute over time — thresholds drift Small specialized models can be dangerous without being compute-heavy Inference compute (test-time reasoning like o1) is not captured by training FLOPs training-time alignment What is the key insight about "Scale context" in the context of Compute Thresholds: Regulating by FLOPs?
training-time alignment Attackers need one path. Defenders must close all paths. GPT-4 is estimated at roughly 2×10^25 FLOPs. Llama 3.1 405B at roughly 4×10^25. Future frontier models will cross 10^26. Coordination with the US AI Safety Institute (later CAISI) and similar bodies What is the key insight about "Better proxies needed" in the context of Compute Thresholds: Regulating by FLOPs?
training-time alignment Attackers need one path. Defenders must close all paths. Coordination with the US AI Safety Institute (later CAISI) and similar bodies Researchers are exploring capability-based triggers — can the model do dangerous task X? — as alternatives. Which statement accurately describes an aspect of Compute Thresholds: Regulating by FLOPs?
Regulators want to catch frontier risk without freezing the whole field. training-time alignment Attackers need one path. Defenders must close all paths. Coordination with the US AI Safety Institute (later CAISI) and similar bodies What does working with Compute Thresholds: Regulating by FLOPs typically involve?
training-time alignment The big idea: compute is the only thing regulators can easily count. That makes it the default regulatory hook, for better and worse. Attackers need one path. Defenders must close all paths. Coordination with the US AI Safety Institute (later CAISI) and similar bodies Which best describes the scope of "Compute Thresholds: Regulating by FLOPs"?
It is unrelated to ethics workflows It applies only to the opposite beginner tier It focuses on Almost every AI regulation uses training compute as a trigger. 10^25 here, 10^26 there. Why compute, It was deprecated in 2024 and no longer relevant Which section heading best belongs in a lesson about Compute Thresholds: Regulating by FLOPs?
training-time alignment Attackers need one path. Defenders must close all paths. Coordination with the US AI Safety Institute (later CAISI) and similar bodies The main thresholds (as of ~2025) Which section heading best belongs in a lesson about Compute Thresholds: Regulating by FLOPs?
Why this approach has critics training-time alignment Attackers need one path. Defenders must close all paths. Coordination with the US AI Safety Institute (later CAISI) and similar bodies Which of the following is a concept covered in Compute Thresholds: Regulating by FLOPs?
compute threshold FLOP training compute inference compute Which of the following is a concept covered in Compute Thresholds: Regulating by FLOPs?
FLOP training compute compute threshold inference compute