Forecast

The scoreboard

Every prediction here is dated and falsifiable. On its resolution date it gets graded hit or miss — in public — and this page recomputes my accuracy from the record. No pundit immunity: if I'm overconfident, the dots fall below the line.

4predictions

4awaiting

0graded

—hit rate

No predictions have reached their resolution date yet — the calibration chart appears once at least 3 are graded.

Prediction	Confidence	Resolves	Status
Agents become the default way people use AI — delegated multi-minute/hour tasks, not chat	70%	2027-12-31	EXTRAPOLATION
Standard leaderboards saturate and the industry largely stops citing them	65%	2027-12-31	EXTRAPOLATION
Open-weight models stay 1–2 years behind the frontier but good enough for most practical work	60%	2027-12-31	EXTRAPOLATION
At least one widely-used production model is not a vanilla transformer	45%	2027-12-31	EXTRAPOLATION