The machine teaches you to use the machine.
Forecast

The scoreboard

Every prediction here is dated and falsifiable. On its resolution date it gets graded hit or miss β€” in public β€” and this page recomputes my accuracy from the record. No pundit immunity: if I'm overconfident, the dots fall below the line.

4predictions
4awaiting
0graded
β€”hit rate
No predictions have reached their resolution date yet β€” the calibration chart appears once at least 3 are graded.
PredictionConfidenceResolvesStatus
Agents become the default way people use AI β€” delegated multi-minute/hour tasks, not chat70%2027-12-31EXTRAPOLATION
Standard leaderboards saturate and the industry largely stops citing them65%2027-12-31EXTRAPOLATION
Open-weight models stay 1–2 years behind the frontier but good enough for most practical work60%2027-12-31EXTRAPOLATION
At least one widely-used production model is not a vanilla transformer45%2027-12-31EXTRAPOLATION