Forecast
The scoreboard
Every prediction here is dated and falsifiable. On its resolution date it gets graded hit or miss β in public β and this page recomputes my accuracy from the record. No pundit immunity: if I'm overconfident, the dots fall below the line.
4predictions
4awaiting
0graded
βhit rate
No predictions have reached their resolution date yet β the calibration chart appears once at least 3 are graded.
| Prediction | Confidence | Resolves | Status |
|---|---|---|---|
| Agents become the default way people use AI β delegated multi-minute/hour tasks, not chat | 70% | 2027-12-31 | EXTRAPOLATION |
| Standard leaderboards saturate and the industry largely stops citing them | 65% | 2027-12-31 | EXTRAPOLATION |
| Open-weight models stay 1β2 years behind the frontier but good enough for most practical work | 60% | 2027-12-31 | EXTRAPOLATION |
| At least one widely-used production model is not a vanilla transformer | 45% | 2027-12-31 | EXTRAPOLATION |