Project TD Tuning — Version 8

# V8 Context Not every version moves the needle. V8 is one of those versions, and that’s worth saying plainly. V7 showed us that calibrating the model against the full historical record made a real difference. So for V8 we went back to the calibration and turned up the pressure — applying more regularisation to prevent any single signal from dominating the others. The idea was that a more balanced model might handle the stretches where V7 struggled. It didn’t change anything. Week by week, the predictions landed in almost exactly the same places. The same games the model called correctly in V7, it called correctly in V8. The same games it missed, it missed again. V8 ran twelve weeks and stopped at the same point for the same reason. That’s not failure — it’s information. It tells us the model’s behavior is stable. It tells us that the adjustment we made, while reasonable, isn’t what’s limiting performance. The problem isn’t in the calibration settings. It’s somewhere else. We don’t know exactly where yet. Weeks eleven and twelve keep coming up short, and we haven’t cracked why. That’s the question V9 will go after — not another calibration tweak, but a harder look at what those weeks have in common that the model isn’t seeing. One observation worth noting: Weeks 11 and 12 of the 2000 season fell in mid-to-late November — right in the middle of the Bush-Gore Florida recount. The country was in a sustained state of political uncertainty from election night on November 7th through December. The model doesn’t see any of that. It sees what teams did on the field. Whether that kind of national disruption bleeds into how games are played or called is an open question, but it’s the kind of thing our model currently has no way to account for. Local disruptions — a natural disaster near a stadium, a game relocated, unusual weather that wasn’t predicted — fall into the same blind spot. It’s on the list for future versions. Sometimes the news is that you ran the experiment and the experiment told you what doesn’t work. That’s still progress.

Weeks Run

Accuracy

65%

Avg Error

10.1 pts

Season	Week	W/L	Accuracy	Cumulative
2000	1	13/15	87%	86.67%
2000	2	9/15	60%	73.33%
2000	3	9/14	64%	70.45%
2000	4	8/14	57%	67.24%
2000	5	10/14	71%	68.06%
2000	6	8/14	57%	66.28%
2000	7	12/14	86%	69.00%
2000	8	11/14	79%	70.18%
2000	9	7/14	50%	67.97%
2000	10	10/15	67%	67.83%
2000	11	8/15	53%	66.46%
2000	12	8/15	53%	65.32%

For Science and the love of the game!!! What you do with this data is up to you!!!