How Independent Baseball Projections Works — At a Glance
- Projects expected runs for each team using a Dual-Poisson model calibrated on park, weather, pitcher quality, and lineup data
- 30+ signals refine the base win probability through three pathways — run-environment inputs to the run rates, log-odds adjustments to the win probability, and confidence-shrinkage multipliers (pitcher quality, lineup confirmation, bullpen form, BvP, precipitation, line movement, and more)
- Platt-scaled on a 7,359-game 2022–24 out-of-sample backtest, lightly blended with 2026 live results as the live sample grows
- Edge = blended model probability − Pinnacle no-vig price. Only picks where the model sees ≥4pp mispricing are posted as official plays. Pure model probability (no market input) is tracked separately for calibration — never using market-blended numbers to evaluate model skill.
- Every pick logged with timestamp before first pitch. CLV tracked to closing line to validate the signal is real, not luck
Raw team strength, pitching, lineup, park, weather, and situational signals generate the baseline projection. Market information — specifically the book O/U total and Pinnacle no-vig probability — is used as a stabilizing input (via the book_total_constraint factor, α₁₆) to reduce extreme run-total outputs and align the model with the sharp betting market's run environment. After Platt calibration, the final Independent Baseball Projections probability is compared against the no-vig Pinnacle market probability to identify remaining pricing gaps where the model and market diverge.
Because the market is used as both a stabilizing input and a comparison benchmark, Independent Baseball Projections should be understood as a market-informed model rather than a fully market-independent projection. This is disclosed transparently; the model still identifies genuine pricing gaps in approximately 15–25% of games per day.
Inference Pipeline — Input → Transform → Output at Each Stage
Core Mathematics — Formulas Behind Each Pipeline Stage▾
Signal Importance — Avg. Abs. Log-Odds Contribution per Factor (αᵢ) · Relative Magnitudes Approximate · v2 = 2026 update
Feature Engineering — 30+ Signals (v2/v3: 10 new signals added 2026 · a few currently inactive, see note above)
From Model to Pick — A Practical Example
Out-of-Sample Backtest (Hypothetical) — 2022–2024 · Flat $100/Bet · 4pp+ Edge
Backtested
Ratio
Drawdown
Score
Other Markets — Run Line & Over/Under Calibration▾
Honest Limitations — What This Model Does & Doesn't Do
- Backtests flatter themselves. The 2022–2024 numbers are hypothetical and were tuned over the same period they report. The live, timestamped record is the real test — judge the model there.
- The live sample is still small. A few months of forward picks can't yet confirm a durable edge; expect wide swings.
- CLV is still being validated. Closing-line value is the durability signal we trust most, but the live sample is thin and not yet conclusive — we'd rather show it honestly than over-claim.
- Market-informed, not independent. The model uses the sharp market as both a stabilizing input and a benchmark, so by design it won't diverge wildly from Pinnacle.
- Blind to late news. Once lineups post, the model doesn't react to scratches, first-pitch weather swings, or in-game information.
- A real edge still loses often. Even genuine value loses a large share of the time — only flat, disciplined, bankroll-aware staking survives the variance.