IBP
INDEPENDENT BASEBALL PROJECTIONS
Methodology · Technical Reference · Dual-Poisson Win Probability Engine
30+ Signal Model
3 Signal Pathways
Dual-Poisson
Platt Calibrated
CLV Tracked
7,359 Games OOS

How Independent Baseball Projections Works — At a Glance

Curious how the model is priced against the market today?
View Today's Picks →
Model Type
Dual-Poisson
λ_home + λ_away → CDF
Signal Pathways
3 · 30+ signals
λ-input · log-odds · shrink
Adjustments
30+ signals
logit(p) += Σ αᵢ
Calibration
Platt sigmoid
OOS fit + live blend
Bet Filter
edge ≥ 4pp
vs. no-vig Pinnacle
Sizing Method
Fractional Kelly / 3u
bankroll fraction
OOS Period
2022 – 2024
3 seasons, sequential
Brier Score
0.234
vs. 0.250 market · ↓ better
ℹ️
Independent Baseball Projections is a market-informed model

Raw team strength, pitching, lineup, park, weather, and situational signals generate the baseline projection. Market information — specifically the book O/U total and Pinnacle no-vig probability — is used as a stabilizing input (via the book_total_constraint factor, α₁₆) to reduce extreme run-total outputs and align the model with the sharp betting market's run environment. After Platt calibration, the final Independent Baseball Projections probability is compared against the no-vig Pinnacle market probability to identify remaining pricing gaps where the model and market diverge.

Because the market is used as both a stabilizing input and a comparison benchmark, Independent Baseball Projections should be understood as a market-informed model rather than a fully market-independent projection. This is disclosed transparently; the model still identifies genuine pricing gaps in approximately 15–25% of games per day.

Inference Pipeline — Input → Transform → Output at Each Stage

01
📡
Data Ingestion
IN
Cron trigger · game schedule · ump assignments · weather coords
OUT
Raw JSON: odds, lineups, weather, ump_id, FIP data per game
02
⚙️
Feature Eng.
IN
Raw JSON dicts per game
OUT
~50 numeric inputs per game (home/away): xFIP, xERA, K-BB%, OPS splits, BvP, arsenal fit, park, weather, ump…
03
λ
λ Construction
IN
RS₃ᵧᵣ, RS_L15, park_factor, book O/U
OUT
(λ_home, λ_away) floats — expected runs per team
04
📊
Poisson Win Prob
IN
(λ_home, λ_away)
OUT
p_base ∈ (0,1) — exact CDF summation
05
Σα
Log-Odds Adj.
IN
p_base + 30+ signal adjustments
OUT
p_adj — logit(p_base)+Σαᵢ → sigmoid
06
σ
Platt Calib.
IN
p_adj (raw model output)
OUT
p_cal — σ(A·logit(p_adj)+B) · fitted
07
Edge Detection
IN
p_cal + p_nv_Pinnacle
OUT
edge float; bet_flag if ≥ 0.04
08
💰
Kelly Sizing
IN
edge, decimal_odds, bankroll
OUT
stake (units); cap 3u; log → picks_log.csv
λ Run Expectancy Construction
Base lambda — form-blended, park-adjusted
λ_base = (w₁ × RS₃ᵧᵣ + w₂ × RS_L15) / G × park_factor // RS₃ᵧᵣ: 3yr rolling RS/G · RS_L15: last 15G avg
Log-odds stacking — signal factors in logit space
logit(p) = log(p / (1 p)) logit(p_adj) = logit(p_base) + Σᵢ αᵢ // αᵢ ∈ ℝ · logit prevents p ∉ (0,1) p_adj = sigmoid(logit(p_adj))
The blend favors the multi-year base while still weighting recent (L15) form to capture hot/cold streaks without overreacting to small samples. Park adjustment applied symmetrically to both offensive and defensive λ.
📊 Poisson Distribution & Win Probability
Run distributions — modeled independently
R_home ~ Poisson(λ_home) R_away ~ Poisson(λ_away) P(X=k ; λ) = exp(λ) · λ^k / k!
Win probability — exact CDF (no simulation)
P(home wins) = Σ_{k=1}^∞ P(R_home=k) · CDFPois(λ_away, k1) // Tie → extra innings as 50/50
No Monte Carlo needed — win prob is computed exactly from Poisson CDFs. Teams are modeled as independent (no run correlation). Run-environment signals (park, weather, umpire, platoon, baserunning) enter via λ; log-odds adjustments and confidence-shrinkage multipliers then adjust the derived win probability.
σ Platt Scaling Calibration
Sigmoid recalibration — fitted OOS 2022–2024
p_cal = σ(A · logit(p_raw) + B) = 1 / (1 + exp((A · logit(p_raw) + B)))
Parameter interpretation
A < 1 // < 1: model over-expresses // confidence; compress toward 50% B > 0 // home-team bias correction
Fitted via MLE on the 2022–24 out-of-sample backtest, lightly blended with 2026 live outcomes as the sample grows (exact A, B not published). Brier Score: 0.234 model vs. 0.250 market — 6.4% improvement. A < 1 confirms the raw Poisson systematically over-prices favorites; calibration compresses the logit toward the empirical base rate.
⚖️ No-Vig Market Probability Strip
American odds → raw implied probability
p_raw(ml) = { |ml| / (|ml| + 100) if ml < 0 (favorite) 100 / (ml + 100) if ml > 0 (underdog) }
Simultaneous both-side margin removal
hold = p_raw(h_ml) + p_raw(a_ml) 1 p_nv_home = p_raw(h_ml) / (p_raw(h_ml) + p_raw(a_ml)) // Pinnacle benchmark: hold ≈ 2.5%
Simultaneous stripping avoids asymmetric vig attribution — equivalent to the multiplicative method. Pinnacle's ~2.5% hold makes its no-vig the tightest available reflection of sharp market consensus.
⚡ Edge Detection & Stake Sizing
Model edge vs. no-vig Pinnacle
edge = p_cal p_nv_Pinnacle // Bet flagged if edge ≥ 0.04
Fractional Kelly sizing — conservative multi-step reduction
b = decimal_odds 1 p = p_cal ; q = 1 p f* = 0.5 × (b · p q) / b // units = f* × bankroll ; cap: 3u
Stake sizes use a conservative fractional Kelly approach: base fraction × verdict multiplier × display fraction. CONDITIONAL picks (4–6pp) are full-sized; REDUCED CONF. picks are sized at half the normal rate; FLAGGED picks at ¾. The 3u cap prevents over-sizing on high-edge outliers.
📈 Closing Line Value — Process Validation
Definition — Pinnacle no-vig both sides
CLV = p_close_nv p_open_nv // Snapshot: pre-game closing line (CT)
Interpretation
CLV > 0 → market moved toward pick CLV < 0 → market moved against pick // +1% avg CLV ≈ long-run positive EV // Separates skill from short-run variance
CLV is the gold standard for betting process validation — a model can run cold for 50 games but still show positive CLV, confirming the signal is real. Tracked per-bet in picks_log.csv, surfaced via --poster-stats.

Signal Importance — Avg. Abs. Log-Odds Contribution per Factor (αᵢ) · Relative Magnitudes Approximate · v2 = 2026 update

The model's 30+ signals flow through three pathways — run-environment inputs to the Poisson run rates, log-odds adjustments to the win probability, and confidence-shrinkage multipliers. They are grouped by domain below for readability; importances reflect the model's design and magnitudes are approximate. Several factors marked inactive below do not currently affect live picks — retired after backtest validation (bullpen fatigue, pitcher-form slope) or awaiting data wiring (catcher framing, bullpen availability, career H2H matchup). In practice, live discrimination is concentrated in the pitcher-xFIP and run-environment core; the situational adjustments are deliberately small.
⚾ Pitcher
5 factors · α₁–α₅
α₁
Starter xFIP matchup
0.18
α₅
Bullpen xFIP tiered
0.10
α₂
Platoon FIP split
0.09
α₃
SwStr% plate discipline
0.05
α₄
Pitcher form slope L5 · inactive
off
🌤 Situational
6 factors · α₆–α₁₁
α₆
Park factor
0.12
α₁₀
Defensive OAA → RAA
0.07
α₉
Umpire run tendency
0.06
α₇
Weather — wind carry
0.05
α₁₁
Catcher framing runs · inactive
off
α₈
Weather — temperature
0.03
📈 Market & Lineup
7 factors · α₁₂–α₁₈
α₁₂
Lineup OPS split (hand)
0.11
α₁₆
O/U market constraint
0.08
α₁₃
Lineup quality delta
0.07
α₁₅
Home field advantage
0.06
α₁₇
Rolling RS/RA form
0.05
α₁₄
Bullpen availability · inactive
off
α₁₈
Rest days differential
0.03
⚡ Advanced Signals
15 factors · α₁₉–α₃₃
α₁₉
Baserunning quality (BsR) — lambda multiplier
0.03
α₂₀
Travel fatigue / back-to-back scheduling
0.02
α₂₁
Career H2H matchup ERA vs opponent · inactive
off
α₂₂
Statcast xERA blend
0.09
α₂₇
Bullpen 14-day rolling xFIP
0.07
α₂₄
K-BB% command signal
0.06
α₂₈
BvP career matchup
0.05
α₂₃
Times-through-order penalty
0.04
α₂₉
Arsenal fit score
0.03
α₂₅/₂₆
Opener / lineup certainty shrinkage
0.02
α₃₀
Stuff+ composite (FanGraphs sp_pitching)
0.02
α₃₂
Recent pitcher form — ERA last 3 starts
0.02
α₃₁
Precipitation probability shrinkage
0.01
α₃₃
Line movement — intraday Pinnacle sharp-money signal
0.02

Feature Engineering — 30+ Signals (v2/v3: 10 new signals added 2026 · a few currently inactive, see note above)

Pitcher
MLB Stats API · Baseball Savant
starter_xfip
Regressed ERA estimator — normalizes FIP by league fly-ball rate, removes BABIP luck and HR/FB variance. Primary pitcher quality signal, computed in-house from Statcast batted-ball data + MLB Stats components so it stays reliable (FanGraphs used when reachable). Blended 70% individual / 30% league average to reduce overconfidence at extremes.
pitcher_quality_xfip v4
Display-only approximation of the starter's xFIP contribution to the Poisson λ ratio. Shows the pick team's starter quality advantage vs opponent — e.g. +1.9pp when facing a 4.77 xFIP opponent with a 2.31 xFIP starter. Not a separate log-odds adjustment (would double-count); captures the Poisson core contribution explicitly.
bullpen_quality_xfip v4
Same approximation applied to the bullpen innings share (~42% at default projected IP). Surfaces bullpen quality differential as a visible signal when one team's 'pen is meaningfully better than the other's.
projected_starter_ip v4
Phase 1 IP projection: weighted blend of season IP/start (60%), last-5 starts (25%), and last-3 starts (15%) with Bayesian smoothing toward 5.3 IP league average. Rest adjustment (−0.4 short rest to +0.1 extra rest) and pitch-count fatigue applied. Controls the starter/bullpen xFIP split weight — a starter projected for 6.5 IP weights the starter signal at 72% vs 28% bullpen.
xera_blend v2
Statcast xERA (contact quality from exit velocity/launch angle) blended with xFIP — up to 30% weight at 300+ PA faced. Captures contact suppression independent of K/BB.
k_minus_bb_pct v2
K%−BB% (MLB Stats API) — direct command + dominance signal. Complements SwStr% by capturing walk suppression that swinging-strike rate misses.
tto_penalty v2
Times-through-order degradation: ~0.25 runs/TTO beyond the first. Batters adapt; a starter projected for 7+ IP faces measurably higher opponent scoring late.
platoon_fip_split
LHH/RHH xFIP delta weighted by opposing lineup handedness % (per-batter split, not team-level).
swstr_pct
Swinging-strike rate from Baseball Savant. Leading indicator for K% — predicts FIP before results converge.
bullpen_xfip_tiered
A tiered closer / setup / middle-relief xFIP blend. Fatigue-adjusted by recent appearance counts per tier.
bullpen_recent_xfip v2
14-day rolling bullpen xFIP blended 30/70 with season average. Captures in-season bullpen volatility that season-long averages smooth over.
arsenal_fit v2
Statcast pitch-mix matchup score (−1 to +1): how well the starter's pitch category usage (power FB, breaking, offspeed) suppresses the opposing lineup's handedness profile.
pitcher_form_slope inactive
OLS slope of xFIP over last 5 starts. Currently inactive — retired after the 5-start slope showed no reliable directional edge (mean-reversion; market already prices recent form).
stuff_plus v3
FanGraphs sp_pitching composite (100 = avg). Measures raw pitch quality — velocity, movement, release point — independently of outcomes. A small, best-effort signal (weight ~0.02) sourced only from FanGraphs; omitted when that source is unavailable.
recent_pitcher_era v3
Actual ERA over last 3 starts vs. season xFIP baseline. Captures hot/cold streaks and mechanical changes that peripheral stats deliberately strip out.
🌤
Situational
Savant · Open-Meteo · UmpScoreCards
park_factor
3yr static anchor blended with running 2026 RS/RA splits at ballpark GPS coordinates — updated daily.
weather_run_delta
Wind mph × bearing → HR carry/suppress; temp °F; humidity at first pitch. Converted to expected Δruns/9.
umpire_run_factor
Historical run-impact mean per ump (142 tracked). Tight-zone umps suppress scoring; wide-zone umps inflate λ.
def_oaa_delta
Savant OAA (outs above average) → runs above average vs. league mean → win probability delta.
catcher_framing_runs inactive
Savant pitch-level framing runs (called-strike prob over replacement) → run delta per game. Currently inactive — framing data not yet wired into the live pipeline.
precip_probability v3
Open-Meteo forecast precipitation probability (0–1) at game time. Above 30%: shrinks adjustment stack up to 12%, reflecting higher environmental variance. Domed stadiums unaffected.
📈
Lineup & Market
MLB Stats API · Pinnacle / Odds API
lineup_ops_split
Per-batter OPS vs. LHP/RHP from posted lineups only. Weighted OPS delta vs. team season mean, matched to starter's hand.
lineup_quality_delta
Today's lineup aggregate OPS vs. team season average. Detects rest-day lineups and injury absences.
lineup_certainty v2
When lineups aren't confirmed at pick time, the adjustment stack is shrunk 5% toward the Poisson base — correctly reducing confidence without distorting the run-environment estimate.
bvp_career_matchup v2
Career batter-vs-pitcher xwOBA from Baseball Savant, PA-weighted regression toward league average. Capped ±1pp per batter, ±2.5pp per team. Captures genuine matchup history without overfitting small samples. Displayed on pick cards as Career Matchup.
starter_rest_adj
Days since last start + prior pitch count penalty. Captures rest-days fatigue only — pitch quality (xFIP) is handled separately via the λ ratio. Displayed as Starter Rest on pick cards. Short rest (≤3 days): −2.5pp. Extra rest: +0.5pp. Prior start 105+ pitches: additional penalty.
bullpen_availability inactive
Binary fatigue flag: closer/setup used 2+ times in last 3 days → tier downgrade in bullpen xFIP blend. Currently inactive — availability data not yet wired into the live pipeline.
opener_shrinkage v2
When a starter is TBD or listed as an opener, the full adjustment stack is shrunk 20% — the xFIP-based projection is unreliable and the model should not over-project on opener games.
novig_pinnacle_prob
Pinnacle moneyline de-juiced via simultaneous margin strip. Sharpest market benchmark — hold ≈ 2.5%.
book_total_constraint
Market O/U λ-scales the Poisson simulation — prevents model run totals from diverging sharply from the sharp total market.

From Model to Pick — A Practical Example

1
Dual-Poisson base
Team RS/RA, park factor, and starter xFIP generate λ_home = 4.2 runs, λ_away = 3.8 runs → P(home wins) = 53.1%
2
30+ signals
Home starter Stuff+ 118 (+0.3pp), lineup quality +0.8pp, TTO penalty −0.4pp, umpire wide zone +0.2pp, BvP +0.5pp → net +1.4pp adjustment
3
Platt calibration
Logit-scaled by the fitted Platt calibration → final model probability: 54.5%
4
Market comparison
Pinnacle no-vig says home team wins 50.3%. Model says 54.5%. Gap = +4.2pp edge → pick is posted at +148 best available odds.
This pick's signal breakdown appears in the Details section of each pick card on the Today's Picks page. CLV is tracked at close to verify the market agreed with the model's direction.

Out-of-Sample Backtest (Hypothetical) — 2022–2024 · Flat $100/Bet · 4pp+ Edge

These are hypothetical backtest results. The 2022–2024 figures are simulated on historical data with the current model design — not money actually wagered — and the model was developed knowing this period, so live results will differ. For the real, forward-tested record see Model Dashboard and Pick History. Past performance does not guarantee future results.
Games
Backtested
7,359
2022–2024 · OOS only
Sharpe
Ratio
1.82
annualized · risk-adj.
Max
Drawdown
−12.4u
peak-to-trough worst run
Brier
Score
0.234
vs. 0.250 market · ↓ better
Backtested ROI by Season — 2022–2024 (Hypothetical)
Hypothetical backtest. Flat $100/bet · 4pp+ edge signals only · vs. −4.0% baseline (bet every game, full vig)
2022
+0.7%
975 bets · +6.8u
2023
+9.1%
1,093 bets · +99.5u
2024
+10.4%
977 bets · +101.6u
Baseline
−4.0%
all games · vig drain
🏃 Run Line Calibration
Platt scaling — RL-specific constants, fitted OOS 2022–2024
p_rl_cal = σ(A_rl · logit(p_rl_raw) + B_rl) = 1 / (1 + exp((A_rl · logit(p_rl_raw) + B_rl)))
Parameter interpretation
A_rl < 1 // < 1: slight overconfidence; // less shrinkage than ML model B_rl < 0 // small away-team correction
Backtest results — 7,237 games, OOS 2022–2024
Home cover rate: 35.7% actual vs 35.6% model Direction acc.: 64.7% (better side predicted) Away +1.5, 5pp edge, 130 juice: Model ROI: +22.3% // vs +13.8% blind baseline Win rate: 67.2% // vs 65.5% at ≥0pp edge Away +1.5 win rate vs model edge: 0pp edge 65.5% 5pp edge 67.2% 8pp edge 68.4% 10pp edge 69.6% // genuine discriminatory power
RL edge = p_rl_cal − p_nv_Pinnacle_RL. Logged at edge ≥ 2pp — a deliberately low gate that keeps the calibration sample full; run-line ROI uplift concentrates at ≥ 5pp edge. CLV tracked separately via Pinnacle spreads pre-game closing line snapshot. Home −1.5 volume is low (<40 bets at ≥8pp edge over 3 seasons) — treated as a specialty bet requiring high model confidence.
📊 Over/Under Calibration
Platt scaling — O/U constants, fitted OOS 2022–2024
p_over_cal = σ(A_ou · logit(p_over_raw) + B_ou) = 1 / (1 + exp((A_ou · logit(p_over_raw) + B_ou)))
Parameter interpretation
A_ou 1 // heavy shrinkage toward 50% — // model significantly over-expresses // confidence on totals B_ou < 0 // corrects systematic over-prediction // on high-scoring games
Status — informational, not separately bet
Synthetic-line ROI @ 0pp edge: +3.0% Synthetic-line ROI @ 2pp edge: 1.4% Bias by total line: < 7.5 runs: 1.06 run underestimate 10.0 runs: +0.81 run overestimate // O/U market prices now captured live (real // over/under odds shown in Other Markets card). // Standalone O/U bets deferred until live // edge evidence accumulates over several weeks.
Real O/U book prices (best execution + Pinnacle no-vig benchmark) are fetched daily and displayed in each pick card's Other Markets section. The heavy Platt shrinkage (much stronger than the ML model's) reveals the Poisson model has weaker calibration on run totals — likely due to left-tail inflation (low-scoring games more common than Poisson predicts at high-k values).

Honest Limitations — What This Model Does & Doesn't Do

  • Backtests flatter themselves. The 2022–2024 numbers are hypothetical and were tuned over the same period they report. The live, timestamped record is the real test — judge the model there.
  • The live sample is still small. A few months of forward picks can't yet confirm a durable edge; expect wide swings.
  • CLV is still being validated. Closing-line value is the durability signal we trust most, but the live sample is thin and not yet conclusive — we'd rather show it honestly than over-claim.
  • Market-informed, not independent. The model uses the sharp market as both a stabilizing input and a benchmark, so by design it won't diverge wildly from Pinnacle.
  • Blind to late news. Once lineups post, the model doesn't react to scratches, first-pitch weather swings, or in-game information.
  • A real edge still loses often. Even genuine value loses a large share of the time — only flat, disciplined, bankroll-aware staking survives the variance.
Get the daily picks by email — free while we build the 2026 track record (it won't be free forever). One email a day, only on days with picks.
✓ You're on the list — picks coming your way.