INDEPENDENT BASEBALL PROJECTIONS

Methodology · Technical Reference · Dual-Poisson Win Probability Engine

30+ Signal Model

3 Signal Pathways

Dual-Poisson

Platt Calibrated

CLV Tracked

7,359 Games OOS

How Independent Baseball Projections Works — At a Glance

Projects expected runs for each team using a Dual-Poisson model calibrated on park, weather, pitcher quality, and lineup data
30+ signals refine the base win probability through three pathways — run-environment inputs to the run rates, log-odds adjustments to the win probability, and confidence-shrinkage multipliers (pitcher quality, lineup confirmation, bullpen form, BvP, precipitation, line movement, and more)
Platt-scaled on a 7,359-game 2022–24 out-of-sample backtest, lightly blended with 2026 live results as the live sample grows
Edge = blended model probability − Pinnacle no-vig price. Only picks where the model sees ≥4pp mispricing are posted as official plays. Pure model probability (no market input) is tracked separately for calibration — never using market-blended numbers to evaluate model skill.
Every pick logged with timestamp before first pitch. CLV tracked to closing line to validate the signal is real, not luck

Pipeline · Math · Adjustments · Features · Example · Validation · Limitations

Curious how the model is priced against the market today?

View Today's Picks →

Model Type

Dual-Poisson

λ_home + λ_away → CDF

Signal Pathways

3 · 30+ signals

λ-input · log-odds · shrink

Adjustments

30+ signals

logit(p) += Σ αᵢ

Calibration

Platt sigmoid

OOS fit + live blend

Bet Filter

edge ≥ 4pp

vs. no-vig Pinnacle

Sizing Method

Fractional Kelly / 3u

bankroll fraction

OOS Period

2022 – 2024

3 seasons, sequential

Brier Score

0.234

vs. 0.250 market · ↓ better

ℹ️

Independent Baseball Projections is a market-informed model

Raw team strength, pitching, lineup, park, weather, and situational signals generate the baseline projection. Market information — specifically the book O/U total and Pinnacle no-vig probability — is used as a stabilizing input (via the book_total_constraint factor, α₁₆) to reduce extreme run-total outputs and align the model with the sharp betting market's run environment. After Platt calibration, the final Independent Baseball Projections probability is compared against the no-vig Pinnacle market probability to identify remaining pricing gaps where the model and market diverge.

Because the market is used as both a stabilizing input and a comparison benchmark, Independent Baseball Projections should be understood as a market-informed model rather than a fully market-independent projection. This is disclosed transparently; the model still identifies genuine pricing gaps in approximately 15–25% of games per day.

Inference Pipeline — Input → Transform → Output at Each Stage

📡

Data Ingestion

Cron trigger · game schedule · ump assignments · weather coords

OUT

Raw JSON: odds, lineups, weather, ump_id, FIP data per game

⚙️

Feature Eng.

Raw JSON dicts per game

OUT

~50 numeric inputs per game (home/away): xFIP, xERA, K-BB%, OPS splits, BvP, arsenal fit, park, weather, ump…

λ Construction

RS₃ᵧᵣ, RS_L15, park_factor, book O/U

OUT

(λ_home, λ_away) floats — expected runs per team

📊

Poisson Win Prob

(λ_home, λ_away)

OUT

p_base ∈ (0,1) — exact CDF summation

Σα

Log-Odds Adj.

p_base + 30+ signal adjustments

OUT

p_adj — logit(p_base)+Σαᵢ → sigmoid

Platt Calib.

p_adj (raw model output)

OUT

p_cal — σ(A·logit(p_adj)+B) · fitted

⚡

Edge Detection

p_cal + p_nv_Pinnacle

OUT

edge float; bet_flag if ≥ 0.04

💰

Kelly Sizing

edge, decimal_odds, bankroll

OUT

stake (units); cap 3u; log → picks_log.csv

Core Mathematics — Formulas Behind Each Pipeline Stage▾

λ Run Expectancy Construction Base lambda — form-blended, park-adjusted λ_base = (w₁ \times RS₃ᵧᵣ + w₂ \times RS_L15) / G \times park_factor // RS₃ᵧᵣ: 3yr rolling RS/G \cdot RS_L15: last 15G avg Log-odds stacking — signal factors in logit space logit (p) = log (p / (1 - p)) logit (p_adj) = logit (p_base) + Σᵢ αᵢ // αᵢ \in ℝ \cdot logit prevents p \notin (0,1) p_adj = sigmoid (logit (p_adj)) The blend favors the multi-year base while still weighting recent (L15) form to capture hot/cold streaks without overreacting to small samples. Park adjustment applied symmetrically to both offensive and defensive λ.

📊 Poisson Distribution & Win Probability Run distributions — modeled independently R_home ~ Poisson(λ_home) R_away ~ Poisson(λ_away) P(X=k ; λ) = exp(−λ) · λ^k / k! Win probability — exact CDF (no simulation) P(home wins) = Σ_{k=1}^∞ P(R_home=k) · CDFPois(λ_away, k−1) // Tie → extra innings as 50/50 No Monte Carlo needed — win prob is computed exactly from Poisson CDFs. Teams are modeled as independent (no run correlation). Run-environment signals (park, weather, umpire, platoon, baserunning) enter via λ; log-odds adjustments and confidence-shrinkage multipliers then adjust the derived win probability.

σ Platt Scaling Calibration Sigmoid recalibration — fitted OOS 2022-2024 p_cal = σ (A \cdot logit (p_raw) + B) = 1 / (1 + exp (- (A \cdot logit (p_raw) + B))) Parameter interpretation A < 1 // < 1: model over-expresses // confidence; compress toward 50% B > 0 // home-team bias correction Fitted via MLE on the 2022-24 out-of-sample backtest, lightly blended with 2026 live outcomes as the sample grows (exact A, B not published). Brier Score: 0.234 model vs. 0.250 market — 6.4% improvement. A < 1 confirms the raw Poisson systematically over-prices favorites; calibration compresses the logit toward the empirical base rate.

⚖️ No-Vig Market Probability Strip American odds → raw implied probability p_raw(ml) = { |ml| / (|ml| + 100) if ml < 0 (favorite) 100 / (ml + 100) if ml > 0 (underdog) } Simultaneous both-side margin removal hold = p_raw(h_ml) + p_raw(a_ml) − 1 p_nv_home = p_raw(h_ml) / (p_raw(h_ml) + p_raw(a_ml)) // Pinnacle benchmark: hold ≈ 2.5% Simultaneous stripping avoids asymmetric vig attribution — equivalent to the multiplicative method. Pinnacle's ~2.5% hold makes its no-vig the tightest available reflection of sharp market consensus.

⚡ Edge Detection & Stake Sizing Model edge vs. no-vig Pinnacle edge = p_cal − p_nv_Pinnacle // Bet flagged if edge ≥ 0.04 Fractional Kelly sizing — conservative multi-step reduction b = decimal_odds − 1 p = p_cal ; q = 1 − p f* = 0.5 × (b · p − q) / b // units = f* × bankroll ; cap: 3u Stake sizes use a conservative fractional Kelly approach: base fraction × verdict multiplier × display fraction. CONDITIONAL picks (4–6pp) are full-sized; REDUCED CONF. picks are sized at half the normal rate; FLAGGED picks at ¾. The 3u cap prevents over-sizing on high-edge outliers.

📈 Closing Line Value — Process Validation Definition — Pinnacle no-vig both sides CLV = p_close_nv - p_open_nv // Snapshot: pre-game closing line (CT) Interpretation CLV > 0 \to market moved toward pick CLV < 0 \to market moved against pick // +1% avg CLV \approx long-run positive EV // Separates skill from short-run variance CLV is the gold standard for betting process validation — a model can run cold for 50 games but still show positive CLV, confirming the signal is real. Tracked per-bet in picks_log.csv, surfaced via --poster-stats .

Signal Importance — Avg. Abs. Log-Odds Contribution per Factor (αᵢ) · Relative Magnitudes Approximate · v2 = 2026 update

The model's 30+ signals flow through three pathways — run-environment inputs to the Poisson run rates, log-odds adjustments to the win probability, and confidence-shrinkage multipliers. They are grouped by domain below for readability; importances reflect the model's design and magnitudes are approximate. Several factors marked inactive below do not currently affect live picks — retired after backtest validation (bullpen fatigue, pitcher-form slope) or awaiting data wiring (catcher framing, bullpen availability, career H2H matchup). In practice, live discrimination is concentrated in the pitcher-xFIP and run-environment core; the situational adjustments are deliberately small.

⚾ Pitcher

5 factors · α₁–α₅

α₁

Starter xFIP matchup

0.18

α₅

Bullpen xFIP tiered

0.10

α₂

Platoon FIP split

0.09

α₃

SwStr% plate discipline

0.05

α₄

Pitcher form slope L5 · inactive

off

🌤 Situational

6 factors · α₆–α₁₁

α₆

Park factor

0.12

α₁₀

Defensive OAA → RAA

0.07

α₉

Umpire run tendency

0.06

α₇

Weather — wind carry

0.05

α₁₁

Catcher framing runs · inactive

off

α₈

Weather — temperature

0.03

📈 Market & Lineup

7 factors · α₁₂–α₁₈

α₁₂

Lineup OPS split (hand)

0.11

α₁₆

O/U market constraint

0.08

α₁₃

Lineup quality delta

0.07

α₁₅

Home field advantage

0.06

α₁₇

Rolling RS/RA form

0.05

α₁₄

Bullpen availability · inactive

off

α₁₈

Rest days differential

0.03

⚡ Advanced Signals

15 factors · α₁₉–α₃₃

α₁₉

Baserunning quality (BsR) — lambda multiplier

0.03

α₂₀

Travel fatigue / back-to-back scheduling

0.02

α₂₁

Career H2H matchup ERA vs opponent · inactive

off

α₂₂

Statcast xERA blend

0.09

α₂₇

Bullpen 14-day rolling xFIP

0.07

α₂₄

K-BB% command signal

0.06

α₂₈

BvP career matchup

0.05

α₂₃

Times-through-order penalty

0.04

α₂₉

Arsenal fit score

0.03

α₂₅/₂₆

Opener / lineup certainty shrinkage

0.02

α₃₀

Stuff+ composite (FanGraphs sp_pitching)

0.02

α₃₂

Recent pitcher form — ERA last 3 starts

0.02

α₃₁

Precipitation probability shrinkage

0.01

α₃₃

Line movement — intraday Pinnacle sharp-money signal

0.02

Feature Engineering — 30+ Signals (v2/v3: 10 new signals added 2026 · a few currently inactive, see note above)

⚾

Pitcher

MLB Stats API · Baseball Savant

starter_xfip

Regressed ERA estimator — normalizes FIP by league fly-ball rate, removes BABIP luck and HR/FB variance. Primary pitcher quality signal, computed in-house from Statcast batted-ball data + MLB Stats components so it stays reliable (FanGraphs used when reachable). Blended 70% individual / 30% league average to reduce overconfidence at extremes.

pitcher_quality_xfip v4

Display-only approximation of the starter's xFIP contribution to the Poisson λ ratio. Shows the pick team's starter quality advantage vs opponent — e.g. +1.9pp when facing a 4.77 xFIP opponent with a 2.31 xFIP starter. Not a separate log-odds adjustment (would double-count); captures the Poisson core contribution explicitly.

bullpen_quality_xfip v4

Same approximation applied to the bullpen innings share (~42% at default projected IP). Surfaces bullpen quality differential as a visible signal when one team's 'pen is meaningfully better than the other's.

projected_starter_ip v4

Phase 1 IP projection: weighted blend of season IP/start (60%), last-5 starts (25%), and last-3 starts (15%) with Bayesian smoothing toward 5.3 IP league average. Rest adjustment (−0.4 short rest to +0.1 extra rest) and pitch-count fatigue applied. Controls the starter/bullpen xFIP split weight — a starter projected for 6.5 IP weights the starter signal at 72% vs 28% bullpen.

xera_blend v2

Statcast xERA (contact quality from exit velocity/launch angle) blended with xFIP — up to 30% weight at 300+ PA faced. Captures contact suppression independent of K/BB.

k_minus_bb_pct v2

K%−BB% (MLB Stats API) — direct command + dominance signal. Complements SwStr% by capturing walk suppression that swinging-strike rate misses.

tto_penalty v2

Times-through-order degradation: ~0.25 runs/TTO beyond the first. Batters adapt; a starter projected for 7+ IP faces measurably higher opponent scoring late.

platoon_fip_split

LHH/RHH xFIP delta weighted by opposing lineup handedness % (per-batter split, not team-level).

swstr_pct

Swinging-strike rate from Baseball Savant. Leading indicator for K% — predicts FIP before results converge.

bullpen_xfip_tiered

A tiered closer / setup / middle-relief xFIP blend. Fatigue-adjusted by recent appearance counts per tier.

bullpen_recent_xfip v2

14-day rolling bullpen xFIP blended 30/70 with season average. Captures in-season bullpen volatility that season-long averages smooth over.

arsenal_fit v2

Statcast pitch-mix matchup score (−1 to +1): how well the starter's pitch category usage (power FB, breaking, offspeed) suppresses the opposing lineup's handedness profile.

pitcher_form_slope inactive

OLS slope of xFIP over last 5 starts. Currently inactive — retired after the 5-start slope showed no reliable directional edge (mean-reversion; market already prices recent form).

stuff_plus v3

FanGraphs sp_pitching composite (100 = avg). Measures raw pitch quality — velocity, movement, release point — independently of outcomes. A small, best-effort signal (weight ~0.02) sourced only from FanGraphs; omitted when that source is unavailable.

recent_pitcher_era v3

Actual ERA over last 3 starts vs. season xFIP baseline. Captures hot/cold streaks and mechanical changes that peripheral stats deliberately strip out.

🌤

Situational

Savant · Open-Meteo · UmpScoreCards

park_factor

3yr static anchor blended with running 2026 RS/RA splits at ballpark GPS coordinates — updated daily.

weather_run_delta

Wind mph × bearing → HR carry/suppress; temp °F; humidity at first pitch. Converted to expected Δruns/9.

umpire_run_factor

Historical run-impact mean per ump (142 tracked). Tight-zone umps suppress scoring; wide-zone umps inflate λ.

def_oaa_delta

Savant OAA (outs above average) → runs above average vs. league mean → win probability delta.

catcher_framing_runs inactive

Savant pitch-level framing runs (called-strike prob over replacement) → run delta per game. Currently inactive — framing data not yet wired into the live pipeline.

precip_probability v3

Open-Meteo forecast precipitation probability (0–1) at game time. Above 30%: shrinks adjustment stack up to 12%, reflecting higher environmental variance. Domed stadiums unaffected.

📈

Lineup & Market

MLB Stats API · Pinnacle / Odds API

lineup_ops_split

Per-batter OPS vs. LHP/RHP from posted lineups only. Weighted OPS delta vs. team season mean, matched to starter's hand.

lineup_quality_delta

Today's lineup aggregate OPS vs. team season average. Detects rest-day lineups and injury absences.

lineup_certainty v2

When lineups aren't confirmed at pick time, the adjustment stack is shrunk 5% toward the Poisson base — correctly reducing confidence without distorting the run-environment estimate.

bvp_career_matchup v2

Career batter-vs-pitcher xwOBA from Baseball Savant, PA-weighted regression toward league average. Capped ±1pp per batter, ±2.5pp per team. Captures genuine matchup history without overfitting small samples. Displayed on pick cards as Career Matchup.

starter_rest_adj

Days since last start + prior pitch count penalty. Captures rest-days fatigue only — pitch quality (xFIP) is handled separately via the λ ratio. Displayed as Starter Rest on pick cards. Short rest (≤3 days): −2.5pp. Extra rest: +0.5pp. Prior start 105+ pitches: additional penalty.

bullpen_availability inactive

Binary fatigue flag: closer/setup used 2+ times in last 3 days → tier downgrade in bullpen xFIP blend. Currently inactive — availability data not yet wired into the live pipeline.

opener_shrinkage v2

When a starter is TBD or listed as an opener, the full adjustment stack is shrunk 20% — the xFIP-based projection is unreliable and the model should not over-project on opener games.

novig_pinnacle_prob

Pinnacle moneyline de-juiced via simultaneous margin strip. Sharpest market benchmark — hold ≈ 2.5%.

book_total_constraint

Market O/U λ-scales the Poisson simulation — prevents model run totals from diverging sharply from the sharp total market.

From Model to Pick — A Practical Example

Dual-Poisson base

Team RS/RA, park factor, and starter xFIP generate λ_home = 4.2 runs, λ_away = 3.8 runs → P(home wins) = 53.1%

30+ signals

Home starter Stuff+ 118 (+0.3pp), lineup quality +0.8pp, TTO penalty −0.4pp, umpire wide zone +0.2pp, BvP +0.5pp → net +1.4pp adjustment

Platt calibration

Logit-scaled by the fitted Platt calibration → final model probability: 54.5%

Market comparison

Pinnacle no-vig says home team wins 50.3%. Model says 54.5%. Gap = +4.2pp edge → pick is posted at +148 best available odds.

This pick's signal breakdown appears in the Details section of each pick card on the Today's Picks page. CLV is tracked at close to verify the market agreed with the model's direction.

Out-of-Sample Backtest (Hypothetical) — 2022–2024 · Flat $100/Bet · 4pp+ Edge

These are hypothetical backtest results. The 2022–2024 figures are simulated on historical data with the current model design — not money actually wagered — and the model was developed knowing this period, so live results will differ. For the real, forward-tested record see Model Dashboard and Pick History. Past performance does not guarantee future results.

Games
Backtested

7,359

2022–2024 · OOS only

Sharpe
Ratio

1.82

annualized · risk-adj.

Max
Drawdown

−12.4u

peak-to-trough worst run

Brier
Score

0.234

vs. 0.250 market · ↓ better

Backtested ROI by Season — 2022–2024 (Hypothetical)

Hypothetical backtest. Flat $100/bet · 4pp+ edge signals only · vs. −4.0% baseline (bet every game, full vig)

2022

+0.7%

975 bets · +6.8u

2023

+9.1%

1,093 bets · +99.5u

2024

+10.4%

977 bets · +101.6u

Baseline

−4.0%

all games · vig drain

Other Markets — Run Line & Over/Under Calibration▾

🏃 Run Line Calibration Platt scaling — RL-specific constants, fitted OOS 2022-2024 p_rl_cal = σ (A_rl \cdot logit (p_rl_raw) + B_rl) = 1 / (1 + exp (- (A_rl \cdot logit (p_rl_raw) + B_rl))) Parameter interpretation A_rl < 1 // < 1: slight overconfidence; // less shrinkage than ML model B_rl < 0 // small away-team correction Backtest results — 7,237 games, OOS 2022-2024 Home cover rate: 35.7% actual vs 35.6% model Direction acc.: 64.7% (better side predicted) Away + 1.5, \geq 5pp edge, - 130 juice: Model ROI: +22.3% // vs +13.8% blind baseline Win rate: 67.2% // vs 65.5% at \geq0pp edge Away + 1.5 win rate vs model edge: \geq 0pp edge \to 65.5% \geq 5pp edge \to 67.2% \geq 8pp edge \to 68.4% \geq 10pp edge \to 69.6% // genuine discriminatory power RL edge = p_rl_cal - p_nv_Pinnacle_RL. Logged at edge \geq 2pp — a deliberately low gate that keeps the calibration sample full; run-line ROI uplift concentrates at \geq 5pp edge. CLV tracked separately via Pinnacle spreads pre-game closing line snapshot. Home -1.5 volume is low (<40 bets at \geq8pp edge over 3 seasons) — treated as a specialty bet requiring high model confidence.

📊 Over/Under Calibration Platt scaling — O/U constants, fitted OOS 2022-2024 p_over_cal = σ (A_ou \cdot logit (p_over_raw) + B_ou) = 1 / (1 + exp (- (A_ou \cdot logit (p_over_raw) + B_ou))) Parameter interpretation A_ou ≪ 1 // heavy shrinkage toward 50% — // model significantly over-expresses // confidence on totals B_ou < 0 // corrects systematic over-prediction // on high-scoring games Status — informational, not separately bet Synthetic-line ROI @ \geq 0pp edge: +3.0% Synthetic-line ROI @ \geq 2pp edge: - 1.4% Bias by total line: < 7.5 runs: - 1.06 run underestimate \geq 10.0 runs: + 0.81 run overestimate // O/U market prices now captured live (real // over/under odds shown in Other Markets card). // Standalone O/U bets deferred until live // edge evidence accumulates over several weeks. Real O/U book prices (best execution + Pinnacle no-vig benchmark) are fetched daily and displayed in each pick card's Other Markets section. The heavy Platt shrinkage (much stronger than the ML model's) reveals the Poisson model has weaker calibration on run totals — likely due to left-tail inflation (low-scoring games more common than Poisson predicts at high-k values).

Honest Limitations — What This Model Does & Doesn't Do

Backtests flatter themselves. The 2022–2024 numbers are hypothetical and were tuned over the same period they report. The live, timestamped record is the real test — judge the model there.
The live sample is still small. A few months of forward picks can't yet confirm a durable edge; expect wide swings.
CLV is still being validated. Closing-line value is the durability signal we trust most, but the live sample is thin and not yet conclusive — we'd rather show it honestly than over-claim.
Market-informed, not independent. The model uses the sharp market as both a stabilizing input and a benchmark, so by design it won't diverge wildly from Pinnacle.
Blind to late news. Once lineups post, the model doesn't react to scratches, first-pitch weather swings, or in-game information.
A real edge still loses often. Even genuine value loses a large share of the time — only flat, disciplined, bankroll-aware staking survives the variance.

Get the daily picks by email — free while we build the 2026 track record (it won't be free forever). One email a day, only on days with picks.

✓ You're on the list — picks coming your way.