Back

Mathematical Model

The Mathematics of Baseball Prediction

The Logistic Regression Model
The mathematical foundation of our baseball prediction engine

The Logistic Function

P(outcome) = 1 / (1 + e-z)

where z = β0 + β1x1 + β2x2 + ... + βnxn

The logistic function transforms any input (z) into a probability between 0 and 1. This makes it perfect for predicting baseball outcomes, where we need to estimate the probability of events like walks, strikeouts, or hits.

In our model, z is a linear combination of player statistics (xi) and their corresponding coefficients (βi). These coefficients determine how much weight each statistic has in the final prediction.

z = β₀ + β₁x₁ + β₂x₂ + ...-6-303600.51ProbabilityP = 0.5 when z = 0
Implementation in Our Codebase
How we've implemented the mathematical model in our prediction engine
// Calculate base probabilities using logistic regression
const walkProb = logisticRegression(
  REGRESSION_COEFFICIENTS.WALK.CONSTANT +
  REGRESSION_COEFFICIENTS.WALK.BATTER_WALK_COEF * batter.stats.walkPct +
  REGRESSION_COEFFICIENTS.WALK.PITCHER_BB9_COEF * pitcher.stats.bb9
)

// Logistic regression function
function logisticRegression(z: number): number {
  return 1 / (1 + Math.exp(-z))
}

Implementation Notes

  • We calculate probabilities for all possible outcomes (Single, Double, Triple, Home Run, Walk, Strikeout).
  • The remaining probability is assigned to "Out" (any other out that's not a strikeout).
  • We apply additional adjustments for factors like handedness matchups and historical data.
  • The fixed walk coefficients have been implemented in our codebase to provide more accurate predictions.