1932 Hotelling-Solomons Inequality: Mean and Median Proven to Differ by No More Than One Standard Deviation

Image for 1932 Hotelling-Solomons Inequality: Mean and Median Proven to Differ by No More Than One Standard Deviation

A fundamental yet "surprising and little-known" result in classical statistics, first published in a 1932 paper by Harold Hotelling and Leonard M. Solomons, has recently been highlighted on social media by Peyman Milanfar. The inequality establishes a precise relationship between the mean (μ), median (m), and standard deviation (σ) of a dataset, stating that the absolute difference between the mean and median is always less than or equal to the standard deviation: |μ−m| ≤ σ.

The 1932 paper, titled "The limits of a measure of skewness," also revealed an even tighter bound for unimodal densities, where the distribution has a single peak. In such cases, the absolute difference between the mean and median is bounded by approximately 0.7746 times the standard deviation, significantly constraining their potential divergence. This specific bound for unimodal distributions underscores the consistent behavior of these central tendency measures under certain conditions.

The intuitive significance of this inequality, as noted by Delip Rao in a social media comment, lies in its explanation of how two central measures—one sensitive to outliers (mean) and one robust (median)—relate to the overall data spread. The principle essentially states that the difference between these measures cannot exceed the data's variability, providing a "duh" explanation that outliers pull the mean away from the median, but only to a bounded extent relative to the standard deviation.

Historically, the Hotelling-Solomons inequality was introduced as a measure of skewness, a statistical property describing the asymmetry of a probability distribution. The relationship between the mean and median is a key indicator of skewness; in a perfectly symmetric distribution, the mean and median are equal. This classical result provides a quantitative limit to how much these two measures can diverge, even in skewed distributions.

The enduring relevance of this 93-year-old statistical principle is evident in contemporary research. Recent academic work, such as a 2023 paper by Yuzo Maruyama, has explored "sharper bounds" for the Hotelling-Solomons inequality, introducing new limits that depend on sample size and are strictly smaller than the original bound of 1. These ongoing refinements demonstrate the foundational importance of this classical result in understanding data distribution and robust statistical analysis.