Hi team,
scipy.stats.multinomial
currently forces the p
shape parameter array to sum to 1.0
, silently ignoring the last element of each row.
import numpy as np
from scipy import stats
res = stats.multinomial.entropy(10, [0.25, 75])
ref = stats.multinomial.entropy(10, [0.25, 0.75])
np.testing.assert_equal(res, ref) # passes, no warning
This behavior is documented, but has led to issues like gh-22565 and gh-11860.
A new PR, gh-22585, proposes to warn when the rows don’t sum to 1.0
within a fixed tolerance:
res = stats.multinomial.entropy(10, [0.25, 75])
# <ipython-input-2-5540b560d502>:3: FutureWarning: Some rows of `p` do not sum to 1.0 within tolerance of eps=1e-15. Currently, the last element of these rows is adjusted to compensate, but this condition will produce NaNs beginning in SciPy 1.18.0. Please ensure that rows of `p` sum to 1.0 to avoid futher disruption.
# res = stats.multinomial.entropy(10, [0.25, 75])
until SciPy 1.8.0, when it would begin producing NaNs. This is consistent with the behavior of other distributions in which invalid shape parameters produce a NaN, e.g.
stats.binom(n=10, p=-1).entropy() # nan, no warning
Comments? Concerns? Please join the conversation in gh-22585.
Thanks!
Matt