NEP 50: Promotion rules for Python scalars

seberg · June 1, 2022, 3:48pm

I would like to share the first formal draft of

NEP 50: Promotion rules for Python scalars

with everyone. The full text can be found here:

https://numpy.org/neps/nep-0050-scalar-promotion.html

NEP 50 is an attempt to remove value-based casting/promotion. We wish to replace it with clearer rules for the resulting dtype when mixing NumPy arrays and Python scalars. As a brief example, the proposal allows the following (unchanged):

>>> np.array([1, 2, 3], dtype=np.int8) + 100
np.array([101, 102, 103], dtype=np.int8)

While clearing up confusion caused by the value-inspecting behavior that we see sometimes, such as:

>>> np.array([1, 2, 3], dtype=np.int8) + 300
np.array([301, 302, 303], dtype=np.int16)  # note the int16

Where 300 is too large to fit an int8. As well as removing the special behavior of 0-D arrays or NumPy scalars:

>>> res = np.array(1, dtype=np.int8) + 100
>>> res.dtype
dtype('int64')

This is the continuation of a long discussion (see the “Discussion” section), including the poll I once posted: Poll: Future NumPy behavior when mixing arrays, NumPy scalars, and Python scalars

I would be happy for any feadback, be it just editorial or fundamental discussion. There are many alternatives which I have tried to capture in the NEP.

For smaller edits, don’t hesitate to open a NumPy PR, or propose edits on my branch (you can use the edit button to create a PR): numpy/nep-0050-scalar-promotion.rst at nep50 · seberg/numpy · GitHub

An important part of moving forward will be assessing the real world impact. To start that process, I have created a branch as a draft PR (at this time): API: Introduce optional (and partial) NEP 50 weak scalar logic by seberg · Pull Request #21626 · numpy/numpy · GitHub

It is missing some parts, but should allow preliminary testing. The main missing part is that the integer warnings and errors are less strict than proposed in the NEP.
It would be invaluable to get a better idea to what extent existing code, especially end-user code, is affected by the proposed changes.

Thanks in advance for any input! This is a big, complicated proposal, but finding a way forward will hopefully clear up a source of confusion and inconsistencies that make both maintainers and users life harder.

seberg · July 7, 2022, 1:07am

As a brief update on this, as noted in the NEP (Note at the end of the abstract), our nightly wheels can now be used to try out the first changes here. To use you have to install NumPy from the nightlies:

pip install -i https://pypi.anaconda.org/scipy-wheels-nightly/simple numpy --upgrade

(I added an --upgrade in case you have a numpy version already). And then run Python e.g. with:

NPY_PROMOTION_STATE=weak ipython

Please see the NEP note for more information. As of now, especially the error when integers are too large is missing (this is added in an open PR).

It would be very interesting to hear whether your use case (e.g. scripts) is affected by the changes! Right now mainly scipy and sklearn were tried, but for this change I believe the impact on end-users is much more important than that on libraries.