In RFC: Naming convention for generalised ufuncs in special · Issue #20448 · scipy/scipy · GitHub, @izaid proposes a naming convention for certains pairs of functions in `scipy.special`

.

First I’ll give the general background. Here is a prototypical example for this kind of pair of functions:

The function scipy.special.sph_harm for computing spherical harmonics has signature `sph_harm(m, n, theta, phi, out=None)`

where `m`

and `n`

are `array_like`

s of integers giving the order and degree of the spherical harmonic respectively, and `theta`

and `phi`

are angles giving spherical coordinates for a point on the surface of a sphere.

This is an instance of a rather common situation where there are integral parameters, and the result is computed through a recurrence relation on these parameters, so e.g. computing `sph_harm(m, n, ...)`

requires computing `sph_harm(i, j, ...)`

for all `0 <= j <= n`

, `0 <= i <= m`

. For ufunc `sph_harm`

, if arrays are passed in for `m`

and `n`

, redundant work is done to recompute the recurrence for each pair of values from these arrays.

@izaid introduced a `gufunc`

version of `sph_harm`

called `sph_harm_all`

whose scalar kernel returns the table of all values computed up until `sph_harm(m, n, ...)`

. The gufunc version only takes integers for `m`

and `n`

and computes the entire table of values for `0 <= i <= m`

, `0 <= j <= n`

.

The gufunc version does not supersede the ufunc version, because if one only needs the results for one or a small number of `(m, n)`

pairs for large `m`

and `n`

, storing and returning the entire table of results will result in excessive memory use. The ufunc version does not store the entries of the table during the recurrence, only storing what is needed to produce the final result. Both versions of the function are useful.

@izaid proposes the convention of naming these pairs of functions like `sph_harm`

and `sph_harm_all`

, the `all`

signifying that the result is computed for all values of the input parameters computed through the recurrence for obtaining the final result. There has been some objection to `all`

on the grounds that it’s not clear from the name “all of what?”, but no one has been able to think of a better name, and I think this one is good enough, particularly if it is well documented, and becomes a convention for all such pairs of functions.

We have some existing pairs of functions like this, such as pbvv and pbvv_seq, but the `seq`

seems specific to the 1d case where there is only one parameter involved in the recurrence. (`pbvv_seq`

is also not a gufunc, and doesn’t take array arguments for any of its parameters).

Another suggestion has been to have only a single function e.g. `sph_harm`

and to change the behavior based on a keyword only flag. I don’t think there is anything inherently wrong with this approach, but @izaid has pointed out this leads to data dependent data shapes which array API standard recommends avoiding, because this can cause problems for array libraries which build compute graphs such as Dask and Jax. This settles the tie in my mind between two API options for which I have no real preference.

Feel free to post in gh-20448 if you’re interested in joining the discussion!