I’ve been using the skimage package for a while now and recently began contributing to the project. Maintainers on my PR (@lagru, @michaelbratsch) have shown interest in starting a general discussion about defining a coordinate reference system (CRS). I hope it’s alright to kick off this conversation here, and I look forward to reaching a conclusion so we can implement the necessary changes.
While adding rotation functionality to a 3D draw function (e.g., drawing ellipsoids), I realized that skimage lacks a clearly defined CRS. I initially considered using napari’s CRS but was informed by Juan that it uses a left-handed coordinate system (napari#4633). Alternatively, we might use SciPy’s CRS since we already utilize many of its functions. The rotation module suggests it may align with matplotlib’s definition.
Why Is CRS Important for Rotation?
In 2D, defining rotation is straightforward with one angle and one direction convention, typically counterclockwise as positive. In 3D, the situation is more complex, involving three angles, the reference system for these angles (body or independent axis), rotation type (active or passive), and rotation order (see Euler angles). All these factors depend on the chosen CRS.
CRS Options
Here are some options for the CRS:
Matplotlib/SciPy Convention:
(x, y, z) ↔ (horizontal right, vertical up, out of plane).
This is different from the (0, 1, 2)/(plane, row, col) axis system used commonly in skimage.
Current (0, 1, 2)/(plane, row, col) Axis System:
(x, y, z) ↔ (vertical down, horizontal right, out of plane).
Compared to the present napari axes display, this option changes the direction of the 0/plane-axis to come out of the plane, making it a right-handed CRS. (right now equivalent to sending scale=(-1,1,1) to flip the 0th axis order in napari.
Please suggest any other viable options or share your views on the ones above. Establishing a clear CRS is crucial for consistency and will directly impact the PRs involving 3D spatial transforms, like #7479 and #7503. Hopefully, napari can use this same choice of CRS, which “solves” its left handed CRS!
Finally, this CRS should be clearly documented, similar to how data types are documented.
It will come as no surprise that I favour option 2 , but also, maybe, option 3: option 2 is the default CRS, but users can configure the CRS for reletvant operations.
I’ll also note that axis names are completely arbitrary, only the handedness matters. See this post by Zachary Pincus (and the surrounding context).
Based on my understanding (these things get so confusing!), the issue here for defining rotations is different than just defining access names for axis dimensions. The direction of the axis needs to be defined too. This in turn effects the handedness of CRS.
For pedagogical purpose, it will be good to have an opinionated default (preferably which makes the most sense - option 2), with the user having the option to change it. Then, all of this should be documented in user docs.
I found a case where axes names might be important: the user specifying the order of rotation. Right now, scipy allows passing a string to its euler rotation function as a string of rotation axis (e.g. ‘XYZ’). If we are to add a similar functionality, we need to ask this from the user, but the axis must be named for this to make sense. (note- rotation is not commutative, so the order matters)
Note that “arbitrary naming” and “no naming” are not the same thing. My suggestion is that we provide the ability to provide axis names, and otherwise use axis ordering by default (and map the axis order to the SciPy names internally). e.g. skimage.transform.rotate(array, angles, axes=(0, 1, 0)).
I seem to remember some discussion with @stefanv about the n-dimensional generalisation — you don’t rotate around an axis, you rotate from one vector to another vector. (There is always a 2D plane of rotation, and a direction of rotation.) So, maybe the above expression should in general be axes=((1, 2), (0, 2), (1, 2)). We can of course have the simpler convention for 3D, which we then translate to the general form (which has the advantage that it makes it easier to figure out where to put things in the rotation matrix).
My suggestion is that we provide the ability to provide axis names, and otherwise use axis ordering by default (and map the axis order to the SciPy names internally). e.g. skimage.transform.rotate(array, angles, axes=(0, 1, 0)).
I agree! This will still require the coordinate system to be defined. Which directions does axis 0, 1, 2 point? And if the user is providing names - what corresponds to x, y, z etc. Hence the options in the original post. I will just go ahead and implement option 2, as everyone here favors that!
I seem to remember some discussion with @stefanv about the n-dimensional generalisation — you don’t rotate around an axis, you rotate from one vector to another vector. (There is always a 2D plane of rotation, and a direction of rotation.)
I don’t know if it is worth it to generalize rotation to n-dim for an imaging package, unless someone wants to rotate a 5D = (T, C) + 3D image matrix!
The 3D situation is a tad confusing to me. We use XY / row, cols in 2D, but then ZXY / pln, row, col in 3D. Intuitively, most users would probably expect XYZ or row, col, pln? How does this intuition map to the handed-ness, or does it at all?
@lagru asked whether we could just change the docstrings, without ill effect, and I suspect that might be fine. And the way it should be, I think is M, N[, P], and that the user guide is wrong, at least as far as the intuition goes that a 3D volume is a stack of 2D images.
We probably want to update the crash course to use the more generic (M, N[, P]) which is consistent with most of our functions. In this specific case it may refer to (row, col) but if all axis get the same treatment we shouldn’t make those assumptions in our API. And in our guide we should just state something like “in this case M refers to the row” or something like that.
I think a lot of the confusion is from P being the same letter that “plane” starts with. I’m starting to think Zacchary’s “N0, N1[, N2]” is the best option because it comes with the least mental baggage. Except, as I just told @stefanv in a DM, that also doesn’t match with NumPy broadcasting, which pads on the left. So the only way that (a) does not come with mental baggage, (b) generalises to any dimension (2, 3, and beyond), and (c) matches broadcasting conventions is [[[n_{-4},] n_{-3},] n_{-2}, n_{-1}]. (til there’s no (obvious?) latex formatting in this instance. Edit: there is! wrap inline latex expressions in single $, wrap multi-line expressions in double $s.)
Then, handedness is specified by the documentation and nothing else.
If we go with ZYX, then some folks will expect this to be a mirror of XYZ and thus left-handed. but even then, we could just specify that it’s right-handed.
Either way, I feel rather strongly that the “optional” axes / dim sizes should be in the front, because that is what matches np.stack defaults and the NumPy broadcasting rules.