Full support for >4 channel images in imsave?

bethac07 · February 8, 2023, 1:59pm

Hey skimage team,

Apologies if I missed a discussion I should have found elsewhere - I checked here and in the GH, but it’s always possible I just totally missed something

I was wondering about the future of imsave; I did notice an open discussion issue about
possibly deprecating it entirely. Assuming it IS sticking around, I was wondering whether anyone had ever raised the idea of working towards removing the restrictions that it be (M,N,{1,3,4}) - I assume this is probably tricky due to not all possible backends supporting other color dimensionalities, but maybe it could be supported for some backends?.

Briefly, as highly-multichannel microscopy gets more and more popular and common, highly-multichannel (and in some cases, highly-multichannel AND many-plane 5D images) use cases are becoming more common. In CellProfiler, we deploy some tricks for working around the current saving API, but we’re realizing the way we currently do it will involve probably some rewrites of our image class. We’re totally capable of taking that on, but if there was enthusiasm for and momentum towards changing the underlying skimage behavior, we’d be happy to try to move our efforts there instead - I don’t think we’re probably the only tool that has butted up against this, and I suspect there may be more in years to come.

LMK what you think!

jni · February 8, 2023, 10:41pm

Hi @bethac07!!!

Well, as you might have gathered from my involvement in that discussion, imho you should be contributing to imageio, not skimage. To the extent that skimage improves here it’ll probably be as a thin wrapper around imageio.

At any rate, I think the iio folks will be grateful to get some more eyeballs on their code… Currently it’s pretty much just @FirefoxMetzger and a bit of Almar Klein keeping that going…

bethac07 · February 8, 2023, 11:15pm

Thanks for the speedy as always response @jni !

It looks like imageio v3 has removed the (M,N,{1,3,4}) restrictions from imwrite - is your suggestion then to just jump ship from using skimage.io for writing and just use that instead? Or will skimage be incorporating those changes too? We already have an imageio reader, so switching for writing is definitely fine too, we’re happy to use what’s good and what works!

jni · February 8, 2023, 11:25pm

Yes. That will continue to work regardless of what we do in skimage.

Maybe/probably, but it could take a long time.

stefanv · February 8, 2023, 11:26pm

Hi Beth,

Here’s the full code for the imread plugin that gets invoked on io.imread():

from functools import wraps
import numpy as np

from imageio.v2 import imread as imageio_imread, imsave

@wraps(imageio_imread)
def imread(*args, **kwargs):
    return np.asarray(imageio_imread(*args, **kwargs))

So, perfectly fine to use it independently or via skimage; the skimage interface is unlikely to change much.

FirefoxMetzger · February 9, 2023, 8:17am

At any rate, I think the iio folks will be grateful to get some more eyeballs on their code… Currently it’s pretty much just @FirefoxMetzger and a bit of Almar Klein keeping that going…

Thanks for the ping @jni . We would indeed be happy about more engagement at the repo-level There is some cool stuff I’d like to support (slowly warming up to the idea of entrypoints) but we currently lack the manpower to turn all of this into reality in a timely fashion.

working towards removing the restrictions that it be (M,N,{1,3,4}) - I assume this is probably tricky due to not all possible backends supporting other color dimensionalities, but maybe it could be supported for some backends?.

@bethac07 Which container/format (TIFF, PNG, …) do you have in mind? iio.imwrite(...) has no restriction regarding the shape of the (nd)image to read/write; however, most container formats I am aware of do, and no popular/widely used interchange formats support 5+ interleaved channels.

(one might be tempted to use TIFF, and it’s certainly popular in bioinformatics. While it is able to store a (..., H, W, C) shaped image in shaped or hyperstack flavor and reconstruct it in that shape when read, doing so is foot-shooting gallore. It kills read performance and results in comparatively large files that don’t compress well.)

Here’s the full code for the imread plugin that gets invoked on io.imread()

@stefanv Anything in particular holding you back from migrating from imageio.v2 to imageio.v3?

bethac07 · February 9, 2023, 3:03pm

@FirefoxMetzger , I don’t in any way disagree about TIFFs downsides, but for the vast majority of CellProfiler’s userbase (users with <1,000 images), the interoperability with all existing open source image analysis tools vastly outweighs them.

Right now, we offer jpeg, png, tiff, npy, and h5 outputs - we’re also probably going to add some sort of NGFF output format but it’s not implemented yet. We aren’t planning to offer 5D pngs/jpegs, and we don’t need to do anything fancy to support that in npy or h5 (or presumably any NGFFs), so really all of this is in the context of tiff.

stefanv · February 9, 2023, 4:37pm

Genuine question: is there any advantage to switching?

FirefoxMetzger · February 11, 2023, 5:35am

@bethac07 Oh, sorry, I think I wasn’t very clear. TIFF is the go-to bioinformatics format for a reason and you should definitely use it for most things. What you should avoid, however, is to use TIFF to store interleaved (channel-last) images with more than 3 color channels, i.e., the (H, W, C>4) images you were asking about.

Internally, TIFF stores images in pages, and a single page can only store images with either 1 channel (gray/binary/palette) or 3 channels (RGB/lab/…), i.e., with shapes (H, W) and (H, W, 3). Images with other shapes are broken down into a sequence of multiple pages which get stored alongside some metadata on how to reconstruct the original shape from this sequence of pages. For (H, W, C > 4) images this means creating H pages of shape (W, C > 4), and this is bad for almost every scenario. It will confuse some readers because it looks identical to a (stack, W, H) image, so it’s not great for interchange. It will load much slower than “normal” TIFF files because there are a lot of pages, so it’s not great for processing. Compression works per page in a TIFF file and single pages are small, so the result is a large file and not good for archiving.

ImageIO will create (H, W, C>4) TIFFs if you request that, no problem, but I don’t think they will serve you well hence my suggestion to not do so.

Genuine question: is there any advantage to switching?

@stefanv I think there are three that stand out when I think of scikit-image:

The ability to write nD images rather than being limited to 2D only.
Access to v3 plugins. In particular tifffile_v3, which allows better customization by exposing all of tifffiles kwargs instead of just a select number (of old ones).
Better control over plugin and format selection. In v2 selecting a plugin and format uses the same kwarg (format="PNG-PIL") and sometimes isn’t possible at all (the plugin reads multiple extensions, but only registers format="plugin-name"). In v3 these are separate arguments: extension=".png" and plugin="pillow". This means we can set more fine-grained defaults, which give you better performance.

So overall, you don’t loose anything but gain performance and a less complex mental model for the user. They only need to remember imread and imwrite to do 95% of their IO tasks and they don’t need to know which plugins are available if they want to write a specific format, i.e., extension=".png" will create a PNG using the best plugin available on the users system. (Contrast that with format="PNG-PIL" where users need to know that pillow is installed and also that the pillow plugin specifically registers itself using various FMT-PIL names.)

cgohlke · February 12, 2023, 8:58pm

TIFF supports up to 65535 samples per pixel. Refer to the ExtraSamples definition in TIFF6.

E.g., using tifffile:

tifffile.imwrite('test.tif', shape=(32, 32, 65535), dtype='uint8', planarconfig='contig')