We’ve done a comparative benchmark (speed and memory usage) with Smil and Scikit-Image.
Smil is a mathematical morphology dedicated library of functions. So comparisons are done only on this area.
We’ve been working on Mathematical Morphology for more than 50 years now the discipline was created here at our research department in the sixties. Smil inherits the experience of previous libraries and software we’ve been writing since the 70’s.
In just some few words, Smil can be orders of magnitude faster than Scikit-image (hundreds or even thousands) on some operations thanks to parallelization and vectorization (SIMD), depending on the computer architecture.
Smil doesn’t replace scikit-image but may be a good complement to scikit-image when speed is important.
Do you think that it would make sense to incorporate Smil into skimage, or merely use it alongside?
We have recently integrated Pythran support, which will allow us to write fast SIMD kernels; but we don’t necessarily need to maintain our own versions.
(I am out of office at the moment, so may be slow to respond.)
Thank you for your reply. Happy to share some ideas.
At our lab, we use both, many times Smil with skImage, as there are lots of functions in the Sk-??? we haven’t and we don’t want to reinvent the wheel. So, we have functions to convert image format from Smil to (and from) NumPy.
But there are some people even at our lab that ask for an easier integration with skImage, and get rid of data type conversions. An example is to use Smil with Keras, and surely, skImage.
Is it possible to make a better integration ? (I mean transparent use of both) The answer is surely yes. Some fields to work on
Smil was created as a C++ library, and we added the Python interface thanks to SWIG. The goal of skImage is Python only. This is a different design concept. So the real question is to choose between replace SWIG or just use an wrapper.
Smil doesn’t support images with float data type. There are two main reasons : mathematical morphology doesn’t need them and you can’t implement hierarchical queues algorithms if the data type is float (infinite number of levels). From time to time people ask for Float data type images in Smil. Implementing float data types is not impossible but needs some code review.
Some time before I was asked about the feasibility of using NumPy as the native data type for Smil. The answer was no. And with the coming of uarray, I think I should wait to see what will come in this direction.
Also, we use both SIMD and OpenMP threads, whenever possible.
That said, we use both and if something can be done to improve their use together, incorporated to skimage or not, is a good idea for me.
Can you elaborate on this? In what ways is the image data structure in SMIL fundamentally different from a NumPy array?
Yes, my instinct with this discussion is that we should work on improving interoperability and dispatch between our libraries, rather than incorporating SMIL directly into or as a dependency of scikit-image. I’m hoping that for skimage2 we can provide a more pluggable API, so that it would be easy for SMIL (or a wrapper) to declare: “this function is equivalent to the scikit-image ‘erode’ function,” and so on.
@jni
Can you elaborate on this? In what ways is the image data structure in SMIL fundamentally different from a NumPy array?
A Smil image is just a C++ class (template) with some metadata, methods and an array of values…
There are some tricks on how to handle it. For some operations, the image can be seen as if it is in an hexagonal grid. Not natural with NumPy arrays.
For the moment, my goal is to make it more visible (get more users) and make it easier use with skImage.
Incorporate it into skImage ? Why not , but it’s secondary and, being realistic , maybe not a short term goal. But I can have this in mind and prepare it to make it possible if it become interesting.
The array is what I’m interested in — if we can do zero-copy views of NumPy arrays as Smil images, and of Smil images as NumPy arrays, then that goes a heck of a long way towards making interop between the two libraries “pleasant”. It’s not great if you have to pay a copy cost every time you run Smil on NumPy inputs.
You touched the point. In interesting applications, we have big 3D images (e.g. 1000 x 1000 x 1000). And this is is one reason why we do “erode(imIn, imOut, se)” and not “imOut = erode(imIn, se)”.
You’re talking about passing data between Smil and NumPy. From Smil to Numpy, you can get a NumPy pointer from Smil images. But I noticed that most of the time it’s faster to copy than directly use pointers. This is an area to improve.