Releasing (or not) 32-bit Windows wheels

rgommers · June 5, 2022, 2:27pm

Thanks for the data points @mckib2 and @steppi, very interesting. And thanks @certik for weighing in. I’m following LFortran development from a distance, with great interest!

This is indeed a key part, and I’m afraid it wouldn’t be that quick. Redoing BLAS/LAPACK support in particular is challenging, and we’d need to keep supporting cython_blas and cython_lapack as a service to the rest of the ecosystem. That said, while not “quick”, I do think it’s a feasible task in principle.

They don’t work (well), unfortunately. @steppi’s qualification of specfun applies to many other vendored libraries, like ARPACK, FITPACK, QUADPACK, interpolative, and so on. They’re not well-tested libraries with a few bugs - many of them deserve to be thrown away and completely replaced; they have segfaults and correctness issues that we just aren’t able to address, in addition to the impossibility to extend them with new code in practice.

Looking at experiences over the past few years, this is clearly not true:

Fortran: few hard bugs get fixed, and zero new code gets written. The only significant addition of Fortran code in the past 5 (or more?) years was PROPACK I think, and as @mckib2 describes it was so frustrating that he started a port to C++. We typically also don’t have more than 1-2 maintainers who even want to review Fortran PRs.
Several newer maintainers and contributors are quite enthusiastic about C++. New features or rewrites of code focusing on performance do happen (e.g., scipy.spatial.cKDTree, scipy.fft, scipy.spatial.distance, scipy.special.logit/expit, earlier also the scipy.sparse matrix data structures). Some of the most significant new functionality we added recently is based on Boost and HiGHS, both high-quality C++ libraries. And it’s a lot easier to find new folks with C++ skills willing to work on high-performance numerical Python libraries than it is to find folks with Fortran skills. Finally, for existing maintainers who don’t know a language, learning C++ potentially makes sense from a career perspective. Learning how to deal with old Fortran code, not so much.

C++ has its issues and can be complex, but the reality is that we attract new talented maintainers that want to use it. For Fortran, it’s zero. And those folks that are enthusiastic about Fortran are talking about modern Fortran (!= F77/F90), which can indeed be nice - but only has a good story for HPC / on Linux. Fortran on Windows is just never-ending pain, and responsible for our worst packaging issues. It was also the worst problem for getting things to work on macOS M1. In terms of negative externalities, it is also a problem for Pyodide for example, while something like Cython or Pythran isn’t even though those tools are more niche overall (because they are transpilers, they basically work wherever C/C++ work).

There’s a whole bunch of reasons pro/con for any other language too:

C is most portable and simple to integrate, but limited because of no templating, and not many people write new code in it,
C++ is most popular with people who like writing native code and feature-full, but harder to understand than C
Cython is the most approachable to write new code in for the largest number of maintainers and is nice for binding generation too, but is a pain for build system integration, creates binaries that are too large, and there are long-term maintenance worries because it relies on 1-2 maintainers only,
Modern Fortran: nice language for array-based algorithms and fast, but no good support for interfacing with Python, still niche, lack of compilers, and we lack maintainers/reviewers,

For Fortran as we have it in SciPy though (F77 mostly), there’s just no pros at all beyond “we already have the code”, and many cons.

Yes, that is a problem. It would depend on the component whether a line-by-line translation (or auto-translation) would make sense, or a rewrite from scratch.

Yes, that would be nice if anyone is looking for a potentially high-impact project