Very nice summary Lucas, thank you for this.
I’d open with “we are slowly starting to discuss the same points”;
I think bunch of us need to, at least attempt to translate some Fortran code to understand what is what to have a more real experience.
Here are very few facts, or things objectively true about this discussion, that I can verify or provide evidence first hand after the current translation work
- F77 is very bad. Zero lines of F77 should be in. F77 needs to die. Legacy or otherwise it does not matter. It has no place in modern code bases. Way too many footguns very little benefit. Too verbose due to lack of data structures and encourages you to write bad code. I don’t care about some folks who can’t move on. Not our problem.
- Not every old code is not bug free. Not every old code is battle-tested.
- The more we translate, the easier it gets. There is no Fortran specific benefit of code being in Fortran. Performance is identical. Fortran has array syntax superior instead of pointer math. That’s it.
- Fortran code can’t be further optimized, offloaded to other hosts, or implemented with SIMD tricks (within our capacity).
Here are some things I learned about “modern fortran”
- Modern fortran means nothing and multiple things. It’s an umbrella term for fortran folks to distance themselves from F77. It can be F95, 03, 08, 23 whatever. Each version has more weird syntax added and the language is driven by folks who are HPC oriented. Committee does not seek more adoption but more specialization.
- There is nothing Fortran specific that makes Fortran code better. All languages converged on the same point. It’s the syntax and ergonomics make the difference.
- Almost everything in modern fortran codebase is new including compilers. Flang is new, LFortran is new, gfortran stays new because of the standards. ifx is also new as opposed to ifort. This is not the case with C or C++ or Python and so on. So compiler stability is still a thing for now though it will get better over time.
We don’t need to discuss these points anymore. So let’s leave these at that. I am removing F77 slowly. So we will be left with LAPACK and PRIMA for fortran. For the remaining parts of this discussion, you can replace fortran with swift and all will be identical. Do we want swift in our codebase? It is relatively new but many people use it. It is maintained though and so on.
My thinking is the following; if PRIMA was not there, I would translate COBYLA in its buggy state. It would have been the same but at least we would not need to compile stuff in fortran. Now PRIMA is here, we are tempted to add more because it is free lunch. I would argue otherwise. It is future cost.
I don’t find this argument strong enough to repopulate SciPy with new fortran and miss this opportunity of getting fortran free. Because, then finally, we can work on what to do with BLAS and LAPACK as the standalone linear algebra issue. I am quite invested in that subject more so than SciPy F77. Because it is every language’s issue and not just Python’s.
We got PRIMA in and that’s fine. It’s not too much code (much less than what we already translated) and can be translated. We do track upstream codebases and it will not be the first one in case double code tracking is an issue.
I recommend folks to at least translate one function of their choosing from the fortran codebase to check how hard this work is or what it entails. Just even in their own time and not to commit to any codebase. Because there is so much mythical thinking involved until you actually do it. And it is not really different than regular Python work. It is just very annoying.