My over 20-year old 55000 statement optical engineering application has been ported to dozens of compilers and/or OSs. My current personal setup is an M2 Ultra Mac and I use MacPorts to build both Gfortran and Flang on it. I’ve had little to no problems with the mature Gfortran but Flang-19 was the first version to correctly run my benchmarks although twice as slow as Gfortran. On the other hand Flang-20 versions attained nearly the same performance, BUT Flang-21 has gone back to twice as slow and now has erratic multithreading (OpenMP). All 3 Flang versions used the same compiler optimization options: -O3 -ffast-math -fstack-arrays -mcpu=native -fopenmp (just -Ofast with Gfortran). Has anyone else seen this gain in performance then taken away? Is this a Flang or LLVM issue or specific to my OS/cpu?
3 Likes
I think LLVM Flang is still relatively “young” (as compared to Gfortran), so I guess it might be not too surprising that the optimization and performance still fluctuate considerably from version to version.. That said, I wonder if the speed difference by a factor of ~2 comes mainly from “fast math” optimization (and if so, the speed is similar if no fast-math optimization used)? I am interested in the latter because I usually turn off fast-math in my calculations (even in production codes for several reasons).
2 Likes
If you have a reproducer for the performance issue or the multi-threading issue then the developers can have a look. You can provide more details in a github issue (GitHub · Where software is built).