-
Notifications
You must be signed in to change notification settings - Fork 31
Aarch64 support #569
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
We, at the
|
While packaging torchcodec on conda-forge, it was trivial to also add the aarch64 build. However, it seems that both C++ and Python tests are failing on linux-aarch64 . I initially tought this may be a problem of conda-forge's pytorch package on Log of tests failures: https://github.com/traversaro/torchcodec-test-debug/actions/runs/14695881468/job/41237423866 . Output C++ test failing on linux-aarch64
|
Thank you for the logs @traversaro . It looks like most of the errors look like
which don't seem too bad. We do exact checks against linux x86, but we do have to relax the tolerances on MacOS and on CUDA already: Lines 38 to 51 in ed13ac5
So relaxing the tolerances on aarch64 seems reasonable. I'm not sure how to best do that robustly though, especially if those new architectures are supported out-of-core. This is something we may have to enable soon for XPU support as well: CC @scotts @dvrogozh for ideas? |
Thanks, that is clear! So if I understood correctly the C++ tests are only expected to pass on Linux on x86_64? |
They are ideally meant to pass on all platforms, but if they don't, it's safe to disable them. Most of the C++ tests are legacy at this point anyway. As long as the Python tests pass (possibly slightly increasing the tolerance), an architecture can be considered safe to release. |
@traversaro, thanks for digging into this! Did you need to make any changes outside of testing? |
No, with just the test-related patch suggested by @NicolasHug (https://github.com/conda-forge/torchcodec-feedstock/blob/fe701d8f1f535381f4779310bc158584b8420c7c/recipe/use_strict_threshold_only_for_linux_x86_64.patch) all the Python test run fine on CI on linux-aarch64: conda-forge/torchcodec-feedstock#12 . |
Just relaxing the tolerance won't work actually. Problem is with color conversion algorithms which might significantly differ per implementation and per-platform. You can't expect bit-to-bit matches. And that's not something you can overcome with the current threshold which expects difference in no more than N pixels. There could be bigger substantial differences. I did step into that working on XPU support for torchcodec. For the media domain that's known problem. It's being solved by comparing frames by metrics such as PSNR, SSIM, etc. That was the part of my initial #558, see change in test/utils.py:
As you can see I did that for XPU, but we can reuse the idea and code for other cases. I will extract this to separate PR today, but I will need help to test this on aarch64 since I don't have this platform at hand. |
Opening this issue to track requests for support of
aarch64
(Linux ARM).I think our build/test/release infra should be able to support those fairly naturally. We'd need to build and push more non-GPL FFmpeg binaries as well.
The text was updated successfully, but these errors were encountered: