You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Adds an AMD ROCm variant of the Docker setup alongside the NVIDIA/Blackwell image, so Unsloth runs in a container on AMD GPUs (RDNA2/3/4 and CDNA/Instinct). It mirrors the structure of the CUDA image so the two stay symmetric. Branch: LeoBorcherding/unsloth@feature/docker-rocm-support
Why / relationship to existing work
Add Docker build for Blackwell that runs on any NVIDIA GPU host #5748 (Add Docker build for Blackwell that runs on any NVIDIA GPU host, branch docker-blackwell-build) added the CUDA side. This is the AMD counterpart: same docker/ layout, same build.sh/run.sh entry points, same smoke-test shape.
[ROCm] add rocm dockerfile #3324 ([ROCm] add rocm dockerfile by @billishyahao) was an earlier ROCm dockerfile attempt that was closed; there's also the dh/recover-3324-rocm-dockerfile branch. This picks that thread back up with a maintained, CI-published image and the AMD-specific gotchas baked in.
What's included
docker/Dockerfile.rocm: ROCm torch wheels + the bitsandbytes pre-release wheel that carries the 4-bit decode fix (bnb <= 0.49.2 NaNs at decode on AMD). [huggingface] extra, SDPA fallback (no xformers on ROCm).
docker/Dockerfile.studio-rocm: Studio variant layered on the ROCm base.
docker/entrypoint-rocm.sh: preflight (/dev/kfd reachable, rocm-smi, HIP torch, gfx-arch check with HSA_OVERRIDE hints).
docker/smoke_test_rocm.py: ROCm smoke test incl. a 5-step LoRA train.
docker/test_locally-rocm.sh: end-to-end local build + smoke + notebook check.
.github/workflows/docker-publish-rocm.yml: GPU-free amd64 build + publish to Docker Hub.
build.sh / run.sh gain a --rocm flag; .dockerignore allowlists the new files.
Base RDNA4 image (ROCM_VERSION=7.2.4, torch rocm7.2) builds cleanly on a GPU-free host, verified end to end at build time: torch 2.12.0+rocm7.2, HIP 7.2.53211, bitsandbytes 0.50.0.dev0 imports cleanly, all required packages present.
Not yet built: the default rocm6.2 image and the Studio image (Dockerfile.studio-rocm).
Pending: the GPU smoke test (5-step LoRA) on real AMD hardware. Needs an AMD GPU runner or a cloud instance; the workflow's self-hosted amd-gpu smoke job can also cover it.
What
Adds an AMD ROCm variant of the Docker setup alongside the NVIDIA/Blackwell image, so Unsloth runs in a container on AMD GPUs (RDNA2/3/4 and CDNA/Instinct). It mirrors the structure of the CUDA image so the two stay symmetric. Branch: LeoBorcherding/unsloth@feature/docker-rocm-support
Why / relationship to existing work
What's included
GPU coverage
Validation so far