Skip to content

CUDA Out of Memory during 3d_fullres inference on RTX 5090 (32 GB VRAM, Windows/WDDM) #2963

@jdmerkur

Description

@jdmerkur

Firstly just wanted to say thank you for all the work put into this segmentation model, and making it publicly accessible. Really appreciate all of the work that has gone into nnU-Net

Description

We encounter reproducible CUDA out-of-memory (OOM) errors during 3d_fullres inference in nnU-Net v2 despite using an RTX 5090 with ~32 GB VRAM. Inference is run conservatively (single fold, no TTA, minimal workers).

One important characteristic of our task is a large number of segmentation labels, which may significantly increase memory usage during softmax, argmax, and resampling/export of multi-class probability maps. The same dataset runs successfully with 3d_lowres.

The OOM typically occurs after several cases complete, suggesting memory accumulation or incomplete VRAM release between cases, rather than a single-volume forward-pass failure.

Environment:

OS: Windows (WDDM)
GPU: NVIDIA GeForce RTX 5090
VRAM: 32607 MiB (~32 GB)
Driver: 576.83
CUDA (driver): 12.9
CUDA (PyTorch): 12.8
Python: 3.13.10 (Anaconda)
PyTorch: 2.8.0+cu128
nnU-Net: nnUNetv2 (latest)

This environment uses Python 3.13.10, which is newer than typical nnU-Net setups. If this is a known limitation, we are happy to test on Python 3.10/3.11 for comparison; however, the observed behavior suggests a memory accumulation issue rather than a single-allocation peak.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions