CUDA Out of Memory during 3d_fullres inference on RTX 5090 (32 GB VRAM, Windows/WDDM)

Firstly just wanted to say thank you for all the work put into this segmentation model, and making it publicly accessible. Really appreciate all of the work that has gone into nnU-Net

Description

We encounter reproducible CUDA out-of-memory (OOM) errors during 3d_fullres inference in nnU-Net v2 despite using an RTX 5090 with ~32 GB VRAM. Inference is run conservatively (single fold, no TTA, minimal workers).

One important characteristic of our task is a large number of segmentation labels, which may significantly increase memory usage during softmax, argmax, and resampling/export of multi-class probability maps. The same dataset runs successfully with 3d_lowres.

The OOM typically occurs after several cases complete, suggesting memory accumulation or incomplete VRAM release between cases, rather than a single-volume forward-pass failure.

Environment:

OS: Windows (WDDM)
GPU: NVIDIA GeForce RTX 5090
VRAM: 32607 MiB (~32 GB)
Driver: 576.83
CUDA (driver): 12.9
CUDA (PyTorch): 12.8
Python: 3.13.10 (Anaconda)
PyTorch: 2.8.0+cu128
nnU-Net: nnUNetv2 (latest)

This environment uses Python 3.13.10, which is newer than typical nnU-Net setups. If this is a known limitation, we are happy to test on Python 3.10/3.11 for comparison; however, the observed behavior suggests a memory accumulation issue rather than a single-allocation peak.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CUDA Out of Memory during 3d_fullres inference on RTX 5090 (32 GB VRAM, Windows/WDDM) #2963

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

CUDA Out of Memory during 3d_fullres inference on RTX 5090 (32 GB VRAM, Windows/WDDM) #2963

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions