Skip to content

Reproduce fine tuning but score poorly on the evaluation dataset #20

@BenjaminChuang00

Description

@BenjaminChuang00

Hi,
Thanks to the author for the contribution, but I had some problems reproducing it.
Why do I get bad scores on the evaluation dataset when reproducing fine-tuning results on my RTX3090?
The scores are as follows:

Dataset AbsRel ↓ Delta_1 ↑
NYUv2 0.056 0.963
KITTI 0.092 0.928
ETH3D 0.064 0.961
ScanNet 0.062 0.956
DIODE 0.299 0.780

The following is the train script configuration I use, refer to train_marigold_e2e_ft_depth.sh:

Note.
The following are the modified parts:
--checkpointing_steps 500 => to store the best checkpoint
--dataloader_num_workers 4 => speed up training time
--mixed_precision "bf16" => reduce memory usage
--seed 1234 => fixed seed

The complete script is as follows:

#!/bin/bash

accelerate launch training/train.py \
--pretrained_model_name_or_path "prs-eth/marigold-v1-0" \
--modality "depth" \
--noise_type "zeros" \
--max_train_steps 20000 \
--checkpointing_steps 500 \
--train_batch_size 2 \
--gradient_accumulation_steps 16 \
--gradient_checkpointing \
--learning_rate 3e-05 \
--lr_total_iter_length 20000 \
--lr_exp_warmup_steps 100 \
--dataloader_num_workers 4 \
--mixed_precision "bf16" \
--output_dir "model-finetuned/marigold_e2e_ft_depth_bf16" \
--enable_xformers_memory_efficient_attention \
--seed 1234 \
"$@"

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions