Leaderboard

We encourage the community to develop new methods, optimize them for specific benchmarks, and compare results with existing approaches.

To implement a new method, refer to our contributing guide.

Note

The results.md file is maintained for reproducibility purposes. However, we encourage contributors to update the leaderboard table instead of the reproducibility table. We will continue refining and tuning baseline methods to keep the leaderboard up to date.

TOFU unlearning on the `Llama-2-7b-hf-chat` architecture

Method	forget10
	forget_quality	model_utility
Finetuned	4.35e-25	0.63
Retain	1.0	0.61

TOFU unlearning on the `Llama-3.2-1B-Instruct` architecture

Method	forget10
	forget_quality	model_utility
Finetuned	3.91e-22	0.6
Retain	1.0	0.59

MUSE unlearning on the benchmark's target models

Method	News				Books
	forget_knowmem_ROUGE	forget_verbmem_ROUGE	privleak	retain_knowmem_ROUGE	forget_knowmem_ROUGE	forget_verbmem_ROUGE	privleak	retain_knowmem_ROUGE
Finetuned	0.64	0.58	-99.81	0.56	0.47	1.0	-57.34	0.69
Retain	0.33	0.20	0	0.56	0.3	0.14	0	0.69

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

leaderboard.md

leaderboard.md

Leaderboard

TOFU unlearning on the `Llama-2-7b-hf-chat` architecture

TOFU unlearning on the `Llama-3.2-1B-Instruct` architecture

MUSE unlearning on the benchmark's target models

Files

leaderboard.md

Latest commit

History

leaderboard.md

File metadata and controls

Leaderboard

TOFU unlearning on the Llama-2-7b-hf-chat architecture

TOFU unlearning on the Llama-3.2-1B-Instruct architecture

MUSE unlearning on the benchmark's target models

TOFU unlearning on the `Llama-2-7b-hf-chat` architecture

TOFU unlearning on the `Llama-3.2-1B-Instruct` architecture