Skip to content

Commit b496b7b

Browse files
committed
Re-ran batch validation on all models across all sets
1 parent ad150e7 commit b496b7b

9 files changed

+1955
-942
lines changed

results/README.md

Lines changed: 22 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -4,47 +4,56 @@ This folder contains validation results for the models in this collection having
44

55
## Datasets
66

7-
There are currently results for the ImageNet validation set and 5 additional test/label sets.
7+
There are currently results for the ImageNet validation set and 5 additional test / label sets.
8+
9+
The test set results include rank and top-1/top-5 differences from clean validation. For the "Real Labels", ImageNetV2, and Sketch test sets, the differences were calculated against the full 1000 class ImageNet-1k validation set. For both the Adversarial and Rendition sets, the differences were calculated against 'clean' runs on the ImageNet-1k validation set with the same 200 classes used in each test set respectively.
810

911
### ImageNet Validation - [`results-imagenet.csv`](results-imagenet.csv)
1012

13+
The standard 50,000 image ImageNet-1k validation set. Model selection during training utilizes this validation set, so it is not a true test set. Question: Does anyone have the official ImageNet-1k test set classification labels now that challenges are done?
14+
1115
* Source: http://image-net.org/challenges/LSVRC/2012/index
1216
* Paper: "ImageNet Large Scale Visual Recognition Challenge" - https://arxiv.org/abs/1409.0575
1317

14-
The standard 50,000 image ImageNet-1k validation set. Model selection during training utilizes this validation set, so it is not a true test set. Question: Does anyone have the official ImageNet-1k test set classification labels now that challenges are done?
18+
### ImageNet-"Real Labels" - [`results-imagenet-real.csv`](results-imagenet-real.csv)
19+
20+
The usual ImageNet-1k validation set with a fresh new set of labels intended to improve on mistakes in the original annotation process.
21+
22+
* Source: https://github.com/google-research/reassessed-imagenet
23+
* Paper: "Are we done with ImageNet?" - https://arxiv.org/abs/2006.07159
1524

1625
### ImageNetV2 Matched Frequency - [`results-imagenetv2-matched-frequency.csv`](results-imagenetv2-matched-frequency.csv)
1726

27+
An ImageNet test set of 10,000 images sampled from new images roughly 10 years after the original. Care was taken to replicate the original ImageNet curation/sampling process.
28+
1829
* Source: https://github.com/modestyachts/ImageNetV2
1930
* Paper: "Do ImageNet Classifiers Generalize to ImageNet?" - https://arxiv.org/abs/1902.10811
2031

21-
An ImageNet test set of 10,000 images sampled from new images roughly 10 years after the original. Care was taken to replicate the original ImageNet curation/sampling process.
22-
2332
### ImageNet-Sketch - [`results-sketch.csv`](results-sketch.csv)
2433

34+
50,000 non photographic (or photos of such) images (sketches, doodles, mostly monochromatic) covering all 1000 ImageNet classes.
35+
2536
* Source: https://github.com/HaohanWang/ImageNet-Sketch
2637
* Paper: "Learning Robust Global Representations by Penalizing Local Predictive Power" - https://arxiv.org/abs/1905.13549
2738

28-
50,000 non photographic (or photos of such) images (sketches, doodles, mostly monochromatic) covering all 1000 ImageNet classes.
29-
3039
### ImageNet-Adversarial - [`results-imagenet-a.csv`](results-imagenet-a.csv)
3140

41+
A collection of 7500 images covering 200 of the 1000 ImageNet classes. Images are naturally occuring adversarial examples that confuse typical ImageNet classifiers. This is a challenging dataset, your typical ResNet-50 will score 0% top-1.
42+
43+
For clean validation with same 200 classes, see [`results-imagenet-a-clean.csv`](results-imagenet-a-clean.csv)
44+
3245
* Source: https://github.com/hendrycks/natural-adv-examples
3346
* Paper: "Natural Adversarial Examples" - https://arxiv.org/abs/1907.07174
3447

35-
A collection of 7500 images covering 200 of the 1000 ImageNet classes. Images are naturally occuring adversarial examples that confuse typical ImageNet classifiers. This is a challenging dataset, your typical ResNet-50 will score 0% top-1.
3648

3749
### ImageNet-Rendition - [`results-imagenet-r.csv`](results-imagenet-r.csv)
3850

39-
* Source: https://github.com/hendrycks/imagenet-r
40-
* Paper: "The Many Faces of Robustness" - https://arxiv.org/abs/2006.16241
41-
4251
Renditions of 200 ImageNet classes resulting in 30,000 images for testing robustness.
4352

44-
### ImageNet-"Real Labels" - [`results-imagenet-real.csv`](results-imagenet-real.csv)
53+
For clean validation with same 200 classes, see [`results-imagenet-r-clean.csv`](results-imagenet-r-clean.csv)
4554

46-
* Source: https://github.com/google-research/reassessed-imagenet
47-
* Paper: "Are we done with ImageNet?" - https://arxiv.org/abs/2006.07159
55+
* Source: https://github.com/hendrycks/imagenet-r
56+
* Paper: "The Many Faces of Robustness" - https://arxiv.org/abs/2006.16241
4857

4958
## TODO
5059
* Explore adding a reduced version of ImageNet-C (Corruptions) and ImageNet-P (Perturbations) from https://github.com/hendrycks/robustness. The originals are huge and image size specific.

0 commit comments

Comments
 (0)