|
4 | 4 |
|
5 | 5 | ### Aug 1, 2020
|
6 | 6 | Universal feature extraction, new models, new weights, new test sets.
|
7 |
| -* All models support the `features_only=True` argument for `create_model` call to return a network that extracts features from the deepest layer at each stride. |
| 7 | +* All models support the `features_only=True` argument for `create_model` call to return a network that extracts feature maps from the deepest layer at each stride. |
8 | 8 | * New models
|
9 | 9 | * CSPResNet, CSPResNeXt, CSPDarkNet, DarkNet
|
10 | 10 | * ReXNet
|
11 |
| - * (Aligned) Xception41/65/71 (a proper port of TF models) |
| 11 | + * (Modified Aligned) Xception41/65/71 (a proper port of TF models) |
12 | 12 | * New trained weights
|
13 | 13 | * SEResNet50 - 80.3
|
14 | 14 | * CSPDarkNet53 - 80.1 top-1
|
@@ -56,76 +56,58 @@ Bunch of changes:
|
56 | 56 |
|
57 | 57 | ## Introduction
|
58 | 58 |
|
59 |
| -Py**T**orch **Im**age **M**odels is a collection of image models, layers, utilities, optimizers, schedulers, data-loaders / augmentations, and reference training / validation scripts that aim to pull together a wide variety of SOTA models with ability to reproduce ImageNet training results. |
| 59 | +Py**T**orch **Im**age **M**odels (`timm`) is a collection of image models, layers, utilities, optimizers, schedulers, data-loaders / augmentations, and reference training / validation scripts that aim to pull together a wide variety of SOTA models with ability to reproduce ImageNet training results. |
60 | 60 |
|
61 |
| -The work of many others is present here. I've tried to make sure all source material is acknowledged via links to github, arxiv papers, etc in the README, documentation, and code comments. Please let me know if I missed anything. |
| 61 | +The work of many others is present here. I've tried to make sure all source material is acknowledged via links to github, arxiv papers, etc in the README, documentation, and code docstrings. Please let me know if I missed anything. |
62 | 62 |
|
63 | 63 | ## Models
|
64 | 64 |
|
65 |
| -Most included models have pretrained weights. The weights are either from their original sources, ported by myself from their original framework (e.g. Tensorflow models), or trained from scratch using the included training script. |
66 |
| - |
67 |
| -Included models: |
68 |
| -* ResNet/ResNeXt (from [torchvision](https://github.com/pytorch/vision/tree/master/torchvision/models) with mods by myself) |
69 |
| - * ResNet-18, ResNet-34, ResNet-50, ResNet-101, ResNet-152, ResNeXt50 (32x4d), ResNeXt101 (32x4d and 64x4d) |
70 |
| - * 'Bag of Tricks' / Gluon C, D, E, S variations (https://arxiv.org/abs/1812.01187) |
71 |
| - * Instagram trained / ImageNet tuned ResNeXt101-32x8d to 32x48d from from [facebookresearch](https://pytorch.org/hub/facebookresearch_WSL-Images_resnext/) |
72 |
| - * Res2Net (https://github.com/gasvn/Res2Net, https://arxiv.org/abs/1904.01169) |
73 |
| - * Selective Kernel (SK) Nets (https://arxiv.org/abs/1903.06586) |
74 |
| - * ResNeSt (code adapted from https://github.com/zhanghang1989/ResNeSt, paper https://arxiv.org/abs/2004.08955) |
75 |
| -* DLA |
76 |
| - * Original (https://github.com/ucbdrive/dla, https://arxiv.org/abs/1707.06484) |
77 |
| - * Res2Net (https://github.com/gasvn/Res2Net, https://arxiv.org/abs/1904.01169) |
78 |
| -* DenseNet (from [torchvision](https://github.com/pytorch/vision/tree/master/torchvision/models)) |
79 |
| - * DenseNet-121, DenseNet-169, DenseNet-201, DenseNet-161 |
80 |
| -* Squeeze-and-Excitation ResNet/ResNeXt (from [Cadene](https://github.com/Cadene/pretrained-models.pytorch) with some pretrained weight additions by myself) |
81 |
| - * SENet-154, SE-ResNet-18, SE-ResNet-34, SE-ResNet-50, SE-ResNet-101, SE-ResNet-152, SE-ResNeXt-26 (32x4d), SE-ResNeXt50 (32x4d), SE-ResNeXt101 (32x4d) |
82 |
| -* Inception-V3 (from [torchvision](https://github.com/pytorch/vision/tree/master/torchvision/models)) |
83 |
| -* Inception-ResNet-V2 and Inception-V4 (from [Cadene](https://github.com/Cadene/pretrained-models.pytorch) ) |
84 |
| -* Xception |
85 |
| - * Original Xception from [Cadene](https://github.com/Cadene/pretrained-models.pytorch) |
86 |
| - * MXNet Gluon 'modified aligned' Xception-65 from [Gluon ModelZoo](https://github.com/dmlc/gluon-cv/tree/master/gluoncv/model_zoo) |
87 |
| - * DeepLab (Aligned) Xception-41, 65, and 71 from [Tensorflow Models](https://github.com/tensorflow/models/tree/master/research/deeplab) |
88 |
| -* PNasNet & NASNet-A (from [Cadene](https://github.com/Cadene/pretrained-models.pytorch)) |
89 |
| -* DPN (from [myself](https://github.com/rwightman/pytorch-dpn-pretrained)) |
90 |
| - * DPN-68, DPN-68b, DPN-92, DPN-98, DPN-131, DPN-107 |
91 |
| -* EfficientNet (from my standalone [GenEfficientNet](https://github.com/rwightman/gen-efficientnet-pytorch)) - A generic model that implements many of the efficient models that utilize similar DepthwiseSeparable and InvertedResidual blocks |
92 |
| - * EfficientNet NoisyStudent (B0-B7, L2) (https://arxiv.org/abs/1911.04252) |
93 |
| - * EfficientNet AdvProp (B0-B8) (https://arxiv.org/abs/1911.09665) |
94 |
| - * EfficientNet (B0-B7) (https://arxiv.org/abs/1905.11946) |
95 |
| - * EfficientNet-EdgeTPU (S, M, L) (https://ai.googleblog.com/2019/08/efficientnet-edgetpu-creating.html) |
96 |
| - * MixNet (https://arxiv.org/abs/1907.09595) |
97 |
| - * MNASNet B1, A1 (Squeeze-Excite), and Small (https://arxiv.org/abs/1807.11626) |
98 |
| - * MobileNet-V2 (https://arxiv.org/abs/1801.04381) |
99 |
| - * FBNet-C (https://arxiv.org/abs/1812.03443) |
100 |
| - * Single-Path NAS (https://arxiv.org/abs/1904.02877) |
101 |
| -* MobileNet-V3 (https://arxiv.org/abs/1905.02244) |
102 |
| -* HRNet |
103 |
| - * code from https://github.com/HRNet/HRNet-Image-Classification |
104 |
| - * paper https://arxiv.org/abs/1908.07919 |
105 |
| -* SelecSLS |
106 |
| - * code from https://github.com/mehtadushy/SelecSLS-Pytorch |
107 |
| - * paper https://arxiv.org/abs/1907.00837 |
108 |
| -* TResNet |
109 |
| - * code from https://github.com/mrT23/TResNet |
110 |
| - * paper https://arxiv.org/abs/2003.13630 |
111 |
| -* RegNet |
112 |
| - * paper `Designing Network Design Spaces` - https://arxiv.org/abs/2003.13678 |
113 |
| - * reference code at https://github.com/facebookresearch/pycls/blob/master/pycls/models/regnet.py |
114 |
| -* VovNet V2 (with V1 support) |
115 |
| - * paper `CenterMask : Real-Time Anchor-Free Instance Segmentation` - https://arxiv.org/abs/1911.06667 |
116 |
| - * reference code at https://github.com/youngwanLEE/vovnet-detectron2 |
117 |
| -* CspNet (Cross-Stage Partial Networks) |
118 |
| - * paper `CSPNet: A New Backbone that can Enhance Learning Capability of CNN` - https://arxiv.org/abs/1911.11929 |
119 |
| - * reference impl at https://github.com/WongKinYiu/CrossStagePartialNetworks |
120 |
| -* ReXNet |
121 |
| - * paper `ReXNet: Diminishing Representational Bottleneck on CNN` - https://arxiv.org/abs/2007.00992 |
122 |
| - * code from https://github.com/clovaai/rexnet |
123 |
| - |
124 |
| -Use the `--model` arg to specify model for train, validation, inference scripts. Match the all lowercase |
125 |
| -creation fn for the model you'd like. |
| 65 | +Most included models have pretrained weights. The weights are either from their original sources, ported by myself from their original framework (e.g. Tensorflow models), or trained from scratch using the included training script. A full version of the list below with source links and references can be found in the [documentation](https://rwightman.github.io/pytorch-image-models/models/). |
| 66 | + |
| 67 | +* CspNet (Cross-Stage Partial Networks) - https://arxiv.org/abs/1911.11929 |
| 68 | +* DenseNet - https://arxiv.org/abs/1608.06993 |
| 69 | +* DLA - https://arxiv.org/abs/1707.06484 |
| 70 | +* DPN (Dual-Path Network) - https://arxiv.org/abs/1707.01629 |
| 71 | +* EfficientNet (MBConvNet Family) |
| 72 | + * EfficientNet NoisyStudent (B0-B7, L2) - https://arxiv.org/abs/1911.04252 |
| 73 | + * EfficientNet AdvProp (B0-B8) - https://arxiv.org/abs/1911.09665 |
| 74 | + * EfficientNet (B0-B7) - https://arxiv.org/abs/1905.11946 |
| 75 | + * EfficientNet-EdgeTPU (S, M, L) - https://ai.googleblog.com/2019/08/efficientnet-edgetpu-creating.html |
| 76 | + * FBNet-C - https://arxiv.org/abs/1812.03443 |
| 77 | + * MixNet - https://arxiv.org/abs/1907.09595 |
| 78 | + * MNASNet B1, A1 (Squeeze-Excite), and Small - https://arxiv.org/abs/1807.11626 |
| 79 | + * MobileNet-V2 - https://arxiv.org/abs/1801.04381 |
| 80 | + * Single-Path NAS - https://arxiv.org/abs/1904.02877 |
| 81 | +* HRNet - https://arxiv.org/abs/1908.07919 |
| 82 | +* Inception-V3 - https://arxiv.org/abs/1512.00567 |
| 83 | +* Inception-ResNet-V2 and Inception-V4 - https://arxiv.org/abs/1602.07261 |
| 84 | +* MobileNet-V3 (MBConvNet w/ Efficient Head) - https://arxiv.org/abs/1905.02244 |
| 85 | +* NASNet-A - https://arxiv.org/abs/1707.07012 |
| 86 | +* PNasNet - https://arxiv.org/abs/1712.00559 |
| 87 | +* RegNet - https://arxiv.org/abs/2003.13678 |
| 88 | +* ResNet/ResNeXt |
| 89 | + * ResNet (v1b/v1.5) - https://arxiv.org/abs/1512.03385 |
| 90 | + * ResNeXt - https://arxiv.org/abs/1611.05431 |
| 91 | + * 'Bag of Tricks' / Gluon C, D, E, S variations - https://arxiv.org/abs/1812.01187 |
| 92 | + * Weakly-supervised (WSL) Instagram pretrained / ImageNet tuned ResNeXt101 - https://arxiv.org/abs/1805.00932 |
| 93 | + * Semi-supervised (SSL) / Semi-weakly Supervised (SWSL) ResNet/ResNeXts - https://arxiv.org/abs/1905.00546 |
| 94 | + * ECA-Net (ECAResNet) - https://arxiv.org/abs/1910.03151v4 |
| 95 | + * Squeeze-and-Excitation Networks (SEResNet) - https://arxiv.org/abs/1709.01507 |
| 96 | +* Res2Net - https://arxiv.org/abs/1904.01169 |
| 97 | +* ResNeSt - https://arxiv.org/abs/2004.08955 |
| 98 | +* ReXNet - https://arxiv.org/abs/2007.00992 |
| 99 | +* SelecSLS - https://arxiv.org/abs/1907.00837 |
| 100 | +* Selective Kernel Networks - https://arxiv.org/abs/1903.06586 |
| 101 | +* TResNet - https://arxiv.org/abs/2003.13630 |
| 102 | +* VovNet V2 (with V1 support) - https://arxiv.org/abs/1911.06667 |
| 103 | +* Xception - https://arxiv.org/abs/1610.02357 |
| 104 | +* Xception (Modified Aligned, Gluon) - https://arxiv.org/abs/1802.02611 |
| 105 | +* Xception (Modified Aligned, TF) - https://arxiv.org/abs/1802.02611 |
126 | 106 |
|
127 | 107 | ## Features
|
| 108 | + |
128 | 109 | Several (less common) features that I often utilize in my projects are included. Many of their additions are the reason why I maintain my own set of models, instead of using others' via PIP:
|
| 110 | + |
129 | 111 | * All models have a common default configuration interface and API for
|
130 | 112 | * accessing/changing the classifier - `get_classifier` and `reset_classifier`
|
131 | 113 | * doing a forward pass on just the features - `forward_features`
|
@@ -165,11 +147,12 @@ Several (less common) features that I often utilize in my projects are included.
|
165 | 147 | * DropBlock (https://arxiv.org/abs/1810.12890)
|
166 | 148 | * Efficient Channel Attention - ECA (https://arxiv.org/abs/1910.03151)
|
167 | 149 | * Blur Pooling (https://arxiv.org/abs/1904.11486)
|
| 150 | +* Space-to-Depth by [mrT23](https://github.com/mrT23/TResNet/blob/master/src/models/tresnet/layers/space_to_depth.py) (https://arxiv.org/abs/1801.04590) -- original paper? |
168 | 151 |
|
169 | 152 | ## Results
|
170 | 153 |
|
171 | 154 | Model validation results can be found in the [documentation](https://rwightman.github.io/pytorch-image-models/results/) and in the [results tables](results/README.md)
|
172 | 155 |
|
173 | 156 | ## Getting Started
|
174 | 157 |
|
175 |
| -See [documentation](https://rwightman.github.io/pytorch-image-models/) |
| 158 | +See the [documentation](https://rwightman.github.io/pytorch-image-models/) |
0 commit comments