Skip to content

Commit 6e7ec86

Browse files
committed
add PVT model
1 parent 75e5089 commit 6e7ec86

6 files changed

Lines changed: 396 additions & 0 deletions

File tree

README.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,6 +110,8 @@ A PaddlePaddle version image model zoo.
110110

111111
* [PiT](./docs/en/model_zoo/pit.md)
112112

113+
* [PVT](./docs/en/model_zoo/pvt.md)
114+
113115
* [TNT](./docs/en/model_zoo/tnt.md)
114116

115117
* [DeiT](./docs/en/model_zoo/deit.md)
@@ -187,4 +189,13 @@ A PaddlePaddle version image model zoo.
187189
archivePrefix={arXiv},
188190
primaryClass={cs.CV}
189191
}
192+
193+
@misc{wang2021pyramid,
194+
title={Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions},
195+
author={Wenhai Wang and Enze Xie and Xiang Li and Deng-Ping Fan and Kaitao Song and Ding Liang and Tong Lu and Ping Luo and Ling Shao},
196+
year={2021},
197+
eprint={2102.12122},
198+
archivePrefix={arXiv},
199+
primaryClass={cs.CV}
200+
}
190201
```

README_CN.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,6 +110,8 @@
110110

111111
* [PiT](./docs/cn/model_zoo/pit.md)
112112

113+
* [PVT](./docs/cn/model_zoo/pvt.md)
114+
113115
* [TNT](./docs/cn/model_zoo/tnt.md)
114116

115117
* [DeiT](./docs/cn/model_zoo/deit.md)
@@ -187,4 +189,13 @@
187189
archivePrefix={arXiv},
188190
primaryClass={cs.CV}
189191
}
192+
193+
@misc{wang2021pyramid,
194+
title={Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions},
195+
author={Wenhai Wang and Enze Xie and Xiang Li and Deng-Ping Fan and Kaitao Song and Ding Liang and Tong Lu and Ping Luo and Ling Shao},
196+
year={2021},
197+
eprint={2102.12122},
198+
archivePrefix={arXiv},
199+
primaryClass={cs.CV}
200+
}
190201
```

docs/cn/model_zoo/pvt.md

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
# PVT
2+
* 论文:[Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions](https://arxiv.org/abs/2102.12122)
3+
* 官方项目:[whai362/PVT](https://github.com/whai362/PVT)
4+
* 模型代码:[pit.py](../../../ppim/models/pvt.py)
5+
* 验证集数据处理:
6+
7+
```python
8+
# 图像后端:pil
9+
# 输入图像大小:224x224
10+
transforms = T.Compose([
11+
T.Resize(248, interpolation='bicubic'),
12+
T.CenterCrop(224),
13+
T.ToTensor(),
14+
T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
15+
])
16+
```
17+
18+
* 模型细节:
19+
20+
| Model | Model Name | Params (M) | FLOPs (G) | Top-1 (%) | Top-5 (%) | Pretrained Model |
21+
|:---------------------:|:---------------------:|:----------:|:---------:|:---------:|:---------:|:----------------------------:|
22+
| PVT-Tiny | pvt_ti | 13.2 | 1.9 | 74.96 | 92.47 | [Download][pvt_ti] |
23+
| PVT-Small | pvt_s | 24.5 | 3.8 | 79.87 | 95.05 | [Download][pvt_s] |
24+
| PVT-Medium | pvt_m | 44.2 | 6.7 | 81.48 | 95.75 | [Download][pvt_m] |
25+
| PVT-Large | pvt_l | 61.4 | 9.8 | 81.74 | 95.87 | [Download][pvt_l] |
26+
27+
28+
[pvt_ti]:https://bj.bcebos.com/v1/ai-studio-online/f833d36454ae4c11be0f5d2eb3041a7e9c2df10b8518434193c0b7c8853dfddf?responseContentDisposition=attachment%3B%20filename%3Dpvt_tiny.pdparams
29+
[pvt_s]:https://bj.bcebos.com/v1/ai-studio-online/608703b1387b44a78d01f09f0c572bd163edecf2354243dda1afeab2b58abb06?responseContentDisposition=attachment%3B%20filename%3Dpvt_small.pdparams
30+
[pvt_m]:https://bj.bcebos.com/v1/ai-studio-online/232d73f40a3b45bb96786a8ae6a58f93967ada580a354266910bb63caa96201b?responseContentDisposition=attachment%3B%20filename%3Dpvt_medium.pdparams
31+
[pvt_l]:https://bj.bcebos.com/v1/ai-studio-online/08b2064702304e13893337d1b1017941ced31fc4f7c644acb4a44a1a81c66e55?responseContentDisposition=attachment%3B%20filename%3Dpvt_large.pdparams

docs/en/model_zoo/pvt.md

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
# PVT
2+
* Paper:[Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions](https://arxiv.org/abs/2102.12122)
3+
* Origin Repo:[whai362/PVT](https://github.com/whai362/PVT)
4+
* Code:[pit.py](../../../ppim/models/pvt.py)
5+
* Evaluate Transforms:
6+
7+
```python
8+
# backend: pil
9+
# input_size: 224x224
10+
transforms = T.Compose([
11+
T.Resize(248, interpolation='bicubic'),
12+
T.CenterCrop(224),
13+
T.ToTensor(),
14+
T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
15+
])
16+
```
17+
18+
* Model Details:
19+
20+
| Model | Model Name | Params (M) | FLOPs (G) | Top-1 (%) | Top-5 (%) | Pretrained Model |
21+
|:---------------------:|:---------------------:|:----------:|:---------:|:---------:|:---------:|:----------------------------:|
22+
| PVT-Tiny | pvt_ti | 13.2 | 1.9 | 74.96 | 92.47 | [Download][pvt_ti] |
23+
| PVT-Small | pvt_s | 24.5 | 3.8 | 79.87 | 95.05 | [Download][pvt_s] |
24+
| PVT-Medium | pvt_m | 44.2 | 6.7 | 81.48 | 95.75 | [Download][pvt_m] |
25+
| PVT-Large | pvt_l | 61.4 | 9.8 | 81.74 | 95.87 | [Download][pvt_l] |
26+
27+
28+
[pvt_ti]:https://bj.bcebos.com/v1/ai-studio-online/f833d36454ae4c11be0f5d2eb3041a7e9c2df10b8518434193c0b7c8853dfddf?responseContentDisposition=attachment%3B%20filename%3Dpvt_tiny.pdparams
29+
[pvt_s]:https://bj.bcebos.com/v1/ai-studio-online/608703b1387b44a78d01f09f0c572bd163edecf2354243dda1afeab2b58abb06?responseContentDisposition=attachment%3B%20filename%3Dpvt_small.pdparams
30+
[pvt_m]:https://bj.bcebos.com/v1/ai-studio-online/232d73f40a3b45bb96786a8ae6a58f93967ada580a354266910bb63caa96201b?responseContentDisposition=attachment%3B%20filename%3Dpvt_medium.pdparams
31+
[pvt_l]:https://bj.bcebos.com/v1/ai-studio-online/08b2064702304e13893337d1b1017941ced31fc4f7c644acb4a44a1a81c66e55?responseContentDisposition=attachment%3B%20filename%3Dpvt_large.pdparams

ppim/models/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
# Transformer
22
# from .tnt import tnt_s, TNT
33
from .vit import VisionTransformer
4+
from .pvt import pvt_ti, pvt_s, pvt_m, pvt_l, PyramidVisionTransformer
45
from .pit import pit_ti, pit_s, pit_xs, pit_b, pit_ti_distilled, pit_s_distilled, pit_xs_distilled, pit_b_distilled, PoolingTransformer, DistilledPoolingTransformer
56
from .deit import deit_ti, deit_s, deit_b, deit_b_384, deit_ti_distilled, deit_s_distilled, deit_b_distilled, deit_b_distilled_384, DistilledVisionTransformer
67

0 commit comments

Comments
 (0)