pvt_v2_b0

Converted TIMM image classification model for LiteRT.

Source architecture: pvt_v2_b0
File: model.tflite

Model Details

Model Type: Image classification / feature backbone
Model Stats:
- Params (M): 3.7
- GMACs: 0.6
- Activations (M): 8.0
- Image size: 224 x 224
Papers:
- PVT v2: Improved Baselines with Pyramid Vision Transformer: https://arxiv.org/abs/2106.13797
Dataset: ImageNet-1k
Original: https://github.com/whai362/PVT

Citation

@article{wang2021pvtv2,
  title={Pvtv2: Improved baselines with pyramid vision transformer},
  author={Wang, Wenhai and Xie, Enze and Li, Xiang and Fan, Deng-Ping and Song, Kaitao and Liang, Ding and Lu, Tong and Luo, Ping and Shao, Ling},
  journal={Computational Visual Media},
  volume={8},
  number={3},
  pages={1--10},
  year={2022},
  publisher={Springer}
}

Downloads last month: 14

Model tree for litert-community/pvt_v2_b0

Base model

timm/pvt_v2_b0.in1k

Finetuned

(1)

this model

Dataset used to train litert-community/pvt_v2_b0

Paper for litert-community/pvt_v2_b0

PVT v2: Improved Baselines with Pyramid Vision Transformer

Paper • 2106.13797 • Published Jun 25, 2021