Scaling vision Transformer 论文理解
- 1. 摘要
- 2. 一些主要结论小结
- 2.1 few shot transfer learning
- 2.2 Pareto-front
- 3. 讨论
- 3.1 Limitations
- 3.2 社会作用
- 4. 文章结论
- 参考资料
1. 摘要
Attention-based neural networks such as the Vision Transformer (ViT) have recently attained state-of-the-art results on many computer vision benchmarks. Scale is a primary ingredient in attaining excellent results, therefore, understanding a model’s scaling propertie