--- language: - en license: apache-2.0 tags: - image-to-text --- # ViTSTR small v1.0 ViTSTR model pre-trained on various real [STR datasets](https://github.com/baudm/parseq/blob/main/Datasets.md) at image size 224x224 with a patch size of 16x16. Disclaimer: this model card was not written by the original author. ## Model description *TODO* ## Intended uses & limitations You can use the model for STR on images containing Latin characters (62 case-sensitive alphanumeric + 32 punctuation marks). ### How to use *TODO* ### BibTeX entry and citation info ```bibtex @InProceedings{atienza2021vision, title={Vision transformer for fast and efficient scene text recognition}, author={Atienza, Rowel}, booktitle={International Conference on Document Analysis and Recognition}, pages={319--334}, year={2021}, organization={Springer} } ```