OpenGVLab
/

InternViT-6B-448px-V1-5

Image Feature Extraction

feature-extraction

Model card Files Files and versions Community

czczup commited on 29 days ago

Commit

e7669c3

•

1 Parent(s): 1bc8042

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -14,9 +14,9 @@ base_model_relation: finetune
 # InternViT-6B-448px-V1-5
-[\[🆕 Blog\]](https://internvl.github.io/blog/)  [\[📜 InternVL 1.0 Paper\]](https://arxiv.org/abs/2312.14238)  [\[📜 InternVL 1.5 Report\]](https://arxiv.org/abs/2404.16821)  [\[🗨️ Chat Demo\]](https://internvl.opengvlab.com/)
-[\[🤗 HF Demo\]](https://huggingface.co/spaces/OpenGVLab/InternVL)  [\[🚀 Quick Start\]](#model-usage)  [\[🌐 Community-hosted API\]](https://rapidapi.com/adushar1320/api/internvl-chat)  [\[📖 中文解读\]](https://zhuanlan.zhihu.com/p/675877376)
 We develop InternViT-6B-448px-V1-5 based on the pre-training of the strong foundation of [InternViT-6B-448px-V1-2](https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-2). In this update, the resolution of training images is expanded from 448&times;448 to dynamic 448&times;448, where the basic tile size is 448&times;448 and the number of tiles ranges from 1 to 12.
 Additionally, we enhance the data scale, quality, and diversity of the pre-training dataset, resulting in the powerful robustness, OCR capability, and high-resolution processing capability of our

 # InternViT-6B-448px-V1-5
+[\[📂 GitHub\]](https://github.com/OpenGVLab/InternVL)  [\[🆕 Blog\]](https://internvl.github.io/blog/)  [\[📜 InternVL 1.0 Paper\]](https://arxiv.org/abs/2312.14238)  [\[📜 InternVL 1.5 Report\]](https://arxiv.org/abs/2404.16821)
+[\[🗨️ Chat Demo\]](https://internvl.opengvlab.com/)  [\[🤗 HF Demo\]](https://huggingface.co/spaces/OpenGVLab/InternVL)  [\[🚀 Quick Start\]](#quick-start)  [\[📖 中文解读\]](https://zhuanlan.zhihu.com/p/706547971)  [\[📖 Documents\]](https://internvl.readthedocs.io/en/latest/)
 We develop InternViT-6B-448px-V1-5 based on the pre-training of the strong foundation of [InternViT-6B-448px-V1-2](https://huggingface.co/OpenGVLab/InternViT-6B-448px-V1-2). In this update, the resolution of training images is expanded from 448&times;448 to dynamic 448&times;448, where the basic tile size is 448&times;448 and the number of tiles ranges from 1 to 12.
 Additionally, we enhance the data scale, quality, and diversity of the pre-training dataset, resulting in the powerful robustness, OCR capability, and high-resolution processing capability of our