Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
Edit Models filters
Tasks
1
Libraries
1
Datasets
Languages
Licenses
Other
Reset Tasks
Multimodal
Visual Question Answering
Image-Text-to-Text
Document Question Answering
Computer Vision
Image Classification
Object Detection
Video Classification
Image Segmentation
Image-to-Text
Zero-Shot Image Classification
Image Feature Extraction
Mask Generation
Depth Estimation
Text-to-Image
Zero-Shot Object Detection
Unconditional Image Generation
Image-to-Image
Image-to-3D
Text-to-Video
Text-to-3D
Image-to-Video
Natural Language Processing
Text Generation
Text Classification
Text2Text Generation
Token Classification
Fill-Mask
Question Answering
Feature Extraction
Translation
Sentence Similarity
Summarization
Zero-Shot Classification
Table Question Answering
Audio
Automatic Speech Recognition
Audio-to-Audio
Audio Classification
Text-to-Speech
Text-to-Audio
Voice Activity Detection
Tabular
Tabular Classification
Tabular Regression
Time Series Forecasting
Reinforcement Learning
Reinforcement Learning
Robotics
Other
Graph Machine Learning
Apply filters
Models
232
Full-text search
Edit filters
Sort: Trending
Active filters:
image-text-to-text, transformers
Clear all
microsoft/Florence-2-large
Image-Text-to-Text
•
Updated
8 days ago
•
98.8k
•
789
qnguyen3/nanoLLaVA-1.5
Image-Text-to-Text
•
Updated
2 days ago
•
829
•
64
OpenGVLab/InternVL2-26B
Image-Text-to-Text
•
Updated
1 day ago
•
1.74k
•
51
OpenGVLab/InternVL2-8B
Image-Text-to-Text
•
Updated
1 day ago
•
1.8k
•
22
OpenGVLab/InternVL2-2B
Image-Text-to-Text
•
Updated
1 day ago
•
1.97k
•
12
microsoft/Florence-2-large-ft
Image-Text-to-Text
•
Updated
8 days ago
•
33.2k
•
230
vikhyatk/moondream2
Image-Text-to-Text
•
Updated
May 22
•
78.4k
•
504
OpenGVLab/InternVL2-40B
Image-Text-to-Text
•
Updated
1 day ago
•
117
•
9
llava-hf/llava-v1.6-mistral-7b-hf
Image-Text-to-Text
•
Updated
11 days ago
•
3.33M
•
171
OpenGVLab/InternVL-Chat-V1-5
Image-Text-to-Text
•
Updated
1 day ago
•
32k
•
377
microsoft/Florence-2-base
Image-Text-to-Text
•
Updated
8 days ago
•
38.4k
•
101
HuggingFaceM4/Florence-2-DocVQA
Image-Text-to-Text
•
Updated
1 day ago
•
856
•
28
HuggingFaceM4/idefics2-8b
Image-Text-to-Text
•
Updated
May 30
•
535k
•
•
536
OpenGVLab/InternVL2-4B
Image-Text-to-Text
•
Updated
1 day ago
•
667
•
7
microsoft/Florence-2-base-ft
Image-Text-to-Text
•
Updated
8 days ago
•
27.4k
•
69
fal/moondream2-docci-instruct
Image-Text-to-Text
•
Updated
May 10
•
17
•
4
google/paligemma-3b-pt-224
Image-Text-to-Text
•
Updated
12 days ago
•
58.3k
•
194
gokaygokay/Florence-2-SD3-Captioner
Image-Text-to-Text
•
Updated
15 days ago
•
752
•
5
deepseek-ai/deepseek-vl-7b-chat
Image-Text-to-Text
•
Updated
Mar 15
•
5.39k
•
205
google/paligemma-3b-ft-ocrvqa-896
Image-Text-to-Text
•
Updated
12 days ago
•
781
•
8
google/paligemma-3b-mix-224
Image-Text-to-Text
•
Updated
12 days ago
•
157k
•
45
OpenGVLab/Mini-InternVL-Chat-4B-V1-5
Image-Text-to-Text
•
Updated
1 day ago
•
23.2k
•
51
openvla/openvla-7b
Image-Text-to-Text
•
Updated
25 days ago
•
20.2k
•
44
AIDC-AI/Ovis-Clip-Llama3-8B
Image-Text-to-Text
•
Updated
25 days ago
•
54
•
5
AIDC-AI/Ovis-Clip-Qwen1_5-7B
Image-Text-to-Text
•
Updated
25 days ago
•
31
•
2
AIDC-AI/Ovis-Clip-Qwen1_5-14B
Image-Text-to-Text
•
Updated
25 days ago
•
22
•
3
OpenGVLab/InternVL2-2B-AWQ
Image-Text-to-Text
•
Updated
1 day ago
•
41
•
2
mlx-community/dolphin-vision-72b-4bit
Image-Text-to-Text
•
Updated
5 days ago
•
12
•
2
liuhaotian/llava-v1.5-7b
Image-Text-to-Text
•
Updated
May 8
•
358k
•
285
liuhaotian/llava-v1.5-13b
Image-Text-to-Text
•
Updated
May 9
•
136k
•
438
Previous
1
2
3
...
8
Next