Edit Models filters

Multimodal

Visual Question Answering

Image-Text-to-Text

Document Question Answering

Computer Vision

Image Classification

Object Detection

Video Classification

Image Segmentation

Zero-Shot Image Classification

Image Feature Extraction

Mask Generation

Depth Estimation

Zero-Shot Object Detection

Unconditional Image Generation

Natural Language Processing

Text Generation

Text Classification

Text2Text Generation

Token Classification

Question Answering

Feature Extraction

Sentence Similarity

Zero-Shot Classification

Table Question Answering

Audio

Automatic Speech Recognition

Audio Classification

Voice Activity Detection

Tabular

Tabular Classification

Tabular Regression

Time Series Forecasting

Reinforcement Learning

Reinforcement Learning

Other

Graph Machine Learning

Models

232

Full-text search

Active filters: image-text-to-text, transformers

microsoft/Florence-2-large

Image-Text-to-Text • Updated 8 days ago • 98.8k • 789

qnguyen3/nanoLLaVA-1.5

Image-Text-to-Text • Updated 2 days ago • 829 • 64

OpenGVLab/InternVL2-26B

Image-Text-to-Text • Updated 1 day ago • 1.74k • 51

OpenGVLab/InternVL2-8B

Image-Text-to-Text • Updated 1 day ago • 1.8k • 22

OpenGVLab/InternVL2-2B

Image-Text-to-Text • Updated 1 day ago • 1.97k • 12

microsoft/Florence-2-large-ft

Image-Text-to-Text • Updated 8 days ago • 33.2k • 230

vikhyatk/moondream2

Image-Text-to-Text • Updated May 22 • 78.4k • 504

OpenGVLab/InternVL2-40B

Image-Text-to-Text • Updated 1 day ago • 117 • 9

llava-hf/llava-v1.6-mistral-7b-hf

Image-Text-to-Text • Updated 11 days ago • 3.33M • 171

OpenGVLab/InternVL-Chat-V1-5

Image-Text-to-Text • Updated 1 day ago • 32k • 377

microsoft/Florence-2-base

Image-Text-to-Text • Updated 8 days ago • 38.4k • 101

HuggingFaceM4/Florence-2-DocVQA

Image-Text-to-Text • Updated 1 day ago • 856 • 28

HuggingFaceM4/idefics2-8b

Image-Text-to-Text • Updated May 30 • 535k • • 536

OpenGVLab/InternVL2-4B

Image-Text-to-Text • Updated 1 day ago • 667 • 7

microsoft/Florence-2-base-ft

Image-Text-to-Text • Updated 8 days ago • 27.4k • 69

fal/moondream2-docci-instruct

Image-Text-to-Text • Updated May 10 • 17 • 4

google/paligemma-3b-pt-224

Image-Text-to-Text • Updated 12 days ago • 58.3k • 194

gokaygokay/Florence-2-SD3-Captioner

Image-Text-to-Text • Updated 15 days ago • 752 • 5

deepseek-ai/deepseek-vl-7b-chat

Image-Text-to-Text • Updated Mar 15 • 5.39k • 205

google/paligemma-3b-ft-ocrvqa-896

Image-Text-to-Text • Updated 12 days ago • 781 • 8

google/paligemma-3b-mix-224

Image-Text-to-Text • Updated 12 days ago • 157k • 45

OpenGVLab/Mini-InternVL-Chat-4B-V1-5

Image-Text-to-Text • Updated 1 day ago • 23.2k • 51

openvla/openvla-7b

Image-Text-to-Text • Updated 25 days ago • 20.2k • 44

AIDC-AI/Ovis-Clip-Llama3-8B

Image-Text-to-Text • Updated 25 days ago • 54 • 5

AIDC-AI/Ovis-Clip-Qwen1_5-7B

Image-Text-to-Text • Updated 25 days ago • 31 • 2

AIDC-AI/Ovis-Clip-Qwen1_5-14B

Image-Text-to-Text • Updated 25 days ago • 22 • 3

OpenGVLab/InternVL2-2B-AWQ

Image-Text-to-Text • Updated 1 day ago • 41 • 2

mlx-community/dolphin-vision-72b-4bit

Image-Text-to-Text • Updated 5 days ago • 12 • 2

liuhaotian/llava-v1.5-7b

Image-Text-to-Text • Updated May 8 • 358k • 285

liuhaotian/llava-v1.5-13b

Image-Text-to-Text • Updated May 9 • 136k • 438