VLM Benchmarks - a marcusinthesky Collection

marcusinthesky 's Collections

DS

Open-vocabulary object detection (OVD).

Multi-modal Mamba

Multimodal Embeddings

Tiny VLM Decoder

PeFT

Decoder Upcycled to Embeddings

VLM Benchmarks

updated 1 day ago

MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models

Paper • 2410.10139 • Published 3 days ago • 47
MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks

Paper • 2410.10563 • Published 2 days ago • 30
LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers Content

Paper • 2410.10783 • Published 2 days ago • 24
TVBench: Redesigning Video-Language Evaluation

Paper • 2410.07752 • Published 6 days ago • 4