Introducing IDEFICS: An Open Reproduction of State-of-the-art Visual Language Model Aug 22, 2023 • 14
view post Post 2509 Reply We release Idefics2-chatty, the chatbot-optimized version of Idefics2: HuggingFaceM4/idefics2-8b-chattyIdefics2-chatty is better at following instructions and following Chain-of-Thoughts reasoning.Moreover, we also release a paper, containing a lot of findings on how to build an efficient and performant Vision-Language Model: What matters when building vision-language models? (2405.02246)How are you going to use the model, or what data are you going to fine-tune it on?
view post Post 2141 Reply Idefics2 is trained mostly on OBELICS, our open interleaved image-text document dataset.Training on interleaved data is crucial to reaching high performance on VQA tasks, taking an arbitrary number of images as input, and doing in-context learning.Dataset: HuggingFaceM4/OBELICSNomic visualization: https://atlas.nomic.ai/map/f2fba2aa-3647-4f49-a0f3-9347daeee499/ee4a84bd-f125-4bcc-a683-1b4e231cb10fLink to OBELICS thread: https://twitter.com/HugoLaurencon/status/1694005892839006301
Datasets VLM Preferences openbmb/RLAIF-V-Dataset Viewer • Updated Jun 1 • 33.8k • 2.54k • 91 MMInstruction/VLFeedback Updated Dec 20, 2023 • 451 • 36 zhiqings/LLaVA-Human-Preference-10K Viewer • Updated Sep 27, 2023 • 9.42k • 55 • 19 sqrti/SPA-VL Viewer • Updated 2 days ago • 101k • 14 • 3