--- pipeline_tag: zero-shot-image-classification --- This repository contains the models of the paper [Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality](https://huggingface.co/papers/2410.05210).