Cylingo/Xinyuan-VL-2B · Hugging Face

We evaluated XinYuan-VL-2B using the VLMEvalKit toolkit across the following benchmarks and found that XinYuan-VL-2B outperformed Qwen/Qwen2-VL-2B-Instruct released by Alibaba Cloud, as well as other models of comparable parameter scale that have significant influence in the open-source community.

Benchamrk	MiniCPM-2B	InternVL-2B	Qwen2-VL-2B	XinYuan-VL-2B
MMB-CN-V11-Test	64.5	68.9	71.2	74.3
MMB-EN-V11-Test	65.8	70.2	73.2	76.5
MMB-EN	69.1	74.4	74.3	78.9
MMB-CN	66.5	71.2	73.8	76.12
CCBench	45.3	74.7	53.7	55.5
MMT-Bench	53.5	50.8	54.5	55.2
RealWorld	55.8	57.3	62.9	63.9
SEEDBench_IMG	67.1	70.9	72.86	73.4
AI2D	56.3	74.1	74.7	74.2
MMMU	38.2	36.3	41.1	40.9
HallusionBench	36.2	36.2	42.4	55.00
POPE	86.3	86.3	86.82	89.42
MME	1808.6	1876.8	1872.0	1854.9
MMStar	39.1	49.8	47.5	51.87
SEEDBench2_Plus	51.9	59.9	62.23	62.98
BLINK	41.2	42.8	43.92	42.98
OCRBench	605	781	794	782
TextVQA	74.1	73.4	79.7	77.6

Cylingo
/

Xinyuan-VL-2B

Model tree for Cylingo/Xinyuan-VL-2B