alibaba-pai
/

VILD

Model card Files Files and versions Community

VILD / README.md

jpWang's picture

Update README.md

37faf13 verified 11 days ago

|

history blame contribute delete

974 Bytes

	---
	license: cc-by-nc-sa-4.0
	language:
	- en
	---
	# The VILD Dataset (VIdeo and Long-Description)

	This dataset is proposed from [VideoCLIP-XL](https://arxiv.org/abs/2410.00741).
	We establish an automatic data collection system, designed to aggregate sufficient and high-quality pairs from multiple data sources.
	We have successfully collected over 2M (VIdeo, Long Description) pairs, denoted as our VILD dataset.

	# Format
	```json
	{
	"short_captions": [
	"...",
	],
	"long_captions": [
	"...",
	],
	"video_id": "..."
	}
	{
	.....
	},
	.....
	```


	# Source
	~~~
	@misc{wang2024videoclipxladvancinglongdescription,
	title={VideoCLIP-XL: Advancing Long Description Understanding for Video CLIP Models},
	author={Jiapeng Wang and Chengyu Wang and Kunzhe Huang and Jun Huang and Lianwen Jin},
	year={2024},
	eprint={2410.00741},
	archivePrefix={arXiv},
	primaryClass={cs.CL},
	url={https://arxiv.org/abs/2410.00741},
	}
	~~~