Upload 11 files

Files changed (12) hide show

.gitattributes CHANGED Viewed

@@ -33,3 +33,11 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+VILD-a-VidLN-Kinetics-train.jsonl filter=lfs diff=lfs merge=lfs -text
+VILD-a-VidLN-Oops-train.jsonl filter=lfs diff=lfs merge=lfs -text
+VILD-a-VidLN-UVO-all.jsonl filter=lfs diff=lfs merge=lfs -text
+VILD-b-VideoChat.jsonl filter=lfs diff=lfs merge=lfs -text
+VILD-b-VideoInstruct100K.jsonl filter=lfs diff=lfs merge=lfs -text
+VILD-c-MiraData.jsonl filter=lfs diff=lfs merge=lfs -text
+VILD-c-Open-Sora-Dataset.jsonl filter=lfs diff=lfs merge=lfs -text
+VILD-d-Panda-70M.jsonl filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

+# The VILD Dataset (VIdeo and Long-Description)
+This dataset is proposed from [VideoCLIP-XL](https://arxiv.org/abs/2410.00741).
+We establish an automatic data collection system, designed to aggregate sufficient and high-quality pairs from multiple data sources.
+We have successfully collected over 2M (VIdeo, Long Description) pairs, denoted as our VILD dataset.
+# Format
+```json
+{
+  "short_captions": [
+        "...",
+    ],
+  "long_captions": [
+        "...",
+    ],
+  "video_id": "..."
+}
+{
+  .....
+},
+.....
+```
+# Source
+~~~
+@misc{wang2024videoclipxladvancinglongdescription,
+      title={VideoCLIP-XL: Advancing Long Description Understanding for Video CLIP Models},
+      author={Jiapeng Wang and Chengyu Wang and Kunzhe Huang and Jun Huang and Lianwen Jin},
+      year={2024},
+      eprint={2410.00741},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2410.00741},
+}
+~~~

VILD-a-VidLN-Kinetics-train.jsonl ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:82388a2a364b821f45c1693cf1dfe5f0c315dcc78f4e864ed43a755d145e5569
+size 113156271

VILD-a-VidLN-OVIS-train.jsonl ADDED Viewed

The diff for this file is too large to render. See raw diff

VILD-a-VidLN-Oops-train.jsonl ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:aec63ead07b2447036e30b1b20d7b121b8d9ae68d63fa409ea441e5a19af95f0
+size 46049729

VILD-a-VidLN-Oops-val.jsonl ADDED Viewed

The diff for this file is too large to render. See raw diff

VILD-a-VidLN-UVO-all.jsonl ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:c7c39457f4c6f5bfe07d1169d8523114e7422ff3d552602b91052daab4059673
+size 30377587

VILD-b-VideoChat.jsonl ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:74bc42ac06af246314a5553492c2dde83da22167430bf354719223773ce2518c
+size 46685938

VILD-b-VideoInstruct100K.jsonl ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:a19ff34523e73de27958821097914564575383daa3a1bf2bfc3efffdff0805e7
+size 175620221

VILD-c-MiraData.jsonl ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:f630622ad36085d59855299352506d614a971fff1e1417331769979725231352
+size 42743061

VILD-c-Open-Sora-Dataset.jsonl ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:7a824d4395e972e3d9fd2a5ce578edc5031fd5f7e4112c9d7141e85e465f4d01
+size 57968303

VILD-d-Panda-70M.jsonl ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:f25be88b2f005802364ab03c964b00a64f7663d785a451e597fb44cc58ec005c
+size 1134381417