|
--- |
|
library_name: v-jepa |
|
tags: |
|
- video-embeddings |
|
- pytorch_model_hub_mixin |
|
- model_hub_mixin |
|
repo_url: https://github.com/facebookresearch/jepa |
|
--- |
|
|
|
## V-JEPA model |
|
|
|
This is a Vision Transformer (ViT) large model, trained using the V-JEPA method. |
|
|
|
## Installation |
|
|
|
First, clone and install the [JEPA package](https://github.com/facebookresearch/jepa/tree/main): |
|
|
|
```bash |
|
!git clone -b add_hf https://github.com/nielsrogge/jepa.git |
|
!cd jepa |
|
!pip install -r requirements.txt |
|
``` |
|
|
|
## Usage |
|
|
|
One can instantiate the model as follows: |
|
|
|
```python |
|
from src.models.vision_transformer import VisionTransformer |
|
|
|
model = VisionTransformer.from_pretrained("nielsr/vit-large-patch16-v-jepa") |
|
``` |