segment_anything_webui

Sleeping

File size: 2,955 Bytes

6cf2732
 
 
 
 
 
c087eb4
6cf2732
 
 
 
f98db89
 
04ab1a3
 
0701438
ba0d063
f98db89
9fbfe03
ba0d063
f98db89
 
 
9fbfe03
f98db89
ba0d063
 
 
 
9fbfe03
 
 
 
 
 
f98db89
 
 
 
 
9fbfe03
f98db89
 
 
 
 
9fbfe03
f98db89
9fbfe03
f98db89
9fbfe03
f98db89
9fbfe03
f98db89
ba0d063
9fbfe03
f98db89
 
 
 
 
ba0d063
f98db89
 
 
ba0d063
 
 
 
 
 
 
 
 
 
9fbfe03

---
title: Segment Anything
emoji: 🚀
colorFrom: gray
colorTo: pink
sdk: gradio
sdk_version: 3.25.0
app_file: app.py
pinned: false
---

# Segment Anything WebUI

[![Duplicate this Space](https://huggingface.co/datasets/huggingface/badges/raw/main/duplicate-this-space-sm.svg)](https://huggingface.co/spaces/AIBoy1993/segment_anything_webui?duplicate=true) 
[![Duplicate this Space](https://huggingface.co/datasets/huggingface/badges/raw/main/duplicate-this-space-sm-dark.svg)](https://huggingface.co/spaces/AIBoy1993/segment_anything_webui?duplicate=true)

This project is based on **[Segment Anything Model](https://segment-anything.com/)** by Meta. The UI is based on [Gradio](https://gradio.app/). 

- Try deme on HF: [AIBoy1993/segment_anything_webui](https://huggingface.co/spaces/AIBoy1993/segment_anything_webui)
- [GitHub](https://github.com/5663015/segment_anything_webui)

![](./images/20230408023615.png)

## Change Logs

- [2023-4-11] 
  - Support video segmentation. A short video can be automatically segmented by SAM.
  - Support text prompt segmentation using [OWL-ViT](https://huggingface.co/docs/transformers/v4.27.2/en/model_doc/owlvit#overview) (Vision Transformer for Open-World Localization) model.


## **Usage**

Following usage is running on your computer. 

- Install Segment Anything（[more details about install Segment Anything](https://github.com/facebookresearch/segment-anything#installation)）：

```
pip install git+https://github.com/facebookresearch/segment-anything.git
```

- `git clone` this repository：

```
git clone https://github.com/5663015/segment_anything_webui.git
```

- Make a new folder named `checkpoints` under this project，and put the downloaded weights files in `checkpoints`。You can download the weights using following URLs：

  - `vit_h`: [ViT-H SAM model](https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth)

  - `vit_l`: [ViT-L SAM model](https://dl.fbaipublicfiles.com/segment_anything/sam_vit_l_0b3195.pth)

  - `vit_b`: [ViT-B SAM model](https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth)

- Under `checkpoints`, make a new folder named `models--google--owlvit-base-patch32`, and put the downloaded [OWL-ViT weights](https://huggingface.co/google/owlvit-base-patch32) files in `models--google--owlvit-base-patch32`. 
- Run：

```
python app.py
```

**Note：** Default model is `vit_b`，the demo can run on CPU. Default device is `cpu`。

## TODO

- [x] Video segmentation

- [x] Add text prompt

- [ ] Add segmentation prompt (point and box)

## Reference

- Thanks to the wonderful work [Segment Anything](https://segment-anything.com/) and [OWL-ViT](https://arxiv.org/abs/2205.06230)
- Some video processing code references [kadirnar/segment-anything-video](https://github.com/kadirnar/segment-anything-video), and some OWL-ViT code references [ngthanhtin/owlvit_segment_anything](https://github.com/ngthanhtin/owlvit_segment_anything).