File size: 4,511 Bytes

---
license: other
licence_name: bria-rmbg-1.4
license_link: https://bria.ai/bria-huggingface-model-license-agreement/

tags:
- remove background
- background
- background removal
- Pytorch
- vision
- legal liability

extra_gated_prompt: This model weights by BRIA AI can be obtained after a commercial license is agreed upon. Fill in the form below and we reach out to you.
extra_gated_fields:
  Name: text
  Company/Org name: text
  Org Type (Early/Growth Startup, Enterprise, Academy): text
  Role: text
  Country: text
  Email: text
  By submitting this form, I agree to BRIA’s Privacy policy and Terms & conditions, see links below: checkbox
---

# BRIA Background Removal v1.4 Model Card

RMBG v1.4 is our state-of-the-art background removal model, designed to effectively separate foreground from background in a range of
categories and image types. This model has been trained on a carefully selected dataset, which includes:
general stock images, e-commerce, gaming, and advertising content, making it suitable for various use cases. 
Developed by BRIA AI, RMBG v1.4 is available as an open-source tool for non-commercial use.


![examples](t4.png)

### Model Description

- **Developed by:** [BRIA AI](https://bria.ai/)
- **Model type:** Background Removal 
- **License:** [bria-rmbg-1.4](https://bria.ai/bria-huggingface-model-license-agreement/)
  - The model is open for non-commercial use.
  - Commercial use is subject to a commercial agreement with BRIA. [Contact Us](https://bria.ai/contact-us)

- **Model Description:** BRIA RMBG 1.4 is an saliency segmentation model trained exclusively on a professional-grade dataset.



## Training data
Bria-RMBG model was trained over 12,000 high-quality, high-resolution, manually labeled (pixel-wise accuracy), fully licensed images.
For clarity, we provide our data distribution according to different categories, demonstrating our model’s versatility.

### Distribution of images:

| Category | Distribution |
| -----------------------------------| -----------------------------------:|
| Objects only | 45.11% |
| People with objects/animals | 25.24% |
| People only | 17.35% |
| people/objects/animals with text | 8.52% |
| Text only | 2.52% |
| Animals only | 1.89% |

| Category | Distribution |
| -----------------------------------| -----------------------------------------:|
| Photorealistic | 87.70% |
| Non-Photorealistic | 12.30% |


| Category | Distribution |
| -----------------------------------| -----------------------------------:|
| Non Solid Background | 52.05% |
| Solid Background | 47.95% 


| Category | Distribution |
| -----------------------------------| -----------------------------------:|
| Single main foreground object | 51.42% |
| Multiple objects in the foreground | 48.58% |


## Qualitative Evaluation

![examples](results.png)

- **Inference Time :** 1 sec on Nvidia A10 GPU

## Architecture

The model’s architecture is based on [IS-Net](https://github.com/xuebinqin/DIS). 
Yet, we employ a distinct training scheme and utilize our proprietary data for the training process, enhancing the model's effectiveness.


## Usage

```python
import os
import numpy as np
from skimage import io
from glob import glob
from tqdm import tqdm
import cv2
import torch.nn.functional as F
from torchvision.transforms.functional import normalize
from models import BriaRMBG

input_size=[1024,1024]
net=BriaRMBG()

model_path = "./model.pth"
im_path = "./example_image.jpg"
result_path = "."

if torch.cuda.is_available():
    net.load_state_dict(torch.load(model_path))
    net=net.cuda()
else:
    net.load_state_dict(torch.load(model_path,map_location="cpu"))
net.eval()    

# prepare input
im = io.imread(im_path)
if len(im.shape) < 3:
    im = im[:, :, np.newaxis]
im_size=im.shape[0:2]
im_tensor = torch.tensor(im, dtype=torch.float32).permute(2,0,1)
im_tensor = F.interpolate(torch.unsqueeze(im_tensor,0), size=input_size, mode='bilinear').type(torch.uint8)
image = torch.divide(im_tensor,255.0)
image = normalize(image,[0.5,0.5,0.5],[1.0,1.0,1.0])

if torch.cuda.is_available():
    image=image.cuda()

# inference 
result=net(image)

# post process
result = torch.squeeze(F.interpolate(result[0][0], size=im_size, mode='bilinear') ,0)
ma = torch.max(result)
mi = torch.min(result)
result = (result-mi)/(ma-mi)

# save result
im_name=im_path.split('/')[-1].split('.')[0]
im_array = (result*255).permute(1,2,0).cpu().data.numpy().astype(np.uint8)
cv2.imwrite(os.path.join(result_path, im_name+".png"), im_array)
```