Upload folder using huggingface_hub

Browse files

Files changed (5) hide show

.DS_Store +0 -0
README.md +10 -55
models/.DS_Store +0 -0
models/ormbg.pth +2 -2
utils/pth_to_onnx.py +2 -2

.DS_Store ADDED Viewed

Binary file (6.15 kB). View file

README.md CHANGED Viewed

@@ -15,11 +15,9 @@ datasets:
 [>>> DEMO <<<](https://huggingface.co/spaces/schirrmacher/ormbg)
-Join our [Research Discord Group](https://discord.gg/YYZ3D66t)!
 ![](examples.jpg)
-This model is a **fully open-source background remover** optimized for images with humans. It is based on [Highly Accurate Dichotomous Image Segmentation research](https://github.com/xuebinqin/DIS). The model was trained with the synthetic [Human Segmentation Dataset](https://huggingface.co/datasets/schirrmacher/humans), a dataset crafted with [LayerDiffuse](https://github.com/layerdiffusion/LayerDiffuse) and [IC-Light](https://github.com/lllyasviel/IC-Light).
 ![](explanation.jpg)
@@ -31,62 +29,19 @@ This model is similar to [RMBG-1.4](https://huggingface.co/briaai/RMBG-1.4), but
 python utils/inference.py
 ```
-## Training
-The model was trained on a NVIDIA GeForce RTX 4090 (10.000 iterations) with the [Human Segmentation Dataset](https://huggingface.co/datasets/schirrmacher/humans) which was created with [LayerDiffuse](https://github.com/layerdiffusion/LayerDiffuse) and [IC-Light](https://github.com/lllyasviel/IC-Light).
-## Want to train your own model?
-Checkout _Highly Accurate Dichotomous Image Segmentation_ code:
-```
-git clone https://github.com/xuebinqin/DIS.git
-cd DIS
-```
-Follow the installation instructions on https://github.com/xuebinqin/DIS?tab=readme-ov-file#1-clone-this-repo.
-Download or create some data ([like this](https://huggingface.co/datasets/schirrmacher/humans)) and place it into the DIS project folder.
-I am using the folder structure:
-- training/im (images)
-- training/gt (ground truth)
-- validation/im (images)
-- validation/gt (ground truth)
-Apply this git patch for setting the right paths and remove normalization of images:
-```
-git apply dis-repo.patch
-```
-Start training:
-```
-cd IS-Net
-python train_valid_inference_main.py
-```
-Export to ONNX (modify paths if needed):
-```
-python utils/pth_to_onnx.py
-```
 # Research
-Synthetic datasets have limitations for achieving great segmentation results. This is because artificial lighting, occlusion, scale or backgrounds create a gap between synthetic and real images. A "model trained solely on synthetic data generated with naïve domain randomization struggles to generalize on the real domain", see [PEOPLESANSPEOPLE: A Synthetic Data Generator for Human-Centric Computer Vision (2022)](https://arxiv.org/pdf/2112.09290).
-Currently I am doing research how to close this gap. Latest research is about creating segmented humans with [LayerDiffuse](https://github.com/layerdiffusion/LayerDiffuse) and then apply [IC-Light](https://github.com/lllyasviel/IC-Light) for creating realistic light effects and shadows.
-## Support
-This is the first iteration of the model, so there will be improvements!
-If you identify cases were the model fails, <a href='https://huggingface.co/schirrmacher/ormbg/discussions' target='_blank'>upload your examples</a>!
-Known issues (work in progress):
-- close-ups: from above, from below, profile, from side
-- minor issues with hair segmentation when hair creates loops
-- more various backgrounds needed

 [>>> DEMO <<<](https://huggingface.co/spaces/schirrmacher/ormbg)
 ![](examples.jpg)
+This model is a **fully open-source background remover** optimized for images with humans. It is based on [Highly Accurate Dichotomous Image Segmentation research](https://github.com/xuebinqin/DIS). The model was trained with the synthetic [Human Segmentation Dataset](https://huggingface.co/datasets/schirrmacher/humans), [P3M-10k](https://paperswithcode.com/dataset/p3m-10k) and [AIM-500](https://paperswithcode.com/dataset/aim-500).
 ![](explanation.jpg)
 python utils/inference.py
 ```
 # Research
+I started training the model
+Synthetic datasets have limitations for achieving great segmentation results. This is because artificial lighting, occlusion, scale or backgrounds create a gap between synthetic and real images. A "model trained solely on synthetic data generated with naïve domain randomization struggles to generalize on the real domain", see [PEOPLESANSPEOPLE: A Synthetic Data Generator for Human-Centric Computer Vision (2022)](https://arxiv.org/pdf/2112.09290).
+Latest changes (05/07/2024):
+- Added [P3M-10K](https://paperswithcode.com/dataset/p3m-10k) dataset for training and validation
+- Added [AIM-500](https://paperswithcode.com/dataset/aim-500) dataset for training and validation
+- Applied [Grid Dropout](https://albumentations.ai/docs/api_reference/augmentations/dropout/grid_dropout/) to make the model smarter
+Next steps:
+- Expand dataset
+- Research on multi-step segmentation by incorporating [ViTMatte](https://github.com/hustvl/ViTMatte)

models/.DS_Store ADDED Viewed

Binary file (6.15 kB). View file

models/ormbg.pth CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:ea91a08b901277e040640859312c76048a6505cea56ecdcdd3ce6b1a27cfe8d3
-size 176717548

 version https://git-lfs.github.com/spec/v1
+oid sha256:ba5817f4d73b494e60d077b4fa2c008c90ad1dc1eb5a7234a958fb0a699907c2
+size 176720018

utils/pth_to_onnx.py CHANGED Viewed

@@ -30,7 +30,7 @@ def export_to_onnx(model_path, onnx_path):
         dummy_input,
         onnx_path,
         export_params=True,
-        opset_version=10,
         do_constant_folding=True,
         input_names=["input"],
         output_names=["output"],
@@ -50,7 +50,7 @@ if __name__ == "__main__":
     parser.add_argument(
         "--onnx_path",
         type=str,
-        default="./models/example.onnx",
         help="The path where the ONNX model will be saved.",
     )

         dummy_input,
         onnx_path,
         export_params=True,
+        opset_version=11,
         do_constant_folding=True,
         input_names=["input"],
         output_names=["output"],
     parser.add_argument(
         "--onnx_path",
         type=str,
+        default="./models/gpu_itr_28000_traLoss_0.102_traTarLoss_0.0105_valLoss_0.1293_valTarLoss_0.015_maxF1_0.9947_mae_0.0059_time_0.015454.pth",
         help="The path where the ONNX model will be saved.",
     )