Edit model card

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

SpatialBot is a VLM with spatial understanding and reasoning abilties, by precisely understanding depth maps and using them to do high-level tasks.

In this HF repo, we provide ckpts of SpatialBot-3B with LoRA, which is based on Phi-2 and SigLIP. It can perform well on general VLM tasks and spatial understanding benchmarks like SpatialBench.

You will also need to download pretrained CKPT.

Paper:

https://arxiv.org/abs/2406.13642

GitHub repo:

https://github.com/BAAI-DCAI/SpatialBot

SpatialBench, the benchmark:

https://huggingface.co/datasets/RussRobin/SpatialBench

Merged SpatialBot-3B:

https://huggingface.co/RussRobin/SpatialBot-3B

Downloads last month
11
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.