Hugging Face – Posts

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

All HF Hub posts

fdaudens

posted an update 2 days ago

Post

2304

🚀 1,000,000 public models milestone achieved on Hugging Face! 🤯

This chart by @cfahlgren1 shows the explosive growth of open-source AI. It's not just about numbers - it's a thriving community combining cutting-edge ML with real-world applications. cfahlgren1/hub-stats

Can't wait to see what's next!

1 reply

abidlabs

posted an update 1 day ago

Post

1302

👋 Hi Gradio community,

I'm excited to share that Gradio 5 will launch in October with improvements across security, performance, SEO, design (see the screenshot for Gradio 4 vs. Gradio 5), and user experience, making Gradio a mature framework for web-based ML applications.

Gradio 5 is currently in beta, so if you'd like to try it out early, please refer to the instructions below:

---------- Installation -------------

Gradio 5 depends on Python 3.10 or higher, so if you are running Gradio locally, please ensure that you have Python 3.10 or higher, or download it here: https://www.python.org/downloads/

* Locally: If you are running gradio locally, simply install the release candidate with pip install gradio --pre
* Spaces: If you would like to update an existing gradio Space to use Gradio 5, you can simply update the sdk_version to be 5.0.0b3 in the README.md file on Spaces.

In most cases, that’s all you have to do to run Gradio 5.0. If you start your Gradio application, you should see your Gradio app running, with a fresh new UI.

-----------------------------

Fore more information, please see: https://github.com/gradio-app/gradio/issues/9463

2 replies

merve

posted an update 1 day ago

Post

1241

We've shipped new computer vision/multimodal tasks to Hugging Face Hub 🫡
Keypoint detection just landed with many docs, and goodies 🎁
https://huggingface.co/models?pipeline_tag=keypoint-detection

In Hugging Face transformers we have SuperPoint, foundation model for keypoint detection, check out the demo here merve/SuperPoint

Shipped transformers task guide on keypoint detection https://huggingface.co/docs/transformers/tasks/keypoint_detection 📖

Also shipped the task page https://huggingface.co/tasks/keypoint-detection (easiest way to get started!) 🔖

Jaward

posted an update 1 day ago

Post

1389

This is supercool!!
LlaVA-3D: adds 3D-awareness to LVMs without compromising 2D understanding capabilities.

Method: they developed a unified architecture that maps 2D clip patch features to their corresponding positions in 3D space - enabling joint 2D and 3D vision-language instruction tuning.

Project: https://zcmax.github.io/projects/LLaVA-3D/

singhsidhukuldeep

posted an update 2 days ago

Post

2133

Good folks at Meta has just unveiled Llama 3.2, pushing the boundaries of language models and computer vision.

Even more interesting is how they trained this cutting-edge model:

1️⃣ Architecture:
Llama 3.2 uses an optimized transformer architecture with auto-regressive capabilities. The largest models (11B and 90B) now support multimodal inputs, integrating both text and images.

2️⃣ Training Pipeline:
• Started with pretrained Llama 3.1 text models
• Added image adapters and encoders
• Pretrained on large-scale noisy (image, text) pair data
• Fine-tuned on high-quality in-domain and knowledge-enhanced (image, text) pairs

3️⃣ Vision Integration:
• Trained adapter weights to integrate a pre-trained image encoder
• Used cross-attention layers to feed image representations into the language model
• Preserved text-only capabilities by not updating language model parameters during adapter training

4️⃣ Post-Training Alignment:
• Multiple rounds of supervised fine-tuning (SFT)
• Rejection sampling (RS)
• Direct preference optimization (DPO)
• Synthetic data generation using Llama 3.1 for Q&A augmentation
• Reward model ranking for high-quality fine-tuning data

5️⃣ Lightweight Models:
• Used pruning and distillation techniques for 1B and 3B models
• Structured pruning from Llama 3.1 8B model
• Knowledge distillation using Llama 3.1 8B and 70B as teachers

6️⃣ Context Length:
All models support an impressive 128K token context length.

7️⃣ Safety Measures:
Incorporated safety mitigation data to balance helpfulness and safety.

The result? A suite of models ranging from edge-friendly 1B parameters to powerful 90B parameter versions, capable of sophisticated reasoning across text and images. Llama 3.2 is set to revolutionize AI applications from mobile devices to enterprise-scale solutions.

What are your thoughts on these advancements? How do you see Llama 3.2 impacting your industry? Let's discuss in the comments!

DmitryRyumin

posted an update 2 days ago

Post

1486

🚀🕺🌟 New Research Alert - ECCV 2024 (Avatars Collection)! 🌟💃🚀
📄 Title: Expressive Whole-Body 3D Gaussian Avatar 🔝

📝 Description: ExAvatar is a model that generates animatable 3D human avatars with facial expressions and hand movements from short monocular videos using a hybrid mesh and 3D Gaussian representation.

👥 Authors: Gyeongsik Moon, Takaaki Shiratori, and @psyth

📅 Conference: ECCV, 29 Sep – 4 Oct, 2024 | Milano, Italy 🇮🇹

📄 Paper: MeshAvatar: Learning High-quality Triangular Human Avatars from Multi-view Videos (2407.08414)

📄 Paper: Expressive Whole-Body 3D Gaussian Avatar (2407.21686)

🌐 Github Page: https://mks0601.github.io/ExAvatar
📁 Repository: https://github.com/mks0601/ExAvatar_RELEASE

🚀 CVPR-2023-24-Papers: https://github.com/DmitryRyumin/CVPR-2023-24-Papers

🚀 WACV-2024-Papers: https://github.com/DmitryRyumin/WACV-2024-Papers

🚀 ICCV-2023-Papers: https://github.com/DmitryRyumin/ICCV-2023-Papers

📚 More Papers: more cutting-edge research presented at other conferences in the DmitryRyumin/NewEraAI-Papers curated by @DmitryRyumin

🚀 Added to the Avatars Collection: DmitryRyumin/avatars-65df37cdf81fec13d4dbac36

🔍 Keywords: #ExAvatar #3DAvatar #FacialExpressions #HandMotions #MonocularVideo #3DModeling #GaussianSplatting #MachineLearning #ComputerVision #ComputerGraphics #DeepLearning #AI #ECCV2024

appvoid

posted an update about 8 hours ago

Post

254

700m parameters are the sweet spot for cpu usage, please let's make more of those!

2 replies

sayakpaul

posted an update about 21 hours ago

Post

623

Did some little experimentation to resize pre-trained LoRAs on Flux. I explored two themes:

* Decrease the rank of a LoRA
* Increase the rank of a LoRA

The first one is helpful in reducing memory requirements if the LoRA is of a high rank, while the second one is merely an experiment. Another implication of this study is in the unification of LoRA ranks when you would like to torch.compile() them.

Check it out here:
sayakpaul/flux-lora-resizing

1 reply

singhsidhukuldeep

posted an update 1 day ago

Post

661

I'm thrilled to share that I’ve just released the Contextual Multi-Armed Bandits Library, a comprehensive Python toolkit that brings together a suite of both contextual and non-contextual bandit algorithms. Whether you're delving into reinforcement learning research or building practical applications, this library is designed to accelerate your work.

What's Inside:

- Contextual Algorithms:
- LinUCB
- Epsilon-Greedy
- KernelUCB
- NeuralLinearBandit
- DecisionTreeBandit

- Non-Contextual Algorithms:
- Upper Confidence Bound (UCB)
- Thompson Sampling

Key Features:

- Modular Design: Easily integrate and customize algorithms for your specific needs.
- Comprehensive Documentation: Clear instructions and examples to get you started quickly.
- Educational Value: Ideal for learning and teaching concepts in reinforcement learning and decision-making under uncertainty.

GitHub Repository: https://github.com/singhsidhukuldeep/contextual-bandits
PyPi: https://pypi.org/project/contextual-bandits-algos/

I am eager to hear your feedback, contributions, and ideas. Feel free to open issues, submit pull requests, or fork the project to make it your own.

erinys

posted an update 2 days ago

Post

1187

We did a thing! Eight weeks into our Hugging Face tenure, we can demo a round-trip of Xet-backed files from our local machine to a prod Hugging Face S3 bucket and back. 🚀

It’s been exciting to dive into how the Hub is built and design our steel thread through the infrastructure. Now that the thread is up, we can kick off project Capacious Extremis 🪄 to add all the other goodies: authentication, authorization, deduplication, privacy, and more.

What does this mean for you? You’re one step closer to ⚡ faster downloads, uploads, and iterative development on Hugging Face Hub! This is our first step toward replacing Git LFS as the Hub's storage backend: https://huggingface.co/blog/xethub-joins-hf

Check out the demo on LinkedIn to see the transfer in action: https://www.linkedin.com/posts/annux_youve-heard-of-blue-steel-but-have-activity-7245062126535405568-3cvJ

Recently active users