MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages Paper • 2410.01036 • Published 5 days ago • 12
HeadGAP: Few-shot 3D Head Avatar via Generalizable Gaussian Priors Paper • 2408.06019 • Published Aug 12 • 13
Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction Paper • 2409.18124 • Published 10 days ago • 23
Llama 3.2 All Versions Collection Meta's new Llama 3.2 vision and text models including 1B, 3B, 11B and 90B. Includes GGUF, 4-bit bnb and original versions. • 20 items • Updated about 19 hours ago • 31
Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 45 items • Updated 18 days ago • 226
ReMamba: Equip Mamba with Effective Long-Sequence Modeling Paper • 2408.15496 • Published Aug 28 • 10
The Mamba in the Llama: Distilling and Accelerating Hybrid Models Paper • 2408.15237 • Published Aug 27 • 36
The Russian-focused embedders' exploration: ruMTEB benchmark and Russian embedding model design Paper • 2408.12503 • Published Aug 22 • 21
Controllable Text Generation for Large Language Models: A Survey Paper • 2408.12599 • Published Aug 22 • 61
Jamba-1.5 Collection The AI21 Jamba family of models are state-of-the-art, hybrid SSM-Transformer instruction following foundation models • 2 items • Updated Aug 22 • 76
view article Article Llama-3.1-Storm-8B: Improved SLM with Self-Curation + Model Merging By akjindal53244 • Aug 19 • 72
Transformer Language Models without Positional Encodings Still Learn Positional Information Paper • 2203.16634 • Published Mar 30, 2022 • 5
Qwen2-Audio Collection Audio-language model series based on Qwen2 • 4 items • Updated 19 days ago • 41
🦅 🐍 FalconMamba 7B Collection This collection features the FalconMamba 7B base model, the instruction-tuned version, their 4-bit and GGUF variants, and the demo. • 13 items • Updated 19 days ago • 25
Llama 3.1 Collection This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated 11 days ago • 587
YouTube-SL-25: A Large-Scale, Open-Domain Multilingual Sign Language Parallel Corpus Paper • 2407.11144 • Published Jul 15 • 7
LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control Paper • 2407.03168 • Published Jul 3 • 2
Human4DiT: Free-view Human Video Generation with 4D Diffusion Transformer Paper • 2405.17405 • Published May 27 • 14
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention Paper • 2405.12981 • Published May 21 • 28
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 27 items • Updated 18 days ago • 474
SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization Paper • 2405.11582 • Published May 19 • 12
PaliGemma Release Collection Pretrained and mix checkpoints for PaliGemma • 16 items • Updated Jul 31 • 136
InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation Paper • 2404.19427 • Published Apr 30 • 71
🤔 Facial Expressions Recognition Collection Embrace the future of Facial Expressions Recognition with the latest AI-powered technologies! 🚀 • 4 items • Updated Jun 11 • 6
Russian speaking 7B models Collection There is some my 7B models good speak and understand Russian language. Approved by some data-set my own tests. Will be link to github repo soon...🪬 • 7 items • Updated May 17 • 4
🤗 Big Five Personality Traits Collection The latest AI technologies usher in a new era of Big Five personality assessment 🚀 • 4 items • Updated May 1 • 2
view article Article Introducing Idefics2: A Powerful 8B Vision-Language Model for the community Apr 15 • 161
🤖 LLM Spaces Collection A collection of applications demonstrating large language models (LLMs) 🚀 • 17 items • Updated May 30 • 6
PhysAvatar: Learning the Physics of Dressed 3D Avatars from Visual Observations Paper • 2404.04421 • Published Apr 5 • 16
GeneAvatar: Generic Expression-Aware Volumetric Head Avatar Editing from a Single Image Paper • 2404.02152 • Published Apr 2 • 3
Efficient 3D Implicit Head Avatar with Mesh-anchored Hash Table Blendshapes Paper • 2404.01543 • Published Apr 2 • 3
Audio-Visual Compound Expression Recognition Method based on Late Modality Fusion and Rule-based Decision Paper • 2403.12687 • Published Mar 19 • 3
VLOGGER: Multimodal Diffusion for Embodied Avatar Synthesis Paper • 2403.08764 • Published Mar 13 • 34
🎭 Avatars Collection The latest AI-powered technologies usher in a new era of realistic avatars! 🚀 • 67 items • Updated 8 days ago • 73
🖼️ Image Enhancement Collection Embrace the future of Image Enhancement with the latest AI-powered technologies! 🚀 • 1 item • Updated May 1 • 5
🔊 Speech Enhancement Collection Unlocking a new era in Speech Enhancement, powered by the latest AI technologies, for superior audio quality improvements! 🚀 • 8 items • Updated May 1 • 9
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation Paper • 2403.04692 • Published Mar 7 • 40
YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information Paper • 2402.13616 • Published Feb 21 • 45
Vision-Based Hand Gesture Customization from a Single Demonstration Paper • 2402.08420 • Published Feb 13 • 7
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages Paper • 2309.09400 • Published Sep 17, 2023 • 82