Arabic NLI & Semantic Similarity Datasets Collection The Arabic Version of SNLI and MultiNLI datasets, originally used for Natural Language Inference (NLI), may be used for finetuning embedding models. • 6 items • Updated 17 days ago • 3
view article Article EU Training Data Transparency: A Proposal for a Sufficiently Detailed Summary 📑📚🖼️🇪🇺 By yjernite • 2 days ago • 7
Arabic Matryoshka Embedding Models Collection A collection of advanced Arabic Matryoshka Embedding Models designed for efficient and high-performance Arabic NLP, available publicly on Hugging Face • 6 items • Updated 1 day ago • 5
view article Article Formatting Datasets for Chat Template Compatibility By nroggendorff • 7 days ago • 6
Probably DPO datasets Collection A collection of datasets that probably support DPO • 146 items • Updated 9 days ago • 8
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale Paper • 2406.17557 • Published 10 days ago • 73
view article Article Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models 12 days ago • 124
view article Article Enhancing Image Model Dreambooth Training Through Effective Captioning: Key Observations By alvdansen • 16 days ago • 11
How Do Large Language Models Acquire Factual Knowledge During Pretraining? Paper • 2406.11813 • Published 18 days ago • 28
MobileCLIP Models + DataCompDR Data Collection MobileCLIP: Mobile-friendly image-text models with SOTA zero-shot capabilities. DataCompDR: Improved datasets for training image-text SOTA models. • 22 items • Updated 16 days ago • 18
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis Paper • 2403.03206 • Published Mar 5 • 47
Quantum Embedding with Transformer for High-dimensional Data Paper • 2402.12704 • Published Feb 20 • 2
INDUS: Effective and Efficient Language Models for Scientific Applications Paper • 2405.10725 • Published May 17 • 28
view article Article Multimodal Augmentation for Documents: Recovering “Comprehension” in “Reading and Comprehension” task By danaaubakirova • May 16 • 15
view article Article A Guide to Designing New Functional Proteins and Improving Protein Function, Stability, and Diversity with Generative AI By AmelieSchreiber • 4 days ago • 25
view article Article Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent Apr 22 • 75
view article Article seemore: Implement a Vision Language Model from Scratch By AviSoori1x • 12 days ago • 48
view article Article Leveraging Transformers and PyTorch for Multiple Choice Question Tasks By Andyrasika • Dec 25, 2023 • 1
view article Article Robust image watermarking with Stable Signature + IMATAG's BZH By imatag-vch • Jan 22 • 1
view article Article Serverless Image Similarity with Upstash Vector and Huggingface Models, Datasets and Spaces By omerXfaruq • Jan 31 • 2
view article Article Streamline Computer Vision Workflows with Hugging Face Transformers and FiftyOne By jamarks • Feb 27 • 8
view article Article Orchestration of Experts: The First-Principle Multi-Model System By alirezamsh • May 30 • 14
view article Article RAG Empowerment: Cohere C4AI Command-R and Transformers Unveiled By Andyrasika • Apr 7 • 10