LLM Reasoning Papers Collection Papers to improve reasoning capabilities of LLMs • 13 items • Updated 9 days ago • 45
Critique-out-Loud Reward Models Collection Paper: https://arxiv.org/abs/2408.11791 | Code: https://github.com/zankner/CLoud • 7 items • Updated 29 days ago • 2
view article Article ColPali: Efficient Document Retrieval with Vision Language Models 👀 By manu • Jul 5 • 109
view article Article Fine-tuning a token classification model for legal data using Argilla and AutoTrain By bikashpatra • 27 days ago • 11
view article Article Llama-3.1-Storm-8B: Improved SLM with Self-Curation + Model Merging By akjindal53244 • Aug 19 • 72
Building and better understanding vision-language models: insights and future directions Paper • 2408.12637 • Published Aug 22 • 110
Llama 3.1 GPTQ, AWQ, and BNB Quants Collection Optimised Quants for high-throughput deployments! Compatible with Transformers, TGI & VLLM 🤗 • 9 items • Updated 8 days ago • 51
view article Article Announcing Finance Commons and the Bad Data Toolbox: Pioneering Open Data and Advanced Document Processing By Pclanglais • Jul 19 • 17
Show, Don't Tell: Aligning Language Models with Demonstrated Feedback Paper • 2406.00888 • Published Jun 2 • 30
Are You Sure? Rank Them Again: Repeated Ranking For Better Preference Datasets Paper • 2405.18952 • Published May 29 • 10
view article Article ⚗️ 🔥 Building High-Quality Datasets with distilabel and Prometheus 2 By burtenshaw • Jun 3 • 26
view article Article Training and Finetuning Embedding Models with Sentence Transformers v3 May 28 • 148
view article Article Can we create pedagogically valuable multi-turn synthetic datasets from Cosmopedia? By davanstrien • May 7 • 7
view article Article 🧑⚖️ "Replacing Judges with Juries" using distilabel By alvarobartt • May 3 • 17
view article Article ⚗️ 🧑🏼🌾 Let's grow some Domain Specific Datasets together By burtenshaw • Apr 29 • 28
Zephyr ORPO Collection Models and datasets to align LLMs with Odds Ratio Preference Optimisation (ORPO). Recipes here: https://github.com/huggingface/alignment-handbook • 3 items • Updated Apr 12 • 16
Creación de corpus en comunidad Collection Colección de esfuerzos colaborativos para crear corpus en español de calidad. Toda persona hispanohablante puede contribuir :) • 7 items • Updated Jul 17 • 6
DIBT Prompt collective SPIN Collection This collection contains resources related to the replication of SPIN with the dibt prompt collective dataset • 8 items • Updated Jul 30 • 7
Apple MLX-compatible 7B LLMs on the 🤗 Hub Collection This collection contains the model weights for 7B LLMs for Apple's MLX framework. Find more information at https://github.com/ml-explore/mlx • 8 items • Updated Sep 2 • 9
Datasets built with ⚗️ distilabel Collection This collection contains some datasets generated and/or labelled using https://github.com/argilla-io/distilabel • 7 items • Updated Aug 6 • 9
Preference Datasets for DPO Collection This collection contains a list of curated preference datasets for DPO fine-tuning for intent alignment of LLMs • 7 items • Updated Jul 30 • 28
Comparing DPO with IPO and KTO Collection A collection of chat models to explore the differences between three alignment techniques: DPO, IPO, and KTO. • 56 items • Updated Jan 9 • 31
Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4 Paper • 2312.16171 • Published Dec 26, 2023 • 34
Awesome feedback datasets Collection A curated list of datasets with human or AI feedback. Useful for training reward models or applying techniques like DPO. • 19 items • Updated Apr 12 • 65
Notus 7B v1 Collection Notus 7B v1 models (DPO fine-tune of Zephyr SFT) and datasets used. More information at https://github.com/argilla-io/notus • 11 items • Updated Jul 30 • 17
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models Paper • 2310.08491 • Published Oct 12, 2023 • 53
OpenChat Collection OpenChat: Advancing Open-source Language Models with Mixed-Quality Data • 7 items • Updated Jul 31 • 33
Adapting Large Language Models via Reading Comprehension Paper • 2309.09530 • Published Sep 18, 2023 • 75