Leandro von Werra's picture

205516.8 TFLOPS

Leandro von Werra

lvwerra

·

https://github.com/lvwerra

lvwerra

lvwerra

AI & ML interests

NLP and RL

Articles

BigCodeBench: Benchmarking Large Language Models on Solving Practical and Challenging Programming Tasks

StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation

Welcome Llama 3 - Meta's new open LLM

StarCoder2 and The Stack v2

Constitutional AI with Open LLMs

Preference Tuning LLMs with Direct Preference Optimization Methods

Welcome Mixtral - a SOTA Mixture of Experts on Hugging Face

The N Implementation Details of RLHF with PPO

Finetune Stable Diffusion Models with DDPO via TRL

Spread Your Wings: Falcon 180B is here

Code Llama: Llama 2 learns to code

Fine-tune Llama 2 with DPO

The Falcon has landed in the Hugging Face ecosystem

Creating a Coding Assistant with StarCoder

StarCoder: A State-of-the-Art LLM for Code

StackLLaMA: A hands-on guide to train LLaMA with RLHF

Fine-tuning 20B LLMs with RLHF on a 24GB consumer GPU

Illustrating Reinforcement Learning from Human Feedback (RLHF)

Evaluating Language Model Bias with 🤗 Evaluate

Announcing Evaluation on the Hub

Organizations

lvwerra's activity

upvoted an article 5 days ago

Article

Our Transformers Code Agent beats the GAIA benchmark!

8 days ago

• 26

upvoted a paper 6 days ago

Agentless: Demystifying LLM-based Software Engineering Agents

Paper • 2407.01489 • Published 7 days ago • 36

upvoted a paper 13 days ago

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Paper • 2406.17557 • Published 14 days ago • 73

upvoted a paper 14 days ago

BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions

Paper • 2406.15877 • Published 17 days ago • 42

upvoted an article 14 days ago

Article

Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models

15 days ago

• 130

upvoted an article 27 days ago

Article

Putting RL back in RLHF

27 days ago

• 53

upvoted a paper about 1 month ago

Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations

Paper • 2405.18392 • Published May 28 • 12

upvoted a collection about 2 months ago

Leaderboards and benchmarks ✨

Cool leaderboard spaces collection for models across modalities! Text, vision, audio, ... • 64 items • Updated 28 days ago • 69

upvoted 2 articles about 2 months ago

Article

2024-04-22 - Hub Incident Post Mortem

By

•

May 17

• 17

Article

License to Call: Introducing Transformers Agents 2.0

May 13

• 100

upvoted an article 2 months ago

Article

StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation

Apr 29

• 70

upvoted 2 papers 3 months ago

Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent

Paper • 2402.09844 • Published Feb 15 • 20

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published Apr 22 • 241

upvoted an article 3 months ago

Article

Welcome Llama 3 - Meta's new open LLM

Apr 18

• 254

upvoted a paper 4 months ago

DoGE: Domain Reweighting with Generalization Estimation

Paper • 2310.15393 • Published Oct 23, 2023 • 1

upvoted a collection 4 months ago

💫 StarCoder2

StarCoder2 models and datasets! • 8 items • Updated Mar 1 • 77

upvoted a paper 4 months ago

StarCoder 2 and The Stack v2: The Next Generation

Paper • 2402.19173 • Published Feb 29 • 126

upvoted a paper 6 months ago

Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18 • 136

upvoted a collection 6 months ago

Comparing DPO with IPO and KTO

A collection of chat models to explore the differences between three alignment techniques: DPO, IPO, and KTO. • 56 items • Updated Jan 9 • 31

upvoted 2 papers 6 months ago

Possible Meissner effect near room temperature in copper-substituted lead apatite

Paper • 2401.00999 • Published Jan 2 • 5

Improving Text Embeddings with Large Language Models

Paper • 2401.00368 • Published Dec 31, 2023 • 77

upvoted a paper 7 months ago

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Paper • 2312.00752 • Published Dec 1, 2023 • 132

upvoted a collection 8 months ago

⭐ StarCoder

All models, datasets, and demos related to StarCoder! • 11 items • Updated Feb 27 • 20

upvoted a collection 9 months ago

Zephyr 7B

Models, datasets, and demos associated with Zephyr 7B. For code to train the models, see: https://github.com/huggingface/alignment-handbook • 9 items • Updated Apr 12 • 142

upvoted a paper 9 months ago

Zephyr: Direct Distillation of LM Alignment

Paper • 2310.16944 • Published Oct 25, 2023 • 119

upvoted a paper 12 months ago

Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 237

upvoted a paper about 1 year ago

RepoFusion: Training Code Models to Understand Your Repository

Paper • 2306.10998 • Published Jun 19, 2023 • 14