Philipp Schmid

Hugging Face and Google partner for open AI collaboration

Jan 25

Welcome Mixtral - a SOTA Mixture of Experts on Hugging Face

Dec 11, 2023

• 8

Mixture of Experts Explained

Dec 11, 2023

• 103

Deploy Embedding Models with Hugging Face Inference Endpoints

Oct 24, 2023

Llama 2 on Amazon SageMaker a Benchmark

Sep 26, 2023

Fine-tuning Llama 2 70B using PyTorch FSDP

Sep 13, 2023

• 6

Spread Your Wings: Falcon 180B is here

Sep 6, 2023

Code Llama: Llama 2 learns to code

Aug 25, 2023

Introducing SafeCoder

Aug 22, 2023

Hugging Face Platform on the AWS Marketplace: Pay with your AWS Account

Aug 10, 2023

Llama 2 is here - get it on Hugging Face

Jul 18, 2023

• 17

Deploy LLMs with Hugging Face Inference Endpoints

Jul 4, 2023

• 8

The Falcon has landed in the Hugging Face ecosystem

Jun 5, 2023

• 6

Introducing the Hugging Face LLM Inference Container for Amazon SageMaker

May 31, 2023

Hugging Face Collaborates with Microsoft to Launch Hugging Face Model Catalog on Azure

May 24, 2023

Creating a Coding Assistant with StarCoder

May 9, 2023

Accelerating Hugging Face Transformers with AWS Inferentia2

Apr 17, 2023

Hugging Face and AWS partner to make AI more accessible

Feb 21, 2023

Pre-Train BERT with Hugging Face Transformers and Habana Gaudi

Aug 22, 2022

• 3

Convert Transformers to ONNX with Hugging Face Optimum

Jun 22, 2022

Accelerated Inference with Optimum and Transformers Pipelines

May 10, 2022

Accelerate BERT inference with Hugging Face Transformers and AWS inferentia

Mar 16, 2022

Case Study: Millisecond Latency using Hugging Face Infinity and modern CPUs

Jan 13, 2022

Paper • 2407.01906 • Published 4 days ago • 18

Deploy GPT-J 6B for inference using Hugging Face Transformers and Amazon SageMaker

Jan 11, 2022

Few-shot learning in practice: GPT-NEO and the 🤗 Accelerated Inference API

Jun 3, 2021

• 3

Distributed Training: Train BART/T5 for Summarization using 🤗 Transformers and Amazon SageMaker

Apr 8, 2021

Organizations

philschmid's activity

upvoted a paper about 16 hours ago

Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models

upvoted a paper 3 days ago

Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems

Paper • 2407.01370 • Published 4 days ago • 60

upvoted a paper 8 days ago

LiveBench: A Challenging, Contamination-Free LLM Benchmark

Paper • 2406.19314 • Published 8 days ago • 12

upvoted an article 8 days ago

Article

Welcome Gemma 2 - Google's new open LLM

9 days ago

• 83

upvoted a paper 10 days ago

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Paper • 2406.17557 • Published 10 days ago • 73

upvoted a paper 13 days ago

Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models

Paper • 2406.13542 • Published 16 days ago • 15

upvoted a paper 18 days ago

How Do Large Language Models Acquire Factual Knowledge During Pretraining?

Paper • 2406.11813 • Published 18 days ago • 28

upvoted a collection 21 days ago

Nemotron 4 340B

Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models. • 4 items • Updated 21 days ago • 148

upvoted a paper 22 days ago

Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

Paper • 2406.08464 • Published 23 days ago • 48

upvoted a paper 24 days ago

CRAG -- Comprehensive RAG Benchmark

Paper • 2406.04744 • Published 29 days ago • 38

upvoted a paper 27 days ago

MAmmoTH2: Scaling Instructions from the Web

Paper • 2405.03548 • Published May 6 • 6

upvoted an article 28 days ago

Article

Introducing the Hugging Face Embedding Container for Amazon SageMaker

29 days ago

• 11

upvoted a collection 29 days ago

Qwen2

Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 29 items • Updated 29 days ago • 231

upvoted a paper about 1 month ago

Executable Code Actions Elicit Better LLM Agents

Paper • 2402.01030 • Published Feb 1 • 22

upvoted an article about 1 month ago

Article

Training and Finetuning Embedding Models with Sentence Transformers v3

May 28

• 115

upvoted a paper about 1 month ago

SimPO: Simple Preference Optimization with a Reference-Free Reward

Paper • 2405.14734 • Published May 23 • 8

upvoted 2 articles about 2 months ago

Article

Build AI on premise with Dell Enterprise Hub

May 21

• 13

Article

From cloud to developers: Hugging Face and Microsoft Deepen Collaboration

May 21

• 8

upvoted a collection about 2 months ago

Yi-1.5 (2024/05)

10 items • Updated May 20 • 84

upvoted a paper about 2 months ago

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

Paper • 2405.04434 • Published May 7 • 11

upvoted 3 papers 2 months ago

Iterative Reasoning Preference Optimization

Paper • 2404.19733 • Published Apr 30 • 45

Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences

Paper • 2404.03715 • Published Apr 4 • 58

Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks

Paper • 2404.14723 • Published Apr 23 • 10

upvoted an article 3 months ago

Article

Welcome Llama 3 - Meta's new open LLM

Apr 18

• 254

upvoted a paper 3 months ago

JetMoE: Reaching Llama2 Performance with 0.1M Dollars

Paper • 2404.07413 • Published Apr 11 • 32

upvoted 2 articles 3 months ago

Article

Welcome Gemma - Google's new open LLM

Feb 21

• 13

Article

CodeGemma - an official Google release for code LLMs

Apr 9

• 98

upvoted a paper 3 months ago

Octopus v2: On-device language model for super agent

Paper • 2404.01744 • Published Apr 2 • 55

upvoted a collection 3 months ago

HF-curated models available on Workers AI

A collection of models curated with Hugging Face that can be run on Cloudflare's Workers AI serverless inference platform. • 15 items • Updated Apr 2 • 50

upvoted a paper 3 months ago

Jamba: A Hybrid Transformer-Mamba Language Model

Paper • 2403.19887 • Published Mar 28 • 99

upvoted a collection 3 months ago

MoEs papers reading list

46 items • Updated about 7 hours ago • 128

upvoted 3 papers 4 months ago

Aligning Modalities in Vision Large Language Models via Preference Fine-tuning

Paper • 2402.11411 • Published Feb 18 • 1

Simple and Scalable Strategies to Continually Pre-train Large Language Models

Paper • 2403.08763 • Published Mar 13 • 48

ORPO: Monolithic Preference Optimization without Reference Model

Paper • 2403.07691 • Published Mar 12 • 59

upvoted a collection 7 months ago

Awesome SFT datasets

A curated list of interesting datasets to fine-tune language models with. • 43 items • Updated Apr 12 • 101

upvoted 2 collections 8 months ago

Distil-Whisper Models

The first version of the Distil-Whisper models released with the Distil-Whisper paper. • 4 items • Updated Mar 21 • 34

Zephyr 7B