Philipp Schmid

Hugging Face and Google partner for open AI collaboration

Jan 25

Welcome Mixtral - a SOTA Mixture of Experts on Hugging Face

Dec 11, 2023

• 8

Mixture of Experts Explained

Dec 11, 2023

• 103

Deploy Embedding Models with Hugging Face Inference Endpoints

Oct 24, 2023

Llama 2 on Amazon SageMaker a Benchmark

Sep 26, 2023

Fine-tuning Llama 2 70B using PyTorch FSDP

Sep 13, 2023

• 6

Spread Your Wings: Falcon 180B is here

Sep 6, 2023

Code Llama: Llama 2 learns to code

Aug 25, 2023

Introducing SafeCoder

Aug 22, 2023

Hugging Face Platform on the AWS Marketplace: Pay with your AWS Account

Aug 10, 2023

Llama 2 is here - get it on Hugging Face

Jul 18, 2023

• 17

Deploy LLMs with Hugging Face Inference Endpoints

Jul 4, 2023

• 8

The Falcon has landed in the Hugging Face ecosystem

Jun 5, 2023

• 6

Introducing the Hugging Face LLM Inference Container for Amazon SageMaker

May 31, 2023

Hugging Face Collaborates with Microsoft to Launch Hugging Face Model Catalog on Azure

May 24, 2023

Creating a Coding Assistant with StarCoder

May 9, 2023

Accelerating Hugging Face Transformers with AWS Inferentia2

Apr 17, 2023

Hugging Face and AWS partner to make AI more accessible

Feb 21, 2023

Pre-Train BERT with Hugging Face Transformers and Habana Gaudi

Aug 22, 2022

• 3

Convert Transformers to ONNX with Hugging Face Optimum

Jun 22, 2022

Accelerated Inference with Optimum and Transformers Pipelines

May 10, 2022

Accelerate BERT inference with Hugging Face Transformers and AWS inferentia

Mar 16, 2022

Case Study: Millisecond Latency using Hugging Face Infinity and modern CPUs

Jan 13, 2022

Paper • 2407.01906 • Published 4 days ago • 18 •

Deploy GPT-J 6B for inference using Hugging Face Transformers and Amazon SageMaker

Jan 11, 2022

Few-shot learning in practice: GPT-NEO and the 🤗 Accelerated Inference API

Jun 3, 2021

• 3

Distributed Training: Train BART/T5 for Summarization using 🤗 Transformers and Amazon SageMaker

Apr 8, 2021

Organizations

philschmid's activity

commented a paper about 16 hours ago

Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models

New activity in philschmid/llm-pricing 1 day ago

Small difference in the IBM WatsonX price compared to Excel

#3 opened 3 days ago by

paul-roro589

commented a paper 3 days ago

Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems

Paper • 2407.01370 • Published 4 days ago • 60 •

New activity in philschmid/llm-pricing 3 days ago

Adding Writer LLMs

#1 opened 3 days ago by

wassemgtk

Update src/lib/data.ts

#2 opened 3 days ago by

wassemgtk

New activity in mistralai/Mistral-7B-Instruct-v0.3 6 days ago

Deploying a fine-tuned model with custom inference code

#53 opened 6 days ago by

maz-qualtrics

commented a paper 10 days ago

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Paper • 2406.17557 • Published 10 days ago • 73 •

New activity in philschmid/stable-diffusion-v1-4-endpoints 10 days ago

Update API

#4 opened 10 days ago by

julien-c

New activity in BAAI/bge-reranker-base 11 days ago

Add TEI support tag

#19 opened 11 days ago by

New activity in BAAI/Infinity-Instruct 15 days ago

Different dataset versions ? 3M / 0608 / 06012

#4 opened 16 days ago by

Paper • 2406.11813 • Published 18 days ago • 28 •

commented a paper 18 days ago

How Do Large Language Models Acquire Factual Knowledge During Pretraining?

New activity in RLHFlow/ArmoRM-Llama3-8B-v0.1 24 days ago

Update README.md

#3 opened 24 days ago by

Paper • 2406.04744 • Published 29 days ago • 38 •

commented a paper 24 days ago

CRAG -- Comprehensive RAG Benchmark

New activity in microsoft/Phi-3-medium-128k-instruct about 1 month ago

Add attention_bias to make TGI work

#5 opened about 1 month ago by

New activity in microsoft/Phi-3-medium-4k-instruct about 1 month ago

Add attention_bias to make TGI work

#2 opened about 1 month ago by

New activity in microsoft/Phi-3-small-128k-instruct about 1 month ago

Add attention_bias to make TGI work

#4 opened about 1 month ago by

New activity in microsoft/Phi-3-small-8k-instruct about 1 month ago

Add attention_bias to make TGI work

#3 opened about 1 month ago by

New activity in microsoft/Phi-3-mini-128k-instruct about 1 month ago

Add attention_bias to make TGI work

#68 opened about 1 month ago by

New activity in microsoft/Phi-3-mini-4k-instruct about 1 month ago

Add attention_bias to make TGI work

#64 opened about 1 month ago by

Paper • 2405.04434 • Published May 7 • 11 •

commented a paper about 2 months ago

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

New activity in mlabonne/Meta-Llama-3-120B-Instruct about 2 months ago

fix snippet

#8 opened about 2 months ago by

New activity in meta-llama/Meta-Llama-3-70B-Instruct 2 months ago

[Don't merge] inferentia2 workaround

#34 opened 2 months ago by

New activity in meta-llama/Meta-Llama-3-8B-Instruct 3 months ago

Fix chat template to add generation prompt only if the option is selected

#9 opened 3 months ago by

ArkaAbacus

New activity in meta-llama/Meta-Llama-3-70B-Instruct 3 months ago

Fix chat template to add generation prompt only if the option is selected

#6 opened 3 months ago by

ArkaAbacus

New activity in meta-llama/Meta-Llama-Guard-2-8B 3 months ago

template-format

#5 opened 3 months ago by

pcuenq

New activity in meta-llama/Meta-Llama-3-8B-Instruct 3 months ago

Example for AutoModelForCausalLM

#11 opened 3 months ago by

pcuenq

New activity in mistral-community/Mixtral-8x22B-v0.1 3 months ago

mention official weights

#15 opened 3 months ago by

Paper • 2404.02258 • Published Apr 2 • 102 •

commented a paper 3 months ago

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

New activity in aws-neuron/optimum-neuron-cache 3 months ago

Create stable-diffusion.json

#43 opened 3 months ago by

Jingya

commented a paper 3 months ago

Improved Baselines with Visual Instruction Tuning

Paper • 2310.03744 • Published Oct 5, 2023 • 34 •

New activity in ai21labs/Jamba-v0.1 3 months ago

Remove TGI tag

#8 opened 3 months ago by

osanseviero

New activity in databricks/dbrx-instruct 3 months ago

Add license with external link to databricks

#4 opened 3 months ago by

New activity in databricks/dbrx-base 3 months ago

Add license with external link to databricks

#4 opened 3 months ago by

New activity in google/gemma-7b 3 months ago

Fine-Tune Gemma with ChatML and Transformer Reinforcement Learning

#80 opened 3 months ago by

Ateeqq

New activity in google/gemma-7b 4 months ago

Deploy in Sagemaker

#42 opened 4 months ago by

XuanNg

New activity in philschmid/gemma-7b-dolly-chatml 4 months ago

Thanks. GGUF coming?

#1 opened 4 months ago by deleted

New activity in philschmid/gemma-tokenizer-chatml 4 months ago

Update tokenizer_config.json

#1 opened 4 months ago by

Paper • 2402.13064 • Published Feb 20 • 46 •

commented a paper 4 months ago

Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models

New activity in google/gemma-7b-it 5 months ago

Type Error when executing dummy script

#8 opened 5 months ago by

LKriesch

New activity in google/gemma-2b 5 months ago

add tos

#6 opened 5 months ago by

New activity in google/gemma-2b-it 5 months ago

add tos

#7 opened 5 months ago by

New activity in google/gemma-7b 5 months ago

add tos

#13 opened 5 months ago by

New activity in google/gemma-7b-it 5 months ago

add tos

#5 opened 5 months ago by

New activity in TheBloke/Llama-2-13B-chat-GPTQ 5 months ago

add template

#51 opened 5 months ago by

New activity in codellama/CodeLlama-7b-Instruct-hf 5 months ago

The AWS SageMaker training code is not working.

#25 opened 5 months ago by

hyerimpark

New activity in philschmid/pyannote-speaker-diarization-endpoint 6 months ago

Example use with requests/httpx in Python?

#1 opened 6 months ago by

hbredin

New activity in mistralai/Mixtral-8x7B-v0.1 7 months ago

Deployment failing on Sagemaker

#15 opened 7 months ago by

vibranium

New activity in mistralai/Mixtral-8x7B-Instruct-v0.1 7 months ago

Inference endpoint fails to deploy

#13 opened 7 months ago by

dragosmc

New activity in philschmid/lilt-en-funsd 7 months ago

Inquiry Regarding Commercial Use of lilt-en-funsd Model

#4 opened 7 months ago by

Ifyouknowthenyouknow

New activity in aws-neuron/stable-diffusion-xl-on-inf2 8 months ago

AWS headquarter in seatle

#1 opened 8 months ago by

New activity in aws-neuron/stable-diffusion-xl-base-1-0-1024x1024 8 months ago

allow dynamic batch size

#3 opened 8 months ago by

l-i

Copy suggestions about revisions, branches and neuronx version

#2 opened 8 months ago by

jeffboudier

model card copy nits

#1 opened 8 months ago by

jeffboudier

New activity in mteb/results 8 months ago

Create amazon.titan-embed-text-v1/t

#19 opened 8 months ago by