@yushun0410 on Hugging Face: "Hi Huggingfacers! Thrilled to introduce Adam-mini, an optimizer that…"

Hugging Face

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Back to feed

yushun0410

posted an update Jun 28

Post

4600

Hi Huggingfacers!

Thrilled to introduce Adam-mini, an optimizer that achieves on-par or better performance than AdamW with 45% to 50% less memory footprint. Adam-mini can also achieve 49.5% higher throughput than AdamW on Llama2-7B pre-training.

The design of Adam-mini is inspired by certain Hessian structures we observed on Transformers.

Feel free to try it out! Try switching to Adam-mini with the same hyperparams of AdamW, it would work with only half memory. Hope Adam-mini can help save time, cost, and energy in your tasks!

Paper: "Adam-mini: Use Fewer Learning Rates To Gain More" https://arxiv.org/abs/2406.16793

Code: https://github.com/zyushun/Adam-mini

Ramikan-BR

Jun 28

Para entusiastas de IA como eu, que buscam o máximo de poder com o menor consumo possível, essa noticia é muito boa! Parabéns por essa poderosa criação, e fico na torcida para que evolua ainda mais... Já nos próximos refinamentos vou começar a testar.

In this post