Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2404.07965

How Far Are We from Intelligent Visual Deductive Reasoning?

Paper • 2403.04732 • Published Mar 7 • 18
MoAI: Mixture of All Intelligence for Large Language and Vision Models

Paper • 2403.07508 • Published Mar 12 • 75
DragAnything: Motion Control for Anything using Entity Representation

Paper • 2403.07420 • Published Mar 12 • 12
Learning and Leveraging World Models in Visual Representation Learning

Paper • 2403.00504 • Published Mar 1 • 30

Papers - Math - Research

AutoNumerics-Zero: Automated Discovery of State-of-the-Art Mathematical Functions

Paper • 2312.08472 • Published Dec 13, 2023 • 2
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?

Paper • 2403.14624 • Published Mar 21 • 50
ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline

Paper • 2404.02893 • Published Apr 3 • 20
Rho-1: Not All Tokens Are What You Need

Paper • 2404.07965 • Published Apr 11 • 83

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6 • 182
RAFT: Adapting Language Model to Domain Specific RAG

Paper • 2403.10131 • Published Mar 15 • 66
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models

Paper • 2403.13372 • Published Mar 20 • 60
InternLM2 Technical Report

Paper • 2403.17297 • Published Mar 26 • 28

ibm/AttaQ

Viewer • Updated Jan 26 • 1.4k • 988 • 11
snorkelai/snorkel-curated-instruction-tuning

Preview • Updated Mar 11 • 8
corbyrosset/researchy_questions

Viewer • Updated Feb 29 • 96.4k • 650 • 24
argilla/ultrafeedback-binarized-preferences

Viewer • Updated Nov 30, 2023 • 63.6k • 801 • 65

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 594
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6 • 182
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models

Paper • 2402.19427 • Published Feb 29 • 52
ResLoRA: Identity Residual Mapping in Low-Rank Adaption

Paper • 2402.18039 • Published Feb 28 • 11

The papers I would love to read

Ultralytics/YOLOv8

Object Detection • Updated 25 days ago • 140
YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information

Paper • 2402.13616 • Published Feb 21 • 45
Gradio: Hassle-Free Sharing and Testing of ML Models in the Wild

Paper • 1906.02569 • Published Jun 6, 2019
SpeechAlign: a Framework for Speech Translation Alignment Evaluation

Paper • 2309.11585 • Published Sep 20, 2023

Data Efficient Approaches

How to Train Data-Efficient LLMs

Paper • 2402.09668 • Published Feb 15 • 38
LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement

Paper • 2403.15042 • Published Mar 22 • 24
MAGID: An Automated Pipeline for Generating Synthetic Multi-modal Datasets

Paper • 2403.03194 • Published Mar 5 • 12
Orca-Math: Unlocking the potential of SLMs in Grade School Math

Paper • 2402.14830 • Published Feb 16 • 24

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1 • 21
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1 • 79
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18 • 142
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30 • 25

Daily paper that is inspiring (abstract is enough)

World Model on Million-Length Video And Language With RingAttention

Paper • 2402.08268 • Published Feb 13 • 36
Improving Text Embeddings with Large Language Models

Paper • 2401.00368 • Published Dec 31, 2023 • 79
Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15 • 99
FiT: Flexible Vision Transformer for Diffusion Model

Paper • 2402.12376 • Published Feb 19 • 48

Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

Paper • 2401.10774 • Published Jan 19 • 53
APAR: LLMs Can Do Auto-Parallel Auto-Regressive Decoding

Paper • 2401.06761 • Published Jan 12 • 1
Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache

Paper • 2401.02669 • Published Jan 5 • 14
MambaByte: Token-free Selective State Space Model

Paper • 2401.13660 • Published Jan 24 • 49

Previous
1
2
3
4
5
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs