SpreadsheetLLM: Encoding Spreadsheets for Large Language Models Paper • 2407.09025 • Published 12 days ago • 110
Learning to Refuse: Towards Mitigating Privacy Risks in LLMs Paper • 2407.10058 • Published 10 days ago • 28
Gemma: Open Models Based on Gemini Research and Technology Paper • 2403.08295 • Published Mar 13 • 45
ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation Paper • 2406.09961 • Published Jun 14 • 54
Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning Paper • 2406.06469 • Published Jun 10 • 22
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation Paper • 2406.06525 • Published Jun 10 • 62
Large Language Model Confidence Estimation via Black-Box Access Paper • 2406.04370 • Published Jun 1 • 19
Small Language Models Learn Enhanced Reasoning Skills from Medical Textbooks Paper • 2404.00376 • Published Mar 30 • 3
ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline Paper • 2404.02893 • Published Apr 3 • 19
Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference Paper • 2403.04132 • Published Mar 7 • 38
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect Paper • 2403.03853 • Published Mar 6 • 61
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking Paper • 2403.09629 • Published Mar 14 • 54
Teaching Large Language Models to Reason with Reinforcement Learning Paper • 2403.04642 • Published Mar 7 • 46
FAX: Scalable and Differentiable Federated Primitives in JAX Paper • 2403.07128 • Published Mar 11 • 11
Self-Discover: Large Language Models Self-Compose Reasoning Structures Paper • 2402.03620 • Published Feb 6 • 107
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27 • 581
Learning to Learn Faster from Human Feedback with Language Model Predictive Control Paper • 2402.11450 • Published Feb 18 • 20
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence Paper • 2401.14196 • Published Jan 25 • 45
MM-LLMs: Recent Advances in MultiModal Large Language Models Paper • 2401.13601 • Published Jan 24 • 42
Textbooks Are All You Need II: phi-1.5 technical report Paper • 2309.05463 • Published Sep 11, 2023 • 84
DocLLM: A layout-aware generative language model for multimodal document understanding Paper • 2401.00908 • Published Dec 31, 2023 • 178
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models Paper • 2401.01335 • Published Jan 2 • 62
Learning Vision from Models Rivals Learning Vision from Data Paper • 2312.17742 • Published Dec 28, 2023 • 14
Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4 Paper • 2312.16171 • Published Dec 26, 2023 • 33
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent Paper • 2312.10003 • Published Dec 15, 2023 • 33
Photorealistic Video Generation with Diffusion Models Paper • 2312.06662 • Published Dec 11, 2023 • 23
Chain of Code: Reasoning with a Language Model-Augmented Code Emulator Paper • 2312.04474 • Published Dec 7, 2023 • 29
LivePhoto: Real Image Animation with Text-guided Motion Control Paper • 2312.02928 • Published Dec 5, 2023 • 15
MagicDance: Realistic Human Dance Video Generation with Motions & Facial Expressions Transfer Paper • 2311.12052 • Published Nov 18, 2023 • 29
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks Paper • 2005.11401 • Published May 22, 2020 • 11
ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs Paper • 2311.13600 • Published Nov 22, 2023 • 41
Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers Paper • 2311.10642 • Published Nov 17, 2023 • 23
SelfEval: Leveraging the discriminative nature of generative models for evaluation Paper • 2311.10708 • Published Nov 17, 2023 • 14
Llamas Know What GPTs Don't Show: Surrogate Models for Confidence Estimation Paper • 2311.08877 • Published Nov 15, 2023 • 5
Single-Image 3D Human Digitization with Shape-Guided Diffusion Paper • 2311.09221 • Published Nov 15, 2023 • 18
The Chosen One: Consistent Characters in Text-to-Image Diffusion Models Paper • 2311.10093 • Published Nov 16, 2023 • 55
Tell Your Model Where to Attend: Post-hoc Attention Steering for LLMs Paper • 2311.02262 • Published Nov 3, 2023 • 9