Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2309.11495

LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models

Paper • 2309.12307 • Published Sep 21, 2023 • 86
Chain-of-Verification Reduces Hallucination in Large Language Models

Paper • 2309.11495 • Published Sep 20, 2023 • 38
Knowledge-Driven CoT: Exploring Faithful Reasoning in LLMs for Knowledge-intensive Question Answering

Paper • 2308.13259 • Published Aug 25, 2023 • 2
Cognitive Mirage: A Review of Hallucinations in Large Language Models

Paper • 2309.06794 • Published Sep 13, 2023 • 1

SciLitLLM: How to Adapt LLMs for Scientific Literature Understanding

Paper • 2408.15545 • Published Aug 28 • 33
Controllable Text Generation for Large Language Models: A Survey

Paper • 2408.12599 • Published Aug 22 • 61
To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20 • 40
Automated Design of Agentic Systems

Paper • 2408.08435 • Published Aug 15 • 38

Prompt Engineering

Chain-of-Verification Reduces Hallucination in Large Language Models

Paper • 2309.11495 • Published Sep 20, 2023 • 38

Papers - Chain of Verification

Chain-of-Verification Reduces Hallucination in Large Language Models

Paper • 2309.11495 • Published Sep 20, 2023 • 38

Papers - Hallucination - Reduction

Chain-of-Verification Reduces Hallucination in Large Language Models

Paper • 2309.11495 • Published Sep 20, 2023 • 38

Papers - Scaling

Chain-of-Verification Reduces Hallucination in Large Language Models

Paper • 2309.11495 • Published Sep 20, 2023 • 38

Papers - Training

SELF: Language-Driven Self-Evolution for Large Language Model

Paper • 2310.00533 • Published Oct 1, 2023 • 2
GrowLength: Accelerating LLMs Pretraining by Progressively Growing Training Length

Paper • 2310.00576 • Published Oct 1, 2023 • 2
A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity

Paper • 2305.13169 • Published May 22, 2023 • 3
Transformers Can Achieve Length Generalization But Not Robustly

Paper • 2402.09371 • Published Feb 14 • 12

Papers - Training Research

Measuring the Effects of Data Parallelism on Neural Network Training

Paper • 1811.03600 • Published Nov 8, 2018 • 2
Adafactor: Adaptive Learning Rates with Sublinear Memory Cost

Paper • 1804.04235 • Published Apr 11, 2018 • 2
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

Paper • 1905.11946 • Published May 28, 2019 • 3
Yi: Open Foundation Models by 01.AI

Paper • 2403.04652 • Published Mar 7 • 61

Papers - Reasoning

Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models

Paper • 2402.14848 • Published Feb 19 • 18
Teaching Large Language Models to Reason with Reinforcement Learning

Paper • 2403.04642 • Published Mar 7 • 46
How Far Are We from Intelligent Visual Deductive Reasoning?

Paper • 2403.04732 • Published Mar 7 • 18
Learning to Reason and Memorize with Self-Notes

Paper • 2305.00833 • Published May 1, 2023 • 4

Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15 • 99
How to Train Data-Efficient LLMs

Paper • 2402.09668 • Published Feb 15 • 38
BitDelta: Your Fine-Tune May Only Be Worth One Bit

Paper • 2402.10193 • Published Feb 15 • 17
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts

Paper • 2402.09727 • Published Feb 15 • 35

Previous
1
2
3
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs