Collections
Discover the best community collections!
Collections including paper arxiv:2403.10131
-
Mistral 7B
Paper • 2310.06825 • Published • 47 -
Instruction Tuning with Human Curriculum
Paper • 2310.09518 • Published • 3 -
RAFT: Adapting Language Model to Domain Specific RAG
Paper • 2403.10131 • Published • 66 -
Instruction-tuned Language Models are Better Knowledge Learners
Paper • 2402.12847 • Published • 24
-
Jamba: A Hybrid Transformer-Mamba Language Model
Paper • 2403.19887 • Published • 103 -
sDPO: Don't Use Your Data All at Once
Paper • 2403.19270 • Published • 38 -
ViTAR: Vision Transformer with Any Resolution
Paper • 2403.18361 • Published • 51 -
Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models
Paper • 2403.18814 • Published • 44
-
Uni-SMART: Universal Science Multimodal Analysis and Research Transformer
Paper • 2403.10301 • Published • 51 -
Recurrent Drafter for Fast Speculative Decoding in Large Language Models
Paper • 2403.09919 • Published • 20 -
RAFT: Adapting Language Model to Domain Specific RAG
Paper • 2403.10131 • Published • 66 -
Alignment Studio: Aligning Large Language Models to Particular Contextual Regulations
Paper • 2403.09704 • Published • 31
-
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
Paper • 1701.06538 • Published • 4 -
Attention Is All You Need
Paper • 1706.03762 • Published • 41 -
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Paper • 2005.11401 • Published • 11 -
Language Model Evaluation Beyond Perplexity
Paper • 2106.00085 • Published