Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation Paper • 2401.08417 • Published Jan 16 • 29
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models Paper • 2401.06066 • Published Jan 11 • 39
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training Paper • 2401.05566 • Published Jan 10 • 25
Patchscope: A Unifying Framework for Inspecting Hidden Representations of Language Models Paper • 2401.06102 • Published Jan 11 • 19
Bootstrapping LLM-based Task-Oriented Dialogue Agents via Self-Talk Paper • 2401.05033 • Published Jan 10 • 15
Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM Paper • 2401.02994 • Published Jan 4 • 46
MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts Paper • 2401.04081 • Published Jan 8 • 69
Soaring from 4K to 400K: Extending LLM's Context with Activation Beacon Paper • 2401.03462 • Published Jan 7 • 26
GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation Paper • 2401.04092 • Published Jan 8 • 20
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism Paper • 2401.02954 • Published Jan 5 • 39
Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache Paper • 2401.02669 • Published Jan 5 • 14
Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation Paper • 2401.02117 • Published Jan 4 • 26
LLM Augmented LLMs: Expanding Capabilities through Composition Paper • 2401.02412 • Published Jan 4 • 36
LLaVA-φ: Efficient Multi-Modal Assistant with Small Language Model Paper • 2401.02330 • Published Jan 4 • 14
Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions Paper • 2401.01827 • Published Jan 3 • 14
Multilingual Instruction Tuning With Just a Pinch of Multilinguality Paper • 2401.01854 • Published Jan 3 • 10
LLaMA Beyond English: An Empirical Study on Language Capability Transfer Paper • 2401.01055 • Published Jan 2 • 52
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models Paper • 2401.01335 • Published Jan 2 • 62
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning Paper • 2401.01325 • Published Jan 2 • 26
VideoDrafter: Content-Consistent Multi-Scene Video Generation with LLM Paper • 2401.01256 • Published Jan 2 • 19
Q-Refine: A Perceptual Quality Refiner for AI-Generated Image Paper • 2401.01117 • Published Jan 2 • 8
Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws Paper • 2401.00448 • Published Dec 31, 2023 • 27
Boosting Large Language Model for Speech Synthesis: An Empirical Study Paper • 2401.00246 • Published Dec 30, 2023 • 9
FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis Paper • 2312.17681 • Published Dec 29, 2023 • 18
Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies Paper • 2407.13623 • Published 4 days ago • 37
Scaling Retrieval-Based Language Models with a Trillion-Token Datastore Paper • 2407.12854 • Published 13 days ago • 26
Streetscapes: Large-scale Consistent Street View Generation Using Autoregressive Video Diffusion Paper • 2407.13759 • Published 4 days ago • 12
Understanding Reference Policies in Direct Preference Optimization Paper • 2407.13709 • Published 4 days ago • 11
Shape of Motion: 4D Reconstruction from a Single Video Paper • 2407.13764 • Published 3 days ago • 14
Splatfacto-W: A Nerfstudio Implementation of Gaussian Splatting for Unconstrained Photo Collections Paper • 2407.12306 • Published 5 days ago • 5
Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language Models Paper • 2407.12327 • Published 5 days ago • 61
AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases Paper • 2407.12784 • Published 4 days ago • 43
E5-V: Universal Embeddings with Multimodal Large Language Models Paper • 2407.12580 • Published 5 days ago • 31
VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control Paper • 2407.12781 • Published 4 days ago • 10
Audio Conditioning for Music Generation via Discrete Bottleneck Features Paper • 2407.12563 • Published 5 days ago • 5
AUITestAgent: Automatic Requirements Oriented GUI Function Testing Paper • 2407.09018 • Published 10 days ago • 4
QuIP: 2-Bit Quantization of Large Language Models With Guarantees Paper • 2307.13304 • Published Jul 25, 2023 • 2
Prompt Expansion for Adaptive Text-to-Image Generation Paper • 2312.16720 • Published Dec 27, 2023 • 5
SSR-Encoder: Encoding Selective Subject Representation for Subject-Driven Generation Paper • 2312.16272 • Published Dec 26, 2023 • 6
DL3DV-10K: A Large-Scale Scene Dataset for Deep Learning-based 3D Vision Paper • 2312.16256 • Published Dec 26, 2023 • 15
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones Paper • 2312.16862 • Published Dec 28, 2023 • 29
DreamCatalyst: Fast and High-Quality 3D Editing via Controlling Editability and Identity Preservation Paper • 2407.11394 • Published 6 days ago • 10
Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes Paper • 2407.10957 • Published 7 days ago • 23
Animate3D: Animating Any 3D Model with Multi-view Video Diffusion Paper • 2407.11398 • Published 6 days ago • 7
OmniBind: Large-scale Omni Multimodal Representation via Binding Spaces Paper • 2407.11895 • Published 6 days ago • 7
From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients Paper • 2407.11239 • Published 6 days ago • 5
Scaling Diffusion Transformers to 16 Billion Parameters Paper • 2407.11633 • Published 6 days ago • 21
ShowRoom3D: Text to High-Quality 3D Room Generation Using 3D Priors Paper • 2312.13324 • Published Dec 20, 2023 • 9
DREAM-Talk: Diffusion-based Realistic Emotional Audio-driven Method for Single Image Talking Face Generation Paper • 2312.13578 • Published Dec 21, 2023 • 24