HiRED: Attention-Guided Token Dropping for Efficient Inference of High-Resolution Vision-Language Models in Resource-Constrained Environments Paper • 2408.10945 • Published Aug 20 • 6
HMoE: Heterogeneous Mixture of Experts for Language Modeling Paper • 2408.10681 • Published Aug 20 • 8