AquilaMoE: Efficient Training for MoE Models with Scale-Up and Scale-Out Strategies Paper • 2408.06567 • Published 25 days ago • 2