Papers
arxiv:2403.07487

Motion Mamba: Efficient and Long Sequence Motion Generation with Hierarchical and Bidirectional Selective SSM

Published on Mar 12
ยท Submitted by akhaliq on Mar 13
Authors:
,
,

Abstract

Human motion generation stands as a significant pursuit in generative computer vision, while achieving long-sequence and efficient motion generation remains challenging. Recent advancements in state space models (SSMs), notably Mamba, have showcased considerable promise in long sequence modeling with an efficient hardware-aware design, which appears to be a promising direction to build motion generation model upon it. Nevertheless, adapting SSMs to motion generation faces hurdles since the lack of a specialized design architecture to model motion sequence. To address these challenges, we propose Motion Mamba, a simple and efficient approach that presents the pioneering motion generation model utilized SSMs. Specifically, we design a Hierarchical Temporal Mamba (HTM) block to process temporal data by ensemble varying numbers of isolated SSM modules across a symmetric U-Net architecture aimed at preserving motion consistency between frames. We also design a Bidirectional Spatial Mamba (BSM) block to bidirectionally process latent poses, to enhance accurate motion generation within a temporal frame. Our proposed method achieves up to 50% FID improvement and up to 4 times faster on the HumanML3D and KIT-ML datasets compared to the previous best diffusion-based method, which demonstrates strong capabilities of high-quality long sequence motion modeling and real-time human motion generation. See project website https://steve-zeyu-zhang.github.io/MotionMamba/

Community

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Motion Mamba: The Future of Efficient Human Motion Generation

Links ๐Ÿ”—:

๐Ÿ‘‰ Subscribe: https://www.youtube.com/@Arxflix
๐Ÿ‘‰ Twitter: https://x.com/arxflix
๐Ÿ‘‰ LMNT (Partner): https://lmnt.com/

By Arxflix
9t4iCUHx_400x400-1.jpg

Paper author
โ€ข
edited Jul 16

๐Ÿ”ฅ๐Ÿ”ฅ We are delighted to announce that our paper, Motion Mamba, has been accepted to ECCV 2024!

Motion Mamba ๐Ÿ presents the pioneering work that customizes SSMs for motion generation. This innovative method achieves an exceptional balance between high quality in long-sequence motion generation and outstanding generation efficiency. It delivers up to a 50% improvement ๐Ÿ’ช in FID and a 4x speedup ๐Ÿš€ in text-to-motion generation.

๐ŸŽ‰๐ŸŽ‰ Congratulations and thanks to all co-authors for their effort and support! See you in Milan!

Motion Mamba: Efficient and Long Sequence Motion Generation
Zeyu Zhang*, Akide Liu*, Ian Reid, Richard Hartley, Bohan Zhuang, Hao Tang

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2403.07487 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2403.07487 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2403.07487 in a Space README.md to link it from this page.

Collections including this paper 9