LLaVa-NeXT-Video Collection LLaVa-NeXT-Video extends LLaVa-NeXT for video understanding. • 5 items • Updated 29 days ago • 2
view article Article seemore: Implement a Vision Language Model from Scratch By AviSoori1x • 16 days ago • 48
view article Article Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent Apr 22 • 75
Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference Paper • 2403.14520 • Published Mar 21 • 31