OuteAI
/

Lite-Oute-2-Mamba2Attn-250M-Instruct

Model card Files Files and versions Community

edwko commited on 29 days ago

Commit

334ec1e

•

1 Parent(s): d478a4a

Update README.md

Files changed (1) hide show

README.md +3 -0

README.md CHANGED Viewed

@@ -143,6 +143,9 @@ For instruction training, we first trained the model with Supervised Fine-tuning
 </table>
 ## Interfacing with the Instruct Model
 > [!IMPORTANT]
 > To ensure optimal performance, please use the following template when interacting with the model:

 </table>
 ## Interfacing with the Instruct Model
+Model weights were converted to be Hugging Face compatible, with custom modeling files included due to the lack of official support for Mamba2 attention layers.
+The attention layer implementation was incorporated from [#32027 PR](https://github.com/huggingface/transformers/pull/32027)
 > [!IMPORTANT]
 > To ensure optimal performance, please use the following template when interacting with the model: