edwko commited on
Commit
334ec1e
1 Parent(s): d478a4a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -0
README.md CHANGED
@@ -143,6 +143,9 @@ For instruction training, we first trained the model with Supervised Fine-tuning
143
  </table>
144
 
145
  ## Interfacing with the Instruct Model
 
 
 
146
  > [!IMPORTANT]
147
  > To ensure optimal performance, please use the following template when interacting with the model:
148
 
 
143
  </table>
144
 
145
  ## Interfacing with the Instruct Model
146
+ Model weights were converted to be Hugging Face compatible, with custom modeling files included due to the lack of official support for Mamba2 attention layers.
147
+ The attention layer implementation was incorporated from [#32027 PR](https://github.com/huggingface/transformers/pull/32027)
148
+
149
  > [!IMPORTANT]
150
  > To ensure optimal performance, please use the following template when interacting with the model:
151