mohammadmahdinouri commited on
Commit
74ae76c
1 Parent(s): 6376ae0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -1
README.md CHANGED
@@ -5,4 +5,36 @@ license: mit
5
  # Model
6
  miniALBERT is a recursive transformer model which uses cross-layer parameter sharing, embedding factorisation, and bottleneck adapters to achieve high parameter efficiency.
7
  Since miniALBERT is a compact model, it is trained using a layer-to-layer distillation technique, using the bert-base model as the teacher. Currently, this model is trained for one epoch on the English subset of Wikipedia.
8
- In terms of architecture, this model uses an embedding dimension of 128, a hidden size of 768, an MLP expansion rate of 4, and a reduction factor of 16 for bottleneck adapters. In general, this model uses 6 recursions and has a unique parameter count of 11 million parameters.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  # Model
6
  miniALBERT is a recursive transformer model which uses cross-layer parameter sharing, embedding factorisation, and bottleneck adapters to achieve high parameter efficiency.
7
  Since miniALBERT is a compact model, it is trained using a layer-to-layer distillation technique, using the bert-base model as the teacher. Currently, this model is trained for one epoch on the English subset of Wikipedia.
8
+ In terms of architecture, this model uses an embedding dimension of 128, a hidden size of 768, an MLP expansion rate of 4, and a reduction factor of 16 for bottleneck adapters. In general, this model uses 6 recursions and has a unique parameter count of 11 million parameters.
9
+
10
+ # Usage
11
+ Since miniALBERT uses a unique architecture it can not be loaded using ts.AutoModel for now. To load the model, first, clone the miniALBERT GitHub project, using the below code:
12
+ ```bash
13
+ git clone https://github.com/nlpie-research/MiniALBERT.git
14
+ ```
15
+ Then use the ```sys.path.append``` to add the miniALBERT files to your project and then import the miniALBERT modeling file using the below code:
16
+ ```bash
17
+ import sys
18
+ sys.path.append("PATH_TO_CLONED_PROJECT/MiniALBERT/")
19
+
20
+ from minialbert_modeling import MiniAlbertForSequenceClassification, MiniAlbertForTokenClassification
21
+ ```
22
+ Finally, load the model like a regular model in the transformers library using the below code:
23
+ ```python
24
+ # For NER use the below code
25
+ model = MiniAlbertForTokenClassification.from_pretrained("nlpie/miniALBERT-128")
26
+ # For Sequence Classification use the below code
27
+ model = MiniAlbertForTokenClassification.from_pretrained("nlpie/miniALBERT-128")
28
+ ```
29
+
30
+ # Citation
31
+
32
+ If you use the model, please cite our paper:
33
+ ```
34
+ @article{nouriborji2022minialbert,
35
+ title={MiniALBERT: Model Distillation via Parameter-Efficient Recursive Transformers},
36
+ author={Nouriborji, Mohammadmahdi and Rohanian, Omid and Kouchaki, Samaneh and Clifton, David A},
37
+ journal={arXiv preprint arXiv:2210.06425},
38
+ year={2022}
39
+ }
40
+ ```