haonan-li commited on
Commit
7f10c49
1 Parent(s): 00cee19

update model

Browse files
Files changed (3) hide show
  1. README.md +69 -0
  2. adapter_config.json +19 -0
  3. pytorch_model.bin +3 -0
README.md ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ ---
4
+
5
+ #### Current Training Steps: 68,000
6
+
7
+
8
+ This repo contains a low-rank adapter (LoRA) for LLaMA-13b
9
+ fit on the [Stanford-Alpaca-52k](https://github.com/tatsu-lab/stanford_alpaca)
10
+ and [databricks-dolly-15k](https://github.com/databrickslabs/dolly/tree/master/data) data in 52 languages.
11
+
12
+ ### Dataset Creation
13
+
14
+ 1. English Instructions: The English instuctions are obtained from [alpaca-52k](https://github.com/tatsu-lab/stanford_alpaca), and [dolly-15k](https://github.com/databrickslabs/dolly/tree/master/data).
15
+ 2. Instruction Translation: The instructions (and inputs) are translated into the target languages using Google Translation API (conducted on April 2023).
16
+ 3. Output Generation: We generate output from `gpt-3.5-turbo` for each language (conducted on April 2023).
17
+
18
+ <h3 align="center">
19
+ <img src="https://raw.githubusercontent.com/fajri91/eval_picts/master/BactrianX_dataset.jpg" width="950" align="center">
20
+ </h3>
21
+
22
+ ### Training Parameters
23
+
24
+ The code for training the model is provided in our [github](https://github.com/mbzuai-nlp/Bactrian-X), which is adapted from [Alpaca-LoRA](https://github.com/tloen/alpaca-lora).
25
+ This version of the weights was trained with the following hyperparameters:
26
+
27
+
28
+ - Epochs: 10
29
+ - Batch size: 128
30
+ - Cutoff length: 512
31
+ - Learning rate: 3e-4
32
+ - Lora _r_: 64
33
+ - Lora target modules: q_proj, k_proj, v_proj, o_proj
34
+
35
+
36
+ That is:
37
+
38
+ ```
39
+ python finetune.py \
40
+ --base_model='decapoda-research/llama-13b-hf' \
41
+ --num_epochs=10 \
42
+ --batch_size=128 \
43
+ --cutoff_len=512 \
44
+ --group_by_length \
45
+ --output_dir='./bactrian-x-llama-13b-lora' \
46
+ --lora_target_modules='q_proj,k_proj,v_proj,o_proj' \
47
+ --lora_r=64 \
48
+ --micro_batch_size=32
49
+ ```
50
+
51
+ Instructions for running it can be found at https://github.com/MBZUAI-nlp/Bactrian-X.
52
+
53
+ ### Discussion of Biases
54
+
55
+ (1) Translation bias; (2) Potential English-culture bias in the translated dataset.
56
+
57
+
58
+ ### Citation Information
59
+
60
+ ```
61
+ @misc{bactrian,
62
+ author = {Haonan Li and Fajri Koto and Minghao Wu and Alham Fikri Aji and Timothy Baldwin},
63
+ title = {Bactrian-X: A Multilingual Replicable Instruction-Following Model},
64
+ year = {2023},
65
+ publisher = {GitHub},
66
+ journal = {GitHub repository},
67
+ howpublished = {\url{https://github.com/MBZUAI-nlp/Bactrian-X}},
68
+ }
69
+ ```
adapter_config.json ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "base_model_name_or_path": "decapoda-research/llama-13b-hf",
3
+ "bias": "none",
4
+ "fan_in_fan_out": false,
5
+ "inference_mode": true,
6
+ "init_lora_weights": true,
7
+ "lora_alpha": 16,
8
+ "lora_dropout": 0.05,
9
+ "modules_to_save": null,
10
+ "peft_type": "LORA",
11
+ "r": 64,
12
+ "target_modules": [
13
+ "q_proj",
14
+ "k_proj",
15
+ "v_proj",
16
+ "o_proj"
17
+ ],
18
+ "task_type": "CAUSAL_LM"
19
+ }
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1a5209f73bf5942bfda97ea438272b02e43246db60dfd2c5e794086a17e2eeb5
3
+ size 419546189