Update README.md
Browse files
README.md
CHANGED
@@ -33,7 +33,7 @@ outputs = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
|
|
33 |
### Dataset
|
34 |
Since the Axolotl corpus contains misaligments, we just select the best samples (~8,000 samples). We also use the [bible-corpus](https://github.com/christos-c/bible-corpus) (7,821 samples).
|
35 |
|
36 |
-
| Axolotl best aligned
|
37 |
|:-----------------------------------------------------:|
|
38 |
| Anales de Tlatelolco |
|
39 |
| Diario |
|
|
|
33 |
### Dataset
|
34 |
Since the Axolotl corpus contains misaligments, we just select the best samples (~8,000 samples). We also use the [bible-corpus](https://github.com/christos-c/bible-corpus) (7,821 samples).
|
35 |
|
36 |
+
| Axolotl books best aligned |
|
37 |
|:-----------------------------------------------------:|
|
38 |
| Anales de Tlatelolco |
|
39 |
| Diario |
|