Commit History

Add Llama tokenizer creation for Dutch, English, Code, Markdown and TeX.
c78da21
Running

yhavinga commited on

Hack some Dutch tokenizers into it
55df72d

yhavinga commited on

update
f331792

xu-song commited on

add compression leaderboard
1b7fc74

xu-song commited on

add grok mixtral
480ae5d

xu-song commited on

add compress rate
814ee6b

xu-song commited on

add zephyr
a6aee1d

xu-song commited on

add xlm-roberta
057bc67

xu-song commited on

add amber and crystal_coder
5db13e0

xu-song commited on

add character glm
f0f84b2

xu-song commited on

fix unicode error: 'unicodeescape' codec can't decode bytes in position 602-608: unknown Unicode character name
bce41d0

xu-song commited on

fix fastchat_t5_3b
c766a08

xu-song commited on

fix tiktoken special tokens
adcfb97

xu-song commited on

add aya
44c3329

xu-song commited on

fix olmo
2442c83

xu-song commited on

add olmo tokenizer
bbefe94

xu-song commited on

fix tiktoken
a6c67ec

xu-song commited on

fix gemma_7b
7011963

xu-song commited on

add gemma_7b
9c8ace5

xu-song commited on

add more tokenizer
5425d5d

xu-song commited on

fix tokenize
e6543ac

xu-song commited on

add more tokenizer
c75633b

xu-song commited on

update
6bdf6c6

xu-song commited on

update
9820e00

xu-song commited on

fix chatglm; new feature about add_special_tokens;
d27a756

xu-song commited on

add more tokenizer
d2417c7

xu-song commited on

add more tokenizers
a1b0cd0

xu-song commited on

add more tokenizer
3030d21

xu-song commited on

add more tokenizer
293bad6

xu-song commited on

fix moss
aa0c637

xu-song commited on

update
da93e39

xu-song commited on

update
2d550af

xu-song commited on

add skywork
c7ed4a2

xu-song commited on

add more tokenizers
f4973d4

xu-song commited on

add qwen
ef8594d

xu-song commited on

update
7cb27ea

xu-song commited on

update
9495a4f

xu-song commited on

update
0ce6477

xu-song commited on

update
614012d

xu-song commited on

update
8e0e4e9

xu-song commited on

update
819cf7f

xu-song commited on

update
7156337

xu-song commited on

update
d10ecd7

xu-song commited on

update
1ee0570

xu-song commited on

update
428b731

xu-song commited on

update
751936e

xu-song commited on