EricLee commited on
Commit
f033a37
1 Parent(s): 6d463d2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -1
README.md CHANGED
@@ -9,5 +9,20 @@ tags:
9
  datasets:
10
  - shibing624/nli_zh
11
  pipeline_tag: sentence-similarity
 
12
  ---
13
- Based on the derivative model of https://huggingface.co/shibing624/text2vec-base-chinese, replace MacBERT with hfl/chinese-roberta-wwm-ext, expand max_seq_length from 128 to 512, and keep other training conditions unchanged。
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  datasets:
10
  - shibing624/nli_zh
11
  pipeline_tag: sentence-similarity
12
+
13
  ---
14
+ 简介:
15
+ 参考 https://github.com/shibing624/text2vec
16
+ 基于Cosent模型架构,使用hfl/chinese-roberta-wwm-ext作为基座模型,在中文STS-B数据集上重新微调训练,将max_seq_length从原有的128扩展到了512
17
+ eval_spearman:0.833
18
+
19
+ ---
20
+ 下游任务:
21
+ 基于text2vec库或sentence-transformer库均可调用。
22
+ 文本向量表征:
23
+ ```
24
+ >>> from text2vec import SentenceModel, EncoderType
25
+ >>> model = SentenceModel('EricLee/text2vec-roberta-512', encoder_type=EncoderType.FIRST_LAST_AVG, max_seq_length=512)
26
+ >>> model.encode("今天天气不错啊")
27
+ Embedding shape: (768,)
28
+ ```