Edit model card

Yinka

Yinka embedding 模型是在开原模型stella-v3.5-mrl上续训的,采用了piccolo2提到的多任务混合损失(multi-task hybrid loss training)。同样本模型也支持了可变的向量维度。

使用方法

该模型的使用方法同stella-v3.5-mrl一样, 无需任何前缀。

from sentence_transformers import SentenceTransformer
from sklearn.preprocessing import normalize

model = SentenceTransformer("Classical/Yinka")
# 注意先不要normalize! 选取前n维后再normalize
vectors = model.encode(["text1", "text2"], normalize_embeddings=False)
print(vectors.shape)  # shape is [2,1792]
n_dims = 768
cut_vecs = normalize(vectors[:, :n_dims])

结果

Model Name Model Size (GB) Dimension Sequence Length Classification (9) Clustering (4) Pair Classification (2) Reranking (4) Retrieval (8) STS (8) Average (35)
Yinka 1.21 1792 512 74.30 61.99 89.87 69.77 74.40 63.30 70.79
stella-v3.5-mrl 1.21 1792 512 71.56 54.39 88.09 68.45 73.51 62.48 68.56
piccolo-large-zh-v2 1.21 1792 512 74.59 62.17 90.24 70 74.36 63.5 70.95

训练细节

TODO

Licence

本模型采用MIT licence.

Downloads last month
1,210
Inference API
This model can be loaded on Inference API (serverless).

Evaluation results