Yinka
Yinka embedding 模型是在开原模型stella-v3.5-mrl上续训的,采用了piccolo2提到的多任务混合损失(multi-task hybrid loss training)。同样本模型也支持了可变的向量维度。
使用方法
该模型的使用方法同stella-v3.5-mrl一样, 无需任何前缀。
from sentence_transformers import SentenceTransformer
from sklearn.preprocessing import normalize
model = SentenceTransformer("Classical/Yinka")
# 注意先不要normalize! 选取前n维后再normalize
vectors = model.encode(["text1", "text2"], normalize_embeddings=False)
print(vectors.shape) # shape is [2,1792]
n_dims = 768
cut_vecs = normalize(vectors[:, :n_dims])
结果
Model Name | Model Size (GB) | Dimension | Sequence Length | Classification (9) | Clustering (4) | Pair Classification (2) | Reranking (4) | Retrieval (8) | STS (8) | Average (35) |
---|---|---|---|---|---|---|---|---|---|---|
Yinka | 1.21 | 1792 | 512 | 74.30 | 61.99 | 89.87 | 69.77 | 74.40 | 63.30 | 70.79 |
stella-v3.5-mrl | 1.21 | 1792 | 512 | 71.56 | 54.39 | 88.09 | 68.45 | 73.51 | 62.48 | 68.56 |
piccolo-large-zh-v2 | 1.21 | 1792 | 512 | 74.59 | 62.17 | 90.24 | 70 | 74.36 | 63.5 | 70.95 |
训练细节
TODO
Licence
本模型采用MIT licence.
- Downloads last month
- 621
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for Classical/Yinka
Spaces using Classical/Yinka 2
Evaluation results
- cos_sim_pearson on MTEB AFQMCvalidation set self-reported56.306
- cos_sim_spearman on MTEB AFQMCvalidation set self-reported61.020
- euclidean_pearson on MTEB AFQMCvalidation set self-reported58.618
- euclidean_spearman on MTEB AFQMCvalidation set self-reported60.131
- manhattan_pearson on MTEB AFQMCvalidation set self-reported58.619
- manhattan_spearman on MTEB AFQMCvalidation set self-reported60.126
- cos_sim_pearson on MTEB ATECtest set self-reported55.861
- cos_sim_spearman on MTEB ATECtest set self-reported59.020
- euclidean_pearson on MTEB ATECtest set self-reported62.028
- euclidean_spearman on MTEB ATECtest set self-reported58.605