cat-emb-2-128 / README.md
bwang0911's picture
Update README.md
555e21a verified
metadata
license: apache-2.0

Cat Embeddings

A set of embedding model trained for study embedding quality vs model architecture (width/depth) given a size constraint (12M params).

  • cat-emb-2-128: 2 layers/hidden size 128/4.4m
  • cat-emb-4-128: 4 layers/H 128/4.8m
  • cat-emb-8-128: 8 layers/H 128/5.6m
  • cat-emb-12-128: 12 layers/H 128/6.4m
  • cat-emb-2-256: 2 layers/H 256/9.7m
  • cat-emb-4-256: 4 layers/H 256/11.3m

Training

  • stage 1: seq 192, batch size 2048, 50k steps, sentence pairs.
  • stage 2: seq 512, batch size 64, 5k steps, sentence triplets.

Perf

MRL dim\Task BIOSSES SICK-R STS12 STS13 STS14 STS15 STS16 STSB SummEval
128 0.7107 0.7126 0.6815 0.7343 0.7038 0.8163 0.7495 0.7652 0.2958
64 0.713 0.7123 0.6829 0.7348 0.7008 0.813 0.7475 0.7609 0.2861
32 0.6714 0.7094 0.6847 0.7345 0.6911 0.7989 0.7385 0.7545 0.3106
16 0.6637 0.697 0.669 0.7096 0.6665 0.7589 0.7183 0.7307 0.3164