Edit model card

SetFit with sentence-transformers/all-MiniLM-L6-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/all-MiniLM-L6-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
very_semantic
  • 'What are the key considerations when proposing names for a project or initiative?'
  • 'What are the key aspects of team life and events in a company?'
  • 'What is being asked for or sought in this conversation?'
lexical
  • 'Who is responsible for reviewing and signing documents related to conference submissions?'
  • 'How do data architecture and management systems enable digital transformation and address its associated challenges?'
  • 'How do keys or access credentials get shared or transferred among team members in a workplace?'
very_lexical
  • 'What are some of the key challenges associated with handling and storing large amounts of genomic data?'
  • "What is the focus of Eurobiomed's partnership with Digital113?"
  • 'What are the key considerations for generating well-formatted JSON instances that conform to a given schema?'
semantic
  • 'How can visualizations be used to enhance documentation and collaboration in software development?'
  • 'What are the key considerations when choosing a distance metric for a vector database?'
  • 'How can AI be leveraged to support HR departments in detecting and addressing gender bias?'

Evaluation

Metrics

Label Accuracy
all 0.3077

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("yaniseuranova/setfit-rag-hybrid-search-query-router-test")
# Run inference
preds = model("What is the purpose of the message posted by the CR?")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 7 14.1913 24
Label Training Sample Count
lexical 41
semantic 24
very_lexical 17
very_semantic 33

Training Hyperparameters

  • batch_size: (4, 4)
  • num_epochs: (2, 2)
  • max_steps: -1
  • sampling_strategy: oversampling
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: True

Training Results

Epoch Step Training Loss Validation Loss
0.0004 1 0.4883 -
0.0209 50 0.3738 -
0.0417 100 0.2192 -
0.0626 150 0.1503 -
0.0834 200 0.1514 -
0.1043 250 0.1829 -
0.1251 300 0.4191 -
0.1460 350 0.2136 -
0.1668 400 0.1847 -
0.1877 450 0.1681 -
0.2085 500 0.222 -
0.2294 550 0.0397 -
0.2502 600 0.2626 -
0.2711 650 0.1343 -
0.2919 700 0.1769 -
0.3128 750 0.1704 -
0.3336 800 0.401 -
0.3545 850 0.1405 -
0.3753 900 0.1892 -
0.3962 950 0.1444 -
0.4170 1000 0.2337 -
0.4379 1050 0.1848 -
0.4587 1100 0.0601 -
0.4796 1150 0.2467 -
0.5004 1200 0.1829 -
0.5213 1250 0.1695 -
0.5421 1300 0.3892 -
0.5630 1350 0.1408 -
0.5838 1400 0.0506 -
0.6047 1450 0.1835 -
0.6255 1500 0.3284 -
0.6464 1550 0.1797 -
0.6672 1600 0.1118 -
0.6881 1650 0.1502 -
0.7089 1700 0.112 -
0.7298 1750 0.0401 -
0.7506 1800 0.117 -
0.7715 1850 0.1287 -
0.7923 1900 0.0623 -
0.8132 1950 0.2128 -
0.8340 2000 0.1542 -
0.8549 2050 0.1774 -
0.8757 2100 0.3252 -
0.8966 2150 0.0152 -
0.9174 2200 0.0539 -
0.9383 2250 0.0047 -
0.9591 2300 0.1232 -
0.9800 2350 0.3466 -
1.0 2398 - 0.3644
1.0008 2400 0.0296 -
1.0217 2450 0.3459 -
1.0425 2500 0.0867 -
1.0634 2550 0.1343 -
1.0842 2600 0.2074 -
1.1051 2650 0.0052 -
1.1259 2700 0.0548 -
1.1468 2750 0.0441 -
1.1676 2800 0.0821 -
1.1885 2850 0.0546 -
1.2093 2900 0.1286 -
1.2302 2950 0.1222 -
1.2510 3000 0.0227 -
1.2719 3050 0.3011 -
1.2927 3100 0.018 -
1.3136 3150 0.0581 -
1.3344 3200 0.0485 -
1.3553 3250 0.2369 -
1.3761 3300 0.1681 -
1.3970 3350 0.1289 -
1.4178 3400 0.1664 -
1.4387 3450 0.1467 -
1.4595 3500 0.1399 -
1.4804 3550 0.3045 -
1.5013 3600 0.2155 -
1.5221 3650 0.061 -
1.5430 3700 0.0787 -
1.5638 3750 0.3649 -
1.5847 3800 0.1202 -
1.6055 3850 0.1004 -
1.6264 3900 0.154 -
1.6472 3950 0.0944 -
1.6681 4000 0.0004 -
1.6889 4050 0.1843 -
1.7098 4100 0.2233 -
1.7306 4150 0.2203 -
1.7515 4200 0.0986 -
1.7723 4250 0.2295 -
1.7932 4300 0.1763 -
1.8140 4350 0.3487 -
1.8349 4400 0.3285 -
1.8557 4450 0.0152 -
1.8766 4500 0.1108 -
1.8974 4550 0.2416 -
1.9183 4600 0.0476 -
1.9391 4650 0.2929 -
1.9600 4700 0.1006 -
1.9808 4750 0.0925 -
2.0 4796 - 0.3669
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.0.3
  • Sentence Transformers: 2.6.1
  • Transformers: 4.39.0
  • PyTorch: 2.3.1+cu121
  • Datasets: 2.18.0
  • Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
87
Safetensors
Model size
22.7M params
Tensor type
F32
·

Finetuned from

Evaluation results