Edit model card

LaBSE-Malach-Multilabel

A multilabel text classification model fine-tuned on an English subset (Malach ASR) of the Visual History Archive. Based on LaBSE pretrained weights but it uses the general Hugging Face framework, not sentence-transformers. Input text segments consisted of ~350 words on average.

Given an input string, the model predicts probablites for 1063 keyword IDs from the VHA ontology, sorted by probability. Typically, probabilities >= 0.5 are "True" if encoding them in a binary vector.

The mapping from keyword IDs to labels will be added to the repository.

Downloads last month
5
Safetensors
Model size
472M params
Tensor type
F32
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.