Edit model card

AntiBERTa2 🧬

AntiBERTa2 is an antibody-specific language model based on the RoFormer model - it is pre-trained using masked language modelling. We also provide a multimodal version of AntiBERTa2, AntiBERTa2-CSSP, that has been trained using a contrastive objective, similar to the CLIP method. Further details on both AntiBERTa2 and AntiBERTa2-CSSP are described in our paper accepted at the NeurIPS MLSB Workshop 2023.

Both AntiBERTa2 models are only available for non-commercial use. Output antibody sequences (e.g. from infilling via masked language models) can only be used for non-commercial use. For any users seeking commercial use of our model and generated antibodies, please reach out to us at info@alchemab.com.

Model variant Parameters Config
AntiBERTa2 202M 16L, 16H, 1024d
AntiBERTa2-CSSP 202M 16L, 16H, 1024d

Example usage

>>> from transformers import (
        RoFormerForMaskedLM, 
        RoFormerTokenizer, 
        pipeline, 
        RoFormerForSequenceClassification
    )
>>> tokenizer = RoFormerTokenizer.from_pretrained("alchemab/antiberta2")
>>> model = RoFormerForMaskedLM.from_pretrained("alchemab/antiberta2")

>>> filler = pipeline(model=model, tokenizer=tokenizer)
>>> filler("Ḣ Q V Q ... C A [MASK] D ... T V S S") # fill in the mask

>>> new_model = RoFormerForSequenceClassification.from_pretrained(
            "alchemab/antiberta2") # this will of course raise warnings 
                                   # that a new linear layer will be added 
                                   # and randomly initialized
Downloads last month
1,339
Safetensors
Model size
203M params
Tensor type
F32
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.