Edit model card

Model Card for the test-version of instructionBERT for Bertology

instruction BERT drawing

A minimalistic instruction model with an already good analysed and pretrained encoder like BERT. So we can research the Bertology with instruction-tuned models, look at the attention and investigate what happens to BERT embeddings during fine-tuning.

The trainings code is released at the instructionBERT repository. We used the Huggingface API for warm-starting BertGeneration with Encoder-Decoder-Models for this purpose.

Run the model with a longer output

from transformers import AutoTokenizer, EncoderDecoderModel

# load the fine-tuned seq2seq model and corresponding tokenizer
model_name = "Bachstelze/instructionBERTtest"
model = EncoderDecoderModel.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

input = "Write a poem about love, peace and pancake."
input_ids = tokenizer(input, return_tensors="pt").input_ids
output_ids = model.generate(input_ids, max_new_tokens=200)
print(tokenizer.decode(output_ids[0]))

Training parameters

  • base model: "bert-base-cased"
  • test subset of the Muennighoff/flan dataset
  • trained for 0.97 epochs
  • batch size of 14
  • 10000 warm-up steps
  • learning rate of 0.00005

Purpose of instructionBERT

InstructionBERT is intended for research purposes. The model-generated text should be treated as a starting point rather than a definitive solution for potential use cases. Users should be cautious when employing these models in their applications.

Downloads last month
2
Safetensors
Model size
137M params
Tensor type
F32
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train Bachstelze/instructionBERTtest