nlpie
/

tiny-biobert

Inference Endpoints

Model card Files Files and versions Community

tiny-biobert / README.md

omidrohanian's picture

Update README.md

a49b910 verified 7 months ago

|

1.3 kB

	---
	title: README
	emoji: 🏃
	colorFrom: gray
	colorTo: purple
	sdk: static
	pinned: false
	license: mit
	---

	# Model Description
	TinyBioBERT is a distilled version of the [BioBERT](https://huggingface.co/dmis-lab/biobert-base-cased-v1.2?text=The+goal+of+life+is+%5BMASK%5D.) which is distilled for 100k training steps using a total batch size of 192 on the PubMed dataset.

	# Distillation Procedure
	This model uses a unique distillation method called ‘transformer-layer distillation’ which is applied on each layer of the student to align the attention maps and the hidden states of the student with those of the teacher.

	# Architecture and Initialisation
	This model uses 4 hidden layers with a hidden dimension size and an embedding size of 768 resulting in a total of 15M parameters. Due to the model's small hidden dimension size, it uses random initialisation.

	# Citation

	If you use this model, please consider citing the following paper:

	```bibtex
	@article{rohanian2023effectiveness,
	title={On the effectiveness of compact biomedical transformers},
	author={Rohanian, Omid and Nouriborji, Mohammadmahdi and Kouchaki, Samaneh and Clifton, David A},
	journal={Bioinformatics},
	volume={39},
	number={3},
	pages={btad103},
	year={2023},
	publisher={Oxford University Press}
	}
	```