Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Material SciBERT (TPU): Improving language understanding in materials science

Work in progress

Introduction

SciBERT-based model pre-trained with materials science scientific fulltext

Authors

Luca Foppiano Pedro Ortiz Suarez

TLDR

  • Collected full-text from ~700000 articles provided by the National Institute for Materials Science (NIMS) TDM platform (https://dice.nims.go.jp/services/TDM-PF/en/), dataset called ScienceCorpus (SciCorpus)
  • We added to the SciBERT vocabulary (32k tokens), 100 domain-specific unknown words extracted from SciCorpus with a keywords modeler (KeyBERT)
  • Starting conditions: original SciBERT weights
  • Pre-train the model MatTpuSciBERT from on the Google Cloud with the TPU (Tensor Processing Unit) as follow:
    • 800000 steps with batch_size: 256, max_seq_length:512
    • 100000 steps with batch_size: 2048, max_seq_length:128
  • Fine-tuning and testing on NER on superconductors (https://github.com/lfoppiano/grobid-superconductors) and physical quantities (https://github.com/kermitt2/grobid-quantities)

Related work

BERT Implementations

Relevant models

Results

Results obtained via 10-fold cross-validation, using DeLFT (https://github.com/kermitt2/delft)

NER Superconductors

Model Precision Recall F1
SciBERT (baseline) 81.62% 84.23% 82.90%
MatSciBERT (Gupta) 81.45% 84.36% 82.88%
MatTPUSciBERT 82.13% 85.15% 83.61%
MatBERT (Ceder) 81.25% 83.99% 82.60%
BatteryScibert-cased 81.09% 84.14% 82.59%

NER Quantities

Model Precision Recall F1
SciBERT (baseline) 88.73% 86.76% 87.73%
MatSciBERT (Gupta) 84.98% 90.12% 87.47%
MatTPUSciBERT 88.62% 86.33% 87.46%
MatBERT (Ceder) 85.08% 89.93% 87.44%
BatteryScibert-cased 85.02% 89.30% 87.11%
BatteryScibert-cased 81.09% 84.14% 82.59%

References

This work was supported by Google, through the researchers program https://cloud.google.com/edu/researchers

Acknowledgements

TBA

Downloads last month
781
Safetensors
Model size
111M params
Tensor type
I64
·
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.