Edit model card

BibTexer is a specialized language models trained by PleIAs for the structured extraction of bibliographies in a Bibtex format.

Bibtexer act like a reversed Zotero: given an unstructured list of references, the model will return a series of Bibtex entries that can be loaded in any bibliographic databases.

Like all models from PleIAs Bad Data Toolbox, BibTexer has been volontary trained on diverse and challenging data sources, covering nearly all the styles featured on Zotero, as well as examples of broken text sources (line jump, digitization artifact).

BibTexer has been trained on multilingual styles and formats and should work correctly on most European languages.

Along with Segmentext and Bibstyle-Detector, BibTexer can be tested on the Reversed-Zotero space.

Example

This copy-paste of unstructured references include unwelcome line jumps as well as a title not part of the original set:

References

  1. Postigo JAR. Leishmaniasis in the World Health Organization Eastern Mediterranean Region Int J Antimicrob Agents. 2010;36:S62-5.
  2. Alvar J, Vélez ID, Bern C, Herrero M, Desjeux P, Cano J, et al. Leishmaniasis worldwide and global estimates of its incidence. PLoS One. 2012;7:35671.
  3. World Health Organization. Control of the leishmaniases. World Health Organ Tech Rep Ser. 2010;7-8:1-186.
  4. Wallace MR, Hale BR, Utz CC, Olson PE, Earhart KC, Thornton SA, et al. Endemic infectious diseases of Afghanistan. Clin Infect Dis. 2002;34:171-207.

After parsing by Segmentext, Bibtexer returns the following list of references:

@article{postigo2010,
  author = {Postigo JAR},
  title = {Leishmaniasis in the World Health Organization Eastern Mediterranean Region},
  journal = {Int J Antimicrob Agents},
  year = {2010},
  volume = {36},
  pages = {62-5}
}

@article{alvar2012,
  author = {Alvar J Vélez ID Bern C Herrero M Desjeux P Cano J et al},
  title = {Leishmaniasis worldwide and global estimates of its incidence},
  journal = {PLoS One},
  year = {2012},
  volume = {7},
  pages = {35671}
}

@article{world2010,
  author = {World Health Organization},
  title = {Control of the leishmaniases},
  journal = {World Health Organ Tech Rep Ser},
  year = {2010},
  volume = {78},
  pages = {1-186}
}

@article{wallace2002,
  author = {Wallace MR Hale BR Utz CC Olson PE Earhart KC Thornton SA et al},
  title = {Endemic infectious diseases of Afghanistan},
  journal = {Clin Infect Dis},
  year = {2002},
  volume = {34},
  pages = {171-207}
}

The reference can be straight exported to Zotero:

Export Zotero

Downloads last month
85
Safetensors
Model size
278M params
Tensor type
F32
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Spaces using PleIAs/BibTexer 2

Collection including PleIAs/BibTexer