Pclanglais commited on
Commit
7018bf6
1 Parent(s): 18a2b69

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -12,7 +12,7 @@ language:
12
 
13
  OCROnos models are versatile tools supporting the correction of OCR errors, wrong word cut/merge and overall broken text structures. The training data includes a highly diverse set of ocrized texts in multiple languages from PleIAs open pre-training corpus, drawn from cultural heritage sources (Common Corpus) and financial and administrative documents in open data (Finance Commons).
14
 
15
- This release currently features a model based on llama-3-8b that has been the most tested to date. Future release will focus on smaller internal models that provides a better ratio of generation cost/quality.
16
 
17
  OCRonos is generally faithful to what the original material, provides sensible restitution of deteriorated text and will rarely rewrite correct words. On highly deteriorated content, OCRonos can act as a synthetic rewriting tool rather than a strict correction tool.
18
 
 
12
 
13
  OCROnos models are versatile tools supporting the correction of OCR errors, wrong word cut/merge and overall broken text structures. The training data includes a highly diverse set of ocrized texts in multiple languages from PleIAs open pre-training corpus, drawn from cultural heritage sources (Common Corpus) and financial and administrative documents in open data (Finance Commons).
14
 
15
+ This release currently features a model based on llama-3-8b that has been the most tested to date. The model was trained using HPC resources from GENCI–IDRIS (Grant 2023-AD011014736) on Jean-Zay. Future release will focus on smaller internal models that provides a better ratio of generation cost/quality.
16
 
17
  OCRonos is generally faithful to what the original material, provides sensible restitution of deteriorated text and will rarely rewrite correct words. On highly deteriorated content, OCRonos can act as a synthetic rewriting tool rather than a strict correction tool.
18