Spaces:
Running
on
Zero
Running
on
Zero
jeremyarancio
commited on
Commit
•
b24df6e
1
Parent(s):
17306ce
Add with torch no grad
Browse files
app.py
CHANGED
@@ -18,7 +18,7 @@ logging.basicConfig(
|
|
18 |
EXAMPLES = [
|
19 |
["images/ingredients_1.jpg", "24.36% chocolat noir 63% origine non UE (cacao, sucre, beurre de cacao, émulsifiant léci - thine de colza, vanille bourbon gousse), œuf, farine de blé, beurre, sucre, miel, sucre perlé, levure chimique, zeste de citron."],
|
20 |
["images/ingredients_2.jpg", "farine de froment, œufs, lait entier pasteurisé Aprigine: France), sucre, sel, extrait de vanille naturelle Conditi( 35."],
|
21 |
-
["images/ingredients_3.jpg", "tural basmati rice - cooked (98%), rice bran oil, salt"],
|
22 |
["images/ingredients_4.jpg", "Eau de noix de coco 93.9%, Arôme natutel de fruit"],
|
23 |
["images/ingredients_5.jpg", "Sucre, pâte de cacao, beurre de cacao, émulsifiant: léci - thines (soja). Peut contenir des traces de lait. Chocolat noir: cacao: 50% minimum. À conserver à l'abri de la chaleur et de l'humidité. Élaboré en France."],
|
24 |
]
|
@@ -37,6 +37,8 @@ However, it often happens the information extracted by OCR contains typos and er
|
|
37 |
To solve this problem, we developed an 🍊 **Ingredient Spellcheck** 🍊, a model capable of correcting typos in a list of ingredients following a defined guideline.
|
38 |
The model, based on Mistral-7B-v0.3, was fine-tuned on thousand of corrected lists of ingredients extracted from the database. More information in the model card.
|
39 |
|
|
|
|
|
40 |
## 👇 Links
|
41 |
|
42 |
* Open Food Facts website: https://world.openfoodfacts.org/discover
|
@@ -83,11 +85,12 @@ def process(text: str) -> str:
|
|
83 |
add_special_tokens=True,
|
84 |
return_tensors="pt"
|
85 |
).input_ids
|
86 |
-
|
87 |
-
|
88 |
-
|
89 |
-
|
90 |
-
|
|
|
91 |
return tokenizer.decode(output[0], skip_special_tokens=True)[len(prompt):].strip()
|
92 |
|
93 |
|
|
|
18 |
EXAMPLES = [
|
19 |
["images/ingredients_1.jpg", "24.36% chocolat noir 63% origine non UE (cacao, sucre, beurre de cacao, émulsifiant léci - thine de colza, vanille bourbon gousse), œuf, farine de blé, beurre, sucre, miel, sucre perlé, levure chimique, zeste de citron."],
|
20 |
["images/ingredients_2.jpg", "farine de froment, œufs, lait entier pasteurisé Aprigine: France), sucre, sel, extrait de vanille naturelle Conditi( 35."],
|
21 |
+
# ["images/ingredients_3.jpg", "tural basmati rice - cooked (98%), rice bran oil, salt"],
|
22 |
["images/ingredients_4.jpg", "Eau de noix de coco 93.9%, Arôme natutel de fruit"],
|
23 |
["images/ingredients_5.jpg", "Sucre, pâte de cacao, beurre de cacao, émulsifiant: léci - thines (soja). Peut contenir des traces de lait. Chocolat noir: cacao: 50% minimum. À conserver à l'abri de la chaleur et de l'humidité. Élaboré en France."],
|
24 |
]
|
|
|
37 |
To solve this problem, we developed an 🍊 **Ingredient Spellcheck** 🍊, a model capable of correcting typos in a list of ingredients following a defined guideline.
|
38 |
The model, based on Mistral-7B-v0.3, was fine-tuned on thousand of corrected lists of ingredients extracted from the database. More information in the model card.
|
39 |
|
40 |
+
## Project in progress
|
41 |
+
|
42 |
## 👇 Links
|
43 |
|
44 |
* Open Food Facts website: https://world.openfoodfacts.org/discover
|
|
|
85 |
add_special_tokens=True,
|
86 |
return_tensors="pt"
|
87 |
).input_ids
|
88 |
+
with torch.no_grad():
|
89 |
+
output = model.generate(
|
90 |
+
input_ids.to(zero.device), # GPU
|
91 |
+
do_sample=False,
|
92 |
+
max_new_tokens=512,
|
93 |
+
)
|
94 |
return tokenizer.decode(output[0], skip_special_tokens=True)[len(prompt):].strip()
|
95 |
|
96 |
|