ankrgyl commited on
Commit
723ec3f
1 Parent(s): 1f7bede

Upload TFLayoutLMForQuestionAnswering

Browse files
Files changed (3) hide show
  1. README.md +30 -39
  2. config.json +5 -3
  3. tf_model.h5 +3 -0
README.md CHANGED
@@ -1,57 +1,48 @@
1
  ---
2
- language: en
3
- thumbnail: https://uploads-ssl.webflow.com/5e3898dff507782a6580d710/614a23fcd8d4f7434c765ab9_logo.png
4
  license: mit
 
 
 
 
 
5
  ---
6
 
7
- # LayoutLM for Visual Question Answering
 
8
 
9
- This is a fine-tuned version of the multi-modal [LayoutLM](https://aka.ms/layoutlm) model for the task of question answering on documents. It has been fine-tuned on
10
 
11
- ## Model details
 
12
 
13
- The LayoutLM model was developed at Microsoft ([paper](https://arxiv.org/abs/1912.13318)) as a general purpose tool for understanding documents. This model is a fine-tuned checkpoint of [LayoutLM-Base-Cased](https://huggingface.co/microsoft/layoutlm-base-uncased), using both the [SQuAD2.0](https://huggingface.co/datasets/squad_v2) and [DocVQA](https://www.docvqa.org/) datasets.
14
 
15
- ## Getting started with the model
16
 
17
- To run these examples, you must have [PIL](https://pillow.readthedocs.io/en/stable/installation.html), [pytesseract](https://pypi.org/project/pytesseract/), and [PyTorch](https://pytorch.org/get-started/locally/) installed in addition to [transformers](https://huggingface.co/docs/transformers/index).
18
 
19
- ```python
20
- from transformers import AutoTokenizer, pipeline
21
 
22
- tokenizer = AutoTokenizer.from_pretrained(
23
- "impira/layoutlm-document-qa",
24
- add_prefix_space=True,
25
- trust_remote_code=True,
26
- )
27
 
28
- nlp = pipeline(
29
- model="impira/layoutlm-document-qa",
30
- tokenizer=tokenizer,
31
- trust_remote_code=True,
32
- )
33
 
34
- nlp(
35
- "https://templates.invoicehome.com/invoice-template-us-neat-750px.png",
36
- "What is the invoice number?"
37
- )
38
- # {'score': 0.9943977, 'answer': 'us-001', 'start': 15, 'end': 15}
39
 
40
- nlp(
41
- "https://miro.medium.com/max/787/1*iECQRIiOGTmEFLdWkVIH2g.jpeg",
42
- "What is the purchase amount?"
43
- )
44
- # {'score': 0.9912159, 'answer': '$1,000,000,000', 'start': 97, 'end': 97}
45
 
46
- nlp(
47
- "https://www.accountingcoach.com/wp-content/uploads/2013/10/income-statement-example@2x.png",
48
- "What are the 2020 net sales?"
49
- )
50
- # {'score': 0.59147286, 'answer': '$ 3,750', 'start': 19, 'end': 20}
51
- ```
52
 
53
- **NOTE**: This model relies on a [model definition](https://github.com/huggingface/transformers/pull/18407) and [pipeline](https://github.com/huggingface/transformers/pull/18414) that are currently in review to be included in the transformers project. In the meantime, you'll have to use the `trust_remote_code=True` flag to run this model.
 
 
54
 
55
- ## About us
56
 
57
- This model was created by the team at [Impira](https://www.impira.com/).
 
 
 
 
 
 
 
 
1
  ---
 
 
2
  license: mit
3
+ tags:
4
+ - generated_from_keras_callback
5
+ model-index:
6
+ - name: layoutlm-document-qa
7
+ results: []
8
  ---
9
 
10
+ <!-- This model card has been generated automatically according to the information Keras had access to. You should
11
+ probably proofread and complete it, then remove this comment. -->
12
 
13
+ # layoutlm-document-qa
14
 
15
+ This model is a fine-tuned version of [impira/layoutlm-document-qa](https://huggingface.co/impira/layoutlm-document-qa) on an unknown dataset.
16
+ It achieves the following results on the evaluation set:
17
 
 
18
 
19
+ ## Model description
20
 
21
+ More information needed
22
 
23
+ ## Intended uses & limitations
 
24
 
25
+ More information needed
 
 
 
 
26
 
27
+ ## Training and evaluation data
 
 
 
 
28
 
29
+ More information needed
 
 
 
 
30
 
31
+ ## Training procedure
 
 
 
 
32
 
33
+ ### Training hyperparameters
 
 
 
 
 
34
 
35
+ The following hyperparameters were used during training:
36
+ - optimizer: None
37
+ - training_precision: float32
38
 
39
+ ### Training results
40
 
41
+
42
+
43
+ ### Framework versions
44
+
45
+ - Transformers 4.22.0.dev0
46
+ - TensorFlow 2.9.2
47
+ - Datasets 2.4.0
48
+ - Tokenizers 0.12.1
config.json CHANGED
@@ -1,15 +1,17 @@
1
  {
2
- "attention_probs_dropout_prob": 0.1,
3
  "architectures": [
4
  "LayoutLMForQuestionAnswering"
5
  ],
 
 
 
6
  "custom_pipelines": {
7
  "document-question-answering": {
8
  "impl": "pipeline_document_question_answering.DocumentQuestionAnsweringPipeline",
9
  "pt": "AutoModelForQuestionAnswering"
10
  }
11
  },
12
- "bos_token_id": 0,
13
  "eos_token_id": 2,
14
  "gradient_checkpointing": false,
15
  "hidden_act": "gelu",
@@ -26,7 +28,7 @@
26
  "pad_token_id": 1,
27
  "position_embedding_type": "absolute",
28
  "tokenizer_class": "RobertaTokenizer",
29
- "transformers_version": "4.6.1",
30
  "type_vocab_size": 1,
31
  "use_cache": true,
32
  "vocab_size": 50265
 
1
  {
2
+ "_name_or_path": "impira/layoutlm-document-qa",
3
  "architectures": [
4
  "LayoutLMForQuestionAnswering"
5
  ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "classifier_dropout": null,
9
  "custom_pipelines": {
10
  "document-question-answering": {
11
  "impl": "pipeline_document_question_answering.DocumentQuestionAnsweringPipeline",
12
  "pt": "AutoModelForQuestionAnswering"
13
  }
14
  },
 
15
  "eos_token_id": 2,
16
  "gradient_checkpointing": false,
17
  "hidden_act": "gelu",
 
28
  "pad_token_id": 1,
29
  "position_embedding_type": "absolute",
30
  "tokenizer_class": "RobertaTokenizer",
31
+ "transformers_version": "4.22.0.dev0",
32
  "type_vocab_size": 1,
33
  "use_cache": true,
34
  "vocab_size": 50265
tf_model.h5 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2796bc2f67ac8e1abe55decfa104c7182376c4bf1f8b97ab87fd8bb4768f2f07
3
+ size 511465184