jtlicardo's picture
Update README.md
e62067d verified
---
license: apache-2.0
tags:
- generated_from_trainer
metrics:
- precision
- recall
- f1
- accuracy
widget:
- text: The process starts when the customer enters the shop. The customer then takes
the product from the shelf. The customer then pays for the product and leaves
the store.
example_title: Example 1
- text: The process begins when the HR department hires the new employee. Next, the
new employee completes necessary paperwork and provides documentation to the HR
department. After the initial task, the HR department performs a decision to
determine the employee's role and department assignment. The employee is trained
by the Sales department. After the training, the Sales department assigns the
employee a sales quota and performance goals. Finally, the process ends with an
'End' event, when the employee begins their role in the Sales department.
example_title: Example 2
- text: A customer places an order for a product on the company's website. Next, the
customer service department checks the availability of the product and confirms
the order with the customer. After the initial task, the warehouse processes
the order. If the order is eligible for same-day shipping, the warehouse staff
picks and packs the order, and it is sent to the shipping department. After the
order is packed, the shipping department delivers the order to the customer. Finally,
the process ends with an 'End' event, when the customer receives their order.
example_title: Example 3
base_model: bert-base-cased
model-index:
- name: bpmn-information-extraction-v2
results: []
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# bpmn-information-extraction-v2
This model is a fine-tuned version of [bert-base-cased](https://huggingface.co/bert-base-cased) on a dataset containing 104 textual process descriptions.
The dataset and the training scripts can be found here: https://github.com/jtlicardo/process-visualizer/tree/main/src/token_classification
The dataset contains 5 target labels:
* `AGENT`
* `TASK`
* `TASK_INFO`
* `PROCESS_INFO`
* `CONDITION`
It achieves the following results on the evaluation set:
- Loss: 0.2179
- Precision: 0.8826
- Recall: 0.9246
- F1: 0.9031
- Accuracy: 0.9516
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 15
### Training results
| Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy |
|:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|:--------:|
| 1.9945 | 1.0 | 12 | 1.5128 | 0.2534 | 0.3730 | 0.3018 | 0.5147 |
| 1.2161 | 2.0 | 24 | 0.8859 | 0.2977 | 0.4524 | 0.3591 | 0.7256 |
| 0.6755 | 3.0 | 36 | 0.4876 | 0.5562 | 0.7262 | 0.6299 | 0.8604 |
| 0.372 | 4.0 | 48 | 0.3091 | 0.7260 | 0.8413 | 0.7794 | 0.9128 |
| 0.2412 | 5.0 | 60 | 0.2247 | 0.7526 | 0.8571 | 0.8015 | 0.9342 |
| 0.1636 | 6.0 | 72 | 0.2102 | 0.8043 | 0.8968 | 0.8480 | 0.9413 |
| 0.1325 | 7.0 | 84 | 0.1910 | 0.8667 | 0.9286 | 0.8966 | 0.9500 |
| 0.11 | 8.0 | 96 | 0.2352 | 0.8456 | 0.9127 | 0.8779 | 0.9389 |
| 0.0945 | 9.0 | 108 | 0.2179 | 0.8550 | 0.9127 | 0.8829 | 0.9429 |
| 0.0788 | 10.0 | 120 | 0.2203 | 0.8830 | 0.9286 | 0.9052 | 0.9445 |
| 0.0721 | 11.0 | 132 | 0.2079 | 0.8902 | 0.9325 | 0.9109 | 0.9516 |
| 0.0617 | 12.0 | 144 | 0.2367 | 0.8797 | 0.9286 | 0.9035 | 0.9445 |
| 0.0615 | 13.0 | 156 | 0.2183 | 0.8859 | 0.9246 | 0.9049 | 0.9492 |
| 0.0526 | 14.0 | 168 | 0.2179 | 0.8826 | 0.9246 | 0.9031 | 0.9516 |
### Framework versions
- Transformers 4.26.1
- Pytorch 1.13.1+cu116
- Datasets 2.10.0
- Tokenizers 0.13.2