|
--- |
|
license: mit |
|
--- |
|
|
|
# Model |
|
The Pandemic PACT Advanced Categorisation Engine (PPACE) is a fine-tuned 8B LLM designed for automatically classifying research abstracts from funded biomedical projects according to WHO-aligned research priorities. Developed as part of the GLOPID-R Pandemic PACT project, PPACE assists in tracking and analysing research funding and clinical evidence for a wide range of diseases with outbreak potential. |
|
|
|
The model leverages a human-annotated dataset expanded with rationales generated by a larger LLM. These rationales provide explanations for the chosen labels, enhancing the model's interpretability and accuracy. |
|
|
|
# Usage |
|
```python |
|
### TODO |
|
``` |
|
|
|
# Model Details |
|
PPACE is fine-tuned using Low-Rank Adaptation (LoRA) to ensure efficient training while maintaining high performance. The fine-tuning process involves training the model for 4 epochs on a dataset of 5142 projects, using 8 A100 GPUs with a batch size of 1 per GPU and 4 gradient accumulation steps. |
|
|
|
## Hyperparameters |
|
|
|
| Hyperparameter | Value | |
|
|---------------------------|--------| |
|
| Total Batch Size | 2 | |
|
| Gradient Accumulation Steps | 4 | |
|
| Learning Rate | 2e-4 | |
|
| LR Scheduler | Linear | |
|
| Epochs | 2 | |
|
| LoRA Rank | 128 | |
|
| LoRA α | 256 | |
|
| LoRA Dropout | 0.05 | |
|
|