sunny3 commited on
Commit
e639067
1 Parent(s): d552201

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -8
README.md CHANGED
@@ -1,20 +1,19 @@
1
- Relation_Extraction
2
- ===================
3
- This repository provides code and additional materials of the paper: "Extraction of the Relations between Significant Pharmacological Entities in Russian-Language Reviews of Internet Users on Medications".
4
-
5
- In this work, we trained a model to recognize 4 types of relationships between entities in drug review texts: ADR–Drugname, Drugname–Diseasename, Drugname–SourceInfoDrug, Diseasename–Indication. The input of the model is a review text and a pair of entities, between which it is required to determine the fact of a relationship and one of the 4 types of relationship, listed above.
6
 
7
  Data
8
  ----
9
  Proposed model is trained on a subset of 908 reviews of the [Russian Drug Review Corpus (RDRS)](https://arxiv.org/pdf/2105.00059.pdf). The subset contains the markup of the following types of entities: and contains pairs of entities marked with the 4 listed types of relationships:
10
  - ADR-Drugname — the relationship between the drug and its side effects
11
- - Drugname-SourceInfodrug — the relationship between the medication and the187source of information about it (e.g., “was advised at the pharmacy”, “the -Drugname-- - Drugname-Diseasename — the relationship between the drug and the disease
 
12
  - Diseasename-Indication — the connection between the illness and its symptoms (e.g., “cough”, “fever 39 degrees”)
13
  Also, this subset contains pairs of the same entity types between which there is no relationship: for example, a drug and an unrelated side effect that appeared after taking another drug; in other words, this side effect is related to another drug.
14
 
15
- Model
16
  ----
17
- Proposed model is based on the [XLM-RoBERTA-large](https://arxiv.org/abs/1911.02116) topology. After the additional training as a langauge model on corpus of unmarked drug reviews, this model was trained as a classification model on 80% of the texts from subset of the corps described above. This model showed the best accuracy on one of the folds of the cross-validation. For additional details see original paper.
18
 
19
  How to use
20
  ----
 
1
+ pharm-relation-extraction
2
+ ===
3
+ Model trained a model to recognize 4 types of relationships between significant pharmacological entities in russian-language reviews: ADR–Drugname, Drugname–Diseasename, Drugname–SourceInfoDrug, Diseasename–Indication. The input of the model is a review text and a pair of entities, between which it is required to determine the fact of a relationship and one of the 4 types of relationship, listed above.
 
 
4
 
5
  Data
6
  ----
7
  Proposed model is trained on a subset of 908 reviews of the [Russian Drug Review Corpus (RDRS)](https://arxiv.org/pdf/2105.00059.pdf). The subset contains the markup of the following types of entities: and contains pairs of entities marked with the 4 listed types of relationships:
8
  - ADR-Drugname — the relationship between the drug and its side effects
9
+ - Drugname-SourceInfodrug — the relationship between the medication and the source of information about it (e.g., “was advised at the pharmacy”, e.g., was advised at the pharmacy”, the doctor recommended it”);
10
+ - Drugname-Diseasname — the relationship between the drug and the disease
11
  - Diseasename-Indication — the connection between the illness and its symptoms (e.g., “cough”, “fever 39 degrees”)
12
  Also, this subset contains pairs of the same entity types between which there is no relationship: for example, a drug and an unrelated side effect that appeared after taking another drug; in other words, this side effect is related to another drug.
13
 
14
+ Model topology and training
15
  ----
16
+ Proposed model is based on the [XLM-RoBERTA-large](https://arxiv.org/abs/1911.02116) topology. After the additional training as a language model on corpus of unmarked drug reviews, this model was trained as a classification model on 80% of the texts from subset of the corps described above.
17
 
18
  How to use
19
  ----