TF-Keras
jamesdolezal commited on
Commit
f33f105
1 Parent(s): 1b4d893

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +105 -0
README.md CHANGED
@@ -1,3 +1,108 @@
1
  ---
2
  license: gpl-3.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: gpl-3.0
3
  ---
4
+
5
+ # Thyroid BRAF-RAS Score (BRS) v1 Model Card
6
+ This model card describes a model associated with the manuscript "Deep learning prediction of BRAF-RAS gene expression signature identifies noninvasive follicular thyroid neoplasms with papillary-like nuclear features", by Dolezal _et al_, available [here](https://www.nature.com/articles/s41379-020-00724-3)
7
+
8
+ ## Model Details
9
+ - **Developed by:** James Dolezal
10
+ - **Model type:** Deep convolutional neural network image classifier
11
+ - **Language(s):** English
12
+ - **License:** GPL-3.0
13
+ - **Model Description:** This is a model that can predict, from H&E-stained pathologic images of thyroid neoplasms, the predicted BRAF-RAS Score (BRS). BRS is a gene expression score scaled from -1 (BRAF-like) to +1 (RAS-like) indicating how similar a tumor's gene expression is to a BRAF-mutant and RAS-mutant tumor. The model is an [Xception](https://arxiv.org/abs/1610.02357) model with two dropout-enabled hidden layers.
14
+ - **Image processing:** This model expects images of H&E-stained pathology slides at 299 x 299 px and 302 x 302 μm resolution. Images should be stain-normalized using a modified Reinhard normalizer ("Reinhard-Fast") available [here](https://github.com/jamesdolezal/slideflow/blob/master/slideflow/norm/tensorflow/reinhard.py). The stain normalizer should be fit using the `target_means` and `target_stds` listed in the model `params.json` file. Images should be should be standardized with `tf.image.per_image_standardization()`.
15
+ - **Resources for more information:** [GitHub Repository](https://github.com/jamesdolezal/histologic-sheep)
16
+
17
+ # Uses
18
+
19
+ ## Examples
20
+ For direct use, the model can be loaded using Tensorflow/Keras:
21
+
22
+ ```
23
+ import tensorflow as tf
24
+ model = tf.keras.models.load_model('/path/')
25
+ ```
26
+
27
+ or loaded with [Slideflow](https://github.com/jamesdolezal/slideflow) version 1.1+ with the following syntax:
28
+
29
+ ```
30
+ import slideflow as sf
31
+ model = sf.model.load('/path/')
32
+ ```
33
+
34
+ The stain normalizer can be loaded and fit using Slideflow:
35
+
36
+ ```
37
+ normalizer = sf.util.get_model_normalizer('/path/')
38
+ ```
39
+
40
+ The stain normalizer has a native Tensorflow transform and can be directly applied to a tf.data.Dataset:
41
+
42
+ ```
43
+ # Map the stain normalizer transformation
44
+ # to a tf.data.Dataset
45
+ dataset = dataset.map(normalizer.tf_to_tf)
46
+ ```
47
+
48
+ Alternatively, the model can be used to generate predictions for whole-slide images processed through Slideflow in an end-to-end [Project](https://slideflow.dev/project_setup.html). To use the model to generate predictions on data processed with Slideflow, simply pass the model to the [`Project.predict()`](https://slideflow.dev/project.html#slideflow.Project.predict) function:
49
+
50
+ ```
51
+ import slideflow
52
+ P = sf.Project('/path/to/slideflow/project')
53
+ P.predict('/model/path')
54
+ ```
55
+
56
+ ## Direct Use
57
+ This model is intended for research purposes only. Possible research areas and tasks include
58
+
59
+ - Applications in educational settings.
60
+ - Research on pathology classification models for thyroid neoplasms.
61
+
62
+ Excluded uses are described below.
63
+
64
+ ### Misuse and Out-of-Scope Use
65
+ This model should not be used in a clinical setting to generate predictions that will be used to inform patients, physicians, or any other health care members directly involved in their health care outside the context of an approved research protocol. Using the model in a clinical setting outside the context of an approved research protocol is a misuse of this model. This includes, but is not limited to:
66
+
67
+ - Generating predictions of images from a patient's tumor and sharing those predictions with the patient
68
+ - Generating predictions of images from a patient's tumor and sharing those predictions with the patient's physician, or other members of the patient's healthcare team
69
+ - Influencing a patient's health care treatment in any way based on output from this model
70
+
71
+ ### Limitations
72
+
73
+ The model has not been validated in contexts where non-thyroid neoplasms, or rare thyroid subtypes such as anaplastic thyroid carcinoma, are possible.
74
+
75
+ ### Bias
76
+ This model was trained on The Cancer Genome Atlas (TCGA), which contains patient data from communities and cultures which may not reflect the general population. This datasets is comprised of images from multiple institutions, which may introduce a potential source of bias from site-specific batch effects ([Howard, 2021](https://www.nature.com/articles/s41467-021-24698-1)).
77
+
78
+ ## Training
79
+
80
+ **Training Data**
81
+ The following dataset was used to train the model:
82
+
83
+ - The Cancer Genome Atlas (TCGA), THCA cohort (see next section)
84
+
85
+ This model was trained on a total of 369 slides, with 116 BRAF-like tumors and 271 RAS-like tumors.
86
+
87
+ **Training Procedure**
88
+ Each whole-slide image was sectioned into smaller images in a grid-wise fashion in order to extract tiles from whole-slide images at 302 x 302 μm. Image tiles were extracted at the nearest downsample layer, and resized to 299 x 299 px using [Libvips](https://www.libvips.org/API/current/libvips-resample.html#vips-resize). During training,
89
+
90
+ - Images are stain-normalized with a modified Reinhard normalizer ("Reinhard-Fast"), which excludes the brightness standardization step, available [here](https://github.com/jamesdolezal/slideflow/blob/master/slideflow/norm/tensorflow/reinhard.py)
91
+ - Images are randomly flipped and rotated (90, 180, 270)
92
+ - Images have a 50% chance of being JPEG compressed with quality level between 50-100%
93
+ - Images have a 10% chance of random Gaussian blur, with sigma between 0.5-2.0
94
+ - Images are standardized with `tf.image.per_image_standardization()`
95
+ - Images are classified through an Xception block, followed by two hidden layers with dropout (p=0.1) enabled during training
96
+ - The loss is mean squared error using the linear outcome BRS
97
+ - Training is completed after 1 epoch
98
+
99
+ Additional training information:
100
+
101
+ - **Hardware:** 1 x A100 GPUs
102
+ - **Optimizer:** Adam
103
+ - **Batch:** 128
104
+ - **Learning rate:** 0.0001, with a decay of 0.98 every 512 steps
105
+ - **Hidden layers:** 2 hidden layers of width 1024, with dropout p=0.1
106
+
107
+ ## Evaluation Results
108
+ External evaluation results are currently under peer review and will be posted once publicly available.