COVID-19 Detection using VGG19 and X-ray Images

Overview

This model is able detect COVID-19 from X-ray images using the VGG19 architecture for transfer learning. The dataset used for this project is the COVID-19 Radiography Database available on Kaggle.

Dataset

The dataset used in this project is the COVID-19 Radiography Database. It contains X-ray images categorized into three classes: COVID, Normal, and other pneumonia. The dataset is split into training, validation, and test sets to ensure robust evaluation of the model.

Methodology

1. Import Libraries

We start by importing the necessary libraries required for data processing, model building, and evaluation. These include TensorFlow for deep learning, matplotlib for visualization, and other essential packages.

2. Load Dataset

The dataset is loaded from the specified directory. This dataset contains X-ray images categorized into COVID, Normal, and other pneumonia classes. The images are stored in respective folders, which are read and preprocessed.

3. Data Preprocessing

Data Augmentation: To increase the diversity of our training data, various transformations such as rotation, zoom, and horizontal flip are applied. This helps in making the model robust and prevents overfitting.
Rescaling: The pixel values are rescaled to the range [0, 1] to standardize the input data, which improves model performance.

4. Split Dataset

The dataset is split into training, validation, and test sets. This is crucial for evaluating the model's performance on unseen data.

Training Set: Used to train the model.
Validation Set: Used to tune hyperparameters and prevent overfitting.
Test Set: Used to assess the final model's performance.

5. Build the Model using VGG19

Transfer Learning: The pre-trained VGG19 model, which has been trained on a large dataset (ImageNet), is used to leverage the learned features from a different domain to our specific task of COVID-19 detection.
Model Architecture: Custom layers are added on top of VGG19 to adapt it to our classification problem. This includes flattening the output, adding dense layers, and a final softmax layer for classification.

6. Compile the Model

Loss Function: 'binary_crossentropy' is used as the loss function because we have more than two classes.
Optimizer: The Adam optimizer is used to adjust the learning rate dynamically.
Metrics: Accuracy is tracked to monitor the performance of the model.

7. Train the Model

Epochs: The number of times the entire training dataset is passed forward and backward through the neural network.
Batch Size: The number of training examples utilized in one iteration.
Validation Data: Helps in monitoring the model's performance on unseen data during training to tune hyperparameters and avoid overfitting.

8. Evaluate the Model

The model is evaluated on the test set to determine its accuracy, precision, recall, and F1 score. This helps in understanding the model's performance comprehensively.

9. Visualize Training Results

Loss and Accuracy Plots: Visualize the training and validation loss and accuracy to understand how well the model is learning and if it's overfitting or underfitting.
Confusion Matrix: Provides a detailed breakdown of true positives, false positives, true negatives, and false negatives, giving insights into where the model is making errors.

10. Conclusion

The findings and the performance of the model are summarized. Potential improvements or future work such as experimenting with different architectures, more data, or advanced preprocessing techniques are discussed.

Results

The model achieves an accuracy of 98.1% on the test set, indicating its effectiveness in detecting COVID-19 from X-ray images. The high accuracy demonstrates the successful application of data preprocessing, augmentation, and model training techniques.

ishans24
/

covid19-detection-xray