paligemma_derm / README.md
brucewayne0459's picture
Update README.md
f82ae44 verified
|
raw
history blame
4.02 kB
metadata
license: apache-2.0
datasets:
  - joshuachou/SkinCAP
  - HemanthKumarK/SKINgpt
language:
  - en
pipeline_tag: image-text-to-text
tags:
  - biology
  - skin
  - skin disease
  - cancer
  - medical

Model Card for PaliGemma Dermatology Model

Model Details

Model Description

This model, based on the PaliGemma-3B architecture, has been fine-tuned for dermatology-related image and text processing tasks. The model is designed to assist in the identification of various skin conditions using a combination of image analysis and natural language processing.

please let me know the model works -->https://forms.gle/cBA6apSevTyiEbp46

Thank you

Uses

Direct Use

The model can be directly used for analyzing dermatology images, providing insights into potential skin conditions.

Bias, Risks, and Limitations

Skin Tone Bias: The model may have been trained on a dataset that does not adequately represent all skin tones, potentially leading to biased results. Geographic Bias: The model's performance may vary depending on the prevalence of certain conditions in different geographic regions.

How to Get Started with the Model


import torch
from transformers import AutoProcessor, PaliGemmaForConditionalGeneration
from PIL import Image

# Load the model and processor
model_id = "brucewayne0459/paligemma_derm"
processor = AutoProcessor.from_pretrained(model_id)
model = PaliGemmaForConditionalGeneration.from_pretrained(model_id, device_map={"": 0})
model.eval()

# Load a sample image and text input
input_text = "Identify the skin condition?"
input_image_path = " Replace with your actual image path"  
input_image = Image.open(input_image_path).convert("RGB")

# Process the input
inputs = processor(text=input_text, images=input_image, return_tensors="pt", padding="longest").to("cuda" if torch.cuda.is_available() else "cpu")

# Set the maximum length for generation
max_new_tokens = 50

# Run inference
with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=max_new_tokens)

# Decode the output
decoded_output = processor.decode(outputs[0], skip_special_tokens=True)
print("Model Output:", decoded_output)

Training Details

Training Data

The model was fine-tuned on a dataset of dermatological images combined with disease names

Training Procedure

The model was fine-tuned using LoRA (Low-Rank Adaptation) for more efficient training. Mixed precision (bfloat16) was used to speed up training and reduce memory usage.

Training Hyperparameters

  • Training regime: Mixed precision (bfloat16)
  • Epochs: 10
  • Learning rate: 2e-5
  • Batch size: 6
  • Gradient accumulation steps: 4

Evaluation

Testing Data, Factors & Metrics

Testing Data

The model was evaluated on a separate validation set of dermatological images and Disease Names, distinct from the training data.

Metrics

  • Validation Loss: The loss was tracked throughout the training process to evaluate model performance.
  • Accuracy: The primary metric for assessing model predictions.

Results

The model achieved a final validation loss of approximately 0.2214, indicating reasonable performance in predicting skin conditions based on the dataset used.

Summary

Environmental Impact

  • Hardware Type: 1 x L4 GPU
  • Hours used: ~22 HOURS
  • Cloud Provider: LIGHTNING AI
  • Compute Region: USA
  • Carbon Emitted: 0.9 kg eq. CO2

Technical Specifications

Model Architecture and Objective

  • Architecture: Vision-Language model based on PaliGemma-3B
  • Objective: To classify and diagnose dermatological conditions from images and text

Compute Infrastructure

Hardware

  • GPU: 1xL4 GPU

Model Card Authors

Bruce_Wayne