Pytorch NN Classifier fMRI
This repository contains a sophisticated PyTorch-based binary classification model for analyzing fMRI data. The model architecture includes customizable hidden layers, dropout regularization, and various activation functions, with weight initialization using Xavier uniform distribution. The model was optimized using Optuna to tune hyperparameters and achieve optimal performance on fMRI classification tasks.
Model Card
- Model Name:
Pytorch_Classifier_fMRI
- Framework: PyTorch
- Task: Binary Classification (e.g., prediction based on fMRI data)
- Hyperparameter Optimization: Optuna
- Evaluation Metrics: Accuracy, ROC-AUC, Precision, Recall, F1-score
- License: j.lacoma
Model Architecture
The classification model is composed of several hidden layers, each followed by batch normalization and activation layers (ReLU
, LeakyReLU
, or Tanh
). Dropout is applied for regularization, and the final layer outputs predictions through a sigmoid activation, suitable for binary classification.
The architecture supports multiple weight initialization strategies, with the default being Xavier uniform initialization. The output layer is designed for binary classification tasks, predicting probabilities for each class (0 or 1).
Example Usage
You can load the model directly from Hugging Face and use it for inference. Below is an example that demonstrates how to load the model, define custom weight initialization, and perform inference.
import torch
from transformers import AutoModelForSequenceClassification
# Load model from Hugging Face
model = AutoModelForSequenceClassification.from_pretrained("JayLacoma/Pytorch_Classifier_fMRI")
# Define weight initialization function
def weight_init(m):
if isinstance(m, torch.nn.Linear):
torch.nn.init.xavier_uniform_(m.weight)
torch.nn.init.zeros_(m.bias)
# (Optional) Load saved model weights
model.load_state_dict(torch.load('Pytorch_Classifier_fMRI.pth'))
# Set model to evaluation mode
model.eval()
# Inference example
input_data = torch.tensor([[0.1, 0.2, 0.3, ..., 0.8]]) # Example input data
with torch.no_grad():
logits = model(input_data).logits
prediction = torch.sigmoid(logits).round() # Binary classification output: 0 or 1
print(f"Predicted label: {prediction.item()}")
Hyperparameter Optimization with Optuna
The model has been optimized using Optuna, a hyperparameter tuning framework. The following hyperparameters were fine-tuned through trials to achieve the best results:
{
"hidden_layers": [512, 256],
"dropout_rate": 0.37,
"learning_rate": 0.00667,
"weight_decay": 1.33e-06,
"batch_size": 32,
"patience": 10,
"activation_function": "tanh",
"optimizer": "AdamW",
"scheduler": "StepLR",
"clip_grad": 3.67
}
Training Workflow
The training loop includes gradient clipping, learning rate scheduling, and early stopping based on the validation loss. The optimizer (AdamW
) and the loss function (BCEWithLogitsLoss
) provide stable and accurate optimization for binary classification. The scheduler (StepLR
) adjusts the learning rate dynamically to improve convergence during training.
Here’s a snippet that shows how the training is managed:
# Model Training Function
def train_model(model, train_loader, test_loader, criterion, optimizer, scheduler, max_epochs, patience, clip_grad=None, scheduler_name=None):
best_loss = float('inf')
no_improvement = 0
for epoch in range(max_epochs):
model.train()
running_loss = 0.0
for inputs, labels in train_loader:
inputs, labels = inputs.to(device), labels.to(device)
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, labels)
loss.backward()
if clip_grad is not None:
torch.nn.utils.clip_grad_norm_(model.parameters(), clip_grad)
optimizer.step()
running_loss += loss.item()
model.eval()
val_loss = 0.0
with torch.no_grad():
for inputs, labels in test_loader:
inputs, labels = inputs.to(device), labels.to(device)
outputs = model(inputs)
val_loss += criterion(outputs, labels).item()
# Early stopping logic here...
return model
Evaluation Metrics
After training, the model can be evaluated using various metrics:
- Accuracy: Measures how often the model predicts the correct class.
- AUC-ROC: Area Under the Curve of the Receiver Operating Characteristic curve. A higher AUC indicates better distinction between classes.
- Precision: Proportion of true positives among all positive predictions.
- Recall: Proportion of true positives correctly identified by the model.
- F1-Score: Harmonic mean of precision and recall, balancing both metrics.
from sklearn.metrics import roc_auc_score, accuracy_score, precision_score, recall_score, f1_score
# Model Evaluation Function
def evaluate_model(model, test_loader):
model.eval()
all_outputs, all_labels = [], []
with torch.no_grad():
for inputs, labels in test_loader:
inputs, labels = inputs.to(device), labels.to(device)
outputs = model(inputs)
all_outputs.append(outputs.cpu().numpy())
all_labels.append(labels.cpu().numpy())
# Metrics calculation
auc_roc = roc_auc_score(all_labels, all_outputs)
accuracy = accuracy_score(all_labels, np.round(all_outputs))
precision = precision_score(all_labels, np.round(all_outputs))
recall = recall_score(all_labels, np.round(all_outputs))
f1 = f1_score(all_labels, np.round(all_outputs))
return auc_roc, accuracy, precision, recall, f1
Parameters
The following hyperparameters yielded the best results during Optuna trials:
- Hidden Layers: [512, 256]
- Dropout Rate: 0.37
- Learning Rate: 0.00667
- Weight Decay: 1.33e-06
- Batch Size: 32
- Activation Function: Tanh
- Optimizer: AdamW
- Scheduler: StepLR
- Clip Gradients: 3.67
Dataset
The model expects fMRI data as input. The input dimension should correspond to the number of features in the dataset, typically pre-processed fMRI signals. Data is loaded using PyTorch's DataLoader
, ensuring efficient mini-batch processing during training.
Hyperparameter Tunning
To tune the hyperparameters after obtaining a model from Hugging Face, you can integrate Optuna or other hyperparameter optimization frameworks into your workflow. Here’s how you can systematically tune hyperparameters for a pre-trained Hugging Face model using Optuna:
1. Install Necessary Libraries
Make sure you have installed all required dependencies:
pip install optuna torch transformers
2. Load the Pre-trained Model
First, load the pre-trained model from Hugging Face using the AutoModelForSequenceClassification
class:
from transformers import AutoModelForSequenceClassification
# Load model from Hugging Face
model = AutoModelForSequenceClassification.from_pretrained("JayLacoma/Pytorch_Classifier_fMRI")
3. Define the Objective Function for Optuna
The objective function for Optuna includes loading the model, setting up the optimizer, loss function, and scheduler, and evaluating the performance on validation data.
Here’s an example objective function:
import optuna
import torch.optim as optim
from torch.utils.data import DataLoader
def objective(trial):
# Define hyperparameters to tune
dropout_rate = trial.suggest_float('dropout_rate', 0.2, 0.5)
lr = trial.suggest_loguniform('lr', 1e-5, 1e-2)
weight_decay = trial.suggest_loguniform('weight_decay', 1e-6, 1e-3)
batch_size = trial.suggest_categorical('batch_size', [16, 32, 64])
optimizer_name = trial.suggest_categorical('optimizer', ['AdamW', 'SGD', 'Adam'])
scheduler_name = trial.suggest_categorical('scheduler', ['CosineAnnealingLR', 'StepLR', 'ReduceLROnPlateau'])
# Apply dropout (if your model supports custom dropout; otherwise, it’s predefined in the architecture)
model.dropout = torch.nn.Dropout(dropout_rate)
# Optimizer setup
if optimizer_name == 'AdamW':
optimizer = optim.AdamW(model.parameters(), lr=lr, weight_decay=weight_decay)
elif optimizer_name == 'SGD':
optimizer = optim.SGD(model.parameters(), lr=lr, weight_decay=weight_decay, momentum=0.9)
elif optimizer_name == 'Adam':
optimizer = optim.Adam(model.parameters(), lr=lr, weight_decay=weight_decay)
# Loss function
criterion = torch.nn.BCEWithLogitsLoss()
# Scheduler setup
if scheduler_name == 'CosineAnnealingLR':
scheduler = optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=10)
elif scheduler_name == 'StepLR':
scheduler = optim.lr_scheduler.StepLR(optimizer, step_size=5, gamma=0.1)
elif scheduler_name == 'ReduceLROnPlateau':
scheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='min', patience=3)
# Create data loaders for training and validation (replace with your dataset)
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False)
# Training loop with early stopping
model.train()
for epoch in range(10): # For brevity, limiting to 10 epochs
for inputs, labels in train_loader:
inputs, labels = inputs.to(device), labels.to(device)
optimizer.zero_grad()
outputs = model(inputs).logits
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
# Validation loop
model.eval()
val_loss = 0
with torch.no_grad():
for inputs, labels in val_loader:
inputs, labels = inputs.to(device), labels.to(device)
outputs = model(inputs).logits
val_loss += criterion(outputs, labels).item()
# Step the scheduler
if scheduler_name == 'ReduceLROnPlateau':
scheduler.step(val_loss)
else:
scheduler.step()
return val_loss / len(val_loader) # Return the average validation loss
4. Run the Hyperparameter Search with Optuna
You can now create an Optuna study and run the optimization process across multiple trials. Each trial will search for better hyperparameters by minimizing the validation loss.
study = optuna.create_study(direction='minimize') # Optimize to minimize the validation loss
study.optimize(objective, n_trials=100) # Run for 100 trials
# Retrieve the best hyperparameters found by Optuna
best_params = study.best_params
print(f"Best hyperparameters: {best_params}")
5. Use the Best Hyperparameters for Model Training
Once Optuna finishes running, you can train the model using the best hyperparameters found:
# Use the best hyperparameters to train your final model
model = AutoModelForSequenceClassification.from_pretrained("JayLacoma/Pytorch_Classifier_fMRI")
# Apply best parameters
dropout_rate = best_params['dropout_rate']
lr = best_params['lr']
weight_decay = best_params['weight_decay']
batch_size = best_params['batch_size']
optimizer_name = best_params['optimizer']
scheduler_name = best_params['scheduler']
# Continue to train using these hyperparameters as demonstrated earlier
Summary of Steps
- Load the Hugging Face model.
- Define an Optuna objective function that incorporates the model, optimizer, loss, and scheduler.
- Use Optuna to perform hyperparameter tuning by running multiple trials.
- Retrieve the best hyperparameters and re-train the model for final use.