ML Made Simple in 30 hours: Hour 25 Transfer Learning

#### Concept

Transfer learning is a machine learning technique where a model trained on one task is re-purposed on a second related task. It leverages the knowledge gained from the source task to improve learning in the target task, especially when the target dataset is small or different from the source dataset.

#### Key Aspects

1. Pre-trained Models: Utilize models trained on large-scale datasets like ImageNet, which have learned rich feature representations from extensive data.

2. Fine-tuning: Adapt pre-trained models to new tasks by updating weights during training on the target dataset. Fine-tuning allows the model to adjust its learned representations to fit the new task better.

3. Domain Adaptation: Adjusting a model trained on one distribution (source domain) to perform well on another distribution (target domain) with different characteristics.

#### Implementation Steps

1. Select a Pre-trained Model: Choose a model pre-trained on a large dataset relevant to your task (e.g., VGG, ResNet, BERT).

2. Adaptation to New Task:

- Feature Extraction: Freeze most layers of the pre-trained model and extract features from intermediate layers for the new dataset.

- Fine-tuning: Fine-tune the entire model or only a few top layers on the new dataset with a lower learning rate to avoid overfitting.

3. Evaluation: Evaluate the performance of the adapted model on the target task using appropriate metrics (e.g., accuracy, precision, recall).

#### Example: Transfer Learning with Pre-trained CNN for Image Classification

Let's demonstrate transfer learning using a pre-trained VGG16 model for classifying images from a new dataset (e.g., CIFAR-10).

import numpy as np
import tensorflow as tf
from tensorflow.keras.applications import VGG16
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Dropout
from tensorflow.keras.optimizers import Adam

# Load CIFAR-10 dataset
(X_train, y_train), (X_test, y_test) = cifar10.load_data()

# Preprocess the data
X_train = X_train.astype('float32') / 255.0
X_test = X_test.astype('float32') / 255.0

# Load pre-trained VGG16 model (excluding top layers)
base_model = VGG16(weights='imagenet', include_top=False, 
                   input_shape=(32, 32, 3))

# Freeze the layers in base model
for layer in base_model.layers:
    layer.trainable = False

# Create a new model on top of the pre-trained base model
model = Sequential([
    base_model,
    Flatten(),
    Dense(512, activation='relu'),
    Dropout(0.5),
    Dense(10, activation='softmax')
])


# Compile the model
model.compile(optimizer=Adam(learning_rate=0.0001),
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])


# Train the model
history = model.fit(X_train, y_train, epochs=10, batch_size=128,
                    validation_data=(X_test, y_test))

# Evaluate the model
test_loss, test_acc = model.evaluate(X_test, y_test)
print(f'Test accuracy: {test_acc}')

# Fine-tuning the model
for layer in base_model.layers[-4:]:
    layer.trainable = True

model.compile(optimizer=Adam(learning_rate=0.00001),
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

history = model.fit(X_train, y_train, epochs=5, batch_size=128,
                    validation_data=(X_test, y_test))

# Evaluate the fine-tuned model
test_loss, test_acc = model.evaluate(X_test, y_test)
print(f'Fine-tuned test accuracy: {test_acc}')

Result

Epoch 10/10
391/391 ━━━━━━━━━━━━━━━━━━━━ 116s 296ms/step - accuracy: 0.5684 - 
       loss: 1.2423 - val_accuracy: 0.5728 - val_loss: 1.2220
313/313 ━━━━━━━━━━━━━━━━━━━━ 25s 79ms/step - accuracy: 0.5747 - 
       loss: 1.2192
Test accuracy: 0.5727999806404114

Epoch 5/5
391/391 ━━━━━━━━━━━━━━━━━━━━ 159s 406ms/step - accuracy: 0.7477 - 
        loss: 0.7227 - val_accuracy: 0.7129 - val_loss: 0.8162
313/313 ━━━━━━━━━━━━━━━━━━━━ 26s 83ms/step - accuracy: 0.7185 - 
        loss: 0.8140 
Fine-tuned test accuracy: 0.7128999829292297

#### Explanation:

1. Loading Data: Load and preprocess the CIFAR-10 dataset.

2. Base Model: Load VGG16 pre-trained on ImageNet without the top layers.

3. Model Construction: Add custom top layers (fully connected, dropout, output) to the pre-trained base.

4. Training: Train the model on the CIFAR-10 dataset.

5. Fine-tuning: Optionally, unfreeze a few top layers of the base model and continue training with a lower learning rate to adapt to the new task.

6. Evaluation: Evaluate the final model's performance on the test set.

#### Applications

Transfer learning is widely used in:

- Computer Vision: Image classification, object detection, and segmentation.

- Natural Language Processing: Text classification, sentiment analysis, and language translation.

- Audio Processing: Speech recognition and sound classification.

#### Advantages

- Reduced Training Time: Leveraging pre-trained models reduces the need for training from scratch.

- Improved Performance: Transfer learning can improve model accuracy, especially with limited labeled data.

- Broader Applicability: Models trained on diverse datasets can be adapted to various real-world applications.

ML Made Simple in 30 hours

Friday, 3 January 2025

Hour 25 Transfer Learning

No comments:

Post a Comment

Hour 30 Hyperparameter Optimization

Search This Blog