ML Made Simple in 30 hours: Hour 19 Convolutional Neural Networks (CNNs)

#### Concept

Convolutional Neural Networks (CNNs) are specialized neural networks designed to process data with a grid-like topology, such as images. They are particularly effective for image recognition and classification tasks due to their ability to capture spatial hierarchies in the data.

#### Key Features of CNNs

1. Convolutional Layers: Apply convolution operations to extract features from the input data.

2. Pooling Layers: Reduce the dimensionality of the data while retaining important features.

3. Fully Connected Layers: Perform classification based on the extracted features.

4. Activation Functions: Introduce non-linearity to the network (e.g., ReLU).

5. Filters/Kernels: Learnable parameters that detect specific patterns like edges, textures, etc.

#### Key Steps

1. Convolution Operation: Slide filters over the input image to create feature maps.

2. Pooling Operation: Downsample the feature maps to reduce dimensions and computation.

3. Flattening: Convert the 2D feature maps into a 1D vector for the fully connected layers.

4. Fully Connected Layers: Perform the final classification based on the extracted features.

#### Implementation

Let's implement a simple CNN using Keras on the MNIST dataset, which consists of handwritten digit images.

##### Example

# Import necessary libraries

import numpy as np
import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
from tensorflow.keras.utils import to_categorical


# Load the MNIST dataset
(X_train, y_train), (X_test, y_test) = mnist.load_data()

# Preprocessing the data
X_train = 
X_train.reshape(X_train.shape[0], 28, 28, 1).astype('float32') / 255
X_test = 
X_test.reshape(X_test.shape[0], 28, 28, 1).astype('float32') / 255
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

# Creating the CNN model
model = Sequential([
    Conv2D(32, kernel_size=(3, 3), activation='relu',
           input_shape=(28, 28, 1)),
    MaxPooling2D(pool_size=(2, 2)),
    Conv2D(64, kernel_size=(3, 3), activation='relu'),
    MaxPooling2D(pool_size=(2, 2)),
    Flatten(),
    Dense(128, activation='relu'),
    Dense(10, activation='softmax')
])

# Compiling the model
model.compile(optimizer='adam', loss='categorical_crossentropy', 
              metrics=['accuracy'])

# Training the model
model.fit(X_train, y_train, epochs=10, batch_size=200, 
          validation_split=0.2, verbose=1)


# Evaluating the model
loss, accuracy = model.evaluate(X_test, y_test, verbose=0)
print(f"Test Accuracy: {accuracy}")

Result

240/240 ━━━━━━━━━━━━━━━━━━━━ 6s 24ms/step - accuracy: 0.9941 -
              loss: 0.0169 - val_accuracy: 0.9860 - val_loss: 0.0523
Epoch 10/10
240/240 ━━━━━━━━━━━━━━━━━━━━ 6s 25ms/step - accuracy: 0.9958 - 
              loss: 0.0137 - val_accuracy: 0.9889 - val_loss: 0.0398
Test Accuracy: 0.9896000027656555

#### Explanation of the Code

1. Libraries: We import necessary libraries like numpy and tensorflow.keras.

2. Data Loading: We load the MNIST dataset with images of handwritten digits.

3. Data Preprocessing:

- Reshape the images to include a single channel (grayscale).

- Normalize pixel values to the range [0, 1].

- Convert the labels to one-hot encoded format.

4. Model Creation:

- Conv2D Layers: Apply 32 and 64 filters with a kernel size of (3, 3) for feature extraction.

- MaxPooling2D Layers: Reduce the spatial dimensions of the feature maps.

- Flatten Layer: Convert 2D feature maps to a 1D vector.

- Dense Layers: Perform classification with 128 neurons in the hidden layer and 10 neurons in the output layer (one for each digit class).

5. Model Compilation: We compile the model with the Adam optimizer and categorical cross-entropy loss function.

6. Model Training: We train the model for 10 epochs with a batch size of 200 and validate on 20% of the training data.

7. Model Evaluation: We evaluate the model on the test set and print the accuracy.

print(f"Test Accuracy: {accuracy}")

#### Advanced Features of CNNs

1. Deeper Architectures: Increase the number of convolutional and pooling layers for better feature extraction.

2. Data Augmentation: Enhance the training set by applying transformations like rotation, flipping, and scaling.

3. Transfer Learning: Use pre-trained models (e.g., VGG, ResNet) and fine-tune them on specific tasks.

4. Regularization Techniques:

- Dropout: Randomly drop neurons during training to prevent overfitting.

- Batch Normalization: Normalize inputs of each layer to stabilize and accelerate training.

# Example with Data Augmentation and Dropout

print('xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx')

############################################
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.layers import Dropout

# Data Augmentation

datagen = ImageDataGenerator(
    rotation_range=10,
    zoom_range=0.1,
    width_shift_range=0.1,
    height_shift_range=0.1
)

# Creating the CNN model with Dropout
model = Sequential([
    Conv2D(32, kernel_size=(3, 3), activation='relu', 
           input_shape=(28, 28, 1)),
    MaxPooling2D(pool_size=(2, 2)),
    Dropout(0.25),
    Conv2D(64, kernel_size=(3, 3), activation='relu'),
    MaxPooling2D(pool_size=(2, 2)),
    Dropout(0.25),
    Flatten(),
    Dense(128, activation='relu'),
    Dropout(0.5),
    Dense(10, activation='softmax')
])

# Compiling and training remain the same as before
model.compile(optimizer='adam', loss='categorical_crossentropy',
              metrics=['accuracy'])
model.fit(datagen.flow(X_train, y_train, batch_size=200), epochs=10, 
          validation_data=(X_test, y_test), verbose=1)

# Evaluating the model
loss, accuracy = model.evaluate(X_test, y_test, verbose=0)
print(f"Test Accuracy: {accuracy}")

Result

Epoch 10/10
300/300 ━━━━━━━━━━━━━━━━━━━━ 15s 48ms/step - accuracy: 0.9653 - 
          loss: 0.1195 - val_accuracy: 0.9929 - val_loss: 0.0188
Test Accuracy: 0.992900013923645

#### Applications

CNNs are widely used in various fields such as:

- Computer Vision: Image classification, object detection, facial recognition.

- Medical Imaging: Tumor detection, medical image segmentation.

- Autonomous Driving: Road sign recognition, obstacle detection.

- Augmented Reality: Gesture recognition, object tracking.

- Security: Surveillance, biometric authentication.

CNNs' ability to automatically learn hierarchical feature representations makes them highly effective for image-related tasks.

Best Data Science & Machine Learning Resources: https://topmate.io/coding/914624

ENJOY LEARNING 👍👍

ML Made Simple in 30 hours

Friday, 3 January 2025

Hour 19 Convolutional Neural Networks (CNNs)

No comments:

Post a Comment

Hour 30 Hyperparameter Optimization

Search This Blog