Introduction to Computer Vision: Plant Seedlings Classification¶

Problem Statement¶

Context¶

In recent times, the field of agriculture has been in urgent need of modernizing, since the amount of manual work people need to put in to check if plants are growing correctly is still highly extensive. Despite several advances in agricultural technology, people working in the agricultural industry still need to have the ability to sort and recognize different plants and weeds, which takes a lot of time and effort in the long term. The potential is ripe for this trillion-dollar industry to be greatly impacted by technological innovations that cut down on the requirement for manual labor, and this is where Artificial Intelligence can actually benefit the workers in this field, as the time and energy required to identify plant seedlings will be greatly shortened by the use of AI and Deep Learning. The ability to do so far more efficiently and even more effectively than experienced manual labor, could lead to better crop yields, the freeing up of human inolvement for higher-order agricultural decision making, and in the long term will result in more sustainable environmental practices in agriculture as well.

Objective¶

The aim of this project is to Build a Convolutional Neural Netowrk to classify plant seedlings into their respective categories.

Data Dictionary¶

The Aarhus University Signal Processing group, in collaboration with the University of Southern Denmark, has recently released a dataset containing images of unique plants belonging to 12 different species.

  • The dataset can be download from Olympus.

  • The data file names are:

    • images.npy
    • Labels.csv
  • Due to the large volume of data, the images were converted to the images.npy file and the labels are also put into Labels.csv, so that you can work on the data/project seamlessly without having to worry about the high data volume.

  • The goal of the project is to create a classifier capable of determining a plant's species from an image.

List of Species:

#No. of ImagesDescription
1263Black-grass
2390Charlock
3287Cleavers
4611Common Chickweed
5221Common Wheat
6475Fat Hen
7654Loose Silky-bent
8221Maize
9516Scentless Mayweed
10231Shepherds Purse
11496Small-flowered Cranesbill
12385Sugar beet

Importing necessary libraries¶

In [ ]:
# Installing the libraries with the specified version.
# uncomment and run the following line if Google Colab is being used
!pip install tensorflow==2.15.0 scikit-learn==1.2.2 seaborn==0.13.1 matplotlib==3.7.1 numpy==1.25.2 pandas==1.5.3 opencv-python==4.8.0.76 -q --user

Note: After running the above cell, restart the notebook kernel and run all cells sequentially from the start again.

In [ ]:
# Essential Libraries
import os                                                                                        # Importing os
import numpy as np                                                                               # Importing numpy for Matrix Operations
import pandas as pd                                                                              # Importing pandas to read CSV files
import matplotlib.pyplot as plt                                                                  # Importting matplotlib for Plotting and visualizing images
import math                                                                                      # Importing math module to perform mathematical operations
import cv2                                                                                       # Importing openCV for image processing
import seaborn as sns                                                                            # Importing seaborn to plot graphs

# Tensorflow modules
import tensorflow as tf                                                                          # Importing Tensorflow
from tensorflow.keras.preprocessing.image import ImageDataGenerator                              # Importing the ImageDataGenerator for data augmentation
from tensorflow.keras.models import Sequential                                                   # Importing the sequential module to define a sequential model
from tensorflow.keras.layers import Dense,Dropout,Flatten,Conv2D,MaxPooling2D,BatchNormalization # Defining all the layers to build our CNN Model
from tensorflow.keras.optimizers import Adam,SGD                                                 # Importing the optimizers which can be used in our model
from sklearn import preprocessing                                                                # Importing the preprocessing module to preprocess the data
from sklearn.model_selection import train_test_split                                             # Importing train_test_split function to split the data into train and test
from sklearn.metrics import confusion_matrix                                                     # Importing confusion_matrix to plot the confusion matrix
from sklearn.preprocessing import LabelBinarizer                                                 # Importing LabelBinarizer

# Display images using OpenCV
from google.colab.patches import cv2_imshow                                                      # Importing cv2_imshow from google.patches to display images
from sklearn.model_selection import train_test_split                                             # Importing train_test_split function to split the data into train and test
from sklearn.metrics import confusion_matrix                                                     # Importing confusion_matrix to plot the confusion matrix
from sklearn.preprocessing import LabelBinarizer                                                 # Importing LabelBinarizer
from tensorflow.keras import backend                                                             # Importing backend from tensorflow.keras
from keras.callbacks import ReduceLROnPlateau                                                    # Importing ReduceLROnPlateau to reduce the learning rate when a metric has stopped improving
import random                                                                                    # Importing random to perform random operations

# Ignore warnings
import warnings
warnings.filterwarnings('ignore')
In [ ]:
# Additional keras functions
from tensorflow import keras
from keras import backend
from keras.models import Sequential
from keras.layers import Dense, Dropout
In [ ]:
# Verifiyng installed versions. - EXTRA CREDIT
print('\nLibraries imported for Exploratory Data Analysis \n(EDA - Univariate and Multivariate):\n')
print("For Numerical Operations and Statistical Analysis of Data:")
print(" . Pandas version:", pd.__version__)
print(" . NumPy version:", np.__version__)
print("\nFor Visualization of Data: ")
print(" . Matplotlib version:", plt.rcParams['figure.figsize'])
print(" . Seaborn version:", sns.__version__)
print("\nFor Machine Learning Models:", tf.__version__)
print(" . Tensorflow version:", tf.__version__)
print(" . Keras version:", keras.__version__)
Libraries imported for Exploratory Data Analysis 
(EDA - Univariate and Multivariate):

For Numerical Operations and Statistical Analysis of Data:
 . Pandas version: 2.2.2
 . NumPy version: 1.26.4

For Visualization of Data: 
 . Matplotlib version: [6.4, 4.8]
 . Seaborn version: 0.13.2

For Machine Learning Models: 2.17.0
 . Tensorflow version: 2.17.0
 . Keras version: 3.4.1
In [ ]:
# Identify GPU Processing Resources with current runtime
gpu_info = !nvidia-smi
gpu_info = '\n'.join(gpu_info)
if gpu_info.find('failed') >= 0:
  print('Not connected to a GPU')
else:
  print(gpu_info)
Thu Nov 14 01:00:15 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   74C    P0              33W /  70W |    111MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
+---------------------------------------------------------------------------------------+

Loading the dataset¶

In [ ]:
from google.colab import drive
drive.mount('/content/drive')                # Mount Drive
Mounted at /content/drive

Check if the actual size matches the expected size

In [ ]:
# Load the IMAGE file of dataset
images = np.load('/content/images.npy')      # Load images.npy dataset array
print(images.shape)  # Print the shape of the array to verify loading

# Calculate the expected size based on the desired shape
expected_size = 4750 * 128 * 128 * 3

# Print the actual size and the expected size
print("Actual size of the array:", images.size)
print("Expected size of the array:", expected_size)
print("4750 arrays * 128 l * 128 w * 3 color channels\n")

# Check if the actual size matches the expected size
if images.size == expected_size:
    images = images.reshape(4750, 128, 128, 3)  # Reshape if sizes match
    print("images dataset shape:", images.shape, " <--- Plant image in Pixels Array representation")   # Check the images shape
else:
    print("Error: The actual size of the array does not match the expected size.")
    print("Please check the dimensions of your images and the specified shape.")

# Load the LABELS file of dataset
labels = pd.read_csv('/content/Labels.csv')  # Load Labels.csv dataset
print("labels dataset shape:", labels.shape, "<--- Plant Labels")   # Check the labels shape
(4750, 128, 128, 3)
Actual size of the array: 233472000
Expected size of the array: 233472000
4750 arrays * 128 l * 128 w * 3 color channels

images dataset shape: (4750, 128, 128, 3)  <--- Plant image in Pixels Array representation
labels dataset shape: (4750, 1) <--- Plant Labels

Observations:

  • There are 4,750 images with their corresponding labels.
  • The size of the images are 128 x 128 pixels (l x w) with a depth of 3 (RGB) channels (Red, Green, Blue).
  • There are 4750 images in the images dataset and 4750 labesl in the labes dataset.
In [ ]:
def plot_images(images,labels):                                                   # Function to plot images with corresponding labels
  num_classes=10                                                                  # Number of Classes
  categories=np.unique(labels)                                                    # Obtaing the unique classes from y_train
  keys=dict(labels['Label'])                                                      # Obtaing the unique classes from y_train
  rows = 3                                                                        # Defining number of rows=3
  cols = 4                                                                        # Defining number of columns=4
  fig = plt.figure(figsize=(10, 8))                                               # Defining the figure size to 10x8
  for i in range(cols):
      for j in range(rows):
          random_index = np.random.randint(0, len(labels))                        # Generating random indices from the data and plotting the images
          ax = fig.add_subplot(rows, cols, i * rows + j + 1)                      # Adding subplots with 3 rows and 4 columns
          ax.imshow(images[random_index, :])                                      # Plotting the image
          ax.set_title(keys[random_index])                                        # Setting the title of the plot to the corresponding label
  plt.show()                                                                      # Showing the plot
In [ ]:
# images and labels alignments using the plot_images function above
# to plot the images with their corresponding labels:
print("\nPlant labels displayed first, present a grid with 3 rows by 4 columns with (12) plant images \nmadeup by (12) 128 x 128 (l x w) pixel arrays and 3 color (GBR) channels:\n")
plot_images(images, labels)
Plant labels displayed first, present a grid with 3 rows by 4 columns with (12) plant images 
madeup by (12) 128 x 128 (l x w) pixel arrays and 3 color (GBR) channels:

No description has been provided for this image

Data Overview¶

Shape of the datasets¶

In [ ]:
# Load the IMAGE file of dataset
images = np.load('/content/images.npy')      # Load images.npy dataset array
print("\033[1;32mimages\033[0m dataset shape:", images.shape, " <--- Plant image in 128x128 Pixels Array representation with 3 color channels.")   # Check the images shape
# Load the LABELS file of dataset
labels = pd.read_csv('/content/Labels.csv')  # Load Labels.csv dataset
print("\033[1;32mlabels\033[0m dataset shape:", labels.shape, "<--- Plant Labels, 1 column with 4,750 plant labels.")   # Check the labels shape
images dataset shape: (4750, 128, 128, 3)  <--- Plant image in 128x128 Pixels Array representation with 3 color channels.
labels dataset shape: (4750, 1) <--- Plant Labels, 1 column with 4,750 plant labels.

Exploratory Data Analysis¶

In [ ]:
# Extract the unique categories and their counts
category_counts = labels['Label'].value_counts()

# Calculate the percentage of each category and round to 2 significant digits
category_percentage = (category_counts / category_counts.sum() * 100).round(2)

# Find the smallest percentage
min_percentage = category_percentage.min()

# Calculate the factor for each category based on the smallest percentage
factor_based_on_min_percentage = (category_percentage / min_percentage).round(2)

# Combine counts, percentages, and factor into a single DataFrame
category_summary = pd.DataFrame({
    'Count': category_counts,
    'Percentage': category_percentage,
    'xFactor': factor_based_on_min_percentage
})

# Print the DataFrame
category_summary
Out[ ]:
Count Percentage xFactor
Label
Loose Silky-bent 654 13.77 2.96
Common Chickweed 611 12.86 2.77
Scentless Mayweed 516 10.86 2.34
Small-flowered Cranesbill 496 10.44 2.25
Fat Hen 475 10.00 2.15
Charlock 390 8.21 1.77
Sugar beet 385 8.11 1.74
Cleavers 287 6.04 1.30
Black-grass 263 5.54 1.19
Shepherds Purse 231 4.86 1.05
Common wheat 221 4.65 1.00
Maize 221 4.65 1.00

Observations:

  • There are 12 different categories (unique labels) of images.
  • Their category distribution ranges from 221 - 654 plants in this dataset.
  • The category distribution by percentages range from 4.65% to 13.77%.
  • The categories are not evenly distributed (somewhat unbalanced) and this could negatively affect the training of the model. Additional compensations will likely be needed.
In [ ]:
# Plotting the times factor for each category
plt.figure(figsize=(8, 4))
plt.barh(category_summary.index, category_summary['xFactor'], color='skyblue')
plt.xlabel('Factor (based on smallest percentage)')
plt.ylabel('Categories')
plt.title('xFactor of Each Category Relative to Smallest Percentage')
plt.gca().invert_yaxis()  # Invert y-axis for better readability
plt.show()
No description has been provided for this image

Observations:

  • Minority categories Maize and Common Wheat are 3x fewer than Loose Silky-bent and Common Chickweed creating a potential umbalance.

Recommendations:

  • To better handle this umbalance of images, we could adjust Class Weighing in the loss function to emphasize the minority classes, we could also apply transformations (rotations, color adjustments) to minority classes, and we could even try to fine tune a VGG model against this datasets.

  • Additional strategies for handling imbalanced datasets:

    1. Resampling: Use oversampling or undersampling for balance; try SMOTE for synthetic samples.
    2. Class Weighting: Adjust loss weights or use focal loss to emphasize minority classes.
    3. Data Augmentation: Apply transformations to minority data to increase diversity.
    4. Balanced Batching: Ensure balanced class representation in mini-batches.
    5. Ensemble: Combine models trained on balanced subsets.
    6. Transfer Learning: Fine-tune pre-trained models to mitigate imbalance.
    7. Evaluation: Use F1, precision-recall, and ROC-AUC instead of accuracy.

Plotting random images from each of the class¶

In [ ]:
def plot_images(images,labels):
  num_classes=10                                                                  # Number of Classes
  categories=np.unique(labels)
  keys=dict(labels['Label'])                                                      # Obtaing the unique classes from y_train
  rows = 3                                                                        # Defining number of rows=3
  cols = 4                                                                        # Defining number of columns=4
  fig = plt.figure(figsize=(10, 8))                                               # Defining the figure size to 10x8
  for i in range(cols):
      for j in range(rows):
          random_index = np.random.randint(0, len(labels))                        # Generating random indices from the data and plotting the images
          ax = fig.add_subplot(rows, cols, i * rows + j + 1)                      # Adding subplots with 3 rows and 4 columns
          ax.imshow(images[random_index, :])                                      # Plotting the image
          ax.set_title(keys[random_index])
  plt.show()
In [ ]:
# images and labels alignments using the plot_images function
# plot the images with their corresponding labels
print("\nPlant labels displayed first, (12) plant images madeup by (12) 128 in height x 128 in width \npixel arrays and (BRG) 3 color channels:\n")
plot_images(images, labels)
Plant labels displayed first, (12) plant images madeup by (12) 128 in height x 128 in width 
pixel arrays and (BRG) 3 color channels:

No description has been provided for this image

Observing the distribution of the image categories (target variable to predict)¶

In [ ]:
# Sort labels in descending order by count
sorted_labels = labels['Label'].value_counts().index

# Check for data imbalance with labels on y-axis in descending order
ax = sns.countplot(y=labels['Label'], order=sorted_labels)
total = len(labels)

# Annotate each bar with the count and relative percentage
for container in ax.containers:
    for bar in container:
        width = bar.get_width()
        count = int(width)
        percentage = f'{(width / total) * 100:.1f}%'
        ax.annotate(f'{count} ({percentage})',
                    xy=(width, bar.get_y() + bar.get_height() / 2),
                    xytext=(5, 0),  # Offset the text slightly to the right of the bar
                    textcoords="offset points",
                    ha='left', va='center')

plt.xlabel('Count-Images')
plt.ylabel('Label-Images')
plt.show()
No description has been provided for this image

Class balance. Check each class should have roughly between 80-120% of the number of samples as the other classes:

In [ ]:
def check_class_balance(labels, label_column='Label'):
    """
    Identifies the level of class balance in a dataset based on predefined guidelines.

    Parameters:
    - labels (DataFrame or Series): Dataset containing the class labels.
    - label_column (str): The name of the column containing class labels (if `labels` is a DataFrame).

    Returns:
    - str: The level of balance (balanced, mildly imbalanced, heavily imbalanced, or extremely imbalanced).
    """

    # Get class counts
    if isinstance(labels, pd.DataFrame):
        class_counts = labels[label_column].value_counts()
    elif isinstance(labels, pd.Series):
        class_counts = labels.value_counts()
    else:
        raise ValueError("`labels` should be a DataFrame or Series.")

    # Calculate the ratio between the max and min class counts
    max_count = class_counts.max()
    min_count = class_counts.min()
    ratio = max_count / min_count

    # Define thresholds based on guidelines
    if ratio <= 1.2:
        balance_status = "Balanced"
    elif 1.2 < ratio <= 3:
        balance_status = "Mildly Imbalanced"
    elif 3 < ratio <= 5:
        balance_status = "Heavily Imbalanced"
    else:
        balance_status = "Extremely Imbalanced"

    # Print the summary and return the balance status
    # print(f"Class counts:\n{class_counts}\n")
    print(f"Max class count: {max_count}")
    print(f"Min class count: {min_count}")
    print(f"Class ratio (max/min): {ratio:.2f}")
    print(f"Balance status: \033[33m{balance_status}\033[0m")

    return balance_status

# Example usage:
# Assuming `labels` is a DataFrame with a column 'Label' that contains the class labels
balance_status = check_class_balance(labels, label_column='Label')

# Chosen criteria with ANSI formatting for bold and red text
print("\nChosen Criteria Threshold:")
print("\033[33meach class should have roughly between 80-120% of the number of samples as the other classes.\033[0m")
Max class count: 654
Min class count: 221
Class ratio (max/min): 2.96
Balance status: Mildly Imbalanced

Chosen Criteria Threshold:
each class should have roughly between 80-120% of the number of samples as the other classes.

Observations:

  • There are 12 categories of images.
  • There are 4,750 images in the dataset.
  • The dataset shows a slight imbalance of categories, with the minority class (Maize & Common Wheat) having 221 samples and the majority class (Loose Silky-bent) having 654 images.

NOTE: A common guidance is that each class should have roughly 80-120% of the sample size of the others.

To address an slight imbalance, during model building we could include the following:

  • Performance metrics: Use F1 score, precision, recall, and AUC-PR to evaluate how well the model handles each class.
  • Balance evaluation: Apply (1) oversampling, (2) undersampling, or (3) class weights to see if model performance improves with a more balanced data.

While slight imbalances are often acceptable, if the minority class is critical, such as identifying a plant with a rare toxin present, —a higher accuracy for the minority class becomes essential and the model should be build accordingly. The use case dictactes the direction and focus.

Data Pre-Processing¶

Convert the BGR images to RGB images¶

In [ ]:
# Iterates through and converts each image from BGR to RGB using cvtColor function of OpenCV.

import cv2

images_rgb = []
for i in range(len(images)):
    image_rgb = cv2.cvtColor(images[i], cv2.COLOR_BGR2RGB)
    images_rgb.append(image_rgb)

images_rgb = np.array(images_rgb) # Convert back to NumPy array
# Now 'images_rgb' contains the images in RGB format.  We can use this
# array in the rest of the code instead of the original 'images' BRG array.
print("\nImages converted from BGR to RGB still with their original dimensions 128 x 128 (\033[33mimages_rgb\033[0m)")
Images converted from BGR to RGB still with their original dimensions 128 x 128 (images_rgb)
In [ ]:
# Print one image in the original GRB and its corresponding image in RGB and their labels adjacent to each other

import matplotlib.pyplot as plt
import numpy as np
import cv2

# Function to display BGR original image vs RGB
def plot_original_and_rgb(images, images_rgb, labels):
    # Select a random index
    random_index = np.random.randint(0, len(labels))

    # Plot original image (BGR)
    plt.figure(figsize=(9, 4))
    plt.subplot(1, 2, 1)
    plt.imshow(images[random_index])  # Matplotlib handles BGR
    plt.title(f"Original (BGR) - {labels['Label'][random_index]}")


    # Plot converted image (RGB)
    plt.subplot(1, 2, 2)
    plt.imshow(images_rgb[random_index])
    plt.title(f"Converted (RGB) - {labels['Label'][random_index]}")

    plt.show()

# Show one
plot_original_and_rgb(images, images_rgb, labels)
No description has been provided for this image

Resizing images without loosing image details from 128x128 down to 64x64¶

Considering the size of the images, it may be computationally expensive to train on these 128 x 128 images. We can investigate the type of results when we reduce the image size in 1/2 from 128x128 down to 64 x 64 and see that we do not loose critical image details to predict their labels.

In [ ]:
# Resize images_rgb to 64x64 and view their original images_rgb

import matplotlib.pyplot as plt
import numpy as np
import cv2

def display_resized_images_corner(images_rgb, labels):
    resized_images = []
    for img in images_rgb:
        resized_img = cv2.resize(img, (64, 64))
        resized_images.append(resized_img)
    resized_images = np.array(resized_images)

    fig, axes = plt.subplots(3, 3, figsize=(15, 15))
    for i in range(3):
        for j in range(3):
            index = np.random.randint(0, len(images_rgb))  # Random image index
            original_image = images_rgb[index].copy()
            resized_image = resized_images[index]

            # Calculate the position to place the resized image in the top right corner
            x_offset = original_image.shape[1] - resized_image.shape[1]
            y_offset = 0  # Top corner

            # Overlay the resized image onto the original image
            original_image[y_offset:y_offset + resized_image.shape[0], x_offset:x_offset + resized_image.shape[1]] = resized_image

            axes[i, j].imshow(original_image)
            axes[i, j].set_title(f"Label: {labels['Label'][index]}", fontsize=10)
            axes[i, j].set_yticks(np.arange(0, 129, 16)) # 128 tick marks

            #Adding ticks for the resized image (64x64) within the larger image
            axes[i, j].set_xticks(np.arange(x_offset, original_image.shape[1], 8)) # 64 tick marks

    plt.tight_layout()
    plt.show()

# Assuming images_rgb and labels are defined from the previous code
display_resized_images_corner(images_rgb, labels)
No description has been provided for this image

NOTICE:

  • Smaller 64x64 images were placed inside the original image (128x128) at the upper right side corner for comparisons.

Examine each image has no detectable loss of details - despite its reduction in size. ^^

In [ ]:
print("\n Display another 12 images, original vs resized:\n")
display_resized_images_overlay(images_rgb, labels)
 Display another 12 images, original vs resized:

No description has been provided for this image
In [ ]:
def display_resized_images_overlay(images_rgb, labels):
    resized_images = []
    for img in images_rgb:
        resized_img = cv2.resize(img, (64, 64))
        resized_images.append(resized_img)
    resized_images = np.array(resized_images)

    fig, axes = plt.subplots(3, 3, figsize=(15, 15))
    for i in range(3):
        for j in range(3):
            index = i * 3 + j
            if index < len(images_rgb):
                # Overlay the resized image onto the original image
                overlay_image = images_rgb[index].copy()
                x_offset = images_rgb[index].shape[1] - 64
                y_offset = 0  # Place in the upper right corner
                overlay_image[y_offset:y_offset + 64, x_offset:x_offset + 64] = resized_images[index]

                axes[i, j].imshow(overlay_image)
                axes[i, j].set_title(f"128x128 Image w/64x64 - {labels['Label'][index]}", fontsize=10)
                axes[i, j].set_yticks(np.arange(0, 129, 16))

    plt.tight_layout()
    plt.show()

display_resized_images_overlay(images_rgb, labels)
No description has been provided for this image
In [ ]:
# Create the label_dict to store counts of each label
label_dict = labels['Label'].value_counts().to_dict()

# Function to plot images with labels and corresponding value counts
def plot_images_with_value_counts(images, labels, label_dict):
    rows = 3  # Number of rows in the plot
    cols = 4  # Number of columns in the plot
    fig = plt.figure(figsize=(10, 8))  # Set figure size

    # Loop through rows and columns to plot images
    for i in range(rows):
        for j in range(cols):
            random_index = np.random.randint(0, len(images))  # Randomly select an index
            ax = fig.add_subplot(rows, cols, i * cols + j + 1)  # Add subplot in a grid
            ax.imshow(images[random_index])  # Plot the image

            # Get the label and count for the randomly selected image
            label = labels.iloc[random_index, 0]  # Access the first column of the DataFrame
            label_count = label_dict.get(label, 0)  # Get the count for this label from label_dict

            # Set title with label and its count
            ax.set_title(f"{label}\n{label_count} Images")

            # Set ticks for the x and y axis from 0 to 64
            ax.set_xticks(np.arange(0, 128, 16))
            ax.set_yticks(np.arange(0, 128, 16))
            ax.set_xticklabels(np.arange(0, 128, 16))
            ax.set_yticklabels(np.arange(0, 128, 16))

    plt.tight_layout()  # Ensure proper spacing between subplots
    plt.show()

# Display RGB images with labels and label counts
plot_images_with_value_counts(images_rgb, labels, label_dict)

# Print request for observation
print("\n                                  Notice the image axis decreased from 128 down to 64:\n")
No description has been provided for this image
                                  Notice the image axis decreased from 128 down to 64:

Observations:

  • Some images seem somewhat blurred at 64, but their prominent characteristics remain.
  • You can observe the size of the array (shape of the matrix) is 64 x 64 pixels. (l x h)
  • 12 images get displayed randomly.

Compare some images before and after resizing¶

In [ ]:
import cv2
import numpy as np
import matplotlib.pyplot as plt

# Assuming 'images' and 'labels' are already loaded into your environment
# Replace 'images' and 'labels' with your actual image and label data

# Resize images from 128x128 to 64x64
images_resized = []
height = 64
width = 64
dimensions = (width, height)
for i in range(len(images)):
    resized_image = cv2.resize(images[i], dimensions, interpolation=cv2.INTER_LINEAR)
    images_resized.append(resized_image)

images_resized = np.array(images_resized)  # Convert to a NumPy array

# Convert both original and resized images from BGR to RGB
images_rgb = []
images_resized_rgb = []
for i in range(len(images)):
    image_rgb = cv2.cvtColor(images[i], cv2.COLOR_BGR2RGB)
    images_rgb.append(image_rgb)
    resized_image_rgb = cv2.cvtColor(images_resized[i], cv2.COLOR_BGR2RGB)
    images_resized_rgb.append(resized_image_rgb)

images_rgb = np.array(images_rgb)  # Original images in RGB format
images_resized_rgb = np.array(images_resized_rgb)  # Resized images in RGB format

# Count the value frequency of labels (number of occurrences of each label)
label_counts = np.unique(labels, return_counts=True)  # Get unique labels and their counts
label_dict = dict(zip(label_counts[0], label_counts[1]))  # Create a dictionary for label counts

# Function to plot both original and resized images with labels and corresponding value counts
def plot_images_with_value_counts(original_images, resized_images, labels):
    rows = 4  # Number of rows in the plot
    cols = 3  # Number of columns in the plot
    fig = plt.figure(figsize=(14,10))  # Increase figure size to accommodate side-by-side images

    # Loop through rows and columns to plot images
    for i in range(cols):
        for j in range(rows):
            random_index = np.random.randint(0, len(original_images))  # Randomly select an index
            # Add subplot for original image
            ax = fig.add_subplot(rows, cols * 2, (i * rows + j) * 2 + 1)
            ax.imshow(original_images[random_index])  # Plot the original image
            ax.axis('on')  # Turn on axes

            # Get the label and count for the randomly selected image
            label = labels[random_index]  # Access the label using the random index
            label_count = label_dict.get(label, 0)  # Get the count for this label from label_dict

            # Set title with label and its count for original image
            ax.set_title(f"Original\n {label}\n{label_count} Images")

            # Add subplot for resized image
            ax_resized = fig.add_subplot(rows, cols * 2, (i * rows + j) * 2 + 2)
            ax_resized.imshow(resized_images[random_index])  # Plot the resized image
            ax_resized.axis('on')  # Turn on axes

            # Set title with label and its count for resized image
            ax_resized.set_title(f"Resized\n {label}\n{label_count} Images")

            # Set ticks for the x and y axis from 0 to 64 (same for both)
            ax.set_xticks(np.arange(0, 65, 16))  # Set x-axis ticks at intervals of 16 (up to 64)
            ax.set_yticks(np.arange(0, 65, 16))  # Set y-axis ticks at intervals of 16 (up to 64)
            ax_resized.set_xticks(np.arange(0, 65, 16))
            ax_resized.set_yticks(np.arange(0, 65, 16))

            # Label the ticks (matches the tick positions)
            ax.set_xticklabels(np.arange(0, 65, 16))
            ax.set_yticklabels(np.arange(0, 65, 16))
            ax_resized.set_xticklabels(np.arange(0, 65, 16))
            ax_resized.set_yticklabels(np.arange(0, 65, 16))

    plt.tight_layout()  # Ensure proper spacing between subplots
    plt.show()

# Display both original and resized RGB images with labels and label counts
plot_images_with_value_counts(images_rgb, images_resized_rgb, labels)
No description has been provided for this image
In [ ]:
import cv2
import numpy as np
import matplotlib.pyplot as plt

# Convert images from BGR to RGB and resize to 64x64
images_rgb_128 = []
images_rgb_64 = []

for i in range(len(images)):
    # Convert to RGB
    image_rgb = cv2.cvtColor(images[i], cv2.COLOR_BGR2RGB)
    images_rgb_128.append(image_rgb)  # Original 128x128 images

    # Resize to 64x64
    image_resized = cv2.resize(image_rgb, (64, 64))
    images_rgb_64.append(image_resized)  # Resized 64x64 images

# Convert lists back to NumPy arrays
images_rgb_128 = np.array(images_rgb_128)
images_rgb_64 = np.array(images_rgb_64)

# Function to plot a grid of 16 images
def plot_images(images, title, size=(128, 128)):
    plt.figure(figsize=(10, 10))
    for i in range(16):
        plt.subplot(4, 4, i + 1)
        plt.imshow(images[i] if size == (128, 128) else cv2.resize(images[i], (128, 128)))
        plt.axis('off')
    plt.suptitle(title, fontsize=16)
    plt.show()

# Display the original 128x128 images
plot_images(images_rgb_128, "Original 128x128 Images", size=(128, 128))

# Display the resized 64x64 images (scaled back up to 128x128 for easy comparison)
plot_images(images_rgb_64, "Resized 64x64 Images (Displayed at 128x128)", size=(64, 64))
No description has been provided for this image
No description has been provided for this image

Taking a closer look at a couple of images before resizing and after:¶

In [ ]:
print("\n                                                             NOTICE (axes tick marks):")
print("                                      Pixel size (l x w) was cut in half but essential characteristics remain:\n ")

# Display both images side-by-side
plt.figure(figsize=(16,8))

# Show the original image
plt.subplot(1, 2, 1)
plt.title(f"RGB Image BEFORE resizing\nLabel:{labels[7]}")
plt.imshow(images_rgb[7])

# Show the resized image
plt.subplot(1, 2, 2)
plt.title("Image after resizing")
plt.title(f"RGB Image AFTER resizing\nLabel:{labels[7]}")
plt.imshow(images_resized_rgb[7])

plt.show()
                                                             NOTICE (axes tick marks):
.                                      Pixel size (l x w) was cut in half but essential characteristics remain:
 
No description has been provided for this image
In [ ]:
print("\n                                                             NOTICE (axes tick marks):")
print(".                                      Pixel size (l x w) was cut in half but essential characteristics remain:\n ")

# Display both images side-by-side
plt.figure(figsize=(16, 8))

# Show the original image
plt.subplot(1, 2, 1)
plt.title(f"RGB Image BEFORE resizing\nLabel: {labels[500]}")
plt.imshow(images_rgb[500])

# Show the resized image
plt.subplot(1, 2, 2)
plt.title(f"RGB Image AFTER resizing\nLabel: {labels[500]}")  # Set title with the corresponding label
plt.imshow(images_resized_rgb[500])

plt.show()
                                                             NOTICE (axes tick marks):
.                                      Pixel size (l x w) was cut in half but essential characteristics remain:
 
No description has been provided for this image

Data Preparation for Modeling¶

  • We will use 10% of our data for testing, 10% of our data for validation and 80% of our data for training.
  • We are using the train_test_split() function from scikit-learn to split the dataset into three datasets, training, validation and testing.
  • The performance will be measured on the validation and eventually the test datasets.
In [ ]:
# We split the dataset into three parts, Training, Validating and Testing datasets.
from sklearn.model_selection import train_test_split

# images_rgb_64 - numpy array
X_train, X_test, y_train, y_test = train_test_split(images_rgb_64, labels, test_size=0.1, random_state=42)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.111, random_state=42) # 0.111 x 0.9 = 0.1
# Now we have:
# - X_train, y_train: Training data and labels (80% of the original data)
# - X_val, y_val: Validation data and labels (10% of the original data)
# - X_test, y_test: Testing data and labels (10% of the original data)
In [ ]:
# Check the shape of train, validation and test data
print("\nShape of training dataset:")
print(np.array(X_train).shape,np.array(y_train).shape) # Convert X_train and y_train to NumPy arrays before accessing shape
print("\nShape of validation dataset:")
print(np.array(X_val).shape,np.array(y_val).shape)   # Convert X_val and y_val to NumPy arrays before accessing shape
print("\nShape of testing dataset:")
print(np.array(X_test).shape,np.array(y_test).shape)  # Convert X_test and y_test to NumPy arrays before accessing shape
Shape of training dataset:
(3800, 64, 64, 3) (3800, 1)

Shape of validation dataset:
(475, 64, 64, 3) (475, 1)

Shape of testing dataset:
(475, 64, 64, 3) (475, 1)

Encoding the target labels¶

In [ ]:
# Convert labels from names to one hot vectors.
# We have already used encoding methods like
# - onehotencoder
# - abelencoder
# Now we will be using a new encoding method called: labelBinarizer.
# Labelbinarizer works similar to onehotencoder

print("\nWe will use encoding method: labelBinarizer")
print("labelbinarizer works similar to onehotencoder")

from sklearn.preprocessing import LabelBinarizer

enc = LabelBinarizer()                              # Complete the code to intialize the labelBinarizer
y_train_encoded = enc.fit_transform(y_train)        # Complete the code to fit and transform y_train
y_val_encoded=enc.transform(y_val)                  # Complete the code to transform y_val
y_test_encoded=enc.transform(y_test)                # Complete the code to transform y_test
We will use encoding method: labelBinarizer
labelbinarizer works similar to onehotencoder
In [ ]:
print("Check the shape of train, validation and test data:")
print(y_train_encoded.shape,y_val_encoded.shape,y_test_encoded.shape) # Check the shape of train, validation and test data
print("\nShape of training dataset:",np.array(y_train_encoded).shape) # Convert X_train and y_train to NumPy arrays before accessing shape
print("\nShape of validation dataset:",np.array(y_val_encoded).shape) # Convert X_val and y_val to NumPy arrays before accessing shape
print("\nShape of testing dataset:",np.array(y_test_encoded).shape)   # Convert X_test and y_test to NumPy arrays before accessing shape
Check the shape of train, validation and test data:
(3800, 12) (475, 12) (475, 12)

Shape of training dataset: (3800, 12)

Shape of validation dataset: (475, 12)

Shape of testing dataset: (475, 12)

Data Normalization¶

Since the image pixel values range from 0-255, our method of normalization here will be scaling

  • divide all the pixel values by 255 to standardize the images to have values between 0-1.
In [ ]:
# Since the image pixel values range from 0-255, our method of normalization here will be scaling - we shall divide all the pixel values by 255 to standardize the images to have values between 0-1.

# Normalize the image pixels of train, test and validation data
X_train_normalized = np.array(X_train) / 255.0
X_val_normalized = np.array(X_val) / 255.0
X_test_normalized = np.array(X_test) / 255.0

Model Building¶

In [ ]:
# Clearing backend:
backend.clear_session()
In [ ]:
# Fixing the seed for random number generators:
np.random.seed(42)
random.seed(42)
tf.random.set_seed(42)
In [ ]:
# Purpose: Initializes a Sequential model, which allows us to stack layers in a linear order, from input to output.
#
#      This model setup combines 3 convolutional layers, 3 corresponding pooling layers, and fully connected layers with
#      an optimizer compiling with a loss function suitable for multi-class classification to create
#      a sequential structure that can learn hierarchical features from images for classification.

model1 = Sequential()  # Initialize the sequential model

# ------------------------------ 1st --------------------------------------|
# Add first convolutional layer with 128 filters, kernel size 3x3, and 'same' padding:
# The input_shape defines the dimensions of the input image that the model expects.
# This input shape is independent of the number of filters in the convolutional layer.
# Depth changes -only- with convolutional layers based on the number of filters.
# >>  Input Shape: (64, 64, 3) (height, width, color channels). Here, 64x64 is the image dimension, and 3 channels represent RGB color.
#      •	Filters: 128, meaning the layer will learn 128 different features of the image using the following configuration.
#      •	Kernel Size: (3, 3), which determines that each filter will slide over a 3x3 region at a time.
#      •	Padding: "same" ensures that the output size (l x w) remains the same as the input, so the output remains (64, 64, 128) and a depth = # filters.
#      •	Activation: ReLU applies a rectified linear activation to introduce non-linearity.
model1.add(Conv2D(128, (3, 3), activation='relu', padding="same", input_shape=(64, 64, 3)))
# >>  Output Shape: (64, 64, 128) - The height and width stay the same, but the depth changes from 3 (RGB) channels to 128 due to the number of filters.


# ------------------------------ Pooling Layer for 1st Convolutional Layer
# Pooling layers, such as max pooling or average pooling, are used primarily to reduce the spatial dimensions (height and width) of the feature maps.
# This dimensionality reduction helps make the model more computationally efficient, it reduces overfitting, and it allows it to learn more generalized features.
# Add a max pooling layer to reduce the first convolutionary layer array size in half:
#	     •	Pooling Size: (2, 2), which reduces the height and width by half by taking the maximum value in each 2x2 region.
#	     •	Padding: "same" keeps the dimensions from dropping unevenly.
model1.add(MaxPooling2D((2, 2), padding='same'))
# >>  Output Shape: (32, 32, 128) Height and length reduced by 1/2 (64 x 64 down to 32 x 32) while depth remains at 128 based on the previous convolution layer filters.


# ------------------------------ 2nd --------------------------------------|
# Add second convolutional layer with 64 filters and max pooling layer:
# >>  Input Shape: (32, 32, 128)
#	     •	Filters: 64, meaning this layer will learn 64 different features.
#	     •	Kernel Size: (3, 3), scanning each 3x3 region.
#	     •	Padding: "same" keeps height and width the same.
#	     •	Activation: ReLU introduces non-linearity.
model1.add(Conv2D(64, (3, 3), activation='relu', padding="same"))
# >>  Output Shape: (32, 32, 64) - The height and width stay at 32, while the depth is reduced to 64 due to the number of filters.


# ------------------------------ Pooling Layer for 2nd Convolutional Layer
# Add the max pooling layer to reduce the output size of the 2nd convolutional layer:
#      •	Pooling Size: (2, 2), reducing the height and width by half again.
model1.add(MaxPooling2D((2, 2), padding='same'))
# >>  Output Shape: (16, 16, 64) - Height and width are reduced to 16, while depth remains 64.


# ----------------------------- 3rd ---------------------------------------|
# Add the third convolutional layer with 32 filters and max pooling layer:
# >>  Input Shape: (16, 16, 64)
#	     •	Filters: 32, meaning this layer will learn 32 different features.
#	     •	Kernel Size: (3, 3)
#	     •	Padding: "same" keeps the dimensions unchanged.
#	     •	Activation: ReLU introduces non-linearity.
model1.add(Conv2D(32, (3, 3), activation='relu', padding="same"))
# >>  Output Shape: (16, 16, 32) - The height and width stay at 16, and depth is reduced to 32 due to the number of filters.


# ------------------------------ Pooling Layer for 3rd Convolutional Layer
#	     •	Pooling Size: (2, 2), reducing the height and width by half.
model1.add(MaxPooling2D((2, 2), padding='same'))
# >>  Output Shape: (8, 8, 32) - The height and width are now 8, while the depth stays 32.


# ----------------------------- Flattening --------------------------------|
# Flatten the output to prepare for the fully connected dense layers.
# Purpose: Converts the 3D output into a 1D vector for the dense layers.
# >>  Input Shape: (8, 8, 32)
model1.add(Flatten())
# >>  Output Shape: (8 * 8 * 32) = 2048, a flattened vector of size 2048.


# -------------------------- 1st Fully Connected NN ------------------------|
# Add a fully connected dense layer with 16 neurons:
#	     •	Purpose: Adds a fully connected layer with 16 neurons.
#	     •	Activation: ReLU introduces non-linearity.
#	     •	Input Shape: 2048
model1.add(Dense(16, activation='relu'))
# >>  Output Shape: 16, which represents 16 features.


# -------------------------- Droput Layer
#	     •	Purpose: Drops 30% of neurons randomly during training, which helps prevent overfitting.
model1.add(Dropout(0.3))
# >>  Output Shape: Still 16 (only a regularization effect, no shape change).


# -------------------------- 2nd Fully Connected NN ------------------------|
# Add the output layer with 12 neurons and softmax activation for multi-class classification
#	Purpose: Outputs predictions for a 12-class classification problem.
# >>  Input Shape: 16
#	     •	Activation: Softmax to output probabilities for each of the 12 classes.
model1.add(Dense(12, activation='softmax'))
# >>  Output Shape: 12, representing the 12 class probabilities.


# -------------------------- NN Optimizer
# Initialize the Adam optimizer
#	•	Optimizer: Adam, an efficient and commonly used optimizer.
#	•	Loss Function: categorical_crossentropy for multi-class classification.
#	•	Metric: accuracy to evaluate model performance.
opt = Adam()


# -------------------------- NN Compile
# Compile model with the optimizer and loss function suitable for multi-class classification:
model1.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])


# -------------------------- Model Summary
# Generate the model summary
model1.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                         ┃ Output Shape                ┃         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ conv2d (Conv2D)                      │ (None, 64, 64, 128)         │           3,584 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d (MaxPooling2D)         │ (None, 32, 32, 128)         │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ conv2d_1 (Conv2D)                    │ (None, 32, 32, 64)          │          73,792 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d_1 (MaxPooling2D)       │ (None, 16, 16, 64)          │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ conv2d_2 (Conv2D)                    │ (None, 16, 16, 32)          │          18,464 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d_2 (MaxPooling2D)       │ (None, 8, 8, 32)            │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ flatten (Flatten)                    │ (None, 2048)                │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense (Dense)                        │ (None, 16)                  │          32,784 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dropout (Dropout)                    │ (None, 16)                  │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_1 (Dense)                      │ (None, 12)                  │             204 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 128,828 (503.23 KB)
 Trainable params: 128,828 (503.23 KB)
 Non-trainable params: 0 (0.00 B)

Architecture of model1 = Sequential() layers in a linear, "sequential" order, from input to output:¶

model1_.jpg

https://alexlenail.me/NN-SVG/LeNet.html NN-SVG MODEL (GRAPHIC). <<<<------- Online Drawing Tool Used for building the following Architecture Schematic:

model1_network.jpg

Fitting the model on the train data

In [ ]:
# Fit the model on the training data
history_1 = model1.fit(
    X_train_normalized,  # Input data for training
    y_train_encoded,     # Corresponding labels for training
    epochs=30,           # Number of epochs for training
    validation_data=(X_val_normalized, y_val_encoded),  # Validation data
    batch_size=32,       # Batch size for training
    verbose=2            # Verbosity mode
)
Epoch 1/30
119/119 - 16s - 138ms/step - accuracy: 0.1342 - loss: 2.4462 - val_accuracy: 0.2358 - val_loss: 2.3607
Epoch 2/30
119/119 - 1s - 9ms/step - accuracy: 0.2042 - loss: 2.2834 - val_accuracy: 0.2737 - val_loss: 2.0487
Epoch 3/30
119/119 - 1s - 9ms/step - accuracy: 0.2597 - loss: 2.0837 - val_accuracy: 0.3137 - val_loss: 1.9253
Epoch 4/30
119/119 - 1s - 9ms/step - accuracy: 0.2792 - loss: 1.9790 - val_accuracy: 0.3453 - val_loss: 1.8321
Epoch 5/30
119/119 - 1s - 9ms/step - accuracy: 0.3068 - loss: 1.9156 - val_accuracy: 0.3874 - val_loss: 1.7766
Epoch 6/30
119/119 - 1s - 9ms/step - accuracy: 0.3263 - loss: 1.8654 - val_accuracy: 0.4000 - val_loss: 1.6814
Epoch 7/30
119/119 - 1s - 9ms/step - accuracy: 0.3292 - loss: 1.8248 - val_accuracy: 0.4253 - val_loss: 1.6298
Epoch 8/30
119/119 - 1s - 9ms/step - accuracy: 0.3353 - loss: 1.7964 - val_accuracy: 0.4316 - val_loss: 1.5866
Epoch 9/30
119/119 - 1s - 9ms/step - accuracy: 0.3379 - loss: 1.7784 - val_accuracy: 0.4526 - val_loss: 1.6032
Epoch 10/30
119/119 - 1s - 9ms/step - accuracy: 0.3500 - loss: 1.7374 - val_accuracy: 0.4800 - val_loss: 1.5618
Epoch 11/30
119/119 - 1s - 9ms/step - accuracy: 0.3558 - loss: 1.7365 - val_accuracy: 0.4926 - val_loss: 1.5383
Epoch 12/30
119/119 - 1s - 9ms/step - accuracy: 0.3571 - loss: 1.7344 - val_accuracy: 0.4905 - val_loss: 1.4823
Epoch 13/30
119/119 - 1s - 9ms/step - accuracy: 0.3732 - loss: 1.6791 - val_accuracy: 0.5032 - val_loss: 1.4724
Epoch 14/30
119/119 - 1s - 9ms/step - accuracy: 0.3792 - loss: 1.6583 - val_accuracy: 0.5095 - val_loss: 1.4823
Epoch 15/30
119/119 - 1s - 9ms/step - accuracy: 0.3782 - loss: 1.6553 - val_accuracy: 0.4947 - val_loss: 1.4638
Epoch 16/30
119/119 - 1s - 9ms/step - accuracy: 0.3811 - loss: 1.6313 - val_accuracy: 0.5221 - val_loss: 1.4063
Epoch 17/30
119/119 - 1s - 9ms/step - accuracy: 0.3916 - loss: 1.6129 - val_accuracy: 0.5263 - val_loss: 1.4212
Epoch 18/30
119/119 - 1s - 9ms/step - accuracy: 0.3863 - loss: 1.6285 - val_accuracy: 0.4926 - val_loss: 1.4715
Epoch 19/30
119/119 - 1s - 9ms/step - accuracy: 0.3903 - loss: 1.5963 - val_accuracy: 0.5221 - val_loss: 1.4296
Epoch 20/30
119/119 - 1s - 9ms/step - accuracy: 0.3937 - loss: 1.5981 - val_accuracy: 0.5326 - val_loss: 1.3828
Epoch 21/30
119/119 - 1s - 9ms/step - accuracy: 0.3892 - loss: 1.5793 - val_accuracy: 0.5453 - val_loss: 1.3792
Epoch 22/30
119/119 - 1s - 9ms/step - accuracy: 0.3987 - loss: 1.5622 - val_accuracy: 0.5579 - val_loss: 1.3280
Epoch 23/30
119/119 - 1s - 9ms/step - accuracy: 0.4353 - loss: 1.4583 - val_accuracy: 0.5768 - val_loss: 1.2848
Epoch 24/30
119/119 - 1s - 9ms/step - accuracy: 0.4574 - loss: 1.4246 - val_accuracy: 0.5663 - val_loss: 1.3093
Epoch 25/30
119/119 - 1s - 9ms/step - accuracy: 0.4689 - loss: 1.3725 - val_accuracy: 0.6000 - val_loss: 1.2182
Epoch 26/30
119/119 - 1s - 9ms/step - accuracy: 0.4813 - loss: 1.3401 - val_accuracy: 0.5663 - val_loss: 1.3145
Epoch 27/30
119/119 - 1s - 9ms/step - accuracy: 0.4900 - loss: 1.3103 - val_accuracy: 0.5811 - val_loss: 1.2526
Epoch 28/30
119/119 - 1s - 9ms/step - accuracy: 0.4916 - loss: 1.2990 - val_accuracy: 0.5684 - val_loss: 1.2699
Epoch 29/30
119/119 - 1s - 9ms/step - accuracy: 0.4903 - loss: 1.2922 - val_accuracy: 0.6021 - val_loss: 1.2133
Epoch 30/30
119/119 - 1s - 9ms/step - accuracy: 0.5108 - loss: 1.2570 - val_accuracy: 0.6063 - val_loss: 1.2072

Model Evaluation

In [ ]:
plt.plot(history_1.history['accuracy'])
plt.plot(history_1.history['val_accuracy'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()
No description has been provided for this image

Observations:

  • Notice the model is Underfitting with the training and will benefit from adjustments and some fine-tuning.
  • Consider the following:
  • Increase Model Complexity: Try a more complex model or add more features.
  • Reduce Regularization: If applicable, reduce regularization to allow the model to fit the training data better.
  • Train Longer: Allow more training epochs, especially if the model is still learning.
  • Clean Data: Ensure the training data quality is high and labels are correct.

Same Model, more Epochs and larger Batch size:

  • The batch_size parameter specifies the number of samples processed before the model’s weights are updated. Here, batch_size=64 means the model will process 64 samples as opposed to 32 samples at a time and then update the weights, which can improve training efficiency and convergence stability.
In [ ]:
# Fit the model on the training data
history_2 = model1.fit(
    X_train_normalized,  # Input data for training
    y_train_encoded,     # Corresponding labels for training
    epochs=50,           # Number of epochs for training
    validation_data=(X_val_normalized, y_val_encoded),  # Validation data
    batch_size=64,       # Batch size for training
    verbose=2            # Verbosity mode
)
Epoch 1/50
60/60 - 1s - 18ms/step - accuracy: 0.8216 - loss: 0.4789 - val_accuracy: 0.7074 - val_loss: 3.3411
Epoch 2/50
60/60 - 1s - 17ms/step - accuracy: 0.8213 - loss: 0.4628 - val_accuracy: 0.6842 - val_loss: 3.7541
Epoch 3/50
60/60 - 1s - 17ms/step - accuracy: 0.8324 - loss: 0.4350 - val_accuracy: 0.6758 - val_loss: 3.6137
Epoch 4/50
60/60 - 1s - 17ms/step - accuracy: 0.8218 - loss: 0.4846 - val_accuracy: 0.7032 - val_loss: 3.4823
Epoch 5/50
60/60 - 1s - 17ms/step - accuracy: 0.8276 - loss: 0.4537 - val_accuracy: 0.6863 - val_loss: 3.6213
Epoch 6/50
60/60 - 1s - 17ms/step - accuracy: 0.8318 - loss: 0.4448 - val_accuracy: 0.7032 - val_loss: 3.3892
Epoch 7/50
60/60 - 1s - 17ms/step - accuracy: 0.8303 - loss: 0.4548 - val_accuracy: 0.6884 - val_loss: 3.5873
Epoch 8/50
60/60 - 1s - 17ms/step - accuracy: 0.8237 - loss: 0.4681 - val_accuracy: 0.6968 - val_loss: 3.4013
Epoch 9/50
60/60 - 1s - 18ms/step - accuracy: 0.8234 - loss: 0.4851 - val_accuracy: 0.6989 - val_loss: 3.3771
Epoch 10/50
60/60 - 1s - 19ms/step - accuracy: 0.8311 - loss: 0.4609 - val_accuracy: 0.6947 - val_loss: 3.3944
Epoch 11/50
60/60 - 1s - 18ms/step - accuracy: 0.8300 - loss: 0.4369 - val_accuracy: 0.6989 - val_loss: 3.3762
Epoch 12/50
60/60 - 1s - 18ms/step - accuracy: 0.8282 - loss: 0.4677 - val_accuracy: 0.6968 - val_loss: 3.4918
Epoch 13/50
60/60 - 1s - 18ms/step - accuracy: 0.8439 - loss: 0.4185 - val_accuracy: 0.6842 - val_loss: 3.6583
Epoch 14/50
60/60 - 1s - 18ms/step - accuracy: 0.8408 - loss: 0.4406 - val_accuracy: 0.6989 - val_loss: 3.6270
Epoch 15/50
60/60 - 1s - 17ms/step - accuracy: 0.8242 - loss: 0.4479 - val_accuracy: 0.6863 - val_loss: 3.8325
Epoch 16/50
60/60 - 1s - 17ms/step - accuracy: 0.8303 - loss: 0.4516 - val_accuracy: 0.7074 - val_loss: 3.2295
Epoch 17/50
60/60 - 1s - 17ms/step - accuracy: 0.8368 - loss: 0.4360 - val_accuracy: 0.6989 - val_loss: 3.4565
Epoch 18/50
60/60 - 1s - 17ms/step - accuracy: 0.8232 - loss: 0.4682 - val_accuracy: 0.6779 - val_loss: 3.8297
Epoch 19/50
60/60 - 1s - 17ms/step - accuracy: 0.8176 - loss: 0.5035 - val_accuracy: 0.6800 - val_loss: 4.0774
Epoch 20/50
60/60 - 1s - 17ms/step - accuracy: 0.8142 - loss: 0.4762 - val_accuracy: 0.6779 - val_loss: 4.1806
Epoch 21/50
60/60 - 1s - 17ms/step - accuracy: 0.8332 - loss: 0.4559 - val_accuracy: 0.6737 - val_loss: 3.8696
Epoch 22/50
60/60 - 1s - 17ms/step - accuracy: 0.8395 - loss: 0.4395 - val_accuracy: 0.6926 - val_loss: 3.9996
Epoch 23/50
60/60 - 1s - 17ms/step - accuracy: 0.8371 - loss: 0.4246 - val_accuracy: 0.6947 - val_loss: 3.8634
Epoch 24/50
60/60 - 1s - 17ms/step - accuracy: 0.8379 - loss: 0.4367 - val_accuracy: 0.6842 - val_loss: 3.5818
Epoch 25/50
60/60 - 1s - 17ms/step - accuracy: 0.8374 - loss: 0.4346 - val_accuracy: 0.6947 - val_loss: 4.2780
Epoch 26/50
60/60 - 1s - 17ms/step - accuracy: 0.8276 - loss: 0.4572 - val_accuracy: 0.6863 - val_loss: 3.4013
Epoch 27/50
60/60 - 1s - 17ms/step - accuracy: 0.8274 - loss: 0.4618 - val_accuracy: 0.6947 - val_loss: 3.9802
Epoch 28/50
60/60 - 1s - 17ms/step - accuracy: 0.8297 - loss: 0.4428 - val_accuracy: 0.6821 - val_loss: 4.3341
Epoch 29/50
60/60 - 1s - 17ms/step - accuracy: 0.8316 - loss: 0.4403 - val_accuracy: 0.6779 - val_loss: 4.2158
Epoch 30/50
60/60 - 1s - 17ms/step - accuracy: 0.8382 - loss: 0.4367 - val_accuracy: 0.6863 - val_loss: 3.8377
Epoch 31/50
60/60 - 1s - 17ms/step - accuracy: 0.8332 - loss: 0.4367 - val_accuracy: 0.6589 - val_loss: 3.7592
Epoch 32/50
60/60 - 1s - 17ms/step - accuracy: 0.8355 - loss: 0.4444 - val_accuracy: 0.6905 - val_loss: 3.3657
Epoch 33/50
60/60 - 1s - 16ms/step - accuracy: 0.8305 - loss: 0.4465 - val_accuracy: 0.6821 - val_loss: 3.9056
Epoch 34/50
60/60 - 1s - 16ms/step - accuracy: 0.8274 - loss: 0.4551 - val_accuracy: 0.6758 - val_loss: 4.0700
Epoch 35/50
60/60 - 1s - 17ms/step - accuracy: 0.8382 - loss: 0.4390 - val_accuracy: 0.6884 - val_loss: 3.6945
Epoch 36/50
60/60 - 1s - 16ms/step - accuracy: 0.8382 - loss: 0.4261 - val_accuracy: 0.6737 - val_loss: 3.1645
Epoch 37/50
60/60 - 1s - 16ms/step - accuracy: 0.8284 - loss: 0.4632 - val_accuracy: 0.6884 - val_loss: 3.3348
Epoch 38/50
60/60 - 1s - 16ms/step - accuracy: 0.8313 - loss: 0.4508 - val_accuracy: 0.6632 - val_loss: 3.9374
Epoch 39/50
60/60 - 1s - 16ms/step - accuracy: 0.8363 - loss: 0.4319 - val_accuracy: 0.6800 - val_loss: 3.6244
Epoch 40/50
60/60 - 1s - 17ms/step - accuracy: 0.8453 - loss: 0.4173 - val_accuracy: 0.6863 - val_loss: 3.7661
Epoch 41/50
60/60 - 1s - 16ms/step - accuracy: 0.8466 - loss: 0.4099 - val_accuracy: 0.6800 - val_loss: 4.3726
Epoch 42/50
60/60 - 1s - 16ms/step - accuracy: 0.8355 - loss: 0.4416 - val_accuracy: 0.6821 - val_loss: 3.9420
Epoch 43/50
60/60 - 1s - 16ms/step - accuracy: 0.8342 - loss: 0.4415 - val_accuracy: 0.6905 - val_loss: 3.9105
Epoch 44/50
60/60 - 1s - 16ms/step - accuracy: 0.8313 - loss: 0.4552 - val_accuracy: 0.6653 - val_loss: 3.4856
Epoch 45/50
60/60 - 1s - 16ms/step - accuracy: 0.8282 - loss: 0.4539 - val_accuracy: 0.6800 - val_loss: 3.5754
Epoch 46/50
60/60 - 1s - 16ms/step - accuracy: 0.8321 - loss: 0.4461 - val_accuracy: 0.6779 - val_loss: 4.0342
Epoch 47/50
60/60 - 1s - 16ms/step - accuracy: 0.8400 - loss: 0.4444 - val_accuracy: 0.6737 - val_loss: 4.0491
Epoch 48/50
60/60 - 1s - 16ms/step - accuracy: 0.8411 - loss: 0.4101 - val_accuracy: 0.6821 - val_loss: 4.0546
Epoch 49/50
60/60 - 1s - 16ms/step - accuracy: 0.8387 - loss: 0.4353 - val_accuracy: 0.6905 - val_loss: 3.6457
Epoch 50/50
60/60 - 1s - 16ms/step - accuracy: 0.8324 - loss: 0.4486 - val_accuracy: 0.6905 - val_loss: 3.7013

Model Evaluation after adjusting Epochs and Batch Size

In [ ]:
import matplotlib.pyplot as plt

plt.plot(history_2.history['accuracy'])
plt.plot(history_2.history['val_accuracy'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()
No description has been provided for this image

Observations:

  • The Training accuracy increased with more Epochs and Batch sized.
  • However, the Validation accuracy shows now an Overfitting scenario.
  • This model offers a higher performance with the training as compared to the first model. (But the Overfitting is worrysome)

Evaluate the model on test data

In [ ]:
# Evaluate the model on test data
loss, accuracy = model1.evaluate(X_test_normalized, y_test_encoded, verbose=0)
print(f"Test Loss: {loss:.4f}")
print(f"Test Accuracy: {accuracy:.4f}")
Test Loss: 4.1427
Test Accuracy: 0.7053

Observations:

  • Both the Test and the Validation accuracy were in the same range.

Plotting the Confusion Matrix

In [ ]:
# Confusion Matrix and all performance metrics on test dataset

import matplotlib.pyplot as plt
import numpy as np
from sklearn.metrics import confusion_matrix, classification_report

# Predict on the test data
predictions = model1.predict(X_test_normalized)
predicted_labels = np.argmax(predictions, axis=1)
true_labels = np.argmax(y_test_encoded, axis=1)


# Confusion Matrix
cm = confusion_matrix(true_labels, predicted_labels)
plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt="d", cmap="Blues", cbar=False)
plt.xlabel("Predicted Labels")
plt.ylabel("True Labels")
plt.title("Confusion Matrix")
plt.show()


# Performance Metrics
print("\n All Performance Metrics done on TEST dataset:\n")
print(classification_report(true_labels, predicted_labels))
15/15 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step
No description has been provided for this image
 All Performance Metrics done on TEST dataset:

              precision    recall  f1-score   support

           0       0.00      0.00      0.00        28
           1       0.65      0.86      0.74        36
           2       0.59      0.67      0.63        24
           3       0.74      0.93      0.83        46
           4       0.00      0.00      0.00        16
           5       0.55      0.55      0.55        53
           6       0.58      0.90      0.71        69
           7       0.92      0.65      0.76        17
           8       0.72      0.69      0.71        59
           9       0.65      0.54      0.59        24
          10       0.89      0.89      0.89        55
          11       0.64      0.52      0.57        48

    accuracy                           0.67       475
   macro avg       0.58      0.60      0.58       475
weighted avg       0.62      0.67      0.64       475

In [ ]:
# Predict the output probabilities for each category
y_pred = model1.predict(X_test_normalized)
15/15 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step 
In [ ]:
# Obtaining the categorical values from y_test_encoded and y_pred
y_pred_arg=np.argmax(y_pred,axis=1)
y_test_arg=np.argmax(y_test_encoded,axis=1)
# Plotting the Confusion Matrix using confusion matrix() function which is also predefined in tensorflow module
confusion_matrix = tf.math.confusion_matrix(y_test_arg,y_pred_arg)              # Complete the code to plot the confusion matrix
f, ax = plt.subplots(figsize=(10,10))
sns.heatmap(
    confusion_matrix,
    annot=True,
    linewidths=.4,
    fmt="d",
    square=True,
    ax=ax,
    cmap="Blues"
)
# Setting the labels to both the axes
ax.set_xlabel('PREDICTED labels');ax.set_ylabel('TRUE labels');
ax.set_title('Confusion Matrix with Labels');
ax.xaxis.set_ticklabels(list(enc.classes_),rotation=45)
ax.yaxis.set_ticklabels(list(enc.classes_),rotation=45)
plt.show()
No description has been provided for this image

Printing a Classification Report

In [ ]:
cr = metrics.classification_report(true_labels, predicted_labels)     # Print the classification report
print(cr)
              precision    recall  f1-score   support

           0       0.00      0.00      0.00        28
           1       0.65      0.86      0.74        36
           2       0.59      0.67      0.63        24
           3       0.74      0.93      0.83        46
           4       0.00      0.00      0.00        16
           5       0.55      0.55      0.55        53
           6       0.58      0.90      0.71        69
           7       0.92      0.65      0.76        17
           8       0.72      0.69      0.71        59
           9       0.65      0.54      0.59        24
          10       0.89      0.89      0.89        55
          11       0.64      0.52      0.57        48

    accuracy                           0.67       475
   macro avg       0.58      0.60      0.58       475
weighted avg       0.62      0.67      0.64       475

Model Performance Improvement¶

Reducing the Learning Rate:

  • ReduceLRonPlateau() is a function that will be used to decrease the learning rate by some factor, if the loss is not decreasing for some time. This may start decreasing the loss at a smaller learning rate. There is a possibility that the loss may still not decrease. This may lead to executing the learning rate reduction again in an attempt to achieve a lower loss.
In [ ]:
# Code to monitor val_accuracy
learning_rate_reduction = ReduceLROnPlateau(monitor='val_accuracy',
                                            patience=3,
                                            verbose=1,
                                            factor=0.5,
                                            min_lr=0.00001)

Data Augmentation¶

In [ ]:
# Clearing backend
from tensorflow.keras import backend
backend.clear_session()

# Fixing the seed for random number generators
import random
np.random.seed(42)
random.seed(42)
tf.random.set_seed(42)

On the Training set, we can experiment with Data Augmentation by including:

  • Image Rotation
  • Brightness ? Darkness change
  • Image Flipping
In [ ]:
# Set the rotation_range to 20
train_datagen = ImageDataGenerator(
                              rotation_range=20,
                              fill_mode='nearest'
                              )

2nd Model with Data Augmentation and Batch Normalization¶

In [ ]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout, BatchNormalization
from tensorflow.keras.optimizers import Adam

# Initializing our sequential model
model2 = Sequential()

# Adding the first convolutional layer with 64 filters and kernel size 3x3
model2.add(Conv2D(64, (3, 3), activation='relu', padding="same", input_shape=(64, 64, 3)))

# Adding max pooling to reduce the size of output of the first conv layer
model2.add(MaxPooling2D((2, 2), padding='same'))

# Adding more convolutional layers
model2.add(Conv2D(32, (3, 3), activation='relu', padding="same"))
model2.add(MaxPooling2D((2, 2), padding='same'))
model2.add(BatchNormalization())

# Flattening the output of the conv layer after max pooling
model2.add(Flatten())

# Adding a fully connected dense layer with 16 neurons
model2.add(Dense(16, activation='relu'))

# Adding dropout with dropout_rate=0.3
model2.add(Dropout(0.3))

# Adding the output layer with 12 neurons and softmax activation function
model2.add(Dense(12, activation='softmax'))

# Initializing Adam Optimizer
opt = Adam()

# Compiling the model
model2.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])

# Generating the summary of the model
model2.summary()
Model: "sequential_2"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                         ┃ Output Shape                ┃         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ conv2d_5 (Conv2D)                    │ (None, 64, 64, 64)          │           1,792 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d_5 (MaxPooling2D)       │ (None, 32, 32, 64)          │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ conv2d_6 (Conv2D)                    │ (None, 32, 32, 32)          │          18,464 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d_6 (MaxPooling2D)       │ (None, 16, 16, 32)          │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ batch_normalization_3                │ (None, 16, 16, 32)          │             128 │
│ (BatchNormalization)                 │                             │                 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ flatten_2 (Flatten)                  │ (None, 8192)                │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_4 (Dense)                      │ (None, 16)                  │         131,088 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dropout_2 (Dropout)                  │ (None, 16)                  │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_5 (Dense)                      │ (None, 12)                  │             204 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 151,676 (592.48 KB)
 Trainable params: 151,612 (592.23 KB)
 Non-trainable params: 64 (256.00 B)

3rd Model with More Convolution Layers

In [ ]:
# Initializing our sequential model
model3 = Sequential()

# Adding the first convolutional layer with 64 filters and kernel size 3x3
model3.add(Conv2D(64, (3, 3), activation='relu', padding="same", input_shape=(64, 64, 3)))

# Adding max pooling to reduce the size of output of the first conv layer
model3.add(MaxPooling2D((2, 2), padding='same'))

# Adding more convolutional layers
model3.add(Conv2D(32, (3, 3), activation='relu', padding="same"))
model3.add(MaxPooling2D((2, 2), padding='same'))
model3.add(BatchNormalization())

# Adding more convolutional layers
model3.add(Conv2D(32, (3, 3), activation='relu', padding="same"))
model3.add(MaxPooling2D((2, 2), padding='same'))
model3.add(BatchNormalization())

# Flattening the output of the conv layer after max pooling
model3.add(Flatten())

# Adding a fully connected dense layer with 16 neurons
model3.add(Dense(16, activation='relu'))

# Adding dropout with dropout_rate=0.3
model3.add(Dropout(0.3))

# Adding the output layer with 12 neurons and softmax activation function
model3.add(Dense(12, activation='softmax'))

# Initializing Adam Optimizer
opt = Adam()

# Compiling the model
model3.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])

# Generating the summary of the model
model3.summary()
Model: "sequential_3"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                         ┃ Output Shape                ┃         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ conv2d_7 (Conv2D)                    │ (None, 64, 64, 64)          │           1,792 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d_7 (MaxPooling2D)       │ (None, 32, 32, 64)          │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ conv2d_8 (Conv2D)                    │ (None, 32, 32, 32)          │          18,464 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d_8 (MaxPooling2D)       │ (None, 16, 16, 32)          │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ batch_normalization_4                │ (None, 16, 16, 32)          │             128 │
│ (BatchNormalization)                 │                             │                 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ conv2d_9 (Conv2D)                    │ (None, 16, 16, 32)          │           9,248 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d_9 (MaxPooling2D)       │ (None, 8, 8, 32)            │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ batch_normalization_5                │ (None, 8, 8, 32)            │             128 │
│ (BatchNormalization)                 │                             │                 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ flatten_3 (Flatten)                  │ (None, 2048)                │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_6 (Dense)                      │ (None, 16)                  │          32,784 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dropout_3 (Dropout)                  │ (None, 16)                  │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_7 (Dense)                      │ (None, 12)                  │             204 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 62,748 (245.11 KB)
 Trainable params: 62,620 (244.61 KB)
 Non-trainable params: 128 (512.00 B)

IMPORTANT:

When fitting a model in Keras, it’s important to ensure that we use the preprocessed data that has been normalized. In our case, you should fit the model on the normalized training data rather than the raw training data. This helps the model learn better and speeds up convergence.

Here’s how to adjust our fitting code accordingly:

Fitting the Model on Normalized Training Data

Make sure we are using the normalized training data (X_train_normalized) and the corresponding one-hot encoded labels (y_train_encoded) when calling the fit method

  • X_train_normalized: This is the training data with pixel values normalized to the range of 0-1, which is crucial for the model’s performance.
  • y_train_encoded: The one-hot encoded labels corresponding to the training data.
  • train_datagen.flow: This is used for data augmentation and to create batches of training data.
  • validation_data: We should also use the normalized validation data (X_val_normalized) and the encoded validation labels (y_val_encoded).

Summary:

Always ensure both the training and validation datasets are normalized before fitting your model, as this significantly impacts the learning process and the final performance of the model.


Fitting the model on the train data

In [ ]:
# Fit the model on train data with batch_size=64 and epochs=30
# Epochs
epochs = 60
# Batch size
batch_size = 64

history = model2.fit(train_datagen.flow(X_train_normalized,y_train_encoded,
                                       batch_size=batch_size,
                                       shuffle=False),
                                       epochs=epochs,
                                       steps_per_epoch=X_train_normalized.shape[0] // batch_size,
                                       validation_data=(X_val_normalized,y_val_encoded),
                                       verbose=1,callbacks=[learning_rate_reduction])
Epoch 1/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 13s 161ms/step - accuracy: 0.1585 - loss: 2.3498 - val_accuracy: 0.2316 - val_loss: 2.4075 - learning_rate: 0.0010
Epoch 2/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.1719 - loss: 2.2942 - val_accuracy: 0.2232 - val_loss: 2.4123 - learning_rate: 0.0010
Epoch 3/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 62ms/step - accuracy: 0.3761 - loss: 1.8418 - val_accuracy: 0.3726 - val_loss: 2.3532 - learning_rate: 0.0010
Epoch 4/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.4375 - loss: 1.5496 - val_accuracy: 0.3642 - val_loss: 2.3388 - learning_rate: 0.0010
Epoch 5/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 62ms/step - accuracy: 0.4506 - loss: 1.6077 - val_accuracy: 0.3916 - val_loss: 2.2024 - learning_rate: 0.0010
Epoch 6/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.3750 - loss: 1.5535 - val_accuracy: 0.3853 - val_loss: 2.2103 - learning_rate: 0.0010
Epoch 7/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 65ms/step - accuracy: 0.4860 - loss: 1.4797 - val_accuracy: 0.4042 - val_loss: 2.0898 - learning_rate: 0.0010
Epoch 8/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.5156 - loss: 1.4049 - val_accuracy: 0.3958 - val_loss: 2.0665 - learning_rate: 0.0010
Epoch 9/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 63ms/step - accuracy: 0.5289 - loss: 1.3844 - val_accuracy: 0.4800 - val_loss: 1.9115 - learning_rate: 0.0010
Epoch 10/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.6562 - loss: 1.2624 - val_accuracy: 0.4884 - val_loss: 1.9017 - learning_rate: 0.0010
Epoch 11/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 63ms/step - accuracy: 0.5583 - loss: 1.2914 - val_accuracy: 0.4589 - val_loss: 1.9296 - learning_rate: 0.0010
Epoch 12/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.4844 - loss: 1.4652 - val_accuracy: 0.4632 - val_loss: 1.9267 - learning_rate: 0.0010
Epoch 13/60
58/59 ━━━━━━━━━━━━━━━━━━━━ 0s 64ms/step - accuracy: 0.5738 - loss: 1.2123
Epoch 13: ReduceLROnPlateau reducing learning rate to 0.0005000000237487257.
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 64ms/step - accuracy: 0.5739 - loss: 1.2124 - val_accuracy: 0.4674 - val_loss: 1.6988 - learning_rate: 0.0010
Epoch 14/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.5938 - loss: 1.1616 - val_accuracy: 0.4779 - val_loss: 1.6683 - learning_rate: 5.0000e-04
Epoch 15/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 62ms/step - accuracy: 0.6091 - loss: 1.1460 - val_accuracy: 0.4800 - val_loss: 1.5684 - learning_rate: 5.0000e-04
Epoch 16/60
 1/59 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - accuracy: 0.6250 - loss: 1.0521
Epoch 16: ReduceLROnPlateau reducing learning rate to 0.0002500000118743628.
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.6250 - loss: 1.0521 - val_accuracy: 0.4821 - val_loss: 1.5754 - learning_rate: 5.0000e-04
Epoch 17/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 63ms/step - accuracy: 0.6203 - loss: 1.0619 - val_accuracy: 0.6105 - val_loss: 1.2692 - learning_rate: 2.5000e-04
Epoch 18/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.6250 - loss: 1.0368 - val_accuracy: 0.6126 - val_loss: 1.2615 - learning_rate: 2.5000e-04
Epoch 19/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 64ms/step - accuracy: 0.6214 - loss: 1.0656 - val_accuracy: 0.6695 - val_loss: 1.1062 - learning_rate: 2.5000e-04
Epoch 20/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.6406 - loss: 0.9833 - val_accuracy: 0.6737 - val_loss: 1.0981 - learning_rate: 2.5000e-04
Epoch 21/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 61ms/step - accuracy: 0.6377 - loss: 1.0153 - val_accuracy: 0.6926 - val_loss: 1.0782 - learning_rate: 2.5000e-04
Epoch 22/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.7188 - loss: 0.8630 - val_accuracy: 0.6884 - val_loss: 1.0738 - learning_rate: 2.5000e-04
Epoch 23/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 61ms/step - accuracy: 0.6376 - loss: 1.0300 - val_accuracy: 0.6779 - val_loss: 0.9975 - learning_rate: 2.5000e-04
Epoch 24/60
 1/59 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step - accuracy: 0.6562 - loss: 0.8518
Epoch 24: ReduceLROnPlateau reducing learning rate to 0.0001250000059371814.
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.6562 - loss: 0.8518 - val_accuracy: 0.6821 - val_loss: 1.0006 - learning_rate: 2.5000e-04
Epoch 25/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 64ms/step - accuracy: 0.6460 - loss: 1.0179 - val_accuracy: 0.7095 - val_loss: 0.9557 - learning_rate: 1.2500e-04
Epoch 26/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.6562 - loss: 0.9321 - val_accuracy: 0.7095 - val_loss: 0.9558 - learning_rate: 1.2500e-04
Epoch 27/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 62ms/step - accuracy: 0.6291 - loss: 1.0144 - val_accuracy: 0.6695 - val_loss: 1.0097 - learning_rate: 1.2500e-04
Epoch 28/60
 1/59 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step - accuracy: 0.7188 - loss: 0.8247
Epoch 28: ReduceLROnPlateau reducing learning rate to 6.25000029685907e-05.
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.7188 - loss: 0.8247 - val_accuracy: 0.6695 - val_loss: 1.0092 - learning_rate: 1.2500e-04
Epoch 29/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 61ms/step - accuracy: 0.6609 - loss: 0.9671 - val_accuracy: 0.6758 - val_loss: 0.9947 - learning_rate: 6.2500e-05
Epoch 30/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.7344 - loss: 0.7964 - val_accuracy: 0.6779 - val_loss: 0.9937 - learning_rate: 6.2500e-05
Epoch 31/60
58/59 ━━━━━━━━━━━━━━━━━━━━ 0s 63ms/step - accuracy: 0.6652 - loss: 0.9656
Epoch 31: ReduceLROnPlateau reducing learning rate to 3.125000148429535e-05.
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 63ms/step - accuracy: 0.6653 - loss: 0.9655 - val_accuracy: 0.6905 - val_loss: 0.9776 - learning_rate: 6.2500e-05
Epoch 32/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.6875 - loss: 0.8881 - val_accuracy: 0.6905 - val_loss: 0.9778 - learning_rate: 3.1250e-05
Epoch 33/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 61ms/step - accuracy: 0.6619 - loss: 0.9569 - val_accuracy: 0.7095 - val_loss: 0.9443 - learning_rate: 3.1250e-05
Epoch 34/60
 1/59 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step - accuracy: 0.6250 - loss: 0.9994
Epoch 34: ReduceLROnPlateau reducing learning rate to 1.5625000742147677e-05.
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.6250 - loss: 0.9994 - val_accuracy: 0.7095 - val_loss: 0.9441 - learning_rate: 3.1250e-05
Epoch 35/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 61ms/step - accuracy: 0.6587 - loss: 0.9832 - val_accuracy: 0.6968 - val_loss: 0.9548 - learning_rate: 1.5625e-05
Epoch 36/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.6562 - loss: 0.8819 - val_accuracy: 0.6947 - val_loss: 0.9546 - learning_rate: 1.5625e-05
Epoch 37/60
58/59 ━━━━━━━━━━━━━━━━━━━━ 0s 62ms/step - accuracy: 0.6570 - loss: 0.9546
Epoch 37: ReduceLROnPlateau reducing learning rate to 1e-05.
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 62ms/step - accuracy: 0.6572 - loss: 0.9542 - val_accuracy: 0.6947 - val_loss: 0.9500 - learning_rate: 1.5625e-05
Epoch 38/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.5938 - loss: 1.1048 - val_accuracy: 0.6947 - val_loss: 0.9500 - learning_rate: 1.0000e-05
Epoch 39/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 60ms/step - accuracy: 0.6693 - loss: 0.9455 - val_accuracy: 0.6926 - val_loss: 0.9519 - learning_rate: 1.0000e-05
Epoch 40/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.6406 - loss: 0.9729 - val_accuracy: 0.6926 - val_loss: 0.9518 - learning_rate: 1.0000e-05
Epoch 41/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 62ms/step - accuracy: 0.6559 - loss: 0.9784 - val_accuracy: 0.6863 - val_loss: 0.9572 - learning_rate: 1.0000e-05
Epoch 42/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.5625 - loss: 1.1080 - val_accuracy: 0.6863 - val_loss: 0.9573 - learning_rate: 1.0000e-05
Epoch 43/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 62ms/step - accuracy: 0.6466 - loss: 0.9701 - val_accuracy: 0.6968 - val_loss: 0.9513 - learning_rate: 1.0000e-05
Epoch 44/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.6094 - loss: 0.8782 - val_accuracy: 0.6968 - val_loss: 0.9511 - learning_rate: 1.0000e-05
Epoch 45/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 61ms/step - accuracy: 0.6739 - loss: 0.9449 - val_accuracy: 0.6989 - val_loss: 0.9498 - learning_rate: 1.0000e-05
Epoch 46/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.6406 - loss: 1.0489 - val_accuracy: 0.6968 - val_loss: 0.9499 - learning_rate: 1.0000e-05
Epoch 47/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 61ms/step - accuracy: 0.6639 - loss: 0.9530 - val_accuracy: 0.6947 - val_loss: 0.9507 - learning_rate: 1.0000e-05
Epoch 48/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.6406 - loss: 1.0257 - val_accuracy: 0.6947 - val_loss: 0.9507 - learning_rate: 1.0000e-05
Epoch 49/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 62ms/step - accuracy: 0.6636 - loss: 0.9547 - val_accuracy: 0.6968 - val_loss: 0.9533 - learning_rate: 1.0000e-05
Epoch 50/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.6250 - loss: 1.0541 - val_accuracy: 0.6968 - val_loss: 0.9538 - learning_rate: 1.0000e-05
Epoch 51/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 61ms/step - accuracy: 0.6727 - loss: 0.9517 - val_accuracy: 0.6989 - val_loss: 0.9479 - learning_rate: 1.0000e-05
Epoch 52/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.5781 - loss: 1.0067 - val_accuracy: 0.6989 - val_loss: 0.9477 - learning_rate: 1.0000e-05
Epoch 53/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 61ms/step - accuracy: 0.6497 - loss: 0.9837 - val_accuracy: 0.6905 - val_loss: 0.9506 - learning_rate: 1.0000e-05
Epoch 54/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.6250 - loss: 1.0871 - val_accuracy: 0.6905 - val_loss: 0.9509 - learning_rate: 1.0000e-05
Epoch 55/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 63ms/step - accuracy: 0.6543 - loss: 0.9615 - val_accuracy: 0.6905 - val_loss: 0.9508 - learning_rate: 1.0000e-05
Epoch 56/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.5469 - loss: 1.2539 - val_accuracy: 0.6926 - val_loss: 0.9505 - learning_rate: 1.0000e-05
Epoch 57/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 63ms/step - accuracy: 0.6586 - loss: 0.9468 - val_accuracy: 0.6968 - val_loss: 0.9535 - learning_rate: 1.0000e-05
Epoch 58/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.6406 - loss: 1.0416 - val_accuracy: 0.6968 - val_loss: 0.9537 - learning_rate: 1.0000e-05
Epoch 59/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 63ms/step - accuracy: 0.6666 - loss: 0.9524 - val_accuracy: 0.6968 - val_loss: 0.9506 - learning_rate: 1.0000e-05
Epoch 60/60
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.7344 - loss: 0.7264 - val_accuracy: 0.6968 - val_loss: 0.9509 - learning_rate: 1.0000e-05

Normalized Model: Evaluations

In [ ]:
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()
No description has been provided for this image

Observations:

Model #2 with both Normalized and Data Augmentation was less Overfitting indicating an improvement! Yay!!!

Now we Evaluate the model on test data...drumroll

In [ ]:
accuracy = model2.evaluate(X_test_normalized, y_test_encoded, verbose=2)  # Evaluate the model on test data
15/15 - 0s - 4ms/step - accuracy: 0.7116 - loss: 0.8976

Observation:

  • Both the Test Data and Validation data were in the same range for accuracy.
  • The "unseen" data from the test data was well predicted with nearly 70% accuracy.

Plotting the Confusion Matrix

In [ ]:
# Output probabilities
y_pred=model2.predict(X_test_normalized)
15/15 ━━━━━━━━━━━━━━━━━━━━ 1s 19ms/step
In [ ]:
# Obtaining the categorical values from y_test_encoded and y_pred
y_pred_arg=np.argmax(y_pred,axis=1)
y_test_arg=np.argmax(y_test_encoded,axis=1)

# Confusion matrix() function which is also predefined in tensorflow module
confusion_matrix = tf.math.confusion_matrix(y_test_arg,y_pred_arg)     # Confusion matrix
f, ax = plt.subplots(figsize=(10, 9))
sns.heatmap(
    confusion_matrix,
    annot=True,
    linewidths=.4,
    fmt="d",
    square=True,
    ax=ax,
    cmap="Blues"
)
# Setting the labels to both the axes
ax.set_xlabel('Predicted labels');ax.set_ylabel('True labels');
ax.set_title('Confusion Matrix');
ax.xaxis.set_ticklabels(list(enc.classes_),rotation=45)
ax.yaxis.set_ticklabels(list(enc.classes_),rotation=45)
plt.show()
No description has been provided for this image

Plotting Classification Report

In [ ]:
# Display the classification report

from sklearn.metrics import classification_report

# Predict on the test data
predictions = model2.predict(X_test_normalized)
predicted_labels = np.argmax(predictions, axis=1)
true_labels = np.argmax(y_test_encoded, axis=1)

# Performance Metrics
print("\n All Performance Metrics done on TEST dataset:\n")
print(classification_report(true_labels, predicted_labels))
15/15 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step 

 All Performance Metrics done on TEST dataset:

              precision    recall  f1-score   support

           0       0.41      0.32      0.36        28
           1       0.65      0.94      0.77        36
           2       0.86      0.79      0.83        24
           3       0.59      0.93      0.72        46
           4       0.88      0.44      0.58        16
           5       0.65      0.53      0.58        53
           6       0.70      0.72      0.71        69
           7       0.92      0.71      0.80        17
           8       0.76      0.81      0.79        59
           9       0.54      0.29      0.38        24
          10       0.90      0.82      0.86        55
          11       0.80      0.75      0.77        48

    accuracy                           0.71       475
   macro avg       0.72      0.67      0.68       475
weighted avg       0.72      0.71      0.70       475

A 4th Model to Consider¶

In [ ]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, BatchNormalization, Flatten, Dense, Dropout
from tensorflow.keras.optimizers import Adam

# Initializing a sequential model
model3 = Sequential()

# Adding the first convolutional layer with 128 filters and kernel size 3x3
model3.add(Conv2D(128, (3, 3), activation='relu', padding="same", input_shape=(64, 64, 3)))
model3.add(BatchNormalization())
model3.add(MaxPooling2D((2, 2), padding='same'))

# Adding more convolutional layers
model3.add(Conv2D(64, (3, 3), activation='relu', padding="same"))
model3.add(BatchNormalization())
model3.add(MaxPooling2D((2, 2), padding='same'))

model3.add(Conv2D(64, (3, 3), activation='relu', padding="same"))
model3.add(BatchNormalization())
model3.add(MaxPooling2D((2, 2), padding='same'))

model3.add(Conv2D(32, (3, 3), activation='relu', padding="same"))
model3.add(BatchNormalization())
model3.add(MaxPooling2D((2, 2), padding='same'))

# Flattening the output of the conv layer after max pooling
model3.add(Flatten())

# Adding a fully connected dense layer with 64 neurons
model3.add(Dense(64, activation='relu'))
model3.add(Dropout(0.5))

# Adding another fully connected dense layer with 32 neurons
model3.add(Dense(32, activation='relu'))
model3.add(Dropout(0.3))

# Adding the output layer with 12 neurons and softmax activation function
model3.add(Dense(12, activation='softmax'))

# Initializing Adam Optimizer
opt = Adam()

# Compiling the model
model3.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])

# Generating the summary of the model
model3.summary()
Model: "sequential_4"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                         ┃ Output Shape                ┃         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ conv2d_10 (Conv2D)                   │ (None, 64, 64, 128)         │           3,584 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ batch_normalization_6                │ (None, 64, 64, 128)         │             512 │
│ (BatchNormalization)                 │                             │                 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d_10 (MaxPooling2D)      │ (None, 32, 32, 128)         │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ conv2d_11 (Conv2D)                   │ (None, 32, 32, 64)          │          73,792 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ batch_normalization_7                │ (None, 32, 32, 64)          │             256 │
│ (BatchNormalization)                 │                             │                 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d_11 (MaxPooling2D)      │ (None, 16, 16, 64)          │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ conv2d_12 (Conv2D)                   │ (None, 16, 16, 64)          │          36,928 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ batch_normalization_8                │ (None, 16, 16, 64)          │             256 │
│ (BatchNormalization)                 │                             │                 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d_12 (MaxPooling2D)      │ (None, 8, 8, 64)            │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ conv2d_13 (Conv2D)                   │ (None, 8, 8, 32)            │          18,464 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ batch_normalization_9                │ (None, 8, 8, 32)            │             128 │
│ (BatchNormalization)                 │                             │                 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d_13 (MaxPooling2D)      │ (None, 4, 4, 32)            │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ flatten_4 (Flatten)                  │ (None, 512)                 │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_8 (Dense)                      │ (None, 64)                  │          32,832 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dropout_4 (Dropout)                  │ (None, 64)                  │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_9 (Dense)                      │ (None, 32)                  │           2,080 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dropout_5 (Dropout)                  │ (None, 32)                  │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_10 (Dense)                     │ (None, 12)                  │             396 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 169,228 (661.05 KB)
 Trainable params: 168,652 (658.80 KB)
 Non-trainable params: 576 (2.25 KB)

Fitting the 3d Model on the Training Dataset

  • Increasing the number of Epochs.
In [ ]:
# Fit the model on train data with batch_size=64 and epochs GREATER than 30
# Epochs
epochs = 50
# Batch size
batch_size = 64

history_3 = model3.fit(train_datagen.flow(X_train_normalized,y_train_encoded,
                                       batch_size=batch_size,
                                       shuffle=False),
                                       epochs=epochs,
                                       steps_per_epoch=X_train_normalized.shape[0] // batch_size,
                                       validation_data=(X_val_normalized,y_val_encoded),
                                       verbose=1,callbacks=[learning_rate_reduction])
Epoch 1/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 64ms/step - accuracy: 0.6205 - loss: 1.1449 - val_accuracy: 0.7495 - val_loss: 0.8930 - learning_rate: 1.0000e-05
Epoch 2/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.5469 - loss: 1.4556 - val_accuracy: 0.7495 - val_loss: 0.8929 - learning_rate: 1.0000e-05
Epoch 3/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 64ms/step - accuracy: 0.6202 - loss: 1.1164 - val_accuracy: 0.7453 - val_loss: 0.8921 - learning_rate: 1.0000e-05
Epoch 4/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.6250 - loss: 1.2441 - val_accuracy: 0.7453 - val_loss: 0.8921 - learning_rate: 1.0000e-05
Epoch 5/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 66ms/step - accuracy: 0.6050 - loss: 1.1281 - val_accuracy: 0.7474 - val_loss: 0.8880 - learning_rate: 1.0000e-05
Epoch 6/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.5156 - loss: 1.2399 - val_accuracy: 0.7474 - val_loss: 0.8879 - learning_rate: 1.0000e-05
Epoch 7/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 64ms/step - accuracy: 0.6213 - loss: 1.1340 - val_accuracy: 0.7432 - val_loss: 0.8814 - learning_rate: 1.0000e-05
Epoch 8/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.5938 - loss: 1.1737 - val_accuracy: 0.7432 - val_loss: 0.8816 - learning_rate: 1.0000e-05
Epoch 9/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 64ms/step - accuracy: 0.6264 - loss: 1.0903 - val_accuracy: 0.7474 - val_loss: 0.8770 - learning_rate: 1.0000e-05
Epoch 10/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.5156 - loss: 1.3563 - val_accuracy: 0.7474 - val_loss: 0.8770 - learning_rate: 1.0000e-05
Epoch 11/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 66ms/step - accuracy: 0.6098 - loss: 1.1317 - val_accuracy: 0.7516 - val_loss: 0.8848 - learning_rate: 1.0000e-05
Epoch 12/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.5469 - loss: 1.2917 - val_accuracy: 0.7516 - val_loss: 0.8848 - learning_rate: 1.0000e-05
Epoch 13/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 64ms/step - accuracy: 0.6267 - loss: 1.1048 - val_accuracy: 0.7474 - val_loss: 0.8789 - learning_rate: 1.0000e-05
Epoch 14/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.6250 - loss: 1.0760 - val_accuracy: 0.7453 - val_loss: 0.8784 - learning_rate: 1.0000e-05
Epoch 15/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 65ms/step - accuracy: 0.6166 - loss: 1.1229 - val_accuracy: 0.7474 - val_loss: 0.8762 - learning_rate: 1.0000e-05
Epoch 16/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.5469 - loss: 1.3418 - val_accuracy: 0.7453 - val_loss: 0.8758 - learning_rate: 1.0000e-05
Epoch 17/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 64ms/step - accuracy: 0.6059 - loss: 1.1269 - val_accuracy: 0.7453 - val_loss: 0.8883 - learning_rate: 1.0000e-05
Epoch 18/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.6562 - loss: 0.9993 - val_accuracy: 0.7453 - val_loss: 0.8881 - learning_rate: 1.0000e-05
Epoch 19/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 63ms/step - accuracy: 0.6240 - loss: 1.1260 - val_accuracy: 0.7537 - val_loss: 0.8850 - learning_rate: 1.0000e-05
Epoch 20/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.7344 - loss: 0.9926 - val_accuracy: 0.7558 - val_loss: 0.8851 - learning_rate: 1.0000e-05
Epoch 21/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 66ms/step - accuracy: 0.6285 - loss: 1.0827 - val_accuracy: 0.7495 - val_loss: 0.8793 - learning_rate: 1.0000e-05
Epoch 22/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.6250 - loss: 1.0110 - val_accuracy: 0.7495 - val_loss: 0.8796 - learning_rate: 1.0000e-05
Epoch 23/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 63ms/step - accuracy: 0.6119 - loss: 1.1258 - val_accuracy: 0.7474 - val_loss: 0.8867 - learning_rate: 1.0000e-05
Epoch 24/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.6875 - loss: 1.2198 - val_accuracy: 0.7474 - val_loss: 0.8862 - learning_rate: 1.0000e-05
Epoch 25/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 63ms/step - accuracy: 0.6382 - loss: 1.0874 - val_accuracy: 0.7453 - val_loss: 0.8874 - learning_rate: 1.0000e-05
Epoch 26/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.6406 - loss: 1.0509 - val_accuracy: 0.7432 - val_loss: 0.8865 - learning_rate: 1.0000e-05
Epoch 27/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 67ms/step - accuracy: 0.6230 - loss: 1.1249 - val_accuracy: 0.7516 - val_loss: 0.8765 - learning_rate: 1.0000e-05
Epoch 28/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.4844 - loss: 1.2338 - val_accuracy: 0.7516 - val_loss: 0.8765 - learning_rate: 1.0000e-05
Epoch 29/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 63ms/step - accuracy: 0.6426 - loss: 1.0572 - val_accuracy: 0.7579 - val_loss: 0.8704 - learning_rate: 1.0000e-05
Epoch 30/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.6719 - loss: 0.9796 - val_accuracy: 0.7579 - val_loss: 0.8700 - learning_rate: 1.0000e-05
Epoch 31/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 63ms/step - accuracy: 0.6267 - loss: 1.0868 - val_accuracy: 0.7516 - val_loss: 0.8677 - learning_rate: 1.0000e-05
Epoch 32/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.6250 - loss: 1.0208 - val_accuracy: 0.7516 - val_loss: 0.8675 - learning_rate: 1.0000e-05
Epoch 33/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 65ms/step - accuracy: 0.6243 - loss: 1.0879 - val_accuracy: 0.7516 - val_loss: 0.8586 - learning_rate: 1.0000e-05
Epoch 34/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.6250 - loss: 1.1664 - val_accuracy: 0.7516 - val_loss: 0.8585 - learning_rate: 1.0000e-05
Epoch 35/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 62ms/step - accuracy: 0.6282 - loss: 1.0966 - val_accuracy: 0.7558 - val_loss: 0.8552 - learning_rate: 1.0000e-05
Epoch 36/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.5938 - loss: 1.1636 - val_accuracy: 0.7537 - val_loss: 0.8551 - learning_rate: 1.0000e-05
Epoch 37/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 63ms/step - accuracy: 0.6392 - loss: 1.0627 - val_accuracy: 0.7579 - val_loss: 0.8531 - learning_rate: 1.0000e-05
Epoch 38/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.5625 - loss: 1.2617 - val_accuracy: 0.7558 - val_loss: 0.8531 - learning_rate: 1.0000e-05
Epoch 39/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 64ms/step - accuracy: 0.6468 - loss: 1.0494 - val_accuracy: 0.7537 - val_loss: 0.8586 - learning_rate: 1.0000e-05
Epoch 40/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.5312 - loss: 1.3756 - val_accuracy: 0.7537 - val_loss: 0.8582 - learning_rate: 1.0000e-05
Epoch 41/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 63ms/step - accuracy: 0.6281 - loss: 1.0734 - val_accuracy: 0.7453 - val_loss: 0.8578 - learning_rate: 1.0000e-05
Epoch 42/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.6406 - loss: 1.0711 - val_accuracy: 0.7453 - val_loss: 0.8576 - learning_rate: 1.0000e-05
Epoch 43/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 64ms/step - accuracy: 0.6522 - loss: 1.0520 - val_accuracy: 0.7411 - val_loss: 0.8663 - learning_rate: 1.0000e-05
Epoch 44/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.5781 - loss: 1.1601 - val_accuracy: 0.7411 - val_loss: 0.8661 - learning_rate: 1.0000e-05
Epoch 45/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 63ms/step - accuracy: 0.6218 - loss: 1.1116 - val_accuracy: 0.7495 - val_loss: 0.8505 - learning_rate: 1.0000e-05
Epoch 46/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.6094 - loss: 0.8948 - val_accuracy: 0.7495 - val_loss: 0.8498 - learning_rate: 1.0000e-05
Epoch 47/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 64ms/step - accuracy: 0.6409 - loss: 1.0584 - val_accuracy: 0.7537 - val_loss: 0.8558 - learning_rate: 1.0000e-05
Epoch 48/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.7500 - loss: 0.9177 - val_accuracy: 0.7537 - val_loss: 0.8552 - learning_rate: 1.0000e-05
Epoch 49/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 65ms/step - accuracy: 0.6310 - loss: 1.0924 - val_accuracy: 0.7474 - val_loss: 0.8453 - learning_rate: 1.0000e-05
Epoch 50/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - accuracy: 0.5312 - loss: 1.2822 - val_accuracy: 0.7453 - val_loss: 0.8455 - learning_rate: 1.0000e-05
In [ ]:
plt.plot(history_3.history['accuracy'])
plt.plot(history_3.history['val_accuracy'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()
No description has been provided for this image

Observations:

  • Model3 exhibit Underfitting Behavior now....but the increase in the number of Epochs made the difference on the Validation set.
In [ ]:
accuracy = model3.evaluate(X_test_normalized, y_test_encoded, verbose=2)  # Evaluate the model on test data
15/15 - 0s - 5ms/step - accuracy: 0.7158 - loss: 0.8613
In [ ]:
# Output probabilities
y_pred=model3.predict(X_test_normalized)
15/15 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step 
In [ ]:
# Obtaining the categorical values from y_test_encoded and y_pred
y_pred_arg=np.argmax(y_pred,axis=1)
y_test_arg=np.argmax(y_test_encoded,axis=1)

# Confusion matrix() function which is also predefined in tensorflow module
confusion_matrix = tf.math.confusion_matrix(y_test_arg,y_pred_arg)     # Confusion matrix
f, ax = plt.subplots(figsize=(10, 9))
sns.heatmap(
    confusion_matrix,
    annot=True,
    linewidths=.4,
    fmt="d",
    square=True,
    ax=ax,
    cmap="Blues"
)
# Setting the labels to both the axes
ax.set_xlabel('Predicted labels');ax.set_ylabel('True labels');
ax.set_title('Confusion Matrix');
ax.xaxis.set_ticklabels(list(enc.classes_),rotation=45)
ax.yaxis.set_ticklabels(list(enc.classes_),rotation=45)
plt.show()
No description has been provided for this image
In [ ]:
# Display the classification report

from sklearn.metrics import classification_report

# Predict on the test data
predictions = model3.predict(X_test_normalized)
predicted_labels = np.argmax(predictions, axis=1)
true_labels = np.argmax(y_test_encoded, axis=1)

# Performance Metrics
print("\n All Performance Metrics done on TEST dataset:\n")
print(classification_report(true_labels, predicted_labels))
15/15 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step 

 All Performance Metrics done on TEST dataset:

              precision    recall  f1-score   support

           0       0.00      0.00      0.00        28
           1       0.82      0.92      0.87        36
           2       0.73      0.92      0.81        24
           3       0.58      0.96      0.72        46
           4       0.89      0.50      0.64        16
           5       0.73      0.57      0.64        53
           6       0.69      0.84      0.76        69
           7       0.85      0.65      0.73        17
           8       0.75      0.85      0.79        59
           9       0.00      0.00      0.00        24
          10       0.78      0.93      0.85        55
          11       0.72      0.69      0.70        48

    accuracy                           0.72       475
   macro avg       0.63      0.65      0.63       475
weighted avg       0.65      0.72      0.67       475


VGG16 Model

In [ ]:
from tensorflow.keras.applications import VGG16
from tensorflow.keras.layers import Flatten, Dense, Dropout, Input
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.models import Model

# Define the input shape
input_tensor = Input(shape=(64, 64, 3))

# Load the VGG16 model without the top layers, using the functional API
vgg16_base = VGG16(weights='imagenet', include_top=False, input_tensor=input_tensor)

# Freeze the VGG16 base layers
for layer in vgg16_base.layers:
    layer.trainable = False

# Add custom layers on top of the VGG16 base model
x = Flatten()(vgg16_base.output)
x = Dense(2048, activation='relu')(x)
x = Dropout(0.5)(x)
x = Dense(2048, activation='relu')(x)
x = Dropout(0.5)(x)
output = Dense(12, activation='softmax')(x)  # Output layer for 12 classes

# Define the complete model
model_vgg16 = Model(inputs=input_tensor, outputs=output)

# Compile the model
opt = Adam()
model_vgg16.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])

# Display the model summary
model_vgg16.summary()
Model: "functional_60"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                         ┃ Output Shape                ┃         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ input_layer_8 (InputLayer)           │ (None, 64, 64, 3)           │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block1_conv1 (Conv2D)                │ (None, 64, 64, 64)          │           1,792 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block1_conv2 (Conv2D)                │ (None, 64, 64, 64)          │          36,928 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block1_pool (MaxPooling2D)           │ (None, 32, 32, 64)          │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block2_conv1 (Conv2D)                │ (None, 32, 32, 128)         │          73,856 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block2_conv2 (Conv2D)                │ (None, 32, 32, 128)         │         147,584 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block2_pool (MaxPooling2D)           │ (None, 16, 16, 128)         │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block3_conv1 (Conv2D)                │ (None, 16, 16, 256)         │         295,168 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block3_conv2 (Conv2D)                │ (None, 16, 16, 256)         │         590,080 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block3_conv3 (Conv2D)                │ (None, 16, 16, 256)         │         590,080 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block3_pool (MaxPooling2D)           │ (None, 8, 8, 256)           │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block4_conv1 (Conv2D)                │ (None, 8, 8, 512)           │       1,180,160 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block4_conv2 (Conv2D)                │ (None, 8, 8, 512)           │       2,359,808 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block4_conv3 (Conv2D)                │ (None, 8, 8, 512)           │       2,359,808 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block4_pool (MaxPooling2D)           │ (None, 4, 4, 512)           │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block5_conv1 (Conv2D)                │ (None, 4, 4, 512)           │       2,359,808 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block5_conv2 (Conv2D)                │ (None, 4, 4, 512)           │       2,359,808 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block5_conv3 (Conv2D)                │ (None, 4, 4, 512)           │       2,359,808 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block5_pool (MaxPooling2D)           │ (None, 2, 2, 512)           │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ flatten_8 (Flatten)                  │ (None, 2048)                │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_20 (Dense)                     │ (None, 2048)                │       4,196,352 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dropout_12 (Dropout)                 │ (None, 2048)                │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_21 (Dense)                     │ (None, 2048)                │       4,196,352 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dropout_13 (Dropout)                 │ (None, 2048)                │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_22 (Dense)                     │ (None, 12)                  │          24,588 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 23,131,980 (88.24 MB)
 Trainable params: 8,417,292 (32.11 MB)
 Non-trainable params: 14,714,688 (56.13 MB)

Explanation of Key Components with the VGG16 model:

    1. VGG16 Base Model: We load the VGG16 model pre-trained on ImageNet, excluding the top classification layer (include_top=False). This version will work as a feature extractor.
    1. Freezing Layers: Freezing the VGG16 layers keeps the learned ImageNet weights intact, which is useful if the dataset size is small.
    1. Custom Fully Connected Layers: We add a fully connected layer with BatchNormalization and Dropout for regularization, followed by a 12-neuron output layer with softmax for multiclass classification.
    1. Compilation: The model is compiled with the Adam optimizer and categorical cross-entropy loss.

This code fine-tunes only the added layers, making it efficient while benefiting from VGG16’s pre-trained features.

In [ ]:
# Parameters for training
epochs = 50
batch_size = 64

# Fit the model using the specified parameters
history_vgg16 = model_vgg16.fit(
    train_datagen.flow(X_train_normalized, y_train_encoded, batch_size=batch_size, shuffle=False),
    epochs=epochs,
    steps_per_epoch=X_train_normalized.shape[0] // batch_size,
    validation_data=(X_val_normalized, y_val_encoded),
    verbose=1,
    callbacks=[learning_rate_reduction]
)
Epoch 1/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 19s 205ms/step - accuracy: 0.1961 - loss: 2.5828 - val_accuracy: 0.4021 - val_loss: 1.7457 - learning_rate: 0.0010
Epoch 2/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.3438 - loss: 1.6298 - val_accuracy: 0.4021 - val_loss: 1.7329 - learning_rate: 0.0010
Epoch 3/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 65ms/step - accuracy: 0.3957 - loss: 1.6730 - val_accuracy: 0.4632 - val_loss: 1.5070 - learning_rate: 0.0010
Epoch 4/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.3750 - loss: 1.7015 - val_accuracy: 0.4653 - val_loss: 1.5077 - learning_rate: 0.0010
Epoch 5/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 66ms/step - accuracy: 0.4673 - loss: 1.4903 - val_accuracy: 0.4989 - val_loss: 1.4321 - learning_rate: 0.0010
Epoch 6/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.5000 - loss: 1.3659 - val_accuracy: 0.4884 - val_loss: 1.4406 - learning_rate: 0.0010
Epoch 7/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 65ms/step - accuracy: 0.4708 - loss: 1.4298 - val_accuracy: 0.4632 - val_loss: 1.4341 - learning_rate: 0.0010
Epoch 8/50
 1/59 ━━━━━━━━━━━━━━━━━━━━ 1s 24ms/step - accuracy: 0.5156 - loss: 1.2276
Epoch 8: ReduceLROnPlateau reducing learning rate to 0.0005000000237487257.
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.5156 - loss: 1.2276 - val_accuracy: 0.4842 - val_loss: 1.4006 - learning_rate: 0.0010
Epoch 9/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 67ms/step - accuracy: 0.5340 - loss: 1.3200 - val_accuracy: 0.5558 - val_loss: 1.3054 - learning_rate: 5.0000e-04
Epoch 10/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.5312 - loss: 1.3465 - val_accuracy: 0.5516 - val_loss: 1.3052 - learning_rate: 5.0000e-04
Epoch 11/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 66ms/step - accuracy: 0.5595 - loss: 1.2316 - val_accuracy: 0.5179 - val_loss: 1.3102 - learning_rate: 5.0000e-04
Epoch 12/50
 1/59 ━━━━━━━━━━━━━━━━━━━━ 1s 25ms/step - accuracy: 0.6094 - loss: 1.2772
Epoch 12: ReduceLROnPlateau reducing learning rate to 0.0002500000118743628.
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.6094 - loss: 1.2772 - val_accuracy: 0.5137 - val_loss: 1.3062 - learning_rate: 5.0000e-04
Epoch 13/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 65ms/step - accuracy: 0.5843 - loss: 1.1428 - val_accuracy: 0.5263 - val_loss: 1.2923 - learning_rate: 2.5000e-04
Epoch 14/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.6094 - loss: 1.1109 - val_accuracy: 0.5284 - val_loss: 1.2903 - learning_rate: 2.5000e-04
Epoch 15/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 68ms/step - accuracy: 0.6054 - loss: 1.1124 - val_accuracy: 0.5705 - val_loss: 1.2548 - learning_rate: 2.5000e-04
Epoch 16/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.5938 - loss: 1.1570 - val_accuracy: 0.5747 - val_loss: 1.2553 - learning_rate: 2.5000e-04
Epoch 17/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 66ms/step - accuracy: 0.6212 - loss: 1.0507 - val_accuracy: 0.5558 - val_loss: 1.2318 - learning_rate: 2.5000e-04
Epoch 18/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.6719 - loss: 0.8363 - val_accuracy: 0.5537 - val_loss: 1.2301 - learning_rate: 2.5000e-04
Epoch 19/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 62ms/step - accuracy: 0.6373 - loss: 1.0290
Epoch 19: ReduceLROnPlateau reducing learning rate to 0.0001250000059371814.
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 66ms/step - accuracy: 0.6371 - loss: 1.0291 - val_accuracy: 0.5516 - val_loss: 1.2411 - learning_rate: 2.5000e-04
Epoch 20/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.6094 - loss: 1.0653 - val_accuracy: 0.5537 - val_loss: 1.2408 - learning_rate: 1.2500e-04
Epoch 21/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 5s 70ms/step - accuracy: 0.6389 - loss: 1.0222 - val_accuracy: 0.5684 - val_loss: 1.2113 - learning_rate: 1.2500e-04
Epoch 22/50
 1/59 ━━━━━━━━━━━━━━━━━━━━ 1s 23ms/step - accuracy: 0.6406 - loss: 1.0421
Epoch 22: ReduceLROnPlateau reducing learning rate to 6.25000029685907e-05.
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.6406 - loss: 1.0421 - val_accuracy: 0.5684 - val_loss: 1.2125 - learning_rate: 1.2500e-04
Epoch 23/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 65ms/step - accuracy: 0.6473 - loss: 0.9984 - val_accuracy: 0.5663 - val_loss: 1.2044 - learning_rate: 6.2500e-05
Epoch 24/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.6250 - loss: 0.9819 - val_accuracy: 0.5642 - val_loss: 1.2046 - learning_rate: 6.2500e-05
Epoch 25/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 64ms/step - accuracy: 0.6418 - loss: 0.9780
Epoch 25: ReduceLROnPlateau reducing learning rate to 3.125000148429535e-05.
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 68ms/step - accuracy: 0.6419 - loss: 0.9779 - val_accuracy: 0.5705 - val_loss: 1.2071 - learning_rate: 6.2500e-05
Epoch 26/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7344 - loss: 0.8918 - val_accuracy: 0.5747 - val_loss: 1.2073 - learning_rate: 3.1250e-05
Epoch 27/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 66ms/step - accuracy: 0.6527 - loss: 0.9476 - val_accuracy: 0.5663 - val_loss: 1.1964 - learning_rate: 3.1250e-05
Epoch 28/50
 1/59 ━━━━━━━━━━━━━━━━━━━━ 1s 25ms/step - accuracy: 0.6875 - loss: 0.9719
Epoch 28: ReduceLROnPlateau reducing learning rate to 1.5625000742147677e-05.
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.6875 - loss: 0.9719 - val_accuracy: 0.5642 - val_loss: 1.1968 - learning_rate: 3.1250e-05
Epoch 29/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 66ms/step - accuracy: 0.6399 - loss: 0.9582 - val_accuracy: 0.5642 - val_loss: 1.1951 - learning_rate: 1.5625e-05
Epoch 30/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8333 - loss: 0.4822 - val_accuracy: 0.5642 - val_loss: 1.1949 - learning_rate: 1.5625e-05
Epoch 31/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 64ms/step - accuracy: 0.6501 - loss: 0.9625
Epoch 31: ReduceLROnPlateau reducing learning rate to 1e-05.
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 68ms/step - accuracy: 0.6502 - loss: 0.9624 - val_accuracy: 0.5705 - val_loss: 1.1944 - learning_rate: 1.5625e-05
Epoch 32/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.6562 - loss: 0.9838 - val_accuracy: 0.5726 - val_loss: 1.1942 - learning_rate: 1.0000e-05
Epoch 33/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 66ms/step - accuracy: 0.6494 - loss: 0.9777 - val_accuracy: 0.5768 - val_loss: 1.1932 - learning_rate: 1.0000e-05
Epoch 34/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.6719 - loss: 0.9171 - val_accuracy: 0.5768 - val_loss: 1.1932 - learning_rate: 1.0000e-05
Epoch 35/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 5s 70ms/step - accuracy: 0.6565 - loss: 0.9698 - val_accuracy: 0.5747 - val_loss: 1.1933 - learning_rate: 1.0000e-05
Epoch 36/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.6875 - loss: 0.9319 - val_accuracy: 0.5747 - val_loss: 1.1934 - learning_rate: 1.0000e-05
Epoch 37/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 66ms/step - accuracy: 0.6664 - loss: 0.9332 - val_accuracy: 0.5705 - val_loss: 1.1938 - learning_rate: 1.0000e-05
Epoch 38/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.5938 - loss: 1.0382 - val_accuracy: 0.5705 - val_loss: 1.1939 - learning_rate: 1.0000e-05
Epoch 39/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 64ms/step - accuracy: 0.6679 - loss: 0.9477 - val_accuracy: 0.5684 - val_loss: 1.1935 - learning_rate: 1.0000e-05
Epoch 40/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.6406 - loss: 0.9726 - val_accuracy: 0.5663 - val_loss: 1.1934 - learning_rate: 1.0000e-05
Epoch 41/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 68ms/step - accuracy: 0.6840 - loss: 0.9290 - val_accuracy: 0.5747 - val_loss: 1.1919 - learning_rate: 1.0000e-05
Epoch 42/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7188 - loss: 0.7773 - val_accuracy: 0.5747 - val_loss: 1.1918 - learning_rate: 1.0000e-05
Epoch 43/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 67ms/step - accuracy: 0.6619 - loss: 0.9463 - val_accuracy: 0.5705 - val_loss: 1.1936 - learning_rate: 1.0000e-05
Epoch 44/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.6562 - loss: 0.9171 - val_accuracy: 0.5726 - val_loss: 1.1934 - learning_rate: 1.0000e-05
Epoch 45/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 66ms/step - accuracy: 0.6583 - loss: 0.9427 - val_accuracy: 0.5747 - val_loss: 1.1908 - learning_rate: 1.0000e-05
Epoch 46/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.6406 - loss: 0.9428 - val_accuracy: 0.5747 - val_loss: 1.1908 - learning_rate: 1.0000e-05
Epoch 47/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 67ms/step - accuracy: 0.6605 - loss: 0.9457 - val_accuracy: 0.5768 - val_loss: 1.1909 - learning_rate: 1.0000e-05
Epoch 48/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7188 - loss: 0.7958 - val_accuracy: 0.5768 - val_loss: 1.1910 - learning_rate: 1.0000e-05
Epoch 49/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 67ms/step - accuracy: 0.6523 - loss: 0.9460 - val_accuracy: 0.5768 - val_loss: 1.1894 - learning_rate: 1.0000e-05
Epoch 50/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7031 - loss: 0.9027 - val_accuracy: 0.5768 - val_loss: 1.1893 - learning_rate: 1.0000e-05
In [ ]:
plt.plot(history_vgg16.history['accuracy'])
plt.plot(history_vgg16.history['val_accuracy'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()
No description has been provided for this image

Observations:

  • Overfitting after the 10th Epoch with the Validation accuracy averaging around 55%.
  • However, the Validation set became more uniform after only the 15 Epoch (less processing GPU time with this model).

Use the Final Model¶

Comments on the final model we selected so far and use the same in the below code to visualize the images.

In [ ]:
print("Evaluate model1 on test data:")
loss, accuracy = model1.evaluate(X_test_normalized, y_test_encoded, verbose=0)
print(f"Test Loss: {loss:.4f}")
print(f"Test Accuracy: {accuracy:.4f}")
print("\nEvaluate model2 on test data:")
loss, accuracy = model2.evaluate(X_test_normalized, y_test_encoded, verbose=0)
print(f"Test Loss: {loss:.4f}")
print(f"Test Accuracy: {accuracy:.4f}")
print("\nEvaluate model3 on test data:")
loss, accuracy = model3.evaluate(X_test_normalized, y_test_encoded, verbose=0)
print(f"Test Loss: {loss:.4f}")
print(f"Test Accuracy: {accuracy:.4f}")
print("\nEvaluate model_vgg16 on test data:")
loss, accuracy = model_vgg16.evaluate(X_test_normalized, y_test_encoded, verbose=0)
print(f"Test Loss: {loss:.4f}")
print(f"Test Accuracy: {accuracy:.4f}")
Evaluate model1 on test data:
Test Loss: 4.1427
Test Accuracy: 0.7053

Evaluate model2 on test data:
Test Loss: 0.8976
Test Accuracy: 0.7116

Evaluate model3 on test data:
Test Loss: 0.8613
Test Accuracy: 0.7158

Evaluate model_vgg16 on test data:
Test Loss: 1.1478
Test Accuracy: 0.6316

GOAL: Lower Test Loss whilst Higher Test Accuracy

SUMMARY:

Model Test Loss Test Accuracy Observations Model1 4.1427 0.7053 High loss suggests overfitting; moderate accuracy Model2 0.8976 0.7116 Good balance of low loss and high accuracy Model3 0.8613 0.7158 Best performance with highest accuracy and lowest loss Model_vgg16 1.1478 0.6316 Underperforming; possible underfitting
  • Its very noteworthy to recognize how Normalized and Data Augmentation increased the Model2 Test Accuracy (in some cases in the 84%).
  • Recommendations: Utilize Normalization and Data Augmentation during all model training.
  1. Accuracy:

    • Model3 performs best with 0.7158 accuracy, followed by Model2 (0.7116). • Model1 is lower at 0.7053. • Model_vgg16 performs worst at 0.6316.

  2. Loss:

    • Model3 has the lowest loss (0.8613), showing more stable predictions. • Model2 follows with 0.8976. • Model1 has the highest loss (4.1427), indicating poor generalization. • Model_vgg16 is higher at 1.1478.

  3. Overfitting/Underfitting:

    • Model1 likely overfits due to high loss despite moderate accuracy. • Model_vgg16 may be underfitting, as it has low accuracy and higher loss.

  4. Conclusion:

    • Model3 is currently the best choice, with Model2 close behind. • Model1 and Model_vgg16 need adjustments to perform well.

Opportunities for Improvement:

  • All models show room for improvement in performance. With test accuracies around 67%, we might consider:
  • Fine-tuning hyperparameters (learning rate, batch size, etc.).
  • Experimenting with different architectures (adding layers, changing activation functions).
  • Utilizing more advanced techniques (data augmentation, transfer learning).
  • Reviewing the data preprocessing steps to ensure optimal input to the models.

Visualizing the prediction using the selected "Best" model: model3¶

In [ ]:
# Visualizing the predicted and correct label of images from test data
plt.figure(figsize=(4,4))
plt.imshow(X_test[2])
plt.show()
## Complete the code to predict the test data using the final model selected
print('Predicted Label', enc.inverse_transform(model3.predict((X_test_normalized[2].reshape(1,64,64,3)))))   # reshaping the input image as we are only trying to predict using a single image
print('True Label', enc.inverse_transform(y_test_encoded)[2])
No description has been provided for this image
1/1 ━━━━━━━━━━━━━━━━━━━━ 1s 544ms/step
Predicted Label ['Small-flowered Cranesbill']
True Label Small-flowered Cranesbill

Lets observe how well it can predict a few more after recognizing a category of images from the 12 that it was trained to recognize:

In [ ]:
# Now, display several images to compare predictions using ouu "Best" model
indices = [122, 33, 190, 200, 22, 75, 311, 14]  # Selected indices of images to display

# Loop over each image index and display the image with its prediction and true label
for idx in indices:
    # Visualizing the image
    plt.figure(figsize=(2, 2))
    plt.imshow(X_test[idx])
    plt.title(f"Image Selected: {idx}")
    plt.show()

    # Get predicted label for the image
    prediction = model3.predict(X_test_normalized[idx].reshape(1, 64, 64, 3))  # Model prediction

    # Convert prediction to a 1D array and apply argmax to get the predicted class index
    predicted_class_idx = np.argmax(prediction, axis=-1)  # Apply argmax to get class index

    # Inverse transform the predicted label index to get the label name
    predicted_label = enc.inverse_transform(predicted_class_idx.reshape(1, -1))[0]  # Reshape to pass as 2D array

    # Get the true label for the image
    true_label = enc.inverse_transform(y_test_encoded[idx].reshape(1, -1))[0]  # Reshape true label

    # Print predicted and true labels
    print(f"Image {idx} - Predicted Label: \033[1;33m{predicted_label}\033[0m, True Label: \033[1;32m{true_label}\033[0m")
No description has been provided for this image
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 19ms/step
Image 122 - Predicted Label: Black-grass, True Label: Sugar beet
No description has been provided for this image
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 18ms/step
Image 33 - Predicted Label: Black-grass, True Label: Scentless Mayweed
No description has been provided for this image
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 18ms/step
Image 190 - Predicted Label: Black-grass, True Label: Cleavers
No description has been provided for this image
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 19ms/step
Image 200 - Predicted Label: Black-grass, True Label: Common Chickweed
No description has been provided for this image
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 18ms/step
Image 22 - Predicted Label: Black-grass, True Label: Loose Silky-bent
No description has been provided for this image
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 18ms/step
Image 75 - Predicted Label: Black-grass, True Label: Common Chickweed
No description has been provided for this image
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 18ms/step
Image 311 - Predicted Label: Black-grass, True Label: Fat Hen
No description has been provided for this image
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 19ms/step
Image 14 - Predicted Label: Black-grass, True Label: Small-flowered Cranesbill
In [ ]:
import matplotlib.pyplot as plt
import numpy as np


# List of image indices to display and compare
indices = [200, 257, 459, 355, 252, 450, 351, 54, 253]  # 9 images

# Define the number of columns and rows
num_columns = 3
num_rows = 3  # 3 rows and 3 columns for 9 images

# Create a figure with subplots
fig, axes = plt.subplots(num_rows, num_columns, figsize=(12, 12))  # Adjust the figsize to fit 9 images

# Loop over each image index and display the image with its prediction and true label
for idx, ax in zip(indices, axes.flat):
    # Visualizing the image
    ax.imshow(X_test[idx])
    ax.set_title(f"Image {idx}")
    ax.axis('off')  # Hide axes for a cleaner display

    # Get predicted label for the image
    prediction = model_vgg16.predict(X_test_normalized[idx].reshape(1, 64, 64, 3))  # Model prediction
    predicted_class_idx = np.argmax(prediction, axis=-1)  # Apply argmax to get class index
    predicted_label = enc.inverse_transform(predicted_class_idx.reshape(1, -1))[0]  # Inverse transform to label

    # Get the true label for the image
    true_label = enc.inverse_transform(y_test_encoded[idx].reshape(1, -1))[0]  # Inverse transform to label

    # Display the predicted and true labels as text below the image
    ax.text(0.5, -0.1, f"Predicted: {predicted_label}\nTrue: {true_label}",
            ha='center', va='center', transform=ax.transAxes, fontsize=10)


print("\n\nUsing VGG16 Untrained - 'out of the box':\n")

# Adjust layout to prevent overlap
plt.tight_layout()
plt.show()
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 18ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 17ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 17ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 18ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 17ms/step


Using VGG16 Untrained - 'out of the box':

No description has been provided for this image
In [ ]:
import matplotlib.pyplot as plt
import numpy as np

# List of image indices to display and compare
indices = [2, 57, 59, 55, 52, 50, 51, 54, 53]  # 9 images

# Define the number of columns and rows
num_columns = 3
num_rows = 3  # 3 rows and 3 columns for 9 images

# Create a figure with subplots
fig, axes = plt.subplots(num_rows, num_columns, figsize=(12, 12))  # Adjust the figsize to fit 9 images

# Loop over each image index and display the image with its prediction and true label
for idx, ax in zip(indices, axes.flat):
    # Visualizing the image
    ax.imshow(X_test[idx])
    ax.set_title(f"Image {idx}")
    ax.axis('off')  # Hide axes for a cleaner display

    # Get predicted label for the image
    prediction = model2.predict(X_test_normalized[idx].reshape(1, 64, 64, 3))  # Model prediction
    predicted_class_idx = np.argmax(prediction, axis=-1)  # Apply argmax to get class index
    predicted_label = enc.inverse_transform(predicted_class_idx.reshape(1, -1))[0]  # Inverse transform to label

    # Get the true label for the image
    true_label = enc.inverse_transform(y_test_encoded[idx].reshape(1, -1))[0]  # Inverse transform to label

    # Display the predicted and true labels as text below the image
    ax.text(0.5, -0.1, f"Predicted: {predicted_label}\nTrue: {true_label}",
            ha='center', va='center', transform=ax.transAxes, fontsize=10)

print("Using Trained Model2 :")

# Adjust layout to prevent overlap
plt.tight_layout()
plt.show()
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 17ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 17ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 15ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 15ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 15ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 15ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step
Using Trained Model2 :
No description has been provided for this image

EXTRA CREDIT

In [ ]:
# (Frozen) VGG16 TRAINING WITH OUR SET OF IMAGES

import tensorflow as tf
from tensorflow.keras.applications import VGG16
from tensorflow.keras.layers import Input, Flatten, Dense, Dropout
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam

# Assuming X_train_normalized, y_train_encoded, X_val_normalized, y_val_encoded, and learning_rate_reduction are defined from previous code

# Define the input shape
input_tensor = Input(shape=(64, 64, 3))

# Load the VGG16 model without the top layers, using the functional API
vgg16_base = VGG16(weights='imagenet', include_top=False, input_tensor=input_tensor)

# Freeze the VGG16 base layers
for layer in vgg16_base.layers:
    layer.trainable = False

# Add custom layers on top of the VGG16 base model
x = Flatten()(vgg16_base.output)
x = Dense(2048, activation='relu')(x)
x = Dropout(0.5)(x)
x = Dense(2048, activation='relu')(x)
x = Dropout(0.5)(x)
output = Dense(12, activation='softmax')(x)  # Output layer for 12 classes

# Define the complete model
model_vgg16 = Model(inputs=input_tensor, outputs=output)

# Compile the model
opt = Adam()
model_vgg16.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])

# Display the model summary
model_vgg16.summary()

# Parameters for training
epochs = 50
batch_size = 64

# Assuming train_datagen is defined from previous code
history_vgg16 = model_vgg16.fit(
    train_datagen.flow(X_train_normalized, y_train_encoded, batch_size=batch_size, shuffle=False),
    epochs=epochs,
    steps_per_epoch=X_train_normalized.shape[0] // batch_size,
    validation_data=(X_val_normalized, y_val_encoded),
    verbose=1,
    callbacks=[learning_rate_reduction]
)
Model: "functional_61"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                         ┃ Output Shape                ┃         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ input_layer_9 (InputLayer)           │ (None, 64, 64, 3)           │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block1_conv1 (Conv2D)                │ (None, 64, 64, 64)          │           1,792 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block1_conv2 (Conv2D)                │ (None, 64, 64, 64)          │          36,928 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block1_pool (MaxPooling2D)           │ (None, 32, 32, 64)          │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block2_conv1 (Conv2D)                │ (None, 32, 32, 128)         │          73,856 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block2_conv2 (Conv2D)                │ (None, 32, 32, 128)         │         147,584 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block2_pool (MaxPooling2D)           │ (None, 16, 16, 128)         │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block3_conv1 (Conv2D)                │ (None, 16, 16, 256)         │         295,168 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block3_conv2 (Conv2D)                │ (None, 16, 16, 256)         │         590,080 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block3_conv3 (Conv2D)                │ (None, 16, 16, 256)         │         590,080 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block3_pool (MaxPooling2D)           │ (None, 8, 8, 256)           │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block4_conv1 (Conv2D)                │ (None, 8, 8, 512)           │       1,180,160 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block4_conv2 (Conv2D)                │ (None, 8, 8, 512)           │       2,359,808 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block4_conv3 (Conv2D)                │ (None, 8, 8, 512)           │       2,359,808 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block4_pool (MaxPooling2D)           │ (None, 4, 4, 512)           │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block5_conv1 (Conv2D)                │ (None, 4, 4, 512)           │       2,359,808 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block5_conv2 (Conv2D)                │ (None, 4, 4, 512)           │       2,359,808 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block5_conv3 (Conv2D)                │ (None, 4, 4, 512)           │       2,359,808 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ block5_pool (MaxPooling2D)           │ (None, 2, 2, 512)           │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ flatten_9 (Flatten)                  │ (None, 2048)                │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_23 (Dense)                     │ (None, 2048)                │       4,196,352 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dropout_14 (Dropout)                 │ (None, 2048)                │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_24 (Dense)                     │ (None, 2048)                │       4,196,352 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dropout_15 (Dropout)                 │ (None, 2048)                │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_25 (Dense)                     │ (None, 12)                  │          24,588 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 23,131,980 (88.24 MB)
 Trainable params: 8,417,292 (32.11 MB)
 Non-trainable params: 14,714,688 (56.13 MB)
Epoch 1/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 9s 102ms/step - accuracy: 0.1954 - loss: 2.5865 - val_accuracy: 0.3726 - val_loss: 1.7239 - learning_rate: 0.0010
Epoch 2/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.2344 - loss: 2.0087 - val_accuracy: 0.4042 - val_loss: 1.7266 - learning_rate: 0.0010
Epoch 3/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 69ms/step - accuracy: 0.3934 - loss: 1.7049 - val_accuracy: 0.4505 - val_loss: 1.6094 - learning_rate: 0.0010
Epoch 4/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.3906 - loss: 1.8359 - val_accuracy: 0.4905 - val_loss: 1.5507 - learning_rate: 0.0010
Epoch 5/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 65ms/step - accuracy: 0.4487 - loss: 1.5540 - val_accuracy: 0.4758 - val_loss: 1.4968 - learning_rate: 0.0010
Epoch 6/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.3906 - loss: 1.5349 - val_accuracy: 0.4611 - val_loss: 1.4951 - learning_rate: 0.0010
Epoch 7/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 65ms/step - accuracy: 0.4935 - loss: 1.4241 - val_accuracy: 0.5453 - val_loss: 1.3738 - learning_rate: 0.0010
Epoch 8/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.5625 - loss: 1.3215 - val_accuracy: 0.5495 - val_loss: 1.3711 - learning_rate: 0.0010
Epoch 9/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 67ms/step - accuracy: 0.5162 - loss: 1.3587 - val_accuracy: 0.5474 - val_loss: 1.3142 - learning_rate: 0.0010
Epoch 10/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.5312 - loss: 1.2945 - val_accuracy: 0.5326 - val_loss: 1.3123 - learning_rate: 0.0010
Epoch 11/50
58/59 ━━━━━━━━━━━━━━━━━━━━ 0s 62ms/step - accuracy: 0.5551 - loss: 1.2740
Epoch 11: ReduceLROnPlateau reducing learning rate to 0.0005000000237487257.
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 65ms/step - accuracy: 0.5550 - loss: 1.2737 - val_accuracy: 0.5411 - val_loss: 1.3078 - learning_rate: 0.0010
Epoch 12/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.5625 - loss: 1.1602 - val_accuracy: 0.5368 - val_loss: 1.3089 - learning_rate: 5.0000e-04
Epoch 13/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 65ms/step - accuracy: 0.5802 - loss: 1.1902 - val_accuracy: 0.5579 - val_loss: 1.2237 - learning_rate: 5.0000e-04
Epoch 14/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.5469 - loss: 1.2413 - val_accuracy: 0.5832 - val_loss: 1.2280 - learning_rate: 5.0000e-04
Epoch 15/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 66ms/step - accuracy: 0.5829 - loss: 1.1433 - val_accuracy: 0.5642 - val_loss: 1.2194 - learning_rate: 5.0000e-04
Epoch 16/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.5000 - loss: 1.2382 - val_accuracy: 0.5621 - val_loss: 1.2245 - learning_rate: 5.0000e-04
Epoch 17/50
58/59 ━━━━━━━━━━━━━━━━━━━━ 0s 63ms/step - accuracy: 0.6277 - loss: 1.0677
Epoch 17: ReduceLROnPlateau reducing learning rate to 0.0002500000118743628.
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 67ms/step - accuracy: 0.6272 - loss: 1.0686 - val_accuracy: 0.5705 - val_loss: 1.2277 - learning_rate: 5.0000e-04
Epoch 18/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.5781 - loss: 1.1288 - val_accuracy: 0.5726 - val_loss: 1.2399 - learning_rate: 2.5000e-04
Epoch 19/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 5s 69ms/step - accuracy: 0.6154 - loss: 1.0354 - val_accuracy: 0.5789 - val_loss: 1.2044 - learning_rate: 2.5000e-04
Epoch 20/50
 1/59 ━━━━━━━━━━━━━━━━━━━━ 1s 23ms/step - accuracy: 0.5312 - loss: 1.0995
Epoch 20: ReduceLROnPlateau reducing learning rate to 0.0001250000059371814.
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.5312 - loss: 1.0995 - val_accuracy: 0.5768 - val_loss: 1.2034 - learning_rate: 2.5000e-04
Epoch 21/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 65ms/step - accuracy: 0.6561 - loss: 0.9442 - val_accuracy: 0.5916 - val_loss: 1.1596 - learning_rate: 1.2500e-04
Epoch 22/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.6406 - loss: 0.9899 - val_accuracy: 0.5979 - val_loss: 1.1599 - learning_rate: 1.2500e-04
Epoch 23/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 65ms/step - accuracy: 0.6566 - loss: 0.9547 - val_accuracy: 0.5916 - val_loss: 1.1803 - learning_rate: 1.2500e-04
Epoch 24/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.5156 - loss: 1.0095 - val_accuracy: 0.5895 - val_loss: 1.1789 - learning_rate: 1.2500e-04
Epoch 25/50
58/59 ━━━━━━━━━━━━━━━━━━━━ 0s 63ms/step - accuracy: 0.6631 - loss: 0.9359
Epoch 25: ReduceLROnPlateau reducing learning rate to 6.25000029685907e-05.
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 66ms/step - accuracy: 0.6632 - loss: 0.9360 - val_accuracy: 0.5874 - val_loss: 1.1686 - learning_rate: 1.2500e-04
Epoch 26/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7656 - loss: 0.7716 - val_accuracy: 0.5895 - val_loss: 1.1684 - learning_rate: 6.2500e-05
Epoch 27/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 65ms/step - accuracy: 0.6719 - loss: 0.9129 - val_accuracy: 0.5895 - val_loss: 1.1620 - learning_rate: 6.2500e-05
Epoch 28/50
 1/59 ━━━━━━━━━━━━━━━━━━━━ 1s 25ms/step - accuracy: 0.6875 - loss: 0.9620
Epoch 28: ReduceLROnPlateau reducing learning rate to 3.125000148429535e-05.
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.6875 - loss: 0.9620 - val_accuracy: 0.5895 - val_loss: 1.1618 - learning_rate: 6.2500e-05
Epoch 29/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 67ms/step - accuracy: 0.6785 - loss: 0.8878 - val_accuracy: 0.5874 - val_loss: 1.1576 - learning_rate: 3.1250e-05
Epoch 30/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7031 - loss: 0.8622 - val_accuracy: 0.5895 - val_loss: 1.1576 - learning_rate: 3.1250e-05
Epoch 31/50
57/59 ━━━━━━━━━━━━━━━━━━━━ 0s 62ms/step - accuracy: 0.6732 - loss: 0.8967
Epoch 31: ReduceLROnPlateau reducing learning rate to 1.5625000742147677e-05.
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 65ms/step - accuracy: 0.6733 - loss: 0.8965 - val_accuracy: 0.5958 - val_loss: 1.1575 - learning_rate: 3.1250e-05
Epoch 32/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7812 - loss: 0.6112 - val_accuracy: 0.5979 - val_loss: 1.1575 - learning_rate: 1.5625e-05
Epoch 33/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 65ms/step - accuracy: 0.6804 - loss: 0.8843 - val_accuracy: 0.6042 - val_loss: 1.1518 - learning_rate: 1.5625e-05
Epoch 34/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.5938 - loss: 1.0226 - val_accuracy: 0.6021 - val_loss: 1.1519 - learning_rate: 1.5625e-05
Epoch 35/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 67ms/step - accuracy: 0.6802 - loss: 0.8886 - val_accuracy: 0.5979 - val_loss: 1.1502 - learning_rate: 1.5625e-05
Epoch 36/50
 1/59 ━━━━━━━━━━━━━━━━━━━━ 1s 24ms/step - accuracy: 0.5625 - loss: 0.9768
Epoch 36: ReduceLROnPlateau reducing learning rate to 1e-05.
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.5625 - loss: 0.9768 - val_accuracy: 0.5979 - val_loss: 1.1502 - learning_rate: 1.5625e-05
Epoch 37/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 64ms/step - accuracy: 0.6942 - loss: 0.8711 - val_accuracy: 0.6042 - val_loss: 1.1506 - learning_rate: 1.0000e-05
Epoch 38/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.6406 - loss: 0.9024 - val_accuracy: 0.6042 - val_loss: 1.1506 - learning_rate: 1.0000e-05
Epoch 39/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 66ms/step - accuracy: 0.6766 - loss: 0.8890 - val_accuracy: 0.5958 - val_loss: 1.1495 - learning_rate: 1.0000e-05
Epoch 40/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.6875 - loss: 0.8324 - val_accuracy: 0.5958 - val_loss: 1.1495 - learning_rate: 1.0000e-05
Epoch 41/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 5s 68ms/step - accuracy: 0.7005 - loss: 0.8696 - val_accuracy: 0.6042 - val_loss: 1.1494 - learning_rate: 1.0000e-05
Epoch 42/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7031 - loss: 0.7986 - val_accuracy: 0.6042 - val_loss: 1.1494 - learning_rate: 1.0000e-05
Epoch 43/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 64ms/step - accuracy: 0.6869 - loss: 0.8796 - val_accuracy: 0.6042 - val_loss: 1.1487 - learning_rate: 1.0000e-05
Epoch 44/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7031 - loss: 0.8407 - val_accuracy: 0.6063 - val_loss: 1.1489 - learning_rate: 1.0000e-05
Epoch 45/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 68ms/step - accuracy: 0.6771 - loss: 0.8926 - val_accuracy: 0.5958 - val_loss: 1.1487 - learning_rate: 1.0000e-05
Epoch 46/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7656 - loss: 0.8693 - val_accuracy: 0.5958 - val_loss: 1.1487 - learning_rate: 1.0000e-05
Epoch 47/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 64ms/step - accuracy: 0.6882 - loss: 0.8637 - val_accuracy: 0.5937 - val_loss: 1.1496 - learning_rate: 1.0000e-05
Epoch 48/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7344 - loss: 0.8341 - val_accuracy: 0.5958 - val_loss: 1.1495 - learning_rate: 1.0000e-05
Epoch 49/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 4s 64ms/step - accuracy: 0.6942 - loss: 0.8595 - val_accuracy: 0.6000 - val_loss: 1.1484 - learning_rate: 1.0000e-05
Epoch 50/50
59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.5938 - loss: 1.0127 - val_accuracy: 0.6000 - val_loss: 1.1484 - learning_rate: 1.0000e-05

Observations:

  • A frozen layers VGG16 did not give us better performance with higher accuracy than model 3.
  • Perhaps if we train the model with our smaller images dataset we could see if it improves its performance.
  • We will therefore adjust hyperparameters on the "best' performing model: model3

Fine-tuning Model3 for improved accuracy¶

In [ ]:
# Fine-tuning Model3 for improved accuracy

# Data Augmentation
from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest'
)

train_datagen.fit(X_train_normalized)


# Adjust hyperparameters
epochs = 75  # Increased epochs
batch_size = 32  # Reduced batch size
learning_rate = 0.0001 # Reduced learning rate

opt = Adam(learning_rate=learning_rate) #Update the optimizer
model3.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])

# Retrain the model
history_3 = model3.fit(
    train_datagen.flow(X_train_normalized, y_train_encoded, batch_size=batch_size),
    epochs=epochs,
    validation_data=(X_val_normalized, y_val_encoded),
    verbose=1,
    callbacks=[learning_rate_reduction] # Assuming learning_rate_reduction is defined
)

# Evaluate the fine-tuned model
loss, accuracy = model3.evaluate(X_test_normalized, y_test_encoded, verbose=0)
print(f"Fine-tuned Model3 Test Loss: {loss:.4f}")
print(f"Fine-tuned Model3 Test Accuracy: {accuracy:.4f}")

# Further analysis and potential improvements
# 1. Learning rate scheduling: Implement a more sophisticated learning rate schedule (e.g., ReduceLROnPlateau)
# 2. Regularization: Experiment with L1 or L2 regularization or different dropout rates.
# 3. Network architecture:  Consider adding more layers or changing the filter sizes in the convolutional layers.
# 4. Data preprocessing: Investigate other normalization/standardization techniques.
Epoch 1/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 15s 66ms/step - accuracy: 0.5067 - loss: 1.5207 - val_accuracy: 0.7326 - val_loss: 0.9064 - learning_rate: 1.0000e-04
Epoch 2/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 36ms/step - accuracy: 0.5241 - loss: 1.4642 - val_accuracy: 0.7432 - val_loss: 0.9072 - learning_rate: 1.0000e-04
Epoch 3/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 36ms/step - accuracy: 0.5562 - loss: 1.3728 - val_accuracy: 0.6716 - val_loss: 1.0823 - learning_rate: 1.0000e-04
Epoch 4/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 36ms/step - accuracy: 0.5272 - loss: 1.4310 - val_accuracy: 0.7326 - val_loss: 0.8816 - learning_rate: 1.0000e-04
Epoch 5/75
117/119 ━━━━━━━━━━━━━━━━━━━━ 0s 35ms/step - accuracy: 0.5502 - loss: 1.3666
Epoch 5: ReduceLROnPlateau reducing learning rate to 4.999999873689376e-05.
119/119 ━━━━━━━━━━━━━━━━━━━━ 4s 35ms/step - accuracy: 0.5505 - loss: 1.3661 - val_accuracy: 0.6484 - val_loss: 1.0311 - learning_rate: 1.0000e-04
Epoch 6/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 36ms/step - accuracy: 0.5630 - loss: 1.3023 - val_accuracy: 0.7095 - val_loss: 0.9028 - learning_rate: 5.0000e-05
Epoch 7/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 36ms/step - accuracy: 0.5687 - loss: 1.2783 - val_accuracy: 0.7368 - val_loss: 0.8327 - learning_rate: 5.0000e-05
Epoch 8/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 36ms/step - accuracy: 0.5703 - loss: 1.2782 - val_accuracy: 0.7516 - val_loss: 0.8603 - learning_rate: 5.0000e-05
Epoch 9/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 37ms/step - accuracy: 0.5773 - loss: 1.3048 - val_accuracy: 0.7621 - val_loss: 0.7932 - learning_rate: 5.0000e-05
Epoch 10/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 36ms/step - accuracy: 0.5642 - loss: 1.2872 - val_accuracy: 0.7411 - val_loss: 0.8310 - learning_rate: 5.0000e-05
Epoch 11/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 37ms/step - accuracy: 0.5904 - loss: 1.2315 - val_accuracy: 0.7116 - val_loss: 0.8782 - learning_rate: 5.0000e-05
Epoch 12/75
117/119 ━━━━━━━━━━━━━━━━━━━━ 0s 37ms/step - accuracy: 0.5889 - loss: 1.2436
Epoch 12: ReduceLROnPlateau reducing learning rate to 2.499999936844688e-05.
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 37ms/step - accuracy: 0.5890 - loss: 1.2432 - val_accuracy: 0.7389 - val_loss: 0.8718 - learning_rate: 5.0000e-05
Epoch 13/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 36ms/step - accuracy: 0.6107 - loss: 1.2026 - val_accuracy: 0.7600 - val_loss: 0.7868 - learning_rate: 2.5000e-05
Epoch 14/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 36ms/step - accuracy: 0.5821 - loss: 1.2490 - val_accuracy: 0.7516 - val_loss: 0.7899 - learning_rate: 2.5000e-05
Epoch 15/75
116/119 ━━━━━━━━━━━━━━━━━━━━ 0s 36ms/step - accuracy: 0.6097 - loss: 1.2077
Epoch 15: ReduceLROnPlateau reducing learning rate to 1.249999968422344e-05.
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 36ms/step - accuracy: 0.6096 - loss: 1.2077 - val_accuracy: 0.7453 - val_loss: 0.8010 - learning_rate: 2.5000e-05
Epoch 16/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 36ms/step - accuracy: 0.5935 - loss: 1.2079 - val_accuracy: 0.7453 - val_loss: 0.7937 - learning_rate: 1.2500e-05
Epoch 17/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 37ms/step - accuracy: 0.5981 - loss: 1.2007 - val_accuracy: 0.7579 - val_loss: 0.7725 - learning_rate: 1.2500e-05
Epoch 18/75
117/119 ━━━━━━━━━━━━━━━━━━━━ 0s 35ms/step - accuracy: 0.6141 - loss: 1.1676
Epoch 18: ReduceLROnPlateau reducing learning rate to 1e-05.
119/119 ━━━━━━━━━━━━━━━━━━━━ 4s 35ms/step - accuracy: 0.6139 - loss: 1.1679 - val_accuracy: 0.7411 - val_loss: 0.7853 - learning_rate: 1.2500e-05
Epoch 19/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 36ms/step - accuracy: 0.6138 - loss: 1.1993 - val_accuracy: 0.7389 - val_loss: 0.7916 - learning_rate: 1.0000e-05
Epoch 20/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 36ms/step - accuracy: 0.6275 - loss: 1.1784 - val_accuracy: 0.7432 - val_loss: 0.8040 - learning_rate: 1.0000e-05
Epoch 21/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 36ms/step - accuracy: 0.6302 - loss: 1.1429 - val_accuracy: 0.7642 - val_loss: 0.7699 - learning_rate: 1.0000e-05
Epoch 22/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 37ms/step - accuracy: 0.6220 - loss: 1.1944 - val_accuracy: 0.7600 - val_loss: 0.7657 - learning_rate: 1.0000e-05
Epoch 23/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 36ms/step - accuracy: 0.5983 - loss: 1.1948 - val_accuracy: 0.7600 - val_loss: 0.7646 - learning_rate: 1.0000e-05
Epoch 24/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 36ms/step - accuracy: 0.6178 - loss: 1.1493 - val_accuracy: 0.7642 - val_loss: 0.7655 - learning_rate: 1.0000e-05
Epoch 25/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 37ms/step - accuracy: 0.5989 - loss: 1.1987 - val_accuracy: 0.7516 - val_loss: 0.7755 - learning_rate: 1.0000e-05
Epoch 26/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 4s 35ms/step - accuracy: 0.6075 - loss: 1.1831 - val_accuracy: 0.7495 - val_loss: 0.7694 - learning_rate: 1.0000e-05
Epoch 27/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 36ms/step - accuracy: 0.6185 - loss: 1.1562 - val_accuracy: 0.7558 - val_loss: 0.7692 - learning_rate: 1.0000e-05
Epoch 28/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 37ms/step - accuracy: 0.6206 - loss: 1.1566 - val_accuracy: 0.7516 - val_loss: 0.7624 - learning_rate: 1.0000e-05
Epoch 29/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 36ms/step - accuracy: 0.6064 - loss: 1.1607 - val_accuracy: 0.7600 - val_loss: 0.7617 - learning_rate: 1.0000e-05
Epoch 30/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 37ms/step - accuracy: 0.6125 - loss: 1.1546 - val_accuracy: 0.7600 - val_loss: 0.7610 - learning_rate: 1.0000e-05
Epoch 31/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 36ms/step - accuracy: 0.6145 - loss: 1.1760 - val_accuracy: 0.7516 - val_loss: 0.7741 - learning_rate: 1.0000e-05
Epoch 32/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 4s 35ms/step - accuracy: 0.6252 - loss: 1.1619 - val_accuracy: 0.7663 - val_loss: 0.7614 - learning_rate: 1.0000e-05
Epoch 33/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 37ms/step - accuracy: 0.6145 - loss: 1.1913 - val_accuracy: 0.7495 - val_loss: 0.7736 - learning_rate: 1.0000e-05
Epoch 34/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 36ms/step - accuracy: 0.6291 - loss: 1.1273 - val_accuracy: 0.7474 - val_loss: 0.7740 - learning_rate: 1.0000e-05
Epoch 35/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 36ms/step - accuracy: 0.6097 - loss: 1.2096 - val_accuracy: 0.7474 - val_loss: 0.7738 - learning_rate: 1.0000e-05
Epoch 36/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 36ms/step - accuracy: 0.6269 - loss: 1.1638 - val_accuracy: 0.7537 - val_loss: 0.7720 - learning_rate: 1.0000e-05
Epoch 37/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 36ms/step - accuracy: 0.6068 - loss: 1.1418 - val_accuracy: 0.7516 - val_loss: 0.7568 - learning_rate: 1.0000e-05
Epoch 38/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 37ms/step - accuracy: 0.6261 - loss: 1.1302 - val_accuracy: 0.7537 - val_loss: 0.7643 - learning_rate: 1.0000e-05
Epoch 39/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 36ms/step - accuracy: 0.6463 - loss: 1.1428 - val_accuracy: 0.7642 - val_loss: 0.7602 - learning_rate: 1.0000e-05
Epoch 40/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 37ms/step - accuracy: 0.6280 - loss: 1.1070 - val_accuracy: 0.7663 - val_loss: 0.7418 - learning_rate: 1.0000e-05
Epoch 41/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 39ms/step - accuracy: 0.6199 - loss: 1.1567 - val_accuracy: 0.7621 - val_loss: 0.7695 - learning_rate: 1.0000e-05
Epoch 42/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 36ms/step - accuracy: 0.6380 - loss: 1.1498 - val_accuracy: 0.7495 - val_loss: 0.7682 - learning_rate: 1.0000e-05
Epoch 43/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 36ms/step - accuracy: 0.6180 - loss: 1.1674 - val_accuracy: 0.7642 - val_loss: 0.7641 - learning_rate: 1.0000e-05
Epoch 44/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 37ms/step - accuracy: 0.6435 - loss: 1.0651 - val_accuracy: 0.7516 - val_loss: 0.7642 - learning_rate: 1.0000e-05
Epoch 45/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 36ms/step - accuracy: 0.6266 - loss: 1.1431 - val_accuracy: 0.7579 - val_loss: 0.7528 - learning_rate: 1.0000e-05
Epoch 46/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 37ms/step - accuracy: 0.6311 - loss: 1.0980 - val_accuracy: 0.7537 - val_loss: 0.7682 - learning_rate: 1.0000e-05
Epoch 47/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 36ms/step - accuracy: 0.6353 - loss: 1.1379 - val_accuracy: 0.7453 - val_loss: 0.7798 - learning_rate: 1.0000e-05
Epoch 48/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 37ms/step - accuracy: 0.6185 - loss: 1.1474 - val_accuracy: 0.7474 - val_loss: 0.7538 - learning_rate: 1.0000e-05
Epoch 49/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 38ms/step - accuracy: 0.6128 - loss: 1.1896 - val_accuracy: 0.7600 - val_loss: 0.7634 - learning_rate: 1.0000e-05
Epoch 50/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 36ms/step - accuracy: 0.6373 - loss: 1.1194 - val_accuracy: 0.7684 - val_loss: 0.7538 - learning_rate: 1.0000e-05
Epoch 51/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 36ms/step - accuracy: 0.6356 - loss: 1.1222 - val_accuracy: 0.7768 - val_loss: 0.7284 - learning_rate: 1.0000e-05
Epoch 52/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 36ms/step - accuracy: 0.6324 - loss: 1.1090 - val_accuracy: 0.7663 - val_loss: 0.7387 - learning_rate: 1.0000e-05
Epoch 53/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 36ms/step - accuracy: 0.6124 - loss: 1.1603 - val_accuracy: 0.7726 - val_loss: 0.7432 - learning_rate: 1.0000e-05
Epoch 54/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 39ms/step - accuracy: 0.6421 - loss: 1.1011 - val_accuracy: 0.7621 - val_loss: 0.7415 - learning_rate: 1.0000e-05
Epoch 55/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 38ms/step - accuracy: 0.6213 - loss: 1.1348 - val_accuracy: 0.7642 - val_loss: 0.7383 - learning_rate: 1.0000e-05
Epoch 56/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 38ms/step - accuracy: 0.6329 - loss: 1.1105 - val_accuracy: 0.7621 - val_loss: 0.7526 - learning_rate: 1.0000e-05
Epoch 57/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 38ms/step - accuracy: 0.6338 - loss: 1.0994 - val_accuracy: 0.7621 - val_loss: 0.7560 - learning_rate: 1.0000e-05
Epoch 58/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 37ms/step - accuracy: 0.6427 - loss: 1.1130 - val_accuracy: 0.7579 - val_loss: 0.7568 - learning_rate: 1.0000e-05
Epoch 59/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 38ms/step - accuracy: 0.6310 - loss: 1.1270 - val_accuracy: 0.7537 - val_loss: 0.7715 - learning_rate: 1.0000e-05
Epoch 60/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 38ms/step - accuracy: 0.6402 - loss: 1.0943 - val_accuracy: 0.7600 - val_loss: 0.7528 - learning_rate: 1.0000e-05
Epoch 61/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 38ms/step - accuracy: 0.6521 - loss: 1.0892 - val_accuracy: 0.7600 - val_loss: 0.7550 - learning_rate: 1.0000e-05
Epoch 62/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 38ms/step - accuracy: 0.6451 - loss: 1.1049 - val_accuracy: 0.7600 - val_loss: 0.7640 - learning_rate: 1.0000e-05
Epoch 63/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 37ms/step - accuracy: 0.6460 - loss: 1.0849 - val_accuracy: 0.7684 - val_loss: 0.7505 - learning_rate: 1.0000e-05
Epoch 64/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 37ms/step - accuracy: 0.6400 - loss: 1.1231 - val_accuracy: 0.7789 - val_loss: 0.7333 - learning_rate: 1.0000e-05
Epoch 65/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 38ms/step - accuracy: 0.6193 - loss: 1.1679 - val_accuracy: 0.7789 - val_loss: 0.7218 - learning_rate: 1.0000e-05
Epoch 66/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 37ms/step - accuracy: 0.6428 - loss: 1.1004 - val_accuracy: 0.7684 - val_loss: 0.7340 - learning_rate: 1.0000e-05
Epoch 67/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 38ms/step - accuracy: 0.6383 - loss: 1.1244 - val_accuracy: 0.7726 - val_loss: 0.7244 - learning_rate: 1.0000e-05
Epoch 68/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 36ms/step - accuracy: 0.6405 - loss: 1.0659 - val_accuracy: 0.7642 - val_loss: 0.7394 - learning_rate: 1.0000e-05
Epoch 69/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 37ms/step - accuracy: 0.6380 - loss: 1.1054 - val_accuracy: 0.7789 - val_loss: 0.7166 - learning_rate: 1.0000e-05
Epoch 70/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 39ms/step - accuracy: 0.6352 - loss: 1.0991 - val_accuracy: 0.7768 - val_loss: 0.7322 - learning_rate: 1.0000e-05
Epoch 71/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 37ms/step - accuracy: 0.6372 - loss: 1.1098 - val_accuracy: 0.7768 - val_loss: 0.7361 - learning_rate: 1.0000e-05
Epoch 72/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 37ms/step - accuracy: 0.6334 - loss: 1.1248 - val_accuracy: 0.7726 - val_loss: 0.7295 - learning_rate: 1.0000e-05
Epoch 73/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 38ms/step - accuracy: 0.6522 - loss: 1.0630 - val_accuracy: 0.7705 - val_loss: 0.7243 - learning_rate: 1.0000e-05
Epoch 74/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 37ms/step - accuracy: 0.6448 - loss: 1.0540 - val_accuracy: 0.7705 - val_loss: 0.7177 - learning_rate: 1.0000e-05
Epoch 75/75
119/119 ━━━━━━━━━━━━━━━━━━━━ 5s 39ms/step - accuracy: 0.6452 - loss: 1.0847 - val_accuracy: 0.7726 - val_loss: 0.7272 - learning_rate: 1.0000e-05
Fine-tuned Model3 Test Loss: 0.7339
Fine-tuned Model3 Test Accuracy: 0.7684

Observations:

  • Not much of an improvement with the model3 performance as its accuracy did not significantly improve.
  • However, this was a great learning exercise!!

Actionable Insights and Business Recommendations¶

The overall objective of building Convolutional Neural Network Models for this project was:

  • To identify plant seedlings faster by the use of AI and Deep Learning.

SUMMARY & RECOMMENDATIONS:

  • The various models built ranged in accuracy but best results were in the 65%-70% range.
  • I explored 4 models and included a 5th with VGG16.
  • Some models fared better thann the others but were pretty close to one another with their predictions.
  • The VGG16 model was include as EXTRA CREDIT :-)
  • One model was selected based on its best overall accuracy and simpler (fewer layers) topology and was further optimized and fine-tuned.
  • The "best" model predicted plant identification with a 30% failure rate.
  • For 4,750 plants - identifying them correctly into one of the 12 categories was a 60-90 second effort costing mostly the processing power at model fit execution.
  • However, I had 12 random plants selected, identified and compared with the true plant labels and these 12, were incorrectly identified. I tried a second time and got the same results. This leads me to believe that we must improve the accuracy of the models either by augmenting the layers, number of neurons, and more importantly: augmenting the samples of the minority classes to create a better classification category balance.
  • A significant increase with the number of samples will help improve the ability of the models to identify and predict the proper classifications.
  • Additianal strategies for improving image model prediction accuracy:

    1. Data Enhancements: • Add Samples: More labeled images improve generalization. • Data Augmentation: Apply transformations to expand the dataset. • Balance Classes: Use oversampling, undersampling, or synthetic data.
    2. Model Architecture Tweaks: • Add Neurons/Layers: Increases capacity, but watch for overfitting. • Use Pretrained Models: Fine-tune models like ResNet for stronger results. • Ensemble Models: Combine models for more stable predictions.
    3. Training and Regularization: • Tune Hyperparameters: Adjust learning rates, batch sizes, optimizers. • Regularization: Add dropout, batch normalization to avoid overfitting. • Increase Epochs: Train longer with early stopping to avoid overfitting.
    4. Advanced Techniques: • Add Attention Layers: Focus on important image regions. • Fine-Tune Pretrained Layers: Freeze lower layers, adjust top layers. • Self-Supervised Learning: Pre-train with unlabeled data, then fine-tune.

                                                              END