Bank Churn Prediction

Neural Network model design to predict customer churn, optimizing for accuracy and computational efficiency.

Executive Summary --------------------------------------------------------------------- Section

Is it possible to predict which customers are highly likely to leave the bank within 6 months? We will design a ANN to answer the question.

Busines Needs Overview & Problem Statement -------------------------- Section

Context & Objective¶

BUSINESS:

  • Businesses like banks which provide service have to worry about problem of 'Customer Churn' i.e. customers leaving and joining another service provider. It is important to understand which aspects of the service influence a customer's decision in this regard. Management can concentrate efforts on improvement of service, keeping in mind these priorities.

ENGINEERING:

  • You as a Data scientist with the bank need to build a neural network based classifier that can determine whether a customer will leave the bank or not in the next 6 months.

Engineering ANN Design Strategy & General Approach ---------------------------------- Section

Art Zaragoza ANN 2024-30-28.jpg

Data Dictionary¶

#FeatureDescription
1CustomerIdUnique ID which is assigned to each customer.
2SurnameLast name of the customer.
3CreditScoreIt defines the credit history of the customer.
4GeographyA customer’s location.
5GenderIt defines the Gender of the customer.
6AgeAge of the customer
7TenureNumber of years for which the customer has been with the bank.
8NumOfProductsUumber of products that a customer has purchased through the bank.
9BalanceAccount balance
10HasCrCardIt is a categorical variable which decides whether the customer has credit card or not.
11EstimatedSalaryUEstimated salary
12IsActiveMemberIs a categorical variable where custmomer is using bank products regularly, making transactions etc )
13ExitedCustomer left the bank within six month. 0 = No or 1 = Yes ( Customer left the bank )

Importing necessary libraries¶

In [2]:
# Installing the libraries with the specified version.
!pip install tensorflow==2.15.0 scikit-learn==1.2.2 seaborn==0.13.1 matplotlib==3.7.1 numpy==1.25.2 pandas==2.0.3 imbalanced-learn==0.10.1 -q --user

Note: After running the above cell, please restart the notebook kernel/runtime (depending on whether you're using Jupyter Notebook or Google Colab) and then sequentially run all cells from the one below.

In [1]:
# Libraries for reading and manipulating data
import pandas as pd
import numpy as np

# Libaries for data visualization
import matplotlib.pyplot as plt
import seaborn as sns

# Library to split data
from sklearn.model_selection import train_test_split

# Library to standardize the data
from sklearn.preprocessing import StandardScaler, LabelEncoder

# Import functions to build models
import tensorflow as tf
from tensorflow import keras
from keras import backend
from keras.models import Sequential
from keras.layers import Dense, Dropout

# Importing SMOTE
from imblearn.over_sampling import SMOTE

# Importing metrics
from sklearn.metrics import confusion_matrix,roc_curve,classification_report,recall_score

import random

# Library to avoid warnings
import warnings
warnings.filterwarnings("ignore")
In [2]:
# Verifiyng installed versions. - EXTRA CREDIT
print('\nLibraries imported for Exploratory Data Analysis \n(EDA - Univariate and Multivariate):\n')
print("For Numerical Operations and Statistical Analysis of Data:")
print(" . Pandas version:", pd.__version__)
print(" . NumPy version:", np.__version__)
print("\nFor Visualization of Data: ")
print(" . Matplotlib version:", plt.rcParams['figure.figsize'])
print(" . Seaborn version:", sns.__version__)
print("\nFor Machine Learning Models:", tf.__version__)
print(" . Tensorflow version:", tf.__version__)
print(" . Keras version:", keras.__version__)
Libraries imported for Exploratory Data Analysis 
(EDA - Univariate and Multivariate):

For Numerical Operations and Statistical Analysis of Data:
 . Pandas version: 2.2.2
 . NumPy version: 1.26.4

For Visualization of Data: 
 . Matplotlib version: [6.4, 4.8]
 . Seaborn version: 0.13.2

For Machine Learning Models: 2.17.0
 . Tensorflow version: 2.17.0
 . Keras version: 3.4.1
In [3]:
# GPU Processing Resources
gpu_info = !nvidia-smi
gpu_info = '\n'.join(gpu_info)
if gpu_info.find('failed') >= 0:
  print('Not connected to a GPU')
else:
  print(gpu_info)
Sat Oct 12 20:13:17 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   45C    P8               9W /  70W |      0MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

Loading the dataset¶

In [4]:
from google.colab import drive
drive.mount('/content/drive') # Mount drive
Mounted at /content/drive
In [5]:
ds = pd.read_csv("Churn.csv")  # Load raw dataset

Data Overview¶

View the first and last 5 rows of the dataset. - - Data Overview¶

In [6]:
ds.head() # Top 5 rows of the dataset
Out[6]:
RowNumber CustomerId Surname CreditScore Geography Gender Age Tenure Balance NumOfProducts HasCrCard IsActiveMember EstimatedSalary Exited
0 1 15634602 Hargrave 619 France Female 42 2 0.00 1 1 1 101348.88 1
1 2 15647311 Hill 608 Spain Female 41 1 83807.86 1 0 1 112542.58 0
2 3 15619304 Onio 502 France Female 42 8 159660.80 3 1 0 113931.57 1
3 4 15701354 Boni 699 France Female 39 1 0.00 2 0 0 93826.63 0
4 5 15737888 Mitchell 850 Spain Female 43 2 125510.82 1 1 1 79084.10 0
In [7]:
ds.tail() # Last 5 rows of the dataset
Out[7]:
RowNumber CustomerId Surname CreditScore Geography Gender Age Tenure Balance NumOfProducts HasCrCard IsActiveMember EstimatedSalary Exited
9995 9996 15606229 Obijiaku 771 France Male 39 5 0.00 2 1 0 96270.64 0
9996 9997 15569892 Johnstone 516 France Male 35 10 57369.61 1 1 1 101699.77 0
9997 9998 15584532 Liu 709 France Female 36 7 0.00 1 0 1 42085.58 1
9998 9999 15682355 Sabbatini 772 Germany Male 42 3 75075.31 2 1 0 92888.52 1
9999 10000 15628319 Walker 792 France Female 28 4 130142.79 1 1 0 38190.78 0

View a random set of rows (seeded = reproducible). -- Data Overview¶

In [8]:
ds.sample(n=7, random_state=42)
Out[8]:
RowNumber CustomerId Surname CreditScore Geography Gender Age Tenure Balance NumOfProducts HasCrCard IsActiveMember EstimatedSalary Exited
6252 6253 15687492 Anderson 596 Germany Male 32 3 96709.07 2 0 0 41788.37 0
4684 4685 15736963 Herring 623 France Male 43 1 0.00 2 1 1 146379.30 0
1731 1732 15721730 Amechi 601 Spain Female 44 4 0.00 2 1 0 58561.31 0
4742 4743 15762134 Liang 506 Germany Male 59 8 119152.10 2 1 1 170679.74 0
4521 4522 15648898 Chuang 560 Spain Female 27 7 124995.98 1 1 1 114669.79 0
6340 6341 15659064 Salas 790 Spain Male 37 8 0.00 2 1 1 149418.41 0
576 577 15761986 Obialo 439 Spain Female 32 3 138901.61 1 1 0 75685.97 0

Shape of the dataset check. - - Data Overview¶

In [9]:
ds.shape # Number of rows and columns in the training data
Out[9]:
(10000, 14)

Observations:

  • There are 10k rows with 14 columns in the Churn.csv.

Data types check of the columns for the dataset. - - Data Overview¶

In [10]:
ds.info() # Data types
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 14 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   RowNumber        10000 non-null  int64  
 1   CustomerId       10000 non-null  int64  
 2   Surname          10000 non-null  object 
 3   CreditScore      10000 non-null  int64  
 4   Geography        10000 non-null  object 
 5   Gender           10000 non-null  object 
 6   Age              10000 non-null  int64  
 7   Tenure           10000 non-null  int64  
 8   Balance          10000 non-null  float64
 9   NumOfProducts    10000 non-null  int64  
 10  HasCrCard        10000 non-null  int64  
 11  IsActiveMember   10000 non-null  int64  
 12  EstimatedSalary  10000 non-null  float64
 13  Exited           10000 non-null  int64  
dtypes: float64(2), int64(9), object(3)
memory usage: 1.1+ MB

Observations:

  • There are 10k observations and 14 columns.
  • 3 Columns are object datatype.
  • 11 columns are numerical.
  • There appears to be no missing data in all the columns as they match the size of the dataset.

Statistical Summary Check. - - Data Overview¶

In [11]:
# EXTRA CREDIT

# Note: Statistical information is obtained for numerical datatypes only, as shown below. There are only 11: (2) float64, (9) int64.
# Using a lamda function for rounding off to 2 digits the minimum, the average, and the maximum.
# Note: RowNumber & CustomerId statistics are not relevant and do not make sense other than in combination with other variables.

ds.describe().applymap(lambda x: round(x, 2)).T
Out[11]:
count mean std min 25% 50% 75% max
RowNumber 10000.0 5000.50 2886.90 1.00 2500.75 5000.50 7500.25 10000.00
CustomerId 10000.0 15690940.57 71936.19 15565701.00 15628528.25 15690738.00 15753233.75 15815690.00
CreditScore 10000.0 650.53 96.65 350.00 584.00 652.00 718.00 850.00
Age 10000.0 38.92 10.49 18.00 32.00 37.00 44.00 92.00
Tenure 10000.0 5.01 2.89 0.00 3.00 5.00 7.00 10.00
Balance 10000.0 76485.89 62397.41 0.00 0.00 97198.54 127644.24 250898.09
NumOfProducts 10000.0 1.53 0.58 1.00 1.00 1.00 2.00 4.00
HasCrCard 10000.0 0.71 0.46 0.00 0.00 1.00 1.00 1.00
IsActiveMember 10000.0 0.52 0.50 0.00 0.00 1.00 1.00 1.00
EstimatedSalary 10000.0 100090.24 57510.49 11.58 51002.11 100193.91 149388.25 199992.48
Exited 10000.0 0.20 0.40 0.00 0.00 0.00 0.00 1.00

Observations:

  • Statistical information is obtained for numerical datatypes only, as shown below. There are only 11: (2) float64, (9) int64.
  • Using a lamda function for rounding off to 2 digits the minimum, the average, and the maximum.
  • RowNumber, CustomerId have numerical values but their statistics are not relevant and do not make sense other than in combination with other variables.
  • The average Age of all Customers is 39 years.
  • 75% of Customers have been with the bank for about 7 years with and average of 5 years.
  • The average Customer balance is 76.5k.
  • 70% of all Customers have Credit Cards.
  • The average salary of Customers is 100k.

Missing Values Check. - - Data Overview¶

In [12]:
ds.isnull().sum() #  Checking for missing values in the dataset.
Out[12]:
0
RowNumber 0
CustomerId 0
Surname 0
CreditScore 0
Geography 0
Gender 0
Age 0
Tenure 0
Balance 0
NumOfProducts 0
HasCrCard 0
IsActiveMember 0
EstimatedSalary 0
Exited 0

Observations:

  • There appears to be no missing data in all the columns as they match the size of the dataset. (Just like with datatype output info above.)
  • However, we must still check for `NaN` and `Not given` wherever these values are present, potentially affecting our calculations.
In [14]:
# EXTRA CREDIT: FUNCTIONS AND HANDLING FOR 'NaN' and 'Not given'
# Check for NaN values in the dataset
nan_values = ds.isna().sum()
print("\n'NaN' Values in each column:\n", nan_values)

# Check for "Not given" (as a string) in the dataset
not_given_values = (ds == "Not given").sum()
print("\n'Not given' values in each column:\n", not_given_values)

# Replace "Not given" with NaN, so we can handle all missing data uniformly
# ds.replace("Not given", pd.NA, inplace=True)

# Check again after replacing 'Not given' with NaN
#print("\nAfter replacing 'Not given' with NaN, NaN values in each column:\n", ds.isna().sum())

# Optionally, you can handle the NaN values by dropping them or filling them with a default value
# For example, filling missing values with the median of each column
# ds.fillna(ds.median(), inplace=True)

# Print the dataset to ensure missing values are handled
# print("\nDataset after handling missing values:\n", ds)
'NaN' Values in each column:
 RowNumber          0
CustomerId         0
Surname            0
CreditScore        0
Geography          0
Gender             0
Age                0
Tenure             0
Balance            0
NumOfProducts      0
HasCrCard          0
IsActiveMember     0
EstimatedSalary    0
Exited             0
dtype: int64

'Not given' values in each column:
 RowNumber          0
CustomerId         0
Surname            0
CreditScore        0
Geography          0
Gender             0
Age                0
Tenure             0
Balance            0
NumOfProducts      0
HasCrCard          0
IsActiveMember     0
EstimatedSalary    0
Exited             0
dtype: int64

Observations:

  • There are no NaN nor Not Given entries in our dataset.

Duplicates check. - - Data Overview¶

In [15]:
ds.duplicated().sum() # Check for duplicate values in the data
Out[15]:
0

Unique values check for each of the column. - - Data Overview¶

In [16]:
ds.nunique() # Number of Unique Values (Groups)
Out[16]:
0
RowNumber 10000
CustomerId 10000
Surname 2932
CreditScore 460
Geography 3
Gender 2
Age 70
Tenure 11
Balance 6382
NumOfProducts 4
HasCrCard 2
IsActiveMember 2
EstimatedSalary 9999
Exited 2

Observations:

  • These two numeric columns and this string column do not provide insights and could be candidates for dropping: RowNumber, CustomerId and Surname.
  • The following columns have large variations of values: CreditScore, Age, Tenure, Balance and EstimatedSalary.

Distribution of Targeted column (Exited = 1) - - Data Overview¶

In [17]:
ds["Exited"].value_counts(1)
Out[17]:
proportion
Exited
0 0.7963
1 0.2037

Observations:

  • Approximately 20% of all Customers have exited.
  • 20% is unacceptably high for any business.
In [18]:
ds['Geography'].value_counts()# Data frequency categories.
Out[18]:
count
Geography
France 5014
Germany 2509
Spain 2477

Observations:

  • 50% of Customers are from France and the remaining half are equally divided between Germany and Spain.

Recommendations:

  • Right off the bat, bank offerings must include services in three languages: French, German and Spanish.
  • There are twice as many French customers, therefore the marketing and customer support and customer retention budgets should reflect the same.
  • Bots or AI assistants should provide culture and language sensitivity to reflect customer's distribution.
  • Important: EU AI Act applies to all 3 countries and the bank must follow these regulations with strict compliance.
In [19]:
ds['Gender'].value_counts() # Data frequency categories.
Out[19]:
count
Gender
Male 5457
Female 4543

Observations:

  • The number of Males and Females are about the same, however there are about 1,000 more males than females or 10% more of all Customers are males.

Recommendations:

  • Great opportunity to increase adoption of services by targeting the female population to enroll in the bank.
  • It is crucial that the marketing advertising and customer services are gender sensitive to ensure respect and recruitment towards both genders, but more so towards females.
In [20]:
ds['Tenure'].value_counts() # Data frequency categories.
Out[20]:
count
Tenure
2 1048
1 1035
7 1028
8 1025
5 1012
3 1009
4 989
9 984
6 967
10 490
0 413

In [22]:
# Set the figure size and plot the bar plot
plt.figure(figsize=(4,3))
sns.countplot(x='Tenure', data=ds, palette='Blues')

# Adding labels and title
plt.title('Tenure Distribution', fontsize=16)
plt.xlabel('Tenure', fontsize=12)
plt.ylabel('Frequency', fontsize=12)

# Show the plot
plt.show()

Observations:

  • Customers are mostly distributed between 1 - 9 years of Tenure.
  • For each year of Tenure, the number of Customers floats around 900 - 1100.
In [ ]:
ds['NumOfProducts'].value_counts() # Data frequency categories.
Out[ ]:
count
NumOfProducts
1 5084
2 4590
3 266
4 60

Observations:

  • The majority of Customers use 1 or 2 bank products.
  • About half of Customers use one product and less than half use two.
  • An extremely small number uses 3 (266 Customers) and only 60 uses 4 services.
In [26]:
import matplotlib.pyplot as plt

# Get the frequency of each category in 'NumOfProducts'
product_counts = ds['NumOfProducts'].value_counts()

# Plot pie chart
plt.figure(figsize=(2, 2))  # Set the figure size
plt.pie(product_counts, labels=product_counts.index, autopct='%1.1f%%', startangle=90, colors=['lightblue', 'lightgreen', 'orange', 'lightcoral'])

# Adding title
plt.title('Percentage of Customers by Number of Products', fontsize=14)

# Show the pie chart
plt.show()

Observations:

  • 1/2 of all Customers are using one of the bank's offerings.
  • Less than the remaining half are using 2 offerings.
  • Very few use 3 and a handful uses all 4 offerings.

Recommendations:

  • Those that have one of the offerings are best target candidates to get 2 offerings due to familiarity and experience.
  • Customer Satisfaction research should be considered to prevent nearly 1/2 or 46% of all customers from dropping one of the 2 offerings. Retention is crucial for revenue stability.
In [ ]:
ds['HasCrCard'].value_counts() # Data frequency categories.
Out[ ]:
count
HasCrCard
1 7055
0 2945

Observations:

  • About a third of all Customers do not have a bank's credit card.
  • More precisely 2,945 out of 7,055 Customers are not using an important bank offerings.

Recommendations:

  • Target Customers without credit cards to increase conversion.
  • Increase incentives for Customers without credit card.
  • Increase Customer's credit card retention by customer satisfaction research.
  • Design offers with attractive balance transfers to those Customers with other institutions credit cards.
  • Determine which Customers with credit cards are at risk of loosing their credit card services to ensure they remain revenue-worthy. Ensure is a win-win for both the customer and the bank.
In [ ]:
ds['IsActiveMember'].value_counts() # Data frequency categories.
Out[ ]:
count
IsActiveMember
1 5151
0 4849

Observations:

  • Only about half of Customers are actively conducting transactions or using bank products.
  • This is catastrophic. Deeper research is needed here to understand the root causes.

Recommendations:

  • Provide a budget to research contributing factors for the lack of resonable transactions.
  • Is there an "average" baseline of ideal or optimum Customers average transactions meetrics? If not, one MUST be establish ASAP.
  • Once this baseline is established, it can be used as a treshold metric and an alarm or "signal" should be triggered for all Customers NOT crossing it.
In [ ]:
ds['Exited'].value_counts()  # Data frequency categories.
Out[ ]:
count
Exited
0 7963
1 2037

Observations:

  • A fairly large number of Customers are no longer active customers within the last 6 months.
  • More precisely 2037 out of 7963 (26.6%) or a quarter of all customers leave. This is not acceptable.

Recommendations:

  • A predictive NN model should be designed to anticipate Customer trends and provide opportunities to interject before is too late. (This project. :-) )
  • Daily runs in production should provide 'alerts' whenever patterns are detected or treshold metrics are crossed with production or unseen data.
  • The predictive model should be fine-tuned regularly to incorporate additional data values accounting for the preventative-actions and measure both their exit-recurrence effectiveness to the business and the efficiency of the model predictions.
  • Transparency of Customer's Privacy protections should be auditable and test-demonstrated frequently as this model is predictive and could cause incorrect, even catastrophic communications to Customers and Regulators when biased.
In [ ]:
(ds['Balance'] == 0).sum() # Number of Customers with 0 Balance in their accounts.
Out[ ]:
3617

Observations:

  • A fairly large number of Customers are keeping a 0 balance in their accounts.
  • More precisely 3,617 out of 10,000 (36.2%) or a third of all customers. Recommendations:
  • More research is required in this area. A budget for research should be added to understand contributing factors and corrective actions, RCCAs (Root Cause Corrective Actions).
  • Incentives should be studied to provide value-add to these customers as a win-win for them and for the bank.
In [28]:
ds = ds.drop(['RowNumber', 'CustomerId', 'Surname'], axis=1) # RowNumber, CustomerId and Surname are all unique therefore these columns can be dropped

Recommendation:

  • There's no need to keep RowNumber, CustomerID and Surname data for statistical analysis nor for our purposes. These can be dropped.
In [29]:
print("\nRowNumber, CustomerID and Surname data were dropped and Gender and Geography are Object datatypes.", "\nRemaining numeric data:\n")
ds.describe().applymap(lambda x: round(x, 2)).T # Gender & Geography not listed as they are Object datatype as the remaining columns not listed below.
RowNumber, CustomerID and Surname data were dropped and Gender and Geography are Object datatypes. 
Remaining numeric data:

Out[29]:
count mean std min 25% 50% 75% max
CreditScore 10000.0 650.53 96.65 350.00 584.00 652.00 718.00 850.00
Age 10000.0 38.92 10.49 18.00 32.00 37.00 44.00 92.00
Tenure 10000.0 5.01 2.89 0.00 3.00 5.00 7.00 10.00
Balance 10000.0 76485.89 62397.41 0.00 0.00 97198.54 127644.24 250898.09
NumOfProducts 10000.0 1.53 0.58 1.00 1.00 1.00 2.00 4.00
HasCrCard 10000.0 0.71 0.46 0.00 0.00 1.00 1.00 1.00
IsActiveMember 10000.0 0.52 0.50 0.00 0.00 1.00 1.00 1.00
EstimatedSalary 10000.0 100090.24 57510.49 11.58 51002.11 100193.91 149388.25 199992.48
Exited 10000.0 0.20 0.40 0.00 0.00 0.00 0.00 1.00

Observations:

  • Gender & Geography not listed as they are Object datatype.
  • Ave Credit Score of Customers is 650.
  • Customers range in age from 18 - 92 years of age.
  • Ave Tenure of Customer is 5 years.
  • Balance ranges from $0.0 - $251k with an ave of $76.4k
  • Ave number of bank products is slightly higher than one product out of 4 products available.
  • Less than a 1/4 of Customers do not have credit cards from the bank.
  • About 1/4 of all Customers are not actively engage with the bank offerings.
  • Ave Salarie of Customers is about 100k.

Exploratory Data Analysis - EDA --------------------------------------------- Section

Univariate Analysis - EDA¶

In [30]:
# Function to plot a boxplot and a histogram along the same scale.


def histogram_boxplot(data, feature, figsize=(12, 7), kde=False, bins=None):
    """
    Boxplot and histogram combined

    data: dataframe
    feature: dataframe column
    figsize: size of figure (default (12,7))
    kde: whether to show the density curve (default False)
    bins: number of bins for histogram (default None)
    """
    f2, (ax_box2, ax_hist2) = plt.subplots(
        nrows=2,  # Number of rows of the subplot grid= 2
        sharex=True,  # x-axis will be shared among all subplots
        gridspec_kw={"height_ratios": (0.25, 0.75)},
        figsize=figsize,
    )  # creating the 2 subplots
    sns.boxplot(
        data=data, x=feature, ax=ax_box2, showmeans=True, color="violet"
    )  # boxplot will be created and a star will indicate the mean value of the column
    sns.histplot(
        data=data, x=feature, kde=kde, ax=ax_hist2, bins=bins, palette="winter"
    ) if bins else sns.histplot(
        data=data, x=feature, kde=kde, ax=ax_hist2
    )  # For histogram
    ax_hist2.axvline(
        data[feature].mean(), color="green", linestyle="--"
    )  # Add mean to the histogram
    ax_hist2.axvline(
        data[feature].median(), color="black", linestyle="-"
    )  # Add median to the histogram
In [31]:
# Function to create labeled barplots


def labeled_barplot(data, feature, perc=False, n=None):
    """
    Barplot with percentage at the top

    data: dataframe
    feature: dataframe column
    perc: whether to display percentages instead of count (default is False)
    n: displays the top n category levels (default is None, i.e., display all levels)
    """

    total = len(data[feature])  # length of the column
    count = data[feature].nunique()
    if n is None:
        plt.figure(figsize=(count + 1, 5))
    else:
        plt.figure(figsize=(n + 1, 5))

    plt.xticks(rotation=90, fontsize=15)
    ax = sns.countplot(
        data=data,
        x=feature,
        palette="Paired",
        order=data[feature].value_counts().index[:n].sort_values(),
    )

    for p in ax.patches:
        if perc == True:
            label = "{:.1f}%".format(
                100 * p.get_height() / total
            )  # percentage of each class of the category
        else:
            label = p.get_height()  # count of each level of the category

        x = p.get_x() + p.get_width() / 2  # width of the plot
        y = p.get_height()  # height of the plot

        ax.annotate(
            label,
            (x, y),
            ha="center",
            va="center",
            size=12,
            xytext=(0, 5),
            textcoords="offset points",
        )  # annotate the percentage

    plt.show()  # show the plot

Observations on CreditScore¶

In [32]:
# EXTRA CREDIT - Function expanded to include Mean and Median to be displayed on plot.
def histogram_boxplot(data, feature, figsize=(12, 7), kde=False, bins=None):
    """
    Boxplot and histogram combined with the mean value displayed on the histogram.

    data: dataframe
    feature: dataframe column
    figsize: size of figure (default (12,7))
    kde: whether to show the density curve (default False)
    bins: number of bins for histogram (default None)
    """
    f2, (ax_box2, ax_hist2) = plt.subplots(
        nrows=2,  # Number of rows of the subplot grid= 2
        sharex=True,  # x-axis will be shared among all subplots
        gridspec_kw={"height_ratios": (0.25, 0.75)},
        figsize=figsize,
    )  # creating the 2 subplots

    # Create the boxplot
    sns.boxplot(
        data=data, x=feature, ax=ax_box2, showmeans=True, color="orange"
    )  # Boxplot with the mean indicated

    # Create the histogram
    sns.histplot(
        data=data, x=feature, kde=kde, ax=ax_hist2, bins=bins, color="skyblue"
    ) if bins else sns.histplot(
        data=data, x=feature, kde=kde, ax=ax_hist2
    )  # Histogram

    # Calculate mean and median
    mean_val = data[feature].mean()
    median_val = data[feature].median()

    # Add mean and median lines to the histogram
    ax_hist2.axvline(mean_val, color="green", linestyle="--", label=f'Mean: {mean_val:.2f}')
    ax_hist2.axvline(median_val, color="black", linestyle="-", label=f'Median: {median_val:.2f}')

    # Display the value of the mean on the plot
    ax_hist2.text(mean_val, ax_hist2.get_ylim()[1] * 0.8, f'Mean: {mean_val:.2f}',
                  color='green', fontsize=12, ha='center')

    # Optionally, display the value of the median on the plot
    ax_hist2.text(median_val, ax_hist2.get_ylim()[1] * 0.7, f'Median: {median_val:.2f}',
                  color='black', fontsize=12, ha='center')

    # Add a legend to the plot
    ax_hist2.legend()
In [33]:
histogram_boxplot(ds, 'CreditScore', figsize=(6, 2)) # Histogram_boxplot for Credit Score.

Observations:

  • Histogram displays a uniform distribution centering on a Median of 652.
  • A number of outliers on the lower scores and a spike on the highest score.

Recommendations:

  • Data preprocessing-handling candidates are both the outliers and Customers contributing to the band with highest credit score.

Observations on Age¶

In [ ]:
histogram_boxplot(ds, 'Age', figsize=(6, 2)) # Histogram_boxplot for Age.
In [79]:
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd

# Define the age ranges and generation labels
generation_labels = ['Silent Generation', 'Baby Boomers', 'Generation X', 'Millennials', 'Generation Z', 'Generation Alpha']
bins = [0, 11, 27, 43, 59, 78, 96]

# Create a new column for the generations
ds['Generation'] = pd.cut(ds['Age'], bins=bins, labels=generation_labels, right=False)

# Count the number of customers in each generation
generation_counts = ds['Generation'].value_counts()

# Sort the counts according to the defined generation labels
generation_counts = generation_counts.reindex(generation_labels)

# Fill missing values with zero if any generations are not present in the data
generation_counts = generation_counts.fillna(0)

# Calculate total customers
total_customers = generation_counts.sum()

# Debug: Print generation counts to verify values
print(generation_counts)

# Create a horizontal bar plot
plt.figure(figsize=(5, 2.5))
sns.barplot(x=generation_counts.values, y=generation_counts.index, palette='Set1')

# Add labels and title
plt.xlabel('Number of Customers')
plt.ylabel('Generation')
plt.title("Bank's Customer Distribution by Generation")

# Set x-axis limits to ensure proper display of larger values
plt.xlim(0, generation_counts.max() + 500)  # Adjust limit based on your data

# Add actual numbers and percentages on top of the bars
for index, value in enumerate(generation_counts.values):
    percentage = (value / total_customers) * 100
    plt.text(value, index, f'{int(value)} ({percentage:.1f}%)', va='center', fontsize=10, fontweight='bold')

plt.tight_layout()
plt.show()
Generation
Silent Generation       0
Baby Boomers          811
Generation X         6295
Millennials          2306
Generation Z          564
Generation Alpha       24
Name: count, dtype: int64
GenerationBirth YearsAge Range (2024)Bank AttitudeCredit Card Attitude
Generation Alpha2013–20250–11Not yet developedNot yet developed
Generation Z1997–201212–27Prefer digital banking solutionsValue rewards and low fees
Millennials (Gen Y)1981–199628–43Favor banks with ethical practicesUse credit cards for convenience and rewards
Generation X1965–198044–59Traditional banking preferencesPrefer stability and reliability
Baby Boomers1946–196460–78Trust in established banksUse credit cards primarily for emergencies
Silent Generation1928–194579–96Conservative with financial choicesPrefer cash and debit cards

Observations:

  • Median Age is 37 years old.
  • Age is evenly distributed.
  • There's a discrepancy between the stats results showing a Max Customer age of 92 and this distribution showing -0- Customers at that age. It needs additional exploration but it is not statistically significant.

Important Notes on Age:

  • Although age is uniformy distributed, the bins or age ranges of Customers generational attitudes are well known to differ drastically. Each must be well understood within the bank multiple business verticals and their services offered accordingly.
  • Age distribution of Customers plays a present but yet a much stronger contribution to the bank's future revenues from the younger generations and should be planned accordingly.

Recommendations:

  • The number of "young" Customers is understandably very small. More efforts to attract future Customers and to retain them with services they value are required.
  • 811 Customers Generation Alpha (0-11) - Has not yet developed.
  • 564 Customers are Generation Z (12-27) - Prefer digital banking and value credit cards rewards and lower fees.
  • 2,305 Customers are Millenials (28-43) - Prefer ethical banking practices and use credit cards for conveneince and rewards.
  • 6,295 Customers are Generation X (44-59) - Prefer traditional banking services and credit card security, reliability and seek their financial stability. This segment is the majority of the generational groups of Customers.
  • 811 Customers are Baby Boomers (60-78) - Prefer trusted banks and use credit cards primarily for emergencies.
  • 0 Customers are Silent Generation (79-96) - Prefer the most conservative banking choices and still prefer cash over credit cards and debit cards.
  • Bank services and incentives, Marketing messaging and Customer Support communications should reflect these generational banking and credit card preferences as described.

Observations on Balance¶

In [ ]:
histogram_boxplot(ds, 'Balance', figsize=(6, 2)) # Histogram_boxplot for Balance

Observations:

  • The Mean Customer's Balance is 76,485.89
  • Balance is evenly distributed with the exception of a large segment of Customers with 0.00 Balance skewing the histogram.

Recommendations:

  • Customers with 0.00 Balance as a result of no longer conducting banking because they exited - should be removed from for proper evaluations of the current Customers with 0.00 Balance to understand contributing factors.

Observations on Estimated Salary¶

In [ ]:
histogram_boxplot(ds, 'EstimatedSalary', figsize=(6, 2)) # Histogram_boxplot for Estimated Salary
In [80]:
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd

# Define the age ranges and generation labels
generation_labels = ['Silent Generation', 'Baby Boomers', 'Generation X', 'Millennials', 'Generation Z', 'Generation Alpha']
bins = [0, 11, 27, 43, 59, 78, 96]

# Create a new column for the generations
ds['Generation'] = pd.cut(ds['Age'], bins=bins, labels=generation_labels, right=False)

# Group by Generation and calculate the average EstimatedSalary
generation_salary_avg = ds.groupby('Generation')['EstimatedSalary'].mean()

# Debug: Print average salary for each generation
print(generation_salary_avg)

# Create a horizontal bar plot for the average EstimatedSalary per Generation
plt.figure(figsize=(5, 3))
sns.barplot(x=generation_salary_avg.values, y=generation_salary_avg.index, palette='Set1')

# Add labels and title
plt.xlabel('Average Estimated Salary')
plt.ylabel('Generation')
plt.title("Average Estimated Salary by Generation")

# Add actual salary values on top of the bars
for index, value in enumerate(generation_salary_avg.values):
    plt.text(value, index, f'{value:.2f}', va='center', fontsize=10, fontweight='bold')

plt.tight_layout()
plt.show()
WARNING:matplotlib.text:posx and posy should be finite values
WARNING:matplotlib.text:posx and posy should be finite values
Generation
Silent Generation              NaN
Baby Boomers         102179.128977
Generation X          99831.286157
Millennials          100637.390520
Generation Z          97120.271915
Generation Alpha     114646.789583
Name: EstimatedSalary, dtype: float64

Observations:

  • Customers average 100k in Salary.
  • It is almost the same average Salary for each of the generational groups of Customers.
  • Silent Generation has no Salary as expected.

Recommendations:

  • Each generational group category averages about the same income and therefore the younger groups should be targeted with loyalty rewards campaigns to protect their longevity as Customers. For example, Gen Zs (age 12-27) with the same Estimated Salary as their older counterparts should be more aggressively targeted with incentives for longer duration investments like CDs, Annuities, 401ks, and Bonds.
  • These artfacts produce fewer banking transactions overall, but increase Customer Longevity and should be taken into account as a long term portafolio of bank services.
In [135]:
# Drop the 'Generation' column from both X_train and X_val
ds = ds.drop(columns=['Generation'])  # Drop from your main dataset

Observations on Exited¶

In [ ]:
labeled_barplot(ds, "Exited", perc=True) # Barplot on Exited.

Observations:

  • This is our targeted feature.
  • We will measure the predicted exits to the actual exits to measure the Customer Risk of Exiting prediction accuracy of the model.
  • 20.4% of all Customers leave. That's 1/5th of the entire population. That is NOT acceptable.

Recommendations:

  • A comprehensive analysis is required to identify the key factors that contributed to the departure of nearly 2,000 customers.

Observations on Geography¶

In [ ]:
labeled_barplot(ds, "Geography", perc=True) # Barplot on Geography.

Observations:

  • There are mostly 3 European nations that contribute to all Customers.
  • Half of all Customers are from France.
  • The remaining half are from split evenly between Germany and Spain.

Recommendations:

  • Language communications for Customers should reflect their population's distribution of language and culture.
  • Additional recommendations for Customers by country are highlighed elsewhere.

Observations on Gender¶

In [ ]:
labeled_barplot(ds, "Gender", perc=True) # Barplot for Gender

Observations:

  • There are only about 1,000 more Male Customers than Female.
  • Otherwise the distribution by gender is fairly uniform

Recommendations:

  • Target for an increase of Female Customers to minimize the gap.

Observations on Tenure¶

In [ ]:
labeled_barplot(ds, "Tenure", perc=True) # Barplot for Tenure

Observations:

  • Previously addressed.

Observations on Number of Products¶

In [ ]:
labeled_barplot(ds, "NumOfProducts", perc=True) # Barplot for Number of products

Observations:

  • Previously addressed.

Observations on Has Credit Card¶

In [ ]:
labeled_barplot(ds, "HasCrCard", perc=True) # Barplot on credit card

Observations:

  • 60% of all Customers have a credit card.
  • 30% does not.

Recommendations:

  • Customer Loyalty programs for retention should be targeted to those with credit cards.
  • Incentives should be designed to target Customers without credit cards.
  • Balance Transfer on the bank's credit card should target the almost 3,000 Customers withpout a credit card.

Observations on Is Active Member¶

In [ ]:
labeled_barplot(ds, "IsActiveMember", perc=True) # Barplot on Is active member

Observations:

  • Active Customers are evenly distributed.

Recommendations: Establish an alert-trigger treshold for Customer's transactional or login activity.

  • Active Customers falling below should be targeted with incentives to increase banking activities.

Bivariate Analysis - EDA¶

In [81]:
# function to plot stacked bar chart


def stacked_barplot(data, predictor, target):
    """
    Print the category counts and plot a stacked bar chart

    data: dataframe
    predictor: independent variable
    target: target variable
    """
    count = data[predictor].nunique()
    sorter = data[target].value_counts().index[-1]
    tab1 = pd.crosstab(data[predictor], data[target], margins=True).sort_values(
        by=sorter, ascending=False
    )
    print(tab1)
    print("-" * 120)
    tab = pd.crosstab(data[predictor], data[target], normalize="index").sort_values(
        by=sorter, ascending=False
    )
    tab.plot(kind="bar", stacked=True, figsize=(count + 1, 5))
    plt.legend(
        loc="lower left",
        frameon=False,
    )
    plt.legend(loc="upper left", bbox_to_anchor=(1, 1))
    plt.show()

Correlation plot¶

In [82]:
# defining the list of numerical columns
cols_list = ["CreditScore","Age","Tenure","Balance","EstimatedSalary"]
# Original List of Columns.
In [83]:
#EXTRA CREDIT
cols_list = ["CreditScore","Age","Tenure","Balance","NumOfProducts","HasCrCard","IsActiveMember","EstimatedSalary","Exited"] # All Numerical columns (Gender and Geography are Object datatype)
In [84]:
plt.figure(figsize=(6, 3))
sns.heatmap(ds[cols_list].corr(), annot=True, vmin=-1, vmax=1, fmt=".2f", cmap="Spectral")
plt.show()

Observations:

  • The strongest correlations with Exited are: Age and Balance

Recommendations:

  • Further analysis to determine distributions of Exited Customers by Age and by Balance.

Exited Vs Geography¶

In [ ]:
stacked_barplot(ds, "Geography", "Exited" )
Exited        0     1    All
Geography                   
All        7963  2037  10000
Germany    1695   814   2509
France     4204   810   5014
Spain      2064   413   2477
------------------------------------------------------------------------------------------------------------------------

Observations:

  • Distribution by Customer's country is higher for those from Germany.
  • For Spain and France (France is twice as lasrge as Spain and Germany), the number of Customers that leave is higher than for those that stay.

Recommendations:

  • Contributing factors for the higher departure of Custmers from Germany needs further investigations.

Exited Vs Gender¶

In [ ]:
stacked_barplot(ds, "Exited","Gender" )
Gender  Female  Male    All
Exited                     
All       4543  5457  10000
0         3404  4559   7963
1         1139   898   2037
------------------------------------------------------------------------------------------------------------------------

Observations:

  • Distribution of Gender of Customers is higher for Females and those that leave.

Exited Vs Has Credit Card¶

In [ ]:
stacked_barplot(ds, "Exited","HasCrCard" )
HasCrCard     0     1    All
Exited                      
All        2945  7055  10000
0          2332  5631   7963
1           613  1424   2037
------------------------------------------------------------------------------------------------------------------------

Observations:

  • Distribution of Customers with credit cards are about the same for those that stay and those that leave.

Exited Vs Is active member¶

In [ ]:
stacked_barplot(ds, "Exited","IsActiveMember" )
IsActiveMember     0     1    All
Exited                           
All             4849  5151  10000
0               3547  4416   7963
1               1302   735   2037
------------------------------------------------------------------------------------------------------------------------

Exited Vs Credit Score¶

In [ ]:
plt.figure(figsize=(4,2))
sns.boxplot(y='CreditScore',x='Exited',data=ds)
plt.show()

Observations:

  • Distribution of Credit Score of Customers and the average are about the same for those that stay and those that leave.
  • There's a number of Outliers for those that leave.

Exited Vs Age¶

This Age feature had one of the two strongest correlation with Exiting Customers:

In [ ]:
plt.figure(figsize=(4,2))
sns.boxplot(y='Age',x='Exited',data=ds)
plt.show()

Observations:

  • The average Customer Age that left the bank is between 40-50.

Recommendations:

  • Investigate contributing factors.

Exited Vs Tenure¶

In [ ]:
plt.figure(figsize=(4,2))
sns.boxplot(y='Tenure',x='Exited',data=ds) # Boxplot for Exited and Tenure
plt.show()

Observations:

  • The average are about the same for those that stay and those that live.
  • However, the distribution is wider for those Customers that leave.

Exited Vs Balance¶

This Balance feature had one of the two strongest correlation with Exiting Customers after Age:

In [ ]:
plt.figure(figsize=(4,2))
sns.boxplot(y='Balance',x='Exited',data=ds)               ## Complete the code to plot the boxplot for Exited and Balance
plt.show()

Observations:

  • The Balance between those that exited and those that remained with the bank are statistically insignificant but yet was shown to represent the secong largest corrrelation with Exiting Customers.

Recommendations:

  • Further analysis is required here.

Exited Vs Number of Products¶

In [ ]:
plt.figure(figsize=(4,2))
sns.boxplot(y='NumOfProducts',x='Exited',data=ds)               ## Complete the code to plot the boxplot for Exited and Number of products
plt.show()

Observations:

  • Distribution of Number of Products used by Customers are about the same for those that stay and those that leave.

Exited Vs Estimated Salary¶

In [ ]:
plt.figure(figsize=(4,2))
sns.boxplot(y='EstimatedSalary',x='Exited',data=ds)               ## Complete the code to plot the boxplot for Exited and Estimated Salary
plt.show()

Observations:

  • Distribution of Estimated Salary of Customers and the average are about the same for those that stay and those that live.

Data Preprocessing --------------------------------------------------------------- Section

Categorical Encoding with Dummy Variable Creation - Data Preprocessing¶

In [137]:
# Identifying all categorical columns.
	# •	Convert them to (one-hot encoded) columns.
	# •	Dropping the first category from each categorical variable to avoid multicollinearity.
	# •	Encoded columns are of type float.
ds = pd.get_dummies(ds,columns=ds.select_dtypes(include=["object"]).columns.tolist(),drop_first=True,dtype=float)
print("Categorical Encoding done where needed (object datatypes).")
Categorical Encoding done where needed (object datatypes).

Train | Validation | Test (Splits) - Data Preprocessing¶

Separate features from results columns into two tables for handling prediction training and testing datasets:

In [138]:
X = ds.drop(['Exited'],axis=1) # Remove Exited column from Credit Score through Estimated Salary on X table.
y = ds['Exited'] # Assign our Target feature Exited to y table.
In [139]:
print("Features Table or X_large = ", X.shape, "\nResults Table or y_large = ", y.shape) # Confirm features table versus results table.
Features Table or X_large =  (10000, 11) 
Results Table or y_large =  (10000,)

1st Split - Temporary X_large dataset (for Train/Validation datasets)the and he Test dataset Split:

In [140]:
# Splitting the entire dataset into the Training and Testing data sets.
# The Test data set is not used until the end of the process to evaluate the final performance of the model on completely unseen data.
X_large, X_test, y_large, y_test = train_test_split(X, y, test_size = 0.20, random_state = 42,stratify=y,shuffle = True)

This will leave 20% of the available data for the Test dataset and 80% of the data for the train and validation data sets combined.

  • The full allocation is 80% of the data (64% of the original data) for training, and 20% of the 80% data (16% of the original data) for validation.
In [141]:
print("\nTrain/Validation features data (X_large) = ",X_large.shape, "with corresponding Results (y_large) = ",y_large.shape,"\nTest features data (x_test) = ", X_test.shape, "with corresponding Test results data",y_test.shape) # Check splits allocations.
Train/Validation features data (X_large) =  (8000, 11) with corresponding Results (y_large) =  (8000,) 
Test features data (x_test) =  (2000, 11) with corresponding Test results data (2000,)

2nd Split - Temporary X_large dataset split into Train and Validation datasets:

In [142]:
# Splitting the temporaty X_large dataset into the Training and Validation data sets.
X_train, X_val, y_train, y_val = train_test_split(X_large, y_large, test_size = 0.20, random_state = 42,stratify=y_large, shuffle = True) # The validation set is used during the training process to fine-tune model parameters and to avoid overfitting. It helps to evaluate how well the model is generalizing *without* touching the test set.
In [143]:
# After split checks.
print("\nTrain/Validation features data (X_large) = ",X_large.shape, "with corresponding Results (y_large) = ",y_large.shape)
print("Above was split as follows:")
print("\nTraining features data (X_train) = ", X_train.shape, "with corresponding Results (y_large) = ",y_train.shape,"\nValidation features data (X_val) = ", X_val.shape, "with corresponding Validation Results (y_val) = ",y_val.shape)
print("\n - - - - - - - - Test dataset will be used to verify training model on unseen data (X_test) = ", X_test.shape)
Train/Validation features data (X_large) =  (8000, 11) with corresponding Results (y_large) =  (8000,)
Above was split as follows:

Training features data (X_train) =  (6400, 11) with corresponding Results (y_large) =  (6400,) 
Validation features data (X_val) =  (1600, 11) with corresponding Validation Results (y_val) =  (1600,)

 - - - - - - - - Test dataset will be used to verify training model on unseen data (X_test) =  (2000, 11)
In [144]:
# EXTRA CREDIT (Visualize Splits)
# Check the shapes of the training, validation, and test sets adjacent to each other with colored text
print("\nX_Large =", X_large.shape,"is further split into training and validation. Presented here for easier visualizations:")
print(f"\033[33mTrain set shape: {X_train.shape}\033[0m", end=' | ')  # Orange color for train
print(f"\033[32mValidation set shape: {X_val.shape}\033[0m", end=' | ')  # Green color for validation
print(f"\033[35mTest set shape: {X_test.shape}\033[0m")  # Violet color for test

print("\n")
print(f"1. Original set shape: {X.shape}")
print("|" * 100)  # Print a line separator.
# X is your feature data, and y is your target data


# First split: Splits data into 80% training+validation (X_large, y_large) and 20% test (X_test, y_test)
print("\n\n2. First split: Splits the data into 80% \033[1;4mtemporay (temp)\033[0m (X_large, y_large) and \033[35m20% test (X_test, y_test).\033[0m")
print("|" * 80, "\033[35m|\033[0m" * 20)  # Line separator to show split visually
print(f"Temporary (temp) set shape: {X_large.shape}")
print(f"\033[35mTest set shape: {X_test.shape}\033[0m\n")

# Second split: Subsplit X_large and y_large into 80% training and 20% validation
print("\n3. Second split: Subsplits the (temp) dataset into the \033[1;4mtraining (64% of total) + validation set (16% of total),\033[0m \nor (X_Large, y_large) = (X_train, y_train) + (X_val, y_val).")
print("\033[1;33mt\033[0m" * 64, "\033[1;32mv\033[0m" * 16, "\033[1;35mT\033[0m" * 20)  # Line separator with t for training and v for validation
print(f"\033[1;33mTraining set shape (t): {X_train.shape}\033[0m") # Training set
print(f"\033[1;32mValidation set shape (v): {X_val.shape}\033[0m") # Validation set
print(f"\033[35mTest set shape (T): {X_test.shape}\033[0m") # Test set
X_Large = (8000, 11) is further split into training and validation. Presented here for easier visualizations:
Train set shape: (6400, 11) | Validation set shape: (1600, 11) | Test set shape: (2000, 11)


1. Original set shape: (10000, 11)
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||


2. First split: Splits the data into 80% temporay (temp) (X_large, y_large) and 20% test (X_test, y_test).
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| ||||||||||||||||||||
Temporary (temp) set shape: (8000, 11)
Test set shape: (2000, 11)


3. Second split: Subsplits the (temp) dataset into the training (64% of total) + validation set (16% of total), 
or (X_Large, y_large) = (X_train, y_train) + (X_val, y_val).
tttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttt vvvvvvvvvvvvvvvv TTTTTTTTTTTTTTTTTTTT
Training set shape (t): (6400, 11)
Validation set shape (v): (1600, 11)
Test set shape (T): (2000, 11)

Data Normalization - Data Preprocessing¶

Since all the numerical values are on a different scale, we will scale them. To normalize them to the same scale:

In [145]:
# Define cols_list as all columns except 'Exited'
#cols_list = [col for col in X_train.columns if col != 'Exited']  # Replace X_train with your feature dataframe name if different
cols_list = [col for col in cols_list if col != 'Exited'] # Target Variable separated from the the feature variables.

# creating an instance of the standard scaler
sc = StandardScaler()

# Apply scaling to the specified columns
X_train[cols_list] = sc.fit_transform(X_train[cols_list])  # Scaling for training set
X_val[cols_list] = sc.transform(X_val[cols_list])          # Scaling for validation set
X_test[cols_list] = sc.transform(X_test[cols_list])        # Scaling for test set

Why using StandardScaler() is important, specially with Neural Networks:

  1. Feature Scaling: Many machine learning algorithms assume that the features in the dataset are centered around 0 and have similar scales. If the features vary widely in terms of scale (e.g., one feature is measured in dollars while another is measured in grams), it can cause the model to perform poorly because larger scaled features may dominate the learning process. StandardScaler standardizes features by subtracting the mean and dividing by the standard deviation.
  2. Avoid Bias Toward Larger-Scaled Features: Without scaling, features with larger numerical values will have a greater effect on the model’s behavior, even if they aren’t inherently more important. For example, a feature with values in the thousands will dominate over a feature with values in the range of 1 to 10.
  3. Improved Model Convergence: Many optimization algorithms, like gradient descent, work faster and converge more efficiently when the data is scaled. If the features are on different scales, the optimization process can take longer and be less effective.
  4. Neural Networks: In neural networks, scaling input features is particularly important because `it ensures that gradients flow more smoothly` through the network, preventing issues like vanishing or exploding gradients.

Model Building ---------------------------------------------------------------------- Section

Model Evaluation Rationale - Model Building

Logic for choosing the metric that would be the best metric for this business scenario.

  • Observations from available dataset: Within the last 6 months a 1/4 of all customers left resulting in a significant permanent loss of future revenue.
  • Our target is to predict which customers are likely to abandon bank services to preempt and avoid their departure well in advance.
  • The worst situation is predicting no departure and yet, customer does exit. False Negative
  • Our NN model should be capable of predicting customers exiting matching actual exiting as accurtely as possible. False Negatives should get as small as possible.
  • Therefore Recall = TP/ (TP+FN) is our best measure.
  • We will monitor for the highest Recall value (smallest False Positives)

Let's create a function for plotting the confusion matrix

In [146]:
def make_confusion_matrix(actual_targets, predicted_targets):
    """
    To plot the confusion_matrix with percentages

    actual_targets: actual target (dependent) variable values
    predicted_targets: predicted target (dependent) variable values
    """
    cm = confusion_matrix(actual_targets, predicted_targets)
    labels = np.asarray(
        [
            ["{0:0.0f}".format(item) + "\n{0:.2%}".format(item / cm.flatten().sum())]
            for item in cm.flatten()
        ]
    ).reshape(cm.shape[0], cm.shape[1])

    plt.figure(figsize=(3, 2))
    sns.heatmap(cm, annot=labels, fmt="")
    plt.ylabel("True label")
    plt.xlabel("Predicted label")

Let's create two blank dataframes that will store the recall values for all the models we build.

In [147]:
train_metric_df = pd.DataFrame(columns=["recall"])
valid_metric_df = pd.DataFrame(columns=["recall"])

Neural Network with SGD Optimizer - Model Building

Stochastic Gradient Descent keras optimizer:

  • PROs

    • Faster for large datasets.
    • Requires computing gradients and weight calculations for the entire data.
    • Because its stochastic nature, SGD can escape local minima and potentially find better solutions.
  • CONs

    • Nosy Updates making it less stable leading to oscillations.
    • Requires Tuning of learning rates and other hyperparameters.
    • Slower Convergance compared to Adam (Adaptive)
In [148]:
backend.clear_session() # Clear any previous models or computational graphs in memory
print("Previous models or computational graphs in memory have been cleared.")

# By setting a “seed,” we control the random number generators so that they produce the same random numbers every time.
print("\nNow, we set the seed for NumPy, Python, and TensorFlow random generators ...\n")

np.random.seed(2) # Fixes the seed for NumPy’s random number generator.
random.seed(2) # Fixes the seed for Python’s built-in random number generator.
tf.random.set_seed(2) # Fixes the seed for TensorFlow’s random number generator.
print("Why? Because some parts of machine learning (like initializing weights or shuffling data) involve randomness.")
print("We need to produce the same random numbers every time producing consisten results for testing machine learning models.")
Previous models or computational graphs in memory have been cleared.

Now, we set the seed for NumPy, Python, and TensorFlow random generators ...

Why? Because some parts of machine learning (like initializing weights or shuffling data) involve randomness.
We need to produce the same random numbers every time producing consisten results for testing machine learning models.
In [149]:
#Initializing the neural network
model_0 = Sequential()
# Adding the input layer with 64 neurons and relu as activation function
model_0.add(Dense(64, activation='relu', input_dim = X_train.shape[1]))
# Complete the code to add a hidden layer (specify the # of neurons and the activation function)
model_0.add(Dense(32, activation='relu'))
# Complete the code to add the output layer with the number of neurons required.
model_0.add(Dense(1, activation = 'sigmoid'))

print("Neural Network initialized...\n")
print("Hidden Layer: 32 neurons with the ReLU activation function.")
print("Output Layer: 1 neuron with the sigmoid activation function.")
print("Note: For binary classification tasks, using a single neuron with the sigmoid function is common because it outputs probabilities between 0 and 1.\n")
Neural Network initialized...

Hidden Layer: 32 neurons with the ReLU activation function.
Output Layer: 1 neuron with the sigmoid activation function.
Note: For binary classification tasks, using a single neuron with the sigmoid function is common because it outputs probabilities between 0 and 1.

In [150]:
optimizer = tf.keras.optimizers.SGD(learning_rate=0.001) # SGD(learning_rate=0.001)SGD:
# This is the stochastic gradient descent optimizer, which controls the step size for each iteration during the optimization process.
print("\nStochastic Gradient Descent optimizer (SDG) set up. \nIt controls step size for each iteration during optimization process.\n")
# uncomment one of the following lines to define the metric to be used
# metric = 'accuracy'
metric = keras.metrics.Recall()
# metric = keras.metrics.Precision()
# metric = keras.metrics.F1Score()
print("Recall metric setup.\n")
Stochastic Gradient Descent optimizer (SDG) set up. 
It controls step size for each iteration during optimization process.

Recall metric setup.

In [151]:
# Binary cross entropy as loss function and Recall metric setup.
model_0.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=[keras.metrics.Recall()])
print("\nloss = 'binary_crossentropy'setup: This is the appropriate loss function for binary classification tasks, measuring the difference between the Predicted and Actual values.\n")
print("metrics = [keras.metrics.Recall()] setup: This specifies that recall will be used as a metric to evaluate the model’s performance during training and testing.\n")
loss = 'binary_crossentropy'setup: This is the appropriate loss function for binary classification tasks, measuring the difference between the Predicted and Actual values.

metrics = [keras.metrics.Recall()] setup: This specifies that recall will be used as a metric to evaluate the model’s performance during training and testing.

In [152]:
model_0.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                         ┃ Output Shape                ┃         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ dense (Dense)                        │ (None, 64)                  │             768 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_1 (Dense)                      │ (None, 32)                  │           2,080 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_2 (Dense)                      │ (None, 1)                   │              33 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 2,881 (11.25 KB)
 Trainable params: 2,881 (11.25 KB)
 Non-trainable params: 0 (0.00 B)
In [100]:
# Visualize weights from the first layer
import matplotlib.pyplot as plt

weights = model_0.layers[0].get_weights()[0]
plt.imshow(weights, aspect='auto', cmap='viridis')
plt.colorbar()
plt.title('Weights of Layer {}'.format(0))
plt.show()
In [ ]:
# List weights and biases of the first layer
weights_layer_0 = model_0.layers[0].get_weights()
print(weights_layer_0[0])  # Weights
print(weights_layer_0[1])  # Biases
[[ 1.75325200e-01 -2.61551142e-01  5.55486642e-02  1.14529379e-01
   2.23805144e-01  2.67600328e-01  2.32992500e-01 -1.18666239e-01
  -1.29189283e-01 -7.48853087e-02  1.58950016e-02 -2.20118091e-01
   2.73295999e-01  6.38830811e-02 -2.60986596e-01  9.30325985e-02
   4.30375934e-02 -1.56150654e-01 -1.29977688e-01 -1.46740317e-01
  -2.03299910e-01 -1.45411894e-01  1.81421742e-01  9.27374363e-02
   4.78739664e-02  2.10710168e-01 -3.00635584e-03  2.97922760e-01
   2.26749063e-01 -3.96726131e-02 -1.44787222e-01  2.82083094e-01
  -2.16455773e-01  6.40020296e-02  6.17788211e-02 -1.15042366e-01
   1.19284868e-01 -2.07410809e-02  2.04781219e-01 -1.22410670e-01
  -1.50430277e-01 -2.49734640e-01 -1.02991961e-01 -8.97317827e-02
  -7.60480911e-02 -1.61512405e-01  1.81536809e-01 -1.85490489e-01
  -7.43354186e-02 -1.79495990e-01  2.75027603e-01 -2.56038934e-01
   4.68956940e-02 -2.19024733e-01  1.38982072e-01  2.61363775e-01
   2.48185068e-01  5.94121180e-02 -8.73972476e-02 -1.44774411e-02
  -1.73507541e-01  1.12357505e-01  2.26097573e-02 -1.30013153e-01]
 [-2.77875006e-01 -1.92843392e-01  1.47020996e-01  1.27795339e-01
  -1.56242982e-01  5.46662733e-02 -1.98231302e-02 -3.82666439e-01
   1.53770939e-01 -2.85252362e-01  1.73556954e-01  5.43470234e-02
   1.00944586e-01 -3.59486997e-01 -1.16274819e-01 -1.90961525e-01
   2.37043425e-01 -5.45796454e-02 -1.43584400e-01 -2.02470630e-01
   3.43751535e-02  1.63169950e-01 -3.39919835e-01  2.22971708e-01
  -2.09099427e-01 -1.61077514e-01 -9.38678980e-02  1.56588927e-01
  -4.05210257e-01  1.00672156e-01 -8.98180306e-02  2.64094263e-01
   1.90001622e-01  1.96987361e-01 -2.06921980e-01 -2.91155428e-01
  -1.31536826e-01 -3.83311212e-01 -2.75842547e-01  5.97987659e-02
   1.66453376e-01 -2.53036618e-01 -2.19846070e-01 -1.58340722e-01
   3.21604833e-02 -1.59297377e-01  1.93217143e-01  2.88794547e-01
  -2.93830514e-01  1.19487248e-01  2.32113734e-01  5.67215495e-02
   8.39634538e-02 -3.02175075e-01 -1.65743567e-02 -1.92290395e-01
  -1.35632470e-01  3.06484085e-02 -3.59215200e-01 -1.99775353e-01
   1.58487022e-01  2.29762822e-01  2.82133639e-01 -1.89796135e-01]
 [-2.38366053e-01  1.51489884e-01  8.71298313e-02  1.50488928e-01
   1.92815691e-01 -1.62413195e-01  2.21276373e-01 -2.18579441e-01
   2.34985277e-01  4.22470532e-02 -4.62810230e-03 -3.84170674e-02
   1.42072886e-01 -2.52307445e-01  2.12509513e-01 -1.85181990e-01
   1.65045291e-01 -1.24946892e-01  6.23911358e-02 -1.17576122e-01
  -3.29313762e-02  3.34211327e-02  1.62450701e-01 -1.90048903e-01
  -6.58534234e-04  1.31855786e-01  2.62132525e-01  1.75664157e-01
   1.95188105e-01  8.27219337e-03 -4.66910414e-02  2.81070381e-01
   6.08831756e-02 -2.77243722e-02 -1.97202653e-01  2.06788823e-01
  -1.00539438e-01 -1.43210396e-01  1.13481894e-01  1.13895155e-01
  -6.09865710e-02  1.23023866e-02  1.53081223e-01 -9.39754993e-02
  -4.79836687e-02 -1.29831024e-02  2.59787172e-01 -7.37508237e-02
   1.04278706e-01 -2.38064468e-01 -7.11286962e-02 -1.28303915e-01
  -9.88961160e-02  1.67083398e-01  2.04120457e-01  1.42105386e-01
  -1.01695187e-01  1.90436989e-01 -1.41859520e-02  1.90842956e-01
   2.39000484e-01  3.19318771e-02 -2.05689400e-01 -2.82409545e-02]
 [ 4.62489873e-02 -1.70567602e-01 -1.03737317e-01  2.19820559e-01
   3.75496894e-02  1.54966891e-01  1.99934781e-01 -5.52939735e-02
   8.69238079e-02  1.93283916e-01 -1.48417875e-01 -2.32818589e-01
   2.49768227e-01  4.21912707e-02 -5.28559573e-02 -1.07855789e-01
   2.38311216e-02 -3.33375096e-01 -2.04199970e-01  2.47138008e-01
  -1.63914159e-01  1.04485780e-01  4.18922044e-02 -2.08765581e-01
   9.25828516e-02 -2.32824326e-01  1.94780797e-01 -3.22697878e-01
  -1.37628391e-01  1.56032667e-01 -6.60693049e-02 -3.23025510e-02
  -4.81197871e-02 -2.75612503e-01  6.59848237e-03 -1.90837383e-01
   1.95719391e-01  4.43976419e-03 -1.67337194e-01 -9.83253643e-02
   1.50849417e-01  2.55279127e-03  2.44578391e-01  1.42143786e-01
  -6.35282099e-02 -1.24812946e-01  5.68138361e-02 -3.67664397e-02
   1.44774793e-02 -2.45728329e-01  1.20210655e-01 -2.40016401e-01
   1.63166597e-01 -2.28928864e-01  2.49116749e-01 -1.27659976e-01
   2.26858705e-01  1.55054912e-01 -9.76372138e-02 -2.03144342e-01
  -7.28275254e-02 -4.15479988e-02 -4.41395678e-02 -2.51601607e-01]
 [ 1.51939660e-01 -8.20504129e-02 -5.31573445e-02 -2.86829174e-01
  -3.10727619e-02  2.35287473e-01 -7.49351643e-03 -9.76649579e-03
  -1.81044161e-01 -1.75131440e-01  2.61306345e-01  1.98987573e-02
   1.86608627e-01  1.75435007e-01  2.01990805e-03  1.16104439e-01
  -4.83771898e-02  2.48490885e-01  2.85073649e-03 -1.68678969e-01
  -8.53991508e-02 -1.03008002e-02 -4.16605361e-02 -2.26272687e-01
  -2.31283754e-01  5.48926108e-02 -1.41371727e-01 -9.45099592e-02
   1.66713268e-01  2.31513098e-01  7.69176846e-03 -1.95819605e-02
   4.57391217e-02  1.20872371e-02 -1.49078593e-01 -7.61611462e-02
  -3.96639258e-02 -1.55569632e-02 -7.27886930e-02  3.25799853e-01
   3.37594092e-01 -6.63924403e-03 -2.64793515e-01  2.84997821e-01
  -2.65040606e-01 -3.50577652e-01  2.49056652e-01 -2.41529435e-01
  -1.20639980e-01  1.32866710e-01 -1.30399428e-02  2.13928252e-01
  -6.36447370e-02  5.51327653e-02 -2.87435949e-01 -2.72118032e-01
   1.50022596e-01  2.49396950e-01 -1.11696579e-01  3.54541123e-01
  -1.96084931e-01  2.95047849e-01 -3.10304791e-01  4.39487733e-02]
 [ 1.37623832e-01  6.28702492e-02 -1.61772698e-01  3.31501029e-02
  -1.39655337e-01 -1.34315982e-01  2.62443215e-01 -7.85615966e-02
  -1.85934842e-01  1.17080852e-01 -2.27191895e-01 -4.01922874e-02
  -1.97408870e-01 -2.19026908e-01  2.25613162e-01 -1.62872642e-01
   2.81272650e-01 -5.18936813e-02 -1.96548134e-01  2.81764530e-02
   2.94271588e-01  2.50264794e-01  6.29112571e-02 -9.39797312e-02
  -4.58701625e-02  1.16776824e-01 -2.35369131e-01  1.61439329e-01
  -1.32945046e-01 -2.50164449e-01  1.33126304e-01  1.81132138e-01
  -1.59631595e-01  4.30756100e-02  2.57264227e-02  4.06874605e-02
  -5.22023030e-02 -5.72340228e-02 -2.27592468e-01  6.01546764e-02
  -1.18586726e-01  1.54675961e-01 -2.60325760e-01  1.28113046e-01
   5.55787273e-02  1.38112754e-01 -6.86111823e-02 -1.00029990e-01
   1.80657029e-01  2.06517234e-01  2.18129098e-01 -1.45164952e-01
   2.76825190e-01  2.30478406e-01 -9.37128514e-02 -2.04631910e-01
  -7.39591420e-02 -3.26465629e-02 -9.48567912e-02  2.41525359e-02
   2.58463621e-01  2.80450881e-02 -1.79361448e-01 -2.51954913e-01]
 [-3.64561416e-02  1.21472612e-01 -1.89491779e-01 -2.47748550e-02
   1.11062482e-01  6.09869473e-02 -1.45133838e-01  1.10828452e-01
   3.23038012e-01 -3.85914519e-02  1.27997577e-01  2.63215929e-01
  -1.19324319e-01 -9.91338491e-02 -1.97844449e-02  8.88772979e-02
   5.77602722e-02  2.21770350e-02  1.75589502e-01  4.72988654e-03
   1.96549833e-01  2.98000097e-01  2.60328442e-01 -2.32695207e-01
  -2.77645979e-02  4.32125013e-03  1.63463473e-01  3.21995646e-01
  -2.01237902e-01  2.20914468e-01  7.98635259e-02 -4.27578166e-02
  -1.52154118e-01 -3.03812474e-01  2.24673599e-01  4.60594818e-02
   1.99594945e-01 -7.14858994e-02  1.91106752e-01 -8.64756256e-02
  -8.85617584e-02  1.10019341e-01  6.87181130e-02  9.13925469e-02
  -1.98685765e-01  1.92627922e-01 -1.15055889e-01 -3.26412052e-01
  -1.71863213e-01 -1.30129769e-01 -1.93348780e-01  2.22303286e-01
   2.02146545e-01 -1.39936209e-01 -2.27206513e-01  1.25746533e-01
   1.44074455e-01  1.29723758e-01 -2.94913262e-01  1.36798292e-01
  -2.25052789e-01  1.16947278e-01 -2.00633958e-01  4.69887592e-02]
 [ 9.54004079e-02  2.08809655e-02  2.37760842e-01 -5.92522807e-02
   1.66359153e-02 -2.86506474e-01 -2.12337196e-01  1.66981637e-01
  -1.20735012e-01  3.27185877e-02  2.80765444e-01 -6.20426945e-02
   2.35230803e-01 -3.30364145e-02 -3.80063504e-02  1.64424747e-01
   2.03090012e-01 -9.14539248e-02  2.43541449e-01 -1.62929982e-01
   5.31978831e-02 -1.98203921e-01  2.22846732e-01 -5.05306348e-02
   1.18639424e-01  1.58717439e-01  9.56787392e-02 -1.08770691e-01
   1.98317617e-01 -1.81222677e-01  5.00074413e-04 -1.59596443e-01
   1.15597591e-01 -1.86131448e-01  8.36059973e-02 -2.48968437e-01
   2.11479083e-01 -2.23015830e-01  2.03817353e-01  9.64586064e-02
  -4.03333493e-02  1.43120766e-01 -3.66301984e-02  2.37170956e-03
  -2.19972700e-01  2.06317887e-01 -1.73651636e-01  1.11358739e-01
   2.71913499e-01 -1.86812446e-01 -1.79158404e-01  1.21753410e-01
   9.78987068e-02  1.61031008e-01  2.28393711e-02  1.17111094e-02
   1.59756124e-01  1.19008459e-01  2.19947502e-01 -1.17699102e-01
   6.84763025e-03 -1.29870892e-01 -1.80716246e-01 -7.75544047e-02]
 [ 6.72440007e-02  1.66921318e-01  1.12760300e-03 -6.81924373e-02
  -2.25733176e-01 -1.41446903e-01 -2.52844840e-01  1.01140961e-01
  -2.19501913e-01 -8.10845643e-02  2.38594948e-03  3.96670066e-02
   4.16637138e-02  2.02749997e-01 -2.74235725e-01  1.86112180e-01
  -2.53965288e-01 -2.59736657e-01  8.42113420e-02 -1.72451854e-01
  -1.58557892e-01 -1.06056094e-01  1.06731318e-01  4.29992303e-02
  -1.80741578e-01 -1.63737327e-01  3.44591984e-03 -2.62685120e-04
   8.51374939e-02 -3.09739094e-02 -2.16917992e-01 -1.40896246e-01
   2.01472178e-01 -2.39217281e-01  1.06686853e-01 -1.68015853e-01
   1.05814017e-01  1.44873917e-01  2.09050953e-01  2.73579836e-01
   2.27043390e-01  8.06075707e-02 -1.32811055e-01 -1.18406169e-01
   2.33691573e-01  6.01055361e-02 -1.43020779e-01 -1.43981948e-01
  -1.81902453e-01 -8.42926502e-02 -1.66288272e-01 -3.77338901e-02
   2.10194781e-01  4.61993702e-02 -3.90777439e-02 -2.16398656e-01
   9.57676172e-02 -2.56552786e-01  2.30182454e-01  1.47020603e-02
   2.60159910e-01  2.56889999e-01  2.97992438e-01  5.71855642e-02]
 [ 5.58880791e-02 -5.72017133e-02  3.13935280e-02  1.05731733e-01
  -1.21514052e-01  2.59490311e-01 -5.93201779e-02 -2.41879165e-01
   2.59593576e-01 -2.43431240e-01 -1.48946606e-02  1.94095001e-01
   2.41133749e-01 -1.56028703e-01 -2.51776785e-01  2.69725323e-01
   2.16350198e-01  4.75136824e-02  2.48787720e-02  2.98204035e-01
  -2.49910071e-01  4.43263836e-02 -2.01922148e-01  2.59866148e-01
  -1.82127267e-01 -2.77746469e-01 -6.13836646e-02  3.20007294e-01
   4.46311384e-02  1.49207115e-01  1.95203319e-01 -9.31701362e-02
  -2.08862543e-01 -2.15315223e-01  1.15041777e-01  2.31536940e-01
   9.03516561e-02 -6.08399399e-02  1.12603404e-01 -1.30238786e-01
  -2.20546246e-01  1.23643398e-01  1.24022782e-01  4.73669544e-02
   2.37502009e-01  1.17760129e-01  2.42959380e-01  7.54257515e-02
   2.89525747e-01 -1.13039851e-01 -5.85733801e-02 -1.57140940e-01
   2.55377442e-01  2.43842881e-02 -2.74751693e-01  2.52303481e-01
  -1.72254995e-01  1.87096134e-01  2.60789424e-01  9.54534635e-02
  -6.46406561e-02 -2.56804407e-01 -2.15239748e-01  1.10967651e-01]
 [-1.68431222e-01 -1.57030046e-01 -2.88781896e-02  5.34576327e-02
   1.66545153e-01  6.41584173e-02 -1.47536784e-01  1.29698500e-01
  -2.74771061e-02  2.48271435e-01 -1.69546008e-01  1.25211984e-01
  -2.28666827e-01  1.84660748e-01 -4.66840714e-02  2.05147654e-01
   8.81307647e-02  3.96724828e-02 -1.54377580e-01 -7.25148479e-03
   2.73410589e-01  3.33248377e-02  2.25307822e-01 -4.05469798e-02
  -1.05280824e-01  2.10240051e-01  2.04325691e-01  3.35525781e-01
   1.78731009e-01 -1.52902484e-01 -2.30175376e-01 -9.55596790e-02
  -8.64481330e-02  7.06351828e-03  2.64078408e-01  1.05550423e-01
  -2.42634863e-03  3.18281174e-01 -1.36565939e-01  2.77415365e-01
  -1.24134785e-02  3.31785940e-02  1.43344954e-01 -2.42752016e-01
   2.14808360e-01  9.64721143e-02 -2.40750730e-01 -2.26175308e-01
  -1.17121324e-01  5.88146299e-02 -1.63718581e-03 -8.13354552e-02
   2.67996103e-01  1.42345130e-01  6.87743127e-02  1.70349777e-01
   6.79202452e-02 -2.56186668e-02  1.13222562e-01 -2.04304159e-01
   2.42569819e-02 -2.28297621e-01  1.05461925e-02 -5.95302247e-02]]
[ 0.00489177 -0.01184957 -0.02791515 -0.01597887  0.04823732  0.0439708
  0.00023827  0.05457535  0.03710898  0.01812254  0.01339818  0.01526786
  0.03327889  0.06471291 -0.02218734 -0.01821242 -0.00484542  0.01713948
  0.01162239  0.04018354  0.03797736  0.0412976   0.04351616  0.01772772
  0.01275408  0.00064005  0.02537233  0.0905596   0.07590915  0.00365814
 -0.02088384  0.01262987 -0.00364384  0.03012564 -0.00775362  0.09214671
  0.01910074  0.10667358  0.00057508  0.00345656 -0.01557318  0.00607846
 -0.00101948 -0.03508409 -0.03384737 -0.04148693 -0.0052371   0.00632306
  0.0312261   0.00036373  0.01324673  0.02238023 -0.01131031  0.01054683
 -0.00817408 -0.03099848 -0.00405851 -0.03860778  0.04913009 -0.02773636
  0.02189949 -0.00915176  0.01311983  0.03112351]
In [153]:
# Datasets
print("\nX_Large =", X_large.shape,"is further split into training and validation. Presented here for easier visualizations:")
print(f"\033[33mTrain set shape: {X_train.shape}\033[0m", end=' | ')  # Orange color for train
print(f"\033[32mValidation set shape: {X_val.shape}\033[0m", end=' | ')  # Green color for validation
print(f"\033[35mTest set shape: {X_test.shape}\033[0m")  # Violet color for test

# Objectives
print("_" * 100)
print(f"Main Objectives: As the model trains over more epochs ...")
print(f"We look for a LOSS decrease \u25BC and a RECALL increase \u25B2 for training and also validation sets:")
print("  • Training Loss and Validation Loss: How the loss changes over time.")
print("-" * 100)
print("\nBatch Size: We will set to 16, 32, 64, or 128.")
print("A smaller batch size can provide more updates to the NN model weights during training but may take longer,")
print("while a larger batch size may speed up training but provide fewer updates.")
print("\nEpochs: The number of epochs should depend on your specific dataset and NN model.")
print("We will start with values 10, 50, or 100, and monitor the training and validation loss/metrics")
print("to see when the NN model converges or starts to overfit.")
print("  • Recall and Validation Recall: How the ability of the model to detect true positives changes during training.\n")
X_Large = (8000, 11) is further split into training and validation. Presented here for easier visualizations:
Train set shape: (6400, 11) | Validation set shape: (1600, 11) | Test set shape: (2000, 11)
____________________________________________________________________________________________________
Main Objectives: As the model trains over more epochs ...
We look for a LOSS decrease ▼ and a RECALL increase ▲ for training and also validation sets:
  • Training Loss and Validation Loss: How the loss changes over time.
----------------------------------------------------------------------------------------------------

Batch Size: We will set to 16, 32, 64, or 128.
A smaller batch size can provide more updates to the NN model weights during training but may take longer,
while a larger batch size may speed up training but provide fewer updates.

Epochs: The number of epochs should depend on your specific dataset and NN model.
We will start with values 10, 50, or 100, and monitor the training and validation loss/metrics
to see when the NN model converges or starts to overfit.
  • Recall and Validation Recall: How the ability of the model to detect true positives changes during training.

In [158]:
print(X_train.dtypes) # Check for datatypes ro ensure no Object/Category remaain unprocessed.
print(y_train.dtypes) # Check for datatypes ro ensure no Object/Category remaain unprocessed.
CreditScore          float64
Age                  float64
Tenure               float64
Balance              float64
NumOfProducts        float64
HasCrCard            float64
IsActiveMember       float64
EstimatedSalary      float64
Geography_Germany    float64
Geography_Spain      float64
Gender_Male          float64
dtype: object
int64
In [159]:
# Forward Propagation Illustration - Utility Function

import time
import sys

# ANSI color codes
RED = "\033[31m"
GREEN = "\033[32m"
ORANGE = "\033[33m"
RESET = "\033[0m"

# Forward Propagation Illustration
print("\nIllustration of -Forward Propagation- by Activation Functions at each neuron based on the strenght of importance or weights between nodes.")

# Function to animate dashes with color choice, moving towards a fixed [*] position, and stop after a timeout
def animate_dashes_with_fixed_asterisk(total_time=10, speed=0.1, color="RED"):
    # Choose color based on user input
    if color == "RED":
        dash_color = RED
    elif color == "GREEN":
        dash_color = GREEN
    elif color == "ORANGE":
        dash_color = ORANGE
    else:
        dash_color = RESET  # Default to no color if an invalid option is given

    width = 10  # Initial width of dashes
    fixed_pos = 45  # Fixed position for the asterisk [*]
    start_time = time.time()  # Record the starting time

    while time.time() - start_time < total_time:
        # Calculate remaining space before the asterisk [*]
        remaining_space = fixed_pos - width - 3  # 3 accounts for the length of [*]

        if remaining_space < 0:
            remaining_space = 0  # Stop growing dashes once they reach [*]

        # Use carriage return '\r' to return to the start of the line and overwrite the previous output
        sys.stdout.write(f"\r{GREEN}[* neuron layer_i *] {dash_color}{'-' * width}{' ' * remaining_space}{ORANGE}[* neuron layer_i+1 *]{RESET}")
        sys.stdout.flush()  # Force output to appear immediately
        time.sleep(speed)   # Control the speed of the animation

        if remaining_space == 0:
            width = 10  # Reset width after dashes meet [*] to restart animation
        else:
            width += 1  # Increase the number of dashes for movement effect

    # After the timeout ends, clear the line and stop
    sys.stdout.write("\r" + " " * (fixed_pos + 3) + "\r")
    sys.stdout.flush()

# Run the animation (set timeout in seconds and specify color: "RED", "GREEN", or "ORANGE")
animate_dashes_with_fixed_asterisk(total_time=10, speed=0.08, color="GREEN")

# Leave the illustration completed without animation
print(f"\r{GREEN}[* neuron layer_i *]{GREEN} ----------------------------------------- {ORANGE}[* neuron layer_i+1 *]")
Illustration of -Forward Propagation- by Activation Functions at each neuron based on the strenght of importance or weights between nodes.
[* neuron layer_i *] ----------------------------------------- [* neuron layer_i+1 *]

Now, drum roll please ...

In [155]:
# Fitting the Artificial Neural Network (ANN) Model: ORIGINAL
print("\nFitting the Artificial Neural Network (ANN) Model: SGD - our baseline ...\n")
history_0 = model_0.fit(
    X_train, y_train, # Train model on the training dataset
    batch_size=32,    # Specify the batch size to 32
    validation_data=(X_val, y_val), # Use validation dataset for evaluation of this training model. <------ Monitor
    epochs=50,    # Specify the number of epochs to 50. The model will be trained this total cycles.
    verbose=1
)
Fitting the Artificial Neural Network (ANN) Model: SGD - our baseline ...

Epoch 1/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 3s 4ms/step - loss: 0.6602 - recall_1: 0.3081 - val_loss: 0.6005 - val_recall_1: 0.0583
Epoch 2/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.5963 - recall_1: 0.0376 - val_loss: 0.5584 - val_recall_1: 0.0061
Epoch 3/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.5575 - recall_1: 0.0000e+00 - val_loss: 0.5325 - val_recall_1: 0.0000e+00
Epoch 4/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.5332 - recall_1: 0.0000e+00 - val_loss: 0.5160 - val_recall_1: 0.0000e+00
Epoch 5/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.5174 - recall_1: 0.0000e+00 - val_loss: 0.5050 - val_recall_1: 0.0000e+00
Epoch 6/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.5065 - recall_1: 0.0000e+00 - val_loss: 0.4972 - val_recall_1: 0.0000e+00
Epoch 7/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4986 - recall_1: 0.0000e+00 - val_loss: 0.4914 - val_recall_1: 0.0000e+00
Epoch 8/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4924 - recall_1: 0.0000e+00 - val_loss: 0.4868 - val_recall_1: 0.0000e+00
Epoch 9/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4874 - recall_1: 0.0000e+00 - val_loss: 0.4829 - val_recall_1: 0.0000e+00
Epoch 10/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4831 - recall_1: 0.0000e+00 - val_loss: 0.4796 - val_recall_1: 0.0000e+00
Epoch 11/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4792 - recall_1: 0.0000e+00 - val_loss: 0.4766 - val_recall_1: 0.0000e+00
Epoch 12/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4757 - recall_1: 0.0000e+00 - val_loss: 0.4738 - val_recall_1: 0.0000e+00
Epoch 13/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4725 - recall_1: 7.6306e-06 - val_loss: 0.4712 - val_recall_1: 0.0000e+00
Epoch 14/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4695 - recall_1: 7.6306e-06 - val_loss: 0.4689 - val_recall_1: 0.0000e+00
Epoch 15/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4666 - recall_1: 7.6306e-06 - val_loss: 0.4666 - val_recall_1: 0.0031
Epoch 16/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4640 - recall_1: 1.3905e-04 - val_loss: 0.4645 - val_recall_1: 0.0031
Epoch 17/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4614 - recall_1: 0.0030 - val_loss: 0.4626 - val_recall_1: 0.0031
Epoch 18/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4591 - recall_1: 0.0044 - val_loss: 0.4607 - val_recall_1: 0.0123
Epoch 19/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4568 - recall_1: 0.0131 - val_loss: 0.4589 - val_recall_1: 0.0123
Epoch 20/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4546 - recall_1: 0.0139 - val_loss: 0.4573 - val_recall_1: 0.0123
Epoch 21/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4526 - recall_1: 0.0184 - val_loss: 0.4557 - val_recall_1: 0.0153
Epoch 22/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4506 - recall_1: 0.0240 - val_loss: 0.4542 - val_recall_1: 0.0184
Epoch 23/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4488 - recall_1: 0.0259 - val_loss: 0.4528 - val_recall_1: 0.0184
Epoch 24/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4470 - recall_1: 0.0306 - val_loss: 0.4514 - val_recall_1: 0.0215
Epoch 25/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4453 - recall_1: 0.0338 - val_loss: 0.4501 - val_recall_1: 0.0307
Epoch 26/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4436 - recall_1: 0.0397 - val_loss: 0.4489 - val_recall_1: 0.0429
Epoch 27/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4421 - recall_1: 0.0424 - val_loss: 0.4477 - val_recall_1: 0.0429
Epoch 28/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4406 - recall_1: 0.0525 - val_loss: 0.4466 - val_recall_1: 0.0491
Epoch 29/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4392 - recall_1: 0.0606 - val_loss: 0.4455 - val_recall_1: 0.0521
Epoch 30/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4378 - recall_1: 0.0677 - val_loss: 0.4445 - val_recall_1: 0.0583
Epoch 31/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4365 - recall_1: 0.0791 - val_loss: 0.4435 - val_recall_1: 0.0583
Epoch 32/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4353 - recall_1: 0.0903 - val_loss: 0.4425 - val_recall_1: 0.0706
Epoch 33/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4341 - recall_1: 0.0963 - val_loss: 0.4416 - val_recall_1: 0.0828
Epoch 34/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4329 - recall_1: 0.1129 - val_loss: 0.4408 - val_recall_1: 0.0890
Epoch 35/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4318 - recall_1: 0.1187 - val_loss: 0.4399 - val_recall_1: 0.1012
Epoch 36/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4307 - recall_1: 0.1224 - val_loss: 0.4391 - val_recall_1: 0.1043
Epoch 37/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4297 - recall_1: 0.1269 - val_loss: 0.4384 - val_recall_1: 0.1104
Epoch 38/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4287 - recall_1: 0.1314 - val_loss: 0.4376 - val_recall_1: 0.1135
Epoch 39/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4277 - recall_1: 0.1398 - val_loss: 0.4369 - val_recall_1: 0.1166
Epoch 40/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4267 - recall_1: 0.1470 - val_loss: 0.4362 - val_recall_1: 0.1258
Epoch 41/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4258 - recall_1: 0.1567 - val_loss: 0.4355 - val_recall_1: 0.1350
Epoch 42/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4249 - recall_1: 0.1616 - val_loss: 0.4349 - val_recall_1: 0.1411
Epoch 43/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4241 - recall_1: 0.1639 - val_loss: 0.4342 - val_recall_1: 0.1442
Epoch 44/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4232 - recall_1: 0.1732 - val_loss: 0.4336 - val_recall_1: 0.1564
Epoch 45/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4224 - recall_1: 0.1846 - val_loss: 0.4330 - val_recall_1: 0.1656
Epoch 46/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4216 - recall_1: 0.1858 - val_loss: 0.4324 - val_recall_1: 0.1718
Epoch 47/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4208 - recall_1: 0.1875 - val_loss: 0.4319 - val_recall_1: 0.1718
Epoch 48/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4201 - recall_1: 0.1935 - val_loss: 0.4313 - val_recall_1: 0.1718
Epoch 49/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4193 - recall_1: 0.1993 - val_loss: 0.4307 - val_recall_1: 0.1718
Epoch 50/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4186 - recall_1: 0.2033 - val_loss: 0.4302 - val_recall_1: 0.1810

Observe weights

In [156]:
# Use Keras callbacks to monitor the training process and visualize weights and gradients at different epochs, which may help identify weights contributing to errors during training.
class WeightsLogger(tf.keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs=None):
        for layer in self.model.layers:
            if hasattr(layer, 'kernel'):
                weights = layer.get_weights()[0]
                # Log weights or visualize here
                print(f' Layer {layer.name} weights: {weights}')

model_0.fit(X_train, y_train, callbacks=[WeightsLogger()])
162/200 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 0.4180 - recall_1: 0.2117 Layer dense weights: [[ 0.1698001  -0.25886297  0.05786121  0.11512227  0.2185205   0.26015487
   0.23013912 -0.13763925 -0.13422856 -0.0735462   0.02299409 -0.2315455
   0.27705607  0.05143552 -0.26549038  0.09346051  0.04050122 -0.17588831
  -0.1400347  -0.14817609 -0.2118244  -0.16145441  0.16697414  0.09685919
   0.04901797  0.21092239 -0.0053688   0.29612353  0.23606798 -0.0383767
  -0.14017221  0.2832754  -0.21365534  0.06505312  0.06147418 -0.11140762
   0.1182932  -0.0288341   0.2021804  -0.117778   -0.15327527 -0.2606982
  -0.10170047 -0.08935267 -0.08486737 -0.15560734  0.18666686 -0.19296993
  -0.07010784 -0.19054925  0.27651107 -0.25636235  0.04391531 -0.21905623
   0.12776063  0.26278168  0.25719118  0.06097    -0.08899504 -0.01598633
  -0.17294253  0.11381213  0.02197479 -0.15324774]
 [-0.2724054  -0.20001796  0.16159725  0.14414361 -0.15217759  0.03910734
  -0.0287217  -0.36214456  0.09846065 -0.2816601   0.16233344  0.05165628
   0.09356121 -0.34000483 -0.11413509 -0.18437828  0.24018356 -0.04988728
  -0.14675927 -0.18477339  0.02653232  0.12362278 -0.3247004   0.22635524
  -0.20072784 -0.14686318 -0.10047477  0.11190627 -0.3675978   0.10246873
  -0.09085575  0.24679601  0.20011906  0.19778116 -0.21389875 -0.26218167
  -0.13591573 -0.33509788 -0.27329853  0.06700288  0.20199417 -0.24272051
  -0.21182287 -0.18269324  0.03125214 -0.16905282  0.21374527  0.31005067
  -0.28512248  0.12652333  0.2328636   0.05084969  0.07037014 -0.272253
  -0.02066905 -0.19533634 -0.13965334  0.06447023 -0.3410914  -0.20820603
   0.15348396  0.24857293  0.29150513 -0.17728001]
 [-0.23520945  0.15602402  0.08316176  0.1474197   0.20039478 -0.16411345
   0.22117762 -0.22256547  0.24204217  0.04449666 -0.00209435 -0.04554974
   0.14986113 -0.25531116  0.21111749 -0.18804914  0.16733178 -0.12739937
   0.06258377 -0.11445496 -0.01719498  0.03821147  0.17777091 -0.187746
   0.00496679  0.1435656   0.26538655  0.18959662  0.21624787  0.009065
  -0.04557622  0.28224543  0.05649941 -0.02412524 -0.19265822  0.20459816
  -0.10144102 -0.14868864  0.11377929  0.11718796 -0.05777418  0.01327093
   0.15628706 -0.10039666 -0.05959534 -0.01612666  0.26056105 -0.07880863
   0.11079801 -0.24273358 -0.07312241 -0.12603405 -0.10368913  0.16787258
   0.19364725  0.14155522 -0.10888953  0.20078321 -0.00388886  0.20386632
   0.24211283  0.03308355 -0.22914848 -0.0391342 ]
 [ 0.04280342 -0.16806695 -0.10548043  0.23361939  0.03402813  0.16066995
   0.20121902 -0.05559179  0.07184225  0.18177344 -0.16013621 -0.22704944
   0.24303687  0.05240402 -0.05279581 -0.11736936  0.02504604 -0.31943932
  -0.20365126  0.21101938 -0.17303765  0.10245851  0.04873113 -0.1955208
   0.08567688 -0.22929971  0.1853039  -0.3162566  -0.11753504  0.15949285
  -0.06176531 -0.03216173 -0.04726392 -0.26966342  0.00648859 -0.20734197
   0.18655953 -0.00186729 -0.16843835 -0.11096124  0.12358333 -0.00337499
   0.23806816  0.14458163 -0.03199968 -0.09243913  0.04609194  0.00256346
  -0.00222392 -0.22532563  0.11340954 -0.23550232  0.16182157 -0.2385639
   0.25687155 -0.12470431  0.22822738  0.14925075 -0.1010528  -0.22072989
  -0.07112771 -0.04667087 -0.03714759 -0.2389788 ]
 [ 0.15621305 -0.07757792 -0.02840948 -0.2756145  -0.02691017  0.24525476
  -0.00329656 -0.00056671 -0.2212606  -0.18315442  0.21438003  0.04216012
   0.14108515  0.20910452 -0.00598517  0.09297341 -0.05313625  0.24258299
  -0.00978259 -0.20360269 -0.11407258 -0.01528985 -0.03473942 -0.17076516
  -0.23513359  0.06690317 -0.14548378 -0.13456926  0.20508197  0.21677063
   0.00723044 -0.02218245  0.03195985  0.02265642 -0.1574026  -0.11547738
  -0.03982496 -0.0459046  -0.06272758  0.28168654  0.2726684   0.00107193
  -0.272881    0.19330475 -0.2036685  -0.278281    0.20417689 -0.16538395
  -0.13866396  0.1727991  -0.0168461   0.21416453 -0.06308173  0.06013976
  -0.26764405 -0.24669226  0.09896494  0.18570614 -0.14982678  0.30727375
  -0.17919663  0.25873196 -0.24796137  0.03347048]
 [ 0.1395441   0.06229867 -0.15919954  0.02867591 -0.13690045 -0.12451708
   0.26224953 -0.09412176 -0.18374181  0.11130448 -0.23801214 -0.03448129
  -0.20931    -0.2256857   0.22692172 -0.17243865  0.2794772  -0.05032349
  -0.20016742  0.03268497  0.27783114  0.25225985  0.05636144 -0.09496292
  -0.04982477  0.12225047 -0.23468818  0.15669769 -0.11711328 -0.25555968
   0.13418758  0.18294619 -0.17901088  0.0532695   0.01393676  0.04259022
  -0.06474873 -0.05011928 -0.22222471  0.04690555 -0.14289652  0.14528398
  -0.261204    0.14449333  0.05324759  0.1686337  -0.08189588 -0.12428286
   0.17829138  0.22477524  0.21350959 -0.14077207  0.2835214   0.2345963
  -0.09022728 -0.20785184 -0.08548743 -0.04172724 -0.0940035   0.02441803
   0.2602915   0.01670656 -0.19516079 -0.24712291]
 [-0.03465037  0.11748839 -0.1946489  -0.02427714  0.11840568  0.04338459
  -0.14067979  0.13566063  0.30511326 -0.02036879  0.12172617  0.26815066
  -0.11474405 -0.05415222 -0.02162901  0.09064347  0.05887872  0.03277064
   0.17862384  0.03428796  0.19693944  0.2732702   0.26461676 -0.21756229
  -0.02412023  0.00161708  0.16845384  0.30721077 -0.1705463   0.2276952
   0.07915314 -0.05241114 -0.16910236 -0.2905214   0.22566435  0.07037908
   0.20506224 -0.01048965  0.1922396  -0.08423822 -0.08753476  0.12322685
   0.07746991  0.09134194 -0.19967523  0.200105   -0.11088245 -0.30461705
  -0.16595115 -0.13361824 -0.18220708  0.22776608  0.1851422  -0.13466688
  -0.22512484  0.12220863  0.14688343  0.14807433 -0.26166123  0.13319439
  -0.21334383  0.13010684 -0.21073933  0.07808433]
 [ 0.08965355  0.01499732  0.2297157  -0.06015449  0.02262225 -0.29216883
  -0.21577081  0.17650403 -0.11803572  0.04091308  0.2839711  -0.05628192
   0.22676095 -0.01382109 -0.03544207  0.16564241  0.20304511 -0.09097338
   0.25287554 -0.1797756   0.0547785  -0.21374302  0.22729184 -0.04974876
   0.1293657   0.16936328  0.09568378 -0.11460824  0.22436458 -0.17885384
   0.004464   -0.16217063  0.10395523 -0.18053852  0.0874898  -0.24386099
   0.21620707 -0.22682387  0.2019654   0.09075411 -0.04918823  0.14552397
  -0.03271517 -0.01195329 -0.2283112   0.21722317 -0.18341637  0.10463355
   0.27448237 -0.17889975 -0.18258621  0.12445598  0.09598092  0.17665401
   0.02467845  0.00592865  0.15735707  0.12539749  0.24150845 -0.1261313
   0.00780406 -0.13626607 -0.19135186 -0.06139433]
 [ 0.06694377  0.1657397  -0.00611886 -0.07514691 -0.21738929 -0.1351174
  -0.25472578  0.12905593 -0.20847875 -0.07910483  0.00092226  0.04877701
   0.0370545   0.22108443 -0.27488354  0.18996663 -0.25548485 -0.25539854
   0.09275749 -0.15113518 -0.1482693  -0.08876044  0.11782277  0.02151915
  -0.17437364 -0.15622732  0.00691497  0.01856008  0.09961864 -0.0341
  -0.21809833 -0.13739806  0.19288833 -0.24016182  0.10900763 -0.15938407
   0.11996815  0.1742933   0.20802166  0.2709216   0.21490511  0.08915096
  -0.12727879 -0.12009124  0.21507055  0.02807207 -0.14651914 -0.16075222
  -0.17449577 -0.0714568  -0.16881359 -0.03759068  0.22940421  0.05468367
  -0.06206057 -0.2232159   0.09101291 -0.26039398  0.25236696  0.01140631
   0.24667428  0.25598586  0.2646931   0.06749443]
 [ 0.05538818 -0.05624257  0.03626699  0.10823164 -0.1261808   0.25546393
  -0.05802579 -0.23817156  0.25527546 -0.24512313 -0.01683146  0.1923662
   0.23948534 -0.15875888 -0.25238064  0.2714735   0.21675564  0.05195682
   0.02638216  0.29675785 -0.25138235  0.03519762 -0.20681207  0.2597533
  -0.18316272 -0.27580485 -0.06233062  0.30913147  0.04258275  0.15169401
   0.19597453 -0.09310547 -0.20868902 -0.21798563  0.11408696  0.22551373
   0.08772748 -0.07008294  0.11200325 -0.12943716 -0.21865298  0.12238237
   0.12455092  0.04789633  0.24061522  0.12018529  0.24479666  0.07736884
   0.2883136  -0.11161707 -0.06043011 -0.15858099  0.25285637  0.02893479
  -0.2701709   0.252679   -0.1730844   0.19128117  0.25645736  0.09684654
  -0.06222756 -0.25760648 -0.21527384  0.11553407]
 [-0.17216747 -0.15793833 -0.01846462  0.05576959  0.1593902   0.0544066
  -0.14790085  0.13177048 -0.03671626  0.24489722 -0.17421623  0.1241491
  -0.23255399  0.17412551 -0.04433131  0.20629938  0.08893118  0.03968006
  -0.15385972 -0.01252692  0.27420276  0.03240899  0.22684015 -0.04893707
  -0.10703649  0.21332406  0.20125821  0.32323837  0.16492867 -0.15137084
  -0.23195043 -0.09846766 -0.08609594  0.00228436  0.26148173  0.1010954
  -0.00355629  0.3145682  -0.13767362  0.28261596 -0.00552887  0.03560481
   0.14195384 -0.2495072   0.21224184  0.08542047 -0.23979764 -0.21939132
  -0.12151437  0.0587079  -0.00408112 -0.08260962  0.2697938   0.14863338
   0.07085244  0.17378527  0.06592158 -0.01215199  0.10914806 -0.20651525
   0.01932139 -0.22951072  0.00096816 -0.05802479]]
 Layer dense_1 weights: [[ 0.18265994 -0.24676082  0.15317412 ... -0.18027642 -0.01487564
  -0.07407705]
 [-0.19370076  0.18582045 -0.23008677 ... -0.05755118 -0.21868934
   0.07031552]
 [ 0.21035707  0.04885225 -0.19932055 ... -0.03898479  0.04992932
   0.15947986]
 ...
 [ 0.02724968  0.18212911  0.15646768 ... -0.01757971 -0.11695193
   0.17892222]
 [ 0.26124173 -0.15221736 -0.18164262 ... -0.06602552  0.00648686
  -0.18105932]
 [ 0.00245617  0.20854373  0.07280722 ...  0.10383212 -0.23099771
   0.20263927]]
 Layer dense_2 weights: [[ 0.15971956]
 [-0.00105832]
 [ 0.18053086]
 [-0.25851646]
 [ 0.15258646]
 [ 0.20507765]
 [ 0.02906602]
 [-0.44394156]
 [-0.6558382 ]
 [-0.36490506]
 [-0.14576703]
 [-0.0406099 ]
 [ 0.05374753]
 [ 0.4768776 ]
 [-0.04878689]
 [ 0.13433182]
 [ 0.28526762]
 [ 0.25335392]
 [-0.10293828]
 [ 0.28332803]
 [ 0.37426648]
 [ 0.36922568]
 [ 0.2478052 ]
 [-0.5153534 ]
 [-0.3144344 ]
 [-0.4101126 ]
 [ 0.1708249 ]
 [ 0.28835422]
 [ 0.17798896]
 [-0.33901557]
 [ 0.2123638 ]
 [-0.39389083]]
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 0.4179 - recall_1: 0.2104
Out[156]:
<keras.src.callbacks.history.History at 0x7e81013e81c0>
In [157]:
print("\nModel Training Complete")
print("-" * 100)
print("For each subsequent epoch, these values get updated:")
print("•	Training Loss and Validation Loss: How the loss changes over time.")
print("•	Training Recall and Validation Recall: How the ability of the model to detect true positives changes during training.")
print("-" * 100)
Model Training Complete
----------------------------------------------------------------------------------------------------
For each subsequent epoch, these values get updated:
•	Training Loss and Validation Loss: How the loss changes over time.
•	Training Recall and Validation Recall: How the ability of the model to detect true positives changes during training.
----------------------------------------------------------------------------------------------------

History Keys in Keras Metrics

In [160]:
print(history_0.history.keys()) # Available keys in the history object:
dict_keys(['loss', 'recall_1', 'val_loss', 'val_recall_1'])

Loss function

In [ ]:
#Plotting Train Loss vs Validation Loss
plt.plot(history_0.history['loss'])
plt.plot(history_0.history['val_loss'])
plt.title('Model Loss with SGD Optimizer')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['train', 'validation'], loc='upper left')
plt.show()

Recall ... yay!

In [161]:
#Plotting Train recall vs Validation recall
plt.plot(history_0.history['recall_1'])
plt.plot(history_0.history['val_recall_1'])
plt.title('Model Recall - SGD Optimizer')
plt.ylabel('Recall')
plt.xlabel('Epoch')
plt.legend(['train', 'validation'], loc='upper left')
plt.show()
In [162]:
# Predicting the results using best as a threshold

y_train_pred = model_0.predict(X_train) # This line uses the trained model model_0 to make predictions on the training data (X_train).
# The predict function outputs a probability score for each sample, typically between 0 and 1,
# since the output layer has a sigmoid activation function (which outputs a probability for binary classification).

# Activation Thresholding for Output layer
y_train_pred = (y_train_pred > 0.5) # Since the model’s predictions are probabilities, this line converts the probabilities into binary classifications.
# If the probability is greater than 0.5, it is classified as True (1), otherwise, it is classified as False (0). Sigmoid

y_train_pred # After thresholding, y_train_pred will contain a series of True and False values that correspond to the model’s classification of each instance in the training set (X_train).
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 993us/step
Out[162]:
array([[ True],
       [False],
       [False],
       ...,
       [False],
       [False],
       [False]])
In [163]:
print("\nmodel_0.predict(X_train) gives you a probability score for each training sample.")
print(" • (y_train_pred > 0.5) converts these probabilities into binary predictions using a threshold of 0.5,")
print("    where values greater than 0.5 are classified as True (or 1) and those less than or equal to 0.5 as False (or 0).")
model_0.predict(X_train) gives you a probability score for each training sample.
 • (y_train_pred > 0.5) converts these probabilities into binary predictions using a threshold of 0.5,
    where values greater than 0.5 are classified as True (or 1) and those less than or equal to 0.5 as False (or 0).
In [ ]:
# Predicting on the validation set
y_val_pred = model_0.predict(X_val)
y_val_pred = (y_val_pred > 0.5)

# Now you can store the recall scores
train_metric_df.loc[model_name] = recall_score(y_train, y_train_pred)
valid_metric_df.loc[model_name] = recall_score(y_val, y_val_pred)
50/50 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
In [ ]:
model_name = "Neutral Network with SGD"

train_metric_df.loc[model_name] = recall_score(y_train, y_train_pred)
valid_metric_df.loc[model_name] = recall_score(y_val, y_val_pred)

Classification report

In [ ]:
# Classification report
print(f"\033[33mClassification report for the training set:\n")
cr = classification_report(y_train, y_train_pred)
print(cr)
Classification report for the training set:

              precision    recall  f1-score   support

           0       0.88      0.96      0.92      5096
           1       0.77      0.46      0.58      1304

    accuracy                           0.86      6400
   macro avg       0.82      0.71      0.75      6400
weighted avg       0.85      0.86      0.85      6400

In [ ]:
# Classification report
print(f"\033[32mClassification report for the validation set:\n")
cr = classification_report(y_val, y_val_pred)
print(cr)
Classification report for the validation set:

              precision    recall  f1-score   support

           0       0.92      0.83      0.87      1274
           1       0.52      0.70      0.60       326

    accuracy                           0.81      1600
   macro avg       0.72      0.77      0.73      1600
weighted avg       0.83      0.81      0.82      1600

Confusion matrix

In [ ]:
print(f"\033[33mTraining Confusion Matrix Predictions with the 6,400 (Training) dataset:\033[0m\n")
make_confusion_matrix(y_train, y_train_pred)
Training Confusion Matrix Predictions with the 6,400 (Training) dataset:

In [ ]:
print(f"\033[32mValidation Confusion Matrix Predictions with the 1,600 (Validation) dataset:\033[0m\n")
make_confusion_matrix(y_val, y_val_pred)
Validation Confusion Matrix Predictions with the 1,600 (Validation) dataset:

In [ ]:
result_df = train_metric_df - valid_metric_df # Calculates the Recall difference between the training and the validation.
print(result_df.head(1))  # Displays the first row
# Display just the last value in the first row as percentage
last_value_first_row = result_df.iloc[0, -1]  # Using iloc to access the first row and last column
print(f"\033[34m\nDelta between Training and Validation Recall values is: {last_value_first_row * 100:.2f}%\033[0m")  # Prints the value in the last cell of the first row
                            recall
Neutral Network with SGD  0.034509

Delta between Training and Validation Recall values is: 3.45%
In [170]:
# Back Propagation Illustration - Utility Function

import time
import sys

# ANSI color codes
RED = "\033[31m"
GREEN = "\033[32m"
ORANGE = "\033[33m"
RESET = "\033[0m"

print("\nIllustration of -Back Propagation- to nodes with stronger influence after wights are updated with a keras Optimizer")

# Function to animate dashes moving from right to left, with "updated weight" in the middle
def animate_dashes_with_weight_in_middle(total_time=10, speed=0.1, asterisk_color="GREEN", dash_color="RED"):
    # Choose dash color based on user input
    if dash_color == "RED":
        dash_color = RED
    elif dash_color == "GREEN":
        dash_color = GREEN
    elif dash_color == "ORANGE":
        dash_color = ORANGE
    else:
        dash_color = RESET  # Default to no color if an invalid option is given

    # Choose asterisk color
    if asterisk_color == "RED":
        asterisk_color = RED
    elif asterisk_color == "GREEN":
        asterisk_color = GREEN
    elif asterisk_color == "ORANGE":
        asterisk_color = ORANGE
    else:
        asterisk_color = RESET  # Default to no color if an invalid option is given

    dash_width = 46  # Starting width of the dashes
    fixed_pos = 46  # Position where the asterisk [*] will stay fixed
    word = " updated weight "
    word_length = len(word)
    start_time = time.time()  # Record the starting time

    while time.time() - start_time < total_time:
        # Calculate the remaining space for the dashes
        remaining_space = fixed_pos - dash_width - 3  # 3 accounts for the length of [*]

        # Calculate the position to place the word in the middle
        total_length = dash_width + len(word) + 3  # Total length including dashes and word
        word_position = (total_length // 2) - (word_length // 2)  # Center the word

        # Create the dashes string
        dashes = '-' * dash_width

        # Insert the word into the dashes at the calculated position
        if word_position < dash_width:
            dashes = dashes[:word_position] + word + dashes[word_position + word_length:]

        # Print the line with dashes shrinking toward the asterisk [*]
        sys.stdout.write(f"\r{asterisk_color}[* neuron layer_i *] {dash_color}{dashes}{' ' * remaining_space}{RESET}")
        sys.stdout.flush()  # Force the output to appear immediately
        time.sleep(speed)   # Control the speed of the animation

        if dash_width == 0:
            dash_width = 45  # Reset dash width after dashes shrink to 0
        else:
            dash_width -= 1  # Decrease the number of dashes for the leftward effect

    # After timeout ends, clear the line and stop
    sys.stdout.write("\r" + " " * (fixed_pos + 3) + "\r")
    sys.stdout.flush()

# Run the animation (set timeout in seconds and specify colors)
animate_dashes_with_weight_in_middle(total_time=10, speed=0.08, asterisk_color="GREEN", dash_color="RED")

# Leave the illustration completed without animation
print(f"\r{GREEN}[* neuron layer_i *]{RED} <---------------------------------------- {ORANGE}[* neuron layer_i+1 *]")
Illustration of -Back Propagation- to nodes with stronger influence after wights are updated with a keras Optimizer
[* neuron layer_i *] <---------------------------------------- [* neuron layer_i+1 *]
In [171]:
message = "Essentially:\nHow fast the global minimum is found (by Error Loss) is a function of the chosen Keras optimizers \n" \
          "during the backflow propagation with updated weights and strengths of signals between neurons."

# Run the animation (set timeout in seconds and specify color: "RED", "GREEN", or "ORANGE")
animate_dashes_with_fixed_asterisk(total_time=5, speed=0.08, color="GREEN")
print(f"\r{GREEN}[* neuron layer_i *]{GREEN} ----------------------------------------- {ORANGE}[* neuron layer_i+1 *]")

# Run the animation (set timeout in seconds and specify colors)
animate_dashes_with_weight_in_middle(total_time=5, speed=0.08, asterisk_color="GREEN", dash_color="RED")
print(f"\r{GREEN}[* neuron later_i *]{RED} <---------------------------------------- {ORANGE}[* neuron layer_i+1 *]{RESET}")

print(message)
[* neuron layer_i *] ----------------------------------------- [* neuron layer_i+1 *]
[* neuron later_i *] <---------------------------------------- [* neuron layer_i+1 *]
Essentially:
How fast the global minimum is found (by Error Loss) is a function of the chosen Keras optimizers 
during the backflow propagation with updated weights and strengths of signals between neurons.

Model Performance Improvement -------------------------------------- Section

This is where the computational resources versus the prediction efficiency get tweaked for fun and profit ;-) ...

Models Optimization Overview:

#ModelOptimizerRegularizationBatch Normalization
1NNAdamn/an/a
2NNAdamDropoutn/a
3NNSGDn/aBalanced
4NNAdamn/aBalanced
5NNAdamDropoutBalanced
  • Balanced Data is a data preprocessing step that ensures fair representation of classes before training even begins.
  • Balanced Data ensures that the model learns equally well across all classes, avoiding class bias, particularly in classification tasks.
  • Batch Normalization helps the model train more efficiently and reduces issues like vanishing or exploding gradients.
  • Batch Normalization operates at the model level during training, normalizing activations layer by layer.
  • Dropout introduces noise by dropping random neurons, forcing the model to learn more robust features.
  • Dropout and Batch Normalization both help prevent overfitting and improve generalization during training, while Balanced Data ensures the data used for training is representative and fair, preventing bias in the model.
  • Dropout is a form of regularization because it adds randomness and noise to the training process, which prevents the model from relying too much on specific neurons and memorizing training data. Like other regularization techniques (e.g., L1/L2 regularization), Dropout helps the model generalize better to unseen data, reducing overfitting and improving performance on test data.

Other Regularization Techniques asides from Dropout:

  • L2 Regularization (Ridge): Penalizes the model for having large weights, discouraging complex models.
  • L1 Regularization (Lasso): Encourages sparsity in weights, leading to feature selection.

keras Optimizers:

  • Backpropagation computes the gradientst to locate the local/global minimum.

  • SGD, Adam, RMSprop and AdaGrad Optimizers from keras.optimizers use these gradients to update the weights across all layers.

(1) Neural Network with Adam Optimizer - - Model Performance Improvement

Adam Optimizer: The Adam (Adaptive Moment Estimation) optimizer is a popular gradient-based optimization algorithm that combines the best properties of two other methods: AdaGrad (adaptive learning rates) and RMSProp (root mean square propagation). It adjusts the learning rate dynamically for each parameter and uses momentum to accelerate convergence.

Advantages:

  • Adaptive learning rate: Adam automatically adjusts the learning rates for each parameter during training, making it efficient and effective across a wide range of problems.
  • Momentum: It uses momentum to speed up convergence by considering past gradients, which helps avoid oscillations.
  • Efficient: Works well for large datasets and with high-dimensional data.
  • Less tuning required: Works well with little parameter tuning (learning rate, batch size, etc.), reducing the need for extensive experimentation.

Disadvantages:

  • Memory intensive: Adam stores additional parameters (momentum and squared gradient averages), which can increase memory usage.
  • Overfitting: Sometimes, it can lead to overfitting as it adapts too aggressively, especially on complex models.
  • May not generalize well: While Adam is good for fast convergence, it may not always provide the best final solution compared to simpler optimizers like SGD.
In [164]:
backend.clear_session()
#Fixing the seed for random number generators so that we can ensure we receive the same output everytime
np.random.seed(2)
random.seed(2)
tf.random.set_seed(2)
In [165]:
# Initializing the neural network
model_1 = Sequential()

# Adding the input layer with 64 neurons and ReLU as the activation function
model_1.add(Dense(64, activation='relu', input_dim=X_train.shape[1]))

# Adding a hidden layer with 32 neurons and ReLU as the activation function
model_1.add(Dense(32, activation='relu'))

# Adding the output layer with 1 neuron and sigmoid as the activation function
model_1.add(Dense(1, activation='sigmoid'))
In [166]:
#Complete the code to use Adam as the optimizer.
optimizer = tf.keras.optimizers.Adam() # Adam is a popular optimization algorithm that combines the advantages of both AdaGrad (adaptive learning rate) and RMSProp (root mean square propagation), making it a good choice for most deep learning models.

# uncomment one of the following lines to define the metric to be used
# metric = 'accuracy'
metric = keras.metrics.Recall() # Measures how many true positives were identified correctly (useful for imbalanced datasets).
# metric = keras.metrics.Precision()
# metric = keras.metrics.F1Score()
In [167]:
# Complete the code to compile the model with binary cross entropy as the loss function and recall as the metric
model_1.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=[keras.metrics.Recall()])
In [ ]:
model_1.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                         ┃ Output Shape                ┃         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ dense (Dense)                        │ (None, 64)                  │             768 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_1 (Dense)                      │ (None, 32)                  │           2,080 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_2 (Dense)                      │ (None, 1)                   │              33 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 2,881 (11.25 KB)
 Trainable params: 2,881 (11.25 KB)
 Non-trainable params: 0 (0.00 B)

More Drum rolls please ...

In [169]:
# Fitting the ANN
print("\nFitting the Artificial Neural Network (ANN) Model: Adam Optimizer...\n")
history_1 = model_1.fit(
    X_train, y_train,
    batch_size=32,  # Specify the batch size to 32
    validation_data=(X_val, y_val),
    epochs=50,  # Specify the number of epochs to 50
    verbose=1
)
Fitting the Artificial Neural Network (ANN) Model: Adam Optimizer...

Epoch 1/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 3s 4ms/step - loss: 0.4976 - recall_1: 0.0810 - val_loss: 0.4180 - val_recall_1: 0.2638
Epoch 2/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3900 - recall_1: 0.3592 - val_loss: 0.3830 - val_recall_1: 0.3374
Epoch 3/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3597 - recall_1: 0.4182 - val_loss: 0.3689 - val_recall_1: 0.3804
Epoch 4/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3479 - recall_1: 0.4374 - val_loss: 0.3640 - val_recall_1: 0.3926
Epoch 5/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3422 - recall_1: 0.4499 - val_loss: 0.3612 - val_recall_1: 0.3926
Epoch 6/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3383 - recall_1: 0.4579 - val_loss: 0.3596 - val_recall_1: 0.3926
Epoch 7/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3353 - recall_1: 0.4607 - val_loss: 0.3583 - val_recall_1: 0.3926
Epoch 8/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3325 - recall_1: 0.4699 - val_loss: 0.3568 - val_recall_1: 0.4080
Epoch 9/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3300 - recall_1: 0.4753 - val_loss: 0.3560 - val_recall_1: 0.4080
Epoch 10/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3278 - recall_1: 0.4763 - val_loss: 0.3552 - val_recall_1: 0.4080
Epoch 11/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3257 - recall_1: 0.4802 - val_loss: 0.3549 - val_recall_1: 0.4080
Epoch 12/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3238 - recall_1: 0.4872 - val_loss: 0.3542 - val_recall_1: 0.4080
Epoch 13/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3220 - recall_1: 0.4905 - val_loss: 0.3540 - val_recall_1: 0.4080
Epoch 14/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3202 - recall_1: 0.4937 - val_loss: 0.3539 - val_recall_1: 0.4110
Epoch 15/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3185 - recall_1: 0.4942 - val_loss: 0.3538 - val_recall_1: 0.4172
Epoch 16/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3169 - recall_1: 0.4937 - val_loss: 0.3538 - val_recall_1: 0.4172
Epoch 17/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3153 - recall_1: 0.4964 - val_loss: 0.3542 - val_recall_1: 0.4172
Epoch 18/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3138 - recall_1: 0.4973 - val_loss: 0.3541 - val_recall_1: 0.4141
Epoch 19/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3124 - recall_1: 0.5001 - val_loss: 0.3544 - val_recall_1: 0.4141
Epoch 20/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3109 - recall_1: 0.5014 - val_loss: 0.3548 - val_recall_1: 0.4172
Epoch 21/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3094 - recall_1: 0.5014 - val_loss: 0.3552 - val_recall_1: 0.4202
Epoch 22/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3079 - recall_1: 0.5041 - val_loss: 0.3558 - val_recall_1: 0.4172
Epoch 23/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3068 - recall_1: 0.5073 - val_loss: 0.3563 - val_recall_1: 0.4202
Epoch 24/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3055 - recall_1: 0.5075 - val_loss: 0.3571 - val_recall_1: 0.4172
Epoch 25/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3043 - recall_1: 0.5082 - val_loss: 0.3578 - val_recall_1: 0.4202
Epoch 26/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3030 - recall_1: 0.5165 - val_loss: 0.3587 - val_recall_1: 0.4172
Epoch 27/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3017 - recall_1: 0.5191 - val_loss: 0.3596 - val_recall_1: 0.4080
Epoch 28/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3004 - recall_1: 0.5279 - val_loss: 0.3605 - val_recall_1: 0.4110
Epoch 29/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2991 - recall_1: 0.5384 - val_loss: 0.3616 - val_recall_1: 0.4110
Epoch 30/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2980 - recall_1: 0.5431 - val_loss: 0.3625 - val_recall_1: 0.4141
Epoch 31/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2968 - recall_1: 0.5447 - val_loss: 0.3635 - val_recall_1: 0.4141
Epoch 32/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2956 - recall_1: 0.5501 - val_loss: 0.3645 - val_recall_1: 0.4172
Epoch 33/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2946 - recall_1: 0.5508 - val_loss: 0.3655 - val_recall_1: 0.4172
Epoch 34/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2935 - recall_1: 0.5536 - val_loss: 0.3666 - val_recall_1: 0.4202
Epoch 35/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2924 - recall_1: 0.5611 - val_loss: 0.3676 - val_recall_1: 0.4202
Epoch 36/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2913 - recall_1: 0.5635 - val_loss: 0.3684 - val_recall_1: 0.4202
Epoch 37/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2903 - recall_1: 0.5675 - val_loss: 0.3694 - val_recall_1: 0.4202
Epoch 38/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2893 - recall_1: 0.5656 - val_loss: 0.3703 - val_recall_1: 0.4233
Epoch 39/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2882 - recall_1: 0.5674 - val_loss: 0.3713 - val_recall_1: 0.4233
Epoch 40/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2872 - recall_1: 0.5684 - val_loss: 0.3722 - val_recall_1: 0.4294
Epoch 41/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2862 - recall_1: 0.5701 - val_loss: 0.3731 - val_recall_1: 0.4325
Epoch 42/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2852 - recall_1: 0.5691 - val_loss: 0.3743 - val_recall_1: 0.4294
Epoch 43/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2842 - recall_1: 0.5712 - val_loss: 0.3754 - val_recall_1: 0.4325
Epoch 44/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2833 - recall_1: 0.5742 - val_loss: 0.3763 - val_recall_1: 0.4294
Epoch 45/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2821 - recall_1: 0.5767 - val_loss: 0.3773 - val_recall_1: 0.4325
Epoch 46/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2811 - recall_1: 0.5787 - val_loss: 0.3783 - val_recall_1: 0.4294
Epoch 47/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2803 - recall_1: 0.5785 - val_loss: 0.3793 - val_recall_1: 0.4294
Epoch 48/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2793 - recall_1: 0.5806 - val_loss: 0.3805 - val_recall_1: 0.4264
Epoch 49/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2784 - recall_1: 0.5798 - val_loss: 0.3813 - val_recall_1: 0.4325
Epoch 50/50
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2775 - recall_1: 0.5832 - val_loss: 0.3824 - val_recall_1: 0.4356

History Keys in Keras Metrics

In [ ]:
print(history_1.history.keys()) # Available keys in the history object for NN with Adam Optimizar:
dict_keys(['loss', 'recall_2', 'val_loss', 'val_recall_2'])

Loss function

In [ ]:
#Plotting Train Loss vs Validation Loss
plt.plot(history_1.history['loss'])
plt.plot(history_1.history['val_loss'])
plt.title('Model #1 Loss - Adam Optimizer')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['train', 'validation'], loc='upper left')
plt.show()

Recall

In [ ]:
#Plotting Train recall vs Validation recall (dict_keys(['loss', 'recall_1', 'val_loss', 'val_recall_1']))
plt.plot(history_1.history['recall_2'])
plt.plot(history_1.history['val_recall_2'])
plt.title('Model #1 Recall - Adam Optimizer')
plt.ylabel('Recall')
plt.xlabel('Epoch')
plt.legend(['train', 'validation'], loc='upper left')
plt.show()
In [ ]:
#Predicting the results using 0.5 as the threshold
y_train_pred = model_1.predict(X_train)
y_train_pred = (y_train_pred > 0.5)
y_train_pred
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
Out[ ]:
array([[False],
       [False],
       [False],
       ...,
       [False],
       [ True],
       [False]])
In [ ]:
model_name = "NN with Adam"

train_metric_df.loc[model_name] = recall_score(y_train,y_train_pred)
valid_metric_df.loc[model_name] = recall_score(y_val,y_val_pred)

Classification report

In [ ]:
#lassification report
cr=classification_report(y_train,y_train_pred)
print(cr)
              precision    recall  f1-score   support

           0       0.89      0.97      0.93      5096
           1       0.84      0.55      0.66      1304

    accuracy                           0.89      6400
   macro avg       0.86      0.76      0.80      6400
weighted avg       0.88      0.89      0.88      6400

In [ ]:
#lassification report
cr=classification_report(y_val,y_val_pred)
print(cr)
              precision    recall  f1-score   support

           0       0.92      0.83      0.87      1274
           1       0.52      0.70      0.60       326

    accuracy                           0.81      1600
   macro avg       0.72      0.77      0.73      1600
weighted avg       0.83      0.81      0.82      1600

Confusion matrix

In [ ]:
#Calculating the confusion matrix
make_confusion_matrix(y_train, y_train_pred)
In [ ]:
#Calculating the confusion matrix
make_confusion_matrix(y_val, y_val_pred)

Observations:

  • Notice the increase of True Positives (TP) from Training set to validation set.
  • Also notice the decrease of False Negatives (FN) from Training set to Validation set!

(2) Neural Network with Adam Optimizer and Dropout - - Model Performance Improvement

Dropout is a regularization technique to prevent overfitting:

  • Randomly “drops” neurons during training: At each training step, a random subset of neurons (and their connections) is “dropped” or temporarily deactivated. This forces the network to learn more robust patterns that don’t rely on specific neurons.
  • Prevents overfitting: By randomly dropping neurons, the model cannot over-rely on any one feature or pattern, which helps it generalize better to unseen data, reducing overfitting.
  • Improves performance: Dropout encourages the network to learn multiple independent features, improving overall model performance when deployed.
In [ ]:
backend.clear_session()
#Fixing the seed for random number generators so that we can ensure we receive the same output everytime
np.random.seed(2)
random.seed(2)
tf.random.set_seed(2)
In [ ]:
# Initializing the Neural Network with Adam Optimizer and Dropout
model_2 = Sequential()

# Adding the input layer with 32 neurons and relu as activation function
model_2.add(Dense(32, activation='relu', input_dim=X_train.shape[1]))

# Complete the code to add dropout with a ratio of 0.2
model_2.add(Dropout(0.2))

# Complete the code to add a hidden layer (specify the # of neurons and the activation function)
model_2.add(Dense(64, activation='relu'))  # Example: 64 neurons with relu activation

# Complete the code to add another hidden layer (specify the # of neurons and the activation function)
model_2.add(Dense(64, activation='relu'))  # Example: Another 64 neurons with relu activation

# Complete the code to add dropout with a ratio of 0.1
model_2.add(Dropout(0.1))

# Complete the code to add a hidden layer (specify the # of neurons and the activation function)
model_2.add(Dense(32, activation='relu'))  # Example: 32 neurons with relu activation

# Complete the code to add the number of neurons required in the output layer.
model_2.add(Dense(1, activation='sigmoid'))  # Example: 1 neuron for binary classification
In [ ]:
#Complete the code to use Adam as the optimizer.
optimizer = tf.keras.optimizers.Adam()

# uncomment one of the following lines to define the metric to be used
# metric = 'accuracy'
metric = keras.metrics.Recall()
# metric = keras.metrics.Precision()
# metric = keras.metrics.F1Score()
In [ ]:
# Complete the code to compile the model with binary cross entropy as loss function and recall as the metric.
model_2.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=[metric])
In [ ]:
# Summary of the model
model_2.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                         ┃ Output Shape                ┃         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ dense (Dense)                        │ (None, 32)                  │             384 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dropout (Dropout)                    │ (None, 32)                  │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_1 (Dense)                      │ (None, 64)                  │           2,112 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_2 (Dense)                      │ (None, 64)                  │           4,160 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dropout_1 (Dropout)                  │ (None, 64)                  │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_3 (Dense)                      │ (None, 32)                  │           2,080 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_4 (Dense)                      │ (None, 1)                   │              33 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 8,769 (34.25 KB)
 Trainable params: 8,769 (34.25 KB)
 Non-trainable params: 0 (0.00 B)
In [ ]:
# Fitting the ANN with batch_size = 32 and 100 epochs
history_2 = model_2.fit(
    X_train, y_train,
    batch_size=32,  # Specifying the batch size as 32
    epochs=100,  # Specifying the number of epochs as 100
    verbose=1,
    validation_data=(X_val, y_val)
)
Epoch 1/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 4s 5ms/step - loss: 0.5021 - recall: 0.0325 - val_loss: 0.4224 - val_recall: 0.1227
Epoch 2/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4207 - recall: 0.2949 - val_loss: 0.3781 - val_recall: 0.3129
Epoch 3/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3894 - recall: 0.3801 - val_loss: 0.3680 - val_recall: 0.3497
Epoch 4/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3774 - recall: 0.3914 - val_loss: 0.3653 - val_recall: 0.3558
Epoch 5/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3698 - recall: 0.4001 - val_loss: 0.3625 - val_recall: 0.3558
Epoch 6/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3622 - recall: 0.4099 - val_loss: 0.3605 - val_recall: 0.3834
Epoch 7/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3591 - recall: 0.4113 - val_loss: 0.3622 - val_recall: 0.3804
Epoch 8/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3610 - recall: 0.4424 - val_loss: 0.3590 - val_recall: 0.3896
Epoch 9/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3539 - recall: 0.4408 - val_loss: 0.3538 - val_recall: 0.4202
Epoch 10/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3498 - recall: 0.4324 - val_loss: 0.3543 - val_recall: 0.4049
Epoch 11/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3441 - recall: 0.4513 - val_loss: 0.3525 - val_recall: 0.4172
Epoch 12/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3520 - recall: 0.4514 - val_loss: 0.3534 - val_recall: 0.3988
Epoch 13/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3447 - recall: 0.4516 - val_loss: 0.3517 - val_recall: 0.3957
Epoch 14/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3435 - recall: 0.4510 - val_loss: 0.3503 - val_recall: 0.4387
Epoch 15/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3398 - recall: 0.4841 - val_loss: 0.3500 - val_recall: 0.4417
Epoch 16/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3376 - recall: 0.4700 - val_loss: 0.3534 - val_recall: 0.3865
Epoch 17/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3401 - recall: 0.4588 - val_loss: 0.3532 - val_recall: 0.4049
Epoch 18/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3369 - recall: 0.4872 - val_loss: 0.3508 - val_recall: 0.4049
Epoch 19/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3330 - recall: 0.4930 - val_loss: 0.3554 - val_recall: 0.4141
Epoch 20/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3424 - recall: 0.4717 - val_loss: 0.3507 - val_recall: 0.3957
Epoch 21/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3398 - recall: 0.4619 - val_loss: 0.3511 - val_recall: 0.4325
Epoch 22/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3278 - recall: 0.4908 - val_loss: 0.3548 - val_recall: 0.3957
Epoch 23/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3307 - recall: 0.4961 - val_loss: 0.3512 - val_recall: 0.3988
Epoch 24/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3262 - recall: 0.4995 - val_loss: 0.3516 - val_recall: 0.3957
Epoch 25/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3301 - recall: 0.4835 - val_loss: 0.3516 - val_recall: 0.4172
Epoch 26/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3273 - recall: 0.4920 - val_loss: 0.3525 - val_recall: 0.3957
Epoch 27/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3327 - recall: 0.4818 - val_loss: 0.3489 - val_recall: 0.4387
Epoch 28/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3263 - recall: 0.4824 - val_loss: 0.3509 - val_recall: 0.4141
Epoch 29/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3312 - recall: 0.4682 - val_loss: 0.3540 - val_recall: 0.4018
Epoch 30/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3272 - recall: 0.4827 - val_loss: 0.3501 - val_recall: 0.4264
Epoch 31/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3247 - recall: 0.4831 - val_loss: 0.3542 - val_recall: 0.4110
Epoch 32/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3266 - recall: 0.4856 - val_loss: 0.3533 - val_recall: 0.4080
Epoch 33/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3261 - recall: 0.4828 - val_loss: 0.3511 - val_recall: 0.4080
Epoch 34/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3276 - recall: 0.4856 - val_loss: 0.3518 - val_recall: 0.3988
Epoch 35/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3193 - recall: 0.5050 - val_loss: 0.3522 - val_recall: 0.4141
Epoch 36/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3233 - recall: 0.5064 - val_loss: 0.3526 - val_recall: 0.4049
Epoch 37/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3253 - recall: 0.4906 - val_loss: 0.3532 - val_recall: 0.3773
Epoch 38/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3210 - recall: 0.5106 - val_loss: 0.3523 - val_recall: 0.4356
Epoch 39/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3170 - recall: 0.5022 - val_loss: 0.3533 - val_recall: 0.4417
Epoch 40/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3230 - recall: 0.4781 - val_loss: 0.3502 - val_recall: 0.4325
Epoch 41/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3163 - recall: 0.4922 - val_loss: 0.3511 - val_recall: 0.4294
Epoch 42/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3154 - recall: 0.5081 - val_loss: 0.3538 - val_recall: 0.4479
Epoch 43/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3225 - recall: 0.5050 - val_loss: 0.3552 - val_recall: 0.4356
Epoch 44/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3121 - recall: 0.5011 - val_loss: 0.3557 - val_recall: 0.4601
Epoch 45/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3125 - recall: 0.5150 - val_loss: 0.3587 - val_recall: 0.4202
Epoch 46/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3110 - recall: 0.5192 - val_loss: 0.3553 - val_recall: 0.4325
Epoch 47/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3127 - recall: 0.5097 - val_loss: 0.3538 - val_recall: 0.4417
Epoch 48/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3160 - recall: 0.4986 - val_loss: 0.3535 - val_recall: 0.4172
Epoch 49/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3139 - recall: 0.5023 - val_loss: 0.3564 - val_recall: 0.4417
Epoch 50/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3123 - recall: 0.5161 - val_loss: 0.3599 - val_recall: 0.4448
Epoch 51/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3095 - recall: 0.5026 - val_loss: 0.3566 - val_recall: 0.4264
Epoch 52/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3173 - recall: 0.4996 - val_loss: 0.3559 - val_recall: 0.4049
Epoch 53/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3124 - recall: 0.5083 - val_loss: 0.3544 - val_recall: 0.4325
Epoch 54/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3077 - recall: 0.5171 - val_loss: 0.3585 - val_recall: 0.4325
Epoch 55/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3099 - recall: 0.5112 - val_loss: 0.3580 - val_recall: 0.4264
Epoch 56/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3093 - recall: 0.5005 - val_loss: 0.3570 - val_recall: 0.4264
Epoch 57/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3078 - recall: 0.5118 - val_loss: 0.3598 - val_recall: 0.4018
Epoch 58/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3050 - recall: 0.5047 - val_loss: 0.3562 - val_recall: 0.4202
Epoch 59/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3044 - recall: 0.5228 - val_loss: 0.3594 - val_recall: 0.4264
Epoch 60/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3089 - recall: 0.5132 - val_loss: 0.3531 - val_recall: 0.4387
Epoch 61/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3080 - recall: 0.5086 - val_loss: 0.3578 - val_recall: 0.4417
Epoch 62/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3096 - recall: 0.5239 - val_loss: 0.3599 - val_recall: 0.4479
Epoch 63/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3021 - recall: 0.5222 - val_loss: 0.3589 - val_recall: 0.4417
Epoch 64/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3068 - recall: 0.5231 - val_loss: 0.3587 - val_recall: 0.4141
Epoch 65/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3003 - recall: 0.5341 - val_loss: 0.3581 - val_recall: 0.4448
Epoch 66/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3009 - recall: 0.5325 - val_loss: 0.3589 - val_recall: 0.4387
Epoch 67/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2966 - recall: 0.5417 - val_loss: 0.3592 - val_recall: 0.4509
Epoch 68/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3010 - recall: 0.5312 - val_loss: 0.3611 - val_recall: 0.4294
Epoch 69/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2970 - recall: 0.5335 - val_loss: 0.3609 - val_recall: 0.4264
Epoch 70/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3053 - recall: 0.5245 - val_loss: 0.3577 - val_recall: 0.4724
Epoch 71/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3024 - recall: 0.5449 - val_loss: 0.3609 - val_recall: 0.4202
Epoch 72/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3034 - recall: 0.5271 - val_loss: 0.3657 - val_recall: 0.4877
Epoch 73/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2988 - recall: 0.5381 - val_loss: 0.3618 - val_recall: 0.4294
Epoch 74/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2987 - recall: 0.5398 - val_loss: 0.3679 - val_recall: 0.4632
Epoch 75/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2939 - recall: 0.5335 - val_loss: 0.3657 - val_recall: 0.4479
Epoch 76/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2958 - recall: 0.5389 - val_loss: 0.3671 - val_recall: 0.4202
Epoch 77/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2991 - recall: 0.5495 - val_loss: 0.3625 - val_recall: 0.4540
Epoch 78/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2918 - recall: 0.5587 - val_loss: 0.3652 - val_recall: 0.4417
Epoch 79/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2988 - recall: 0.5327 - val_loss: 0.3683 - val_recall: 0.4479
Epoch 80/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2970 - recall: 0.5504 - val_loss: 0.3650 - val_recall: 0.4356
Epoch 81/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3018 - recall: 0.5463 - val_loss: 0.3649 - val_recall: 0.4663
Epoch 82/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2881 - recall: 0.5573 - val_loss: 0.3691 - val_recall: 0.4417
Epoch 83/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2969 - recall: 0.5376 - val_loss: 0.3710 - val_recall: 0.4264
Epoch 84/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2956 - recall: 0.5466 - val_loss: 0.3705 - val_recall: 0.4448
Epoch 85/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2907 - recall: 0.5286 - val_loss: 0.3681 - val_recall: 0.4417
Epoch 86/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2947 - recall: 0.5577 - val_loss: 0.3734 - val_recall: 0.4141
Epoch 87/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2906 - recall: 0.5534 - val_loss: 0.3698 - val_recall: 0.4356
Epoch 88/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2945 - recall: 0.5465 - val_loss: 0.3713 - val_recall: 0.4571
Epoch 89/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2957 - recall: 0.5363 - val_loss: 0.3686 - val_recall: 0.4663
Epoch 90/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2882 - recall: 0.5625 - val_loss: 0.3752 - val_recall: 0.4601
Epoch 91/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2892 - recall: 0.5602 - val_loss: 0.3786 - val_recall: 0.4479
Epoch 92/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2841 - recall: 0.5752 - val_loss: 0.3802 - val_recall: 0.4448
Epoch 93/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2872 - recall: 0.5621 - val_loss: 0.3725 - val_recall: 0.4448
Epoch 94/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2913 - recall: 0.5773 - val_loss: 0.3746 - val_recall: 0.4908
Epoch 95/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2898 - recall: 0.5600 - val_loss: 0.3747 - val_recall: 0.4448
Epoch 96/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2884 - recall: 0.5410 - val_loss: 0.3771 - val_recall: 0.4509
Epoch 97/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2904 - recall: 0.5547 - val_loss: 0.3796 - val_recall: 0.4663
Epoch 98/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2877 - recall: 0.5522 - val_loss: 0.3782 - val_recall: 0.4387
Epoch 99/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2908 - recall: 0.5528 - val_loss: 0.3777 - val_recall: 0.4294
Epoch 100/100
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.2874 - recall: 0.5413 - val_loss: 0.3850 - val_recall: 0.4417
In [ ]:
print(history_2.history.keys()) # Available keys in the history object for NN with Adam Optimizar:
dict_keys(['loss', 'recall', 'val_loss', 'val_recall'])

Loss function

In [ ]:
#Plotting Train Loss vs Validation Loss
plt.plot(history_2.history['loss'])
plt.plot(history_2.history['val_loss'])
plt.title('Model #2 Loss - Adam Optimizer + Dropout')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['train', 'validation'], loc='upper left')
plt.show()

Observations:

  • From the above plot, we can observe that the train and validation curves are having smooth decend lines. Reducing the number of neurons and adding dropouts to the model worked well, and the problem of overfitting was solved.
In [ ]:
#Plotting Train recall vs Validation recall
plt.plot(history_2.history['recall'])
plt.plot(history_2.history['val_recall'])
plt.title('Model #2 Recall - Adam Optimizer + Dropout')
plt.ylabel('Recall')
plt.xlabel('Epoch')
plt.legend(['train', 'validation'], loc='upper left')
plt.show()

Observations: The Recall metric for the validation dataset is slightly under the training set but generally follows the same curve/noise.

In [ ]:
#Predicting the results using best as a threshold
y_train_pred = model_2.predict(X_train)
y_train_pred = (y_train_pred > 0.5)
y_train_pred
200/200 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
Out[ ]:
array([[False],
       [False],
       [False],
       ...,
       [False],
       [ True],
       [False]])
In [ ]:
#Predicting the results using 0.5 as the threshold.
y_val_pred = model_2.predict(X_val)
y_val_pred = (y_val_pred > 0.5)
y_val_pred
50/50 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
Out[ ]:
array([[False],
       [False],
       [False],
       ...,
       [False],
       [ True],
       [ True]])
In [ ]:
model_name = "NN with Adam & Dropout"

train_metric_df.loc[model_name] = recall_score(y_train,y_train_pred)
valid_metric_df.loc[model_name] = recall_score(y_val,y_val_pred)

Classification report

In [ ]:
# Training set classification report
cr=classification_report(y_train,y_train_pred)
print(cr)
              precision    recall  f1-score   support

           0       0.91      0.97      0.94      5096
           1       0.85      0.61      0.71      1304

    accuracy                           0.90      6400
   macro avg       0.88      0.79      0.82      6400
weighted avg       0.89      0.90      0.89      6400

In [ ]:
# Predicting on the validation set
y_val_pred = model_2.predict(X_val)
y_val_pred = (y_val_pred > 0.5)  # Thresholding the predictions
50/50 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
In [ ]:
# Validation dataset classification report
from sklearn.metrics import classification_report
cr = classification_report(y_val, y_val_pred)  # y_val is the true labels, y_val_pred is the predicted labels
print(cr)
              precision    recall  f1-score   support

           0       0.87      0.95      0.91      1274
           1       0.68      0.44      0.54       326

    accuracy                           0.84      1600
   macro avg       0.77      0.69      0.72      1600
weighted avg       0.83      0.84      0.83      1600

Confusion matrix

In [ ]:
#Calculating the confusion matrix
make_confusion_matrix(y_train, y_train_pred)
In [ ]:
#Calculating the confusion matrix
make_confusion_matrix(y_val, y_val_pred)

Observations:

  • Classification predictions on the training set loss and Recall improved.
  • Confusion matrix indicate both Predicted False Negatives declined and True Positives increased.
  • Recall = TP/(TP+FN)
  • This was a nice exercise!!

(3) Neural Network with Balanced Data (by applying SMOTE) and SGD Optimizer - - Model Performance Improvement

Apply SMOTE to balance the dataset and then again apply hyperparamter tuning accordingly.

In [ ]:
sm = SMOTE(random_state=42)
# Fitting SMOTE on the training data
X_train_smote, y_train_smote = sm.fit_resample(X_train, y_train)

# Printing the new shapes after upsampling
print('After UpSampling, the shape of X_train: {}'.format(X_train_smote.shape))
print('After UpSampling, the shape of y_train: {} \n'.format(y_train_smote.shape))
After UpSampling, the shape of X_train: (10192, 11)
After UpSampling, the shape of y_train: (10192,) 

Let's build a model with the balanced dataset

In [ ]:
backend.clear_session()
#Fixing the seed for random number generators so that we can ensure we receive the same output everytime
np.random.seed(2)
random.seed(2)
tf.random.set_seed(2)
In [ ]:
# Initializing the model
model_3 = Sequential()

# Adding the input layer with 32 neurons and relu activation function
model_3.add(Dense(32, activation='relu', input_dim=X_train_smote.shape[1]))

# Adding a hidden layer with 16 neurons and relu activation function
model_3.add(Dense(16, activation='relu'))

# Adding a hidden layer with 8 neurons and relu activation function
model_3.add(Dense(8, activation='relu'))

# Adding the output layer with 1 neuron and sigmoid activation function for binary classification
model_3.add(Dense(1, activation='sigmoid'))
In [ ]:
#Complete the code to use SGD as the optimizer.
optimizer = tf.keras.optimizers.SGD(0.001)

# uncomment one of the following lines to define the metric to be used
# metric = 'accuracy'
metric = keras.metrics.Recall()
# metric = keras.metrics.Precision()
# metric = keras.metrics.F1Score()
In [ ]:
# Complete the code to compile the model with binary cross entropy as loss function and recall as the metric
model_3.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=[metric])
In [ ]:
model_3.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                         ┃ Output Shape                ┃         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ dense (Dense)                        │ (None, 32)                  │             384 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_1 (Dense)                      │ (None, 16)                  │             528 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_2 (Dense)                      │ (None, 8)                   │             136 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_3 (Dense)                      │ (None, 1)                   │               9 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 1,057 (4.13 KB)
 Trainable params: 1,057 (4.13 KB)
 Non-trainable params: 0 (0.00 B)
In [ ]:
# Fitting the ANN
history_3 = model_3.fit(
    X_train_smote,
    y_train_smote,
    batch_size=32,  # Specify the batch size, e.g., 32
    epochs=100,     # Specify the number of epochs, e.g., 100
    verbose=1,
    validation_data=(X_val, y_val)
)
Epoch 1/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 3s 6ms/step - loss: 0.6979 - recall: 0.8124 - val_loss: 0.7129 - val_recall: 0.8190
Epoch 2/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.6953 - recall: 0.8224 - val_loss: 0.7083 - val_recall: 0.8129
Epoch 3/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.6928 - recall: 0.8207 - val_loss: 0.7036 - val_recall: 0.7883
Epoch 4/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.6903 - recall: 0.8077 - val_loss: 0.6989 - val_recall: 0.7577
Epoch 5/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.6878 - recall: 0.7895 - val_loss: 0.6942 - val_recall: 0.7454
Epoch 6/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.6852 - recall: 0.7692 - val_loss: 0.6895 - val_recall: 0.7209
Epoch 7/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.6825 - recall: 0.7516 - val_loss: 0.6846 - val_recall: 0.6933
Epoch 8/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.6797 - recall: 0.7333 - val_loss: 0.6800 - val_recall: 0.6810
Epoch 9/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.6769 - recall: 0.7194 - val_loss: 0.6756 - val_recall: 0.6656
Epoch 10/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.6741 - recall: 0.7128 - val_loss: 0.6715 - val_recall: 0.6595
Epoch 11/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.6712 - recall: 0.7097 - val_loss: 0.6676 - val_recall: 0.6687
Epoch 12/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.6684 - recall: 0.7073 - val_loss: 0.6639 - val_recall: 0.6595
Epoch 13/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.6656 - recall: 0.7086 - val_loss: 0.6602 - val_recall: 0.6687
Epoch 14/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.6626 - recall: 0.7136 - val_loss: 0.6567 - val_recall: 0.6595
Epoch 15/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.6596 - recall: 0.7167 - val_loss: 0.6532 - val_recall: 0.6534
Epoch 16/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.6565 - recall: 0.7178 - val_loss: 0.6498 - val_recall: 0.6564
Epoch 17/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.6533 - recall: 0.7206 - val_loss: 0.6465 - val_recall: 0.6595
Epoch 18/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.6501 - recall: 0.7231 - val_loss: 0.6431 - val_recall: 0.6595
Epoch 19/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.6467 - recall: 0.7247 - val_loss: 0.6399 - val_recall: 0.6687
Epoch 20/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.6433 - recall: 0.7286 - val_loss: 0.6368 - val_recall: 0.6748
Epoch 21/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.6398 - recall: 0.7328 - val_loss: 0.6337 - val_recall: 0.6718
Epoch 22/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.6363 - recall: 0.7363 - val_loss: 0.6308 - val_recall: 0.6779
Epoch 23/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.6326 - recall: 0.7401 - val_loss: 0.6278 - val_recall: 0.6779
Epoch 24/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.6290 - recall: 0.7446 - val_loss: 0.6249 - val_recall: 0.6810
Epoch 25/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.6252 - recall: 0.7455 - val_loss: 0.6221 - val_recall: 0.6871
Epoch 26/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.6215 - recall: 0.7495 - val_loss: 0.6194 - val_recall: 0.6902
Epoch 27/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.6178 - recall: 0.7505 - val_loss: 0.6167 - val_recall: 0.6840
Epoch 28/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.6140 - recall: 0.7529 - val_loss: 0.6140 - val_recall: 0.6902
Epoch 29/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.6103 - recall: 0.7536 - val_loss: 0.6113 - val_recall: 0.6963
Epoch 30/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.6066 - recall: 0.7538 - val_loss: 0.6088 - val_recall: 0.6994
Epoch 31/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.6029 - recall: 0.7575 - val_loss: 0.6062 - val_recall: 0.6994
Epoch 32/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5993 - recall: 0.7595 - val_loss: 0.6038 - val_recall: 0.6963
Epoch 33/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5957 - recall: 0.7636 - val_loss: 0.6014 - val_recall: 0.6871
Epoch 34/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5922 - recall: 0.7646 - val_loss: 0.5990 - val_recall: 0.6902
Epoch 35/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5888 - recall: 0.7660 - val_loss: 0.5966 - val_recall: 0.6933
Epoch 36/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5855 - recall: 0.7651 - val_loss: 0.5942 - val_recall: 0.6933
Epoch 37/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5823 - recall: 0.7645 - val_loss: 0.5919 - val_recall: 0.6994
Epoch 38/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5792 - recall: 0.7641 - val_loss: 0.5896 - val_recall: 0.6994
Epoch 39/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5762 - recall: 0.7659 - val_loss: 0.5872 - val_recall: 0.6871
Epoch 40/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5733 - recall: 0.7629 - val_loss: 0.5848 - val_recall: 0.6871
Epoch 41/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5705 - recall: 0.7605 - val_loss: 0.5825 - val_recall: 0.6902
Epoch 42/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5677 - recall: 0.7598 - val_loss: 0.5804 - val_recall: 0.6902
Epoch 43/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5651 - recall: 0.7600 - val_loss: 0.5782 - val_recall: 0.6871
Epoch 44/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5625 - recall: 0.7599 - val_loss: 0.5760 - val_recall: 0.6933
Epoch 45/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5599 - recall: 0.7600 - val_loss: 0.5740 - val_recall: 0.6963
Epoch 46/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5574 - recall: 0.7591 - val_loss: 0.5720 - val_recall: 0.6963
Epoch 47/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5550 - recall: 0.7588 - val_loss: 0.5699 - val_recall: 0.6933
Epoch 48/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5527 - recall: 0.7594 - val_loss: 0.5680 - val_recall: 0.6933
Epoch 49/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5504 - recall: 0.7577 - val_loss: 0.5660 - val_recall: 0.6994
Epoch 50/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5481 - recall: 0.7613 - val_loss: 0.5639 - val_recall: 0.7055
Epoch 51/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5459 - recall: 0.7614 - val_loss: 0.5619 - val_recall: 0.7055
Epoch 52/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5438 - recall: 0.7624 - val_loss: 0.5600 - val_recall: 0.7086
Epoch 53/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5416 - recall: 0.7646 - val_loss: 0.5582 - val_recall: 0.7086
Epoch 54/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5395 - recall: 0.7647 - val_loss: 0.5564 - val_recall: 0.7086
Epoch 55/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5375 - recall: 0.7644 - val_loss: 0.5546 - val_recall: 0.7086
Epoch 56/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5354 - recall: 0.7644 - val_loss: 0.5527 - val_recall: 0.7117
Epoch 57/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5334 - recall: 0.7647 - val_loss: 0.5510 - val_recall: 0.7178
Epoch 58/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5315 - recall: 0.7645 - val_loss: 0.5493 - val_recall: 0.7147
Epoch 59/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5296 - recall: 0.7654 - val_loss: 0.5475 - val_recall: 0.7147
Epoch 60/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5277 - recall: 0.7651 - val_loss: 0.5459 - val_recall: 0.7147
Epoch 61/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5258 - recall: 0.7662 - val_loss: 0.5443 - val_recall: 0.7178
Epoch 62/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5240 - recall: 0.7670 - val_loss: 0.5426 - val_recall: 0.7178
Epoch 63/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5222 - recall: 0.7674 - val_loss: 0.5411 - val_recall: 0.7178
Epoch 64/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5204 - recall: 0.7674 - val_loss: 0.5395 - val_recall: 0.7239
Epoch 65/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5186 - recall: 0.7677 - val_loss: 0.5380 - val_recall: 0.7270
Epoch 66/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5168 - recall: 0.7675 - val_loss: 0.5363 - val_recall: 0.7270
Epoch 67/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5151 - recall: 0.7677 - val_loss: 0.5348 - val_recall: 0.7239
Epoch 68/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5134 - recall: 0.7687 - val_loss: 0.5332 - val_recall: 0.7239
Epoch 69/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5117 - recall: 0.7692 - val_loss: 0.5317 - val_recall: 0.7270
Epoch 70/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5101 - recall: 0.7708 - val_loss: 0.5304 - val_recall: 0.7301
Epoch 71/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5085 - recall: 0.7720 - val_loss: 0.5291 - val_recall: 0.7301
Epoch 72/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5069 - recall: 0.7719 - val_loss: 0.5277 - val_recall: 0.7301
Epoch 73/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5054 - recall: 0.7738 - val_loss: 0.5263 - val_recall: 0.7362
Epoch 74/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5038 - recall: 0.7733 - val_loss: 0.5250 - val_recall: 0.7331
Epoch 75/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5023 - recall: 0.7748 - val_loss: 0.5236 - val_recall: 0.7362
Epoch 76/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5009 - recall: 0.7739 - val_loss: 0.5223 - val_recall: 0.7362
Epoch 77/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4994 - recall: 0.7743 - val_loss: 0.5211 - val_recall: 0.7362
Epoch 78/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4980 - recall: 0.7737 - val_loss: 0.5198 - val_recall: 0.7423
Epoch 79/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4966 - recall: 0.7742 - val_loss: 0.5186 - val_recall: 0.7423
Epoch 80/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4952 - recall: 0.7757 - val_loss: 0.5173 - val_recall: 0.7454
Epoch 81/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4939 - recall: 0.7763 - val_loss: 0.5161 - val_recall: 0.7454
Epoch 82/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4926 - recall: 0.7771 - val_loss: 0.5149 - val_recall: 0.7454
Epoch 83/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4913 - recall: 0.7784 - val_loss: 0.5138 - val_recall: 0.7485
Epoch 84/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4901 - recall: 0.7781 - val_loss: 0.5126 - val_recall: 0.7485
Epoch 85/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4889 - recall: 0.7792 - val_loss: 0.5116 - val_recall: 0.7546
Epoch 86/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4877 - recall: 0.7798 - val_loss: 0.5105 - val_recall: 0.7607
Epoch 87/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4866 - recall: 0.7795 - val_loss: 0.5096 - val_recall: 0.7607
Epoch 88/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4855 - recall: 0.7793 - val_loss: 0.5086 - val_recall: 0.7577
Epoch 89/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4844 - recall: 0.7804 - val_loss: 0.5076 - val_recall: 0.7546
Epoch 90/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4833 - recall: 0.7814 - val_loss: 0.5065 - val_recall: 0.7546
Epoch 91/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4822 - recall: 0.7827 - val_loss: 0.5055 - val_recall: 0.7577
Epoch 92/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4812 - recall: 0.7830 - val_loss: 0.5046 - val_recall: 0.7577
Epoch 93/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4802 - recall: 0.7824 - val_loss: 0.5037 - val_recall: 0.7577
Epoch 94/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4792 - recall: 0.7827 - val_loss: 0.5026 - val_recall: 0.7546
Epoch 95/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4782 - recall: 0.7835 - val_loss: 0.5017 - val_recall: 0.7577
Epoch 96/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4772 - recall: 0.7835 - val_loss: 0.5007 - val_recall: 0.7577
Epoch 97/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4763 - recall: 0.7831 - val_loss: 0.4997 - val_recall: 0.7577
Epoch 98/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4754 - recall: 0.7828 - val_loss: 0.4988 - val_recall: 0.7577
Epoch 99/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4745 - recall: 0.7839 - val_loss: 0.4979 - val_recall: 0.7607
Epoch 100/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4736 - recall: 0.7841 - val_loss: 0.4970 - val_recall: 0.7638

Loss function

In [ ]:
#Plotting Train Loss vs Validation Loss
plt.plot(history_3.history['loss'])
plt.plot(history_3.history['val_loss'])
plt.title('Model #3 Loss - SGD Optimizer + Balanced')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['train', 'validation'], loc='upper left')
plt.show()

Observations:

  • Initially as the epoch runs executed the loss on the validation dataset was better with a smoot decline. However, the loss increased after the 20th epoch and ended far higher than with the training dataset.
In [ ]:
#Plotting Train recall vs Validation recall
plt.plot(history_3.history['recall'])
plt.plot(history_3.history['val_recall'])
plt.title('Model #3 Recall - SGD Optimizer + Balanced')
plt.ylabel('Recall')
plt.xlabel('Epoch')
plt.legend(['train', 'validation'], loc='upper left')
plt.show()

Observations:

  • Stochastic Gradient Descent with a Balanced data at starting minimized the Recall, however initially, the drop under the first 10 epochs was the most drastic probably as a result of applying the SMOTE to the starting data as only around the 15 epoch the Recall began to favorably increase albeit with noise on the validation set.
  • A higher recall means fewer (FN) false negatives!!
  • The Recall on the validation set was under but followed pretty close to the training set. --That's impressive!
In [ ]:
y_train_pred = model_3.predict(X_train_smote)
#Predicting the results using 0.5 as the threshold
y_train_pred = (y_train_pred > 0.5)
y_train_pred
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step
Out[ ]:
array([[ True],
       [False],
       [False],
       ...,
       [ True],
       [ True],
       [ True]])
In [ ]:
y_val_pred = model_3.predict(X_val)
#Predicting the results using 0.5 as the threshold
y_val_pred = (y_val_pred > 0.5)
y_val_pred
50/50 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
Out[ ]:
array([[ True],
       [False],
       [False],
       ...,
       [False],
       [ True],
       [ True]])
In [ ]:
model_name = "NN with SMOTE & SGD"

train_metric_df.loc[model_name] = recall_score(y_train_smote,y_train_pred)
valid_metric_df.loc[model_name] = recall_score(y_val,y_val_pred)

Classification report

In [ ]:
cr=classification_report(y_train_smote,y_train_pred)
print(cr)
              precision    recall  f1-score   support

           0       0.78      0.76      0.77      5096
           1       0.77      0.79      0.78      5096

    accuracy                           0.78     10192
   macro avg       0.78      0.78      0.77     10192
weighted avg       0.78      0.78      0.77     10192

In [ ]:
cr=classification_report(y_val,y_val_pred) ## Complete the code to check the model's performance on the validation set
print(cr)
              precision    recall  f1-score   support

           0       0.92      0.74      0.82      1274
           1       0.42      0.76      0.55       326

    accuracy                           0.74      1600
   macro avg       0.67      0.75      0.68      1600
weighted avg       0.82      0.74      0.76      1600

Confusion matrix

In [ ]:
#Calculating the confusion matrix
make_confusion_matrix(y_train_smote, y_train_pred)
In [ ]:
#Calculating the confusion matrix

make_confusion_matrix(y_val,y_val_pred) ## Complete the code to check the model's performance on the validation set

Observations:

  • Predicted True Positives (TP) successfully increased.
  • Predicted False Negatives (FN) successfully decreased significantly with SMOTE balanced or a more uniformed starting data.
  • Recall measures how many of the actual positive cases were correctly predicted, and false negatives represent the cases the model missed (failed to predict as positive). A higher recall means fewer false negatives.

(4) Neural Network with Balanced Data (by applying SMOTE) and Adam Optimizer - - Model Performance Improvement

Let's build a model with the balanced dataset

In [ ]:
backend.clear_session()
#Fixing the seed for random number generators so that we can ensure we receive the same output everytime
np.random.seed(2)
random.seed(2)
tf.random.set_seed(2)
In [ ]:
# Initializing the model
model_4 = Sequential()

# Adding an input layer with 64 neurons and 'relu' as the activation function
model_4.add(Dense(64, activation='relu', input_dim=X_train_smote.shape[1]))

# Adding a hidden layer with 32 neurons and 'relu' as the activation function
model_4.add(Dense(32, activation='relu'))

# Adding another hidden layer with 16 neurons and 'relu' as the activation function
model_4.add(Dense(16, activation='relu'))

# Adding the output layer with 1 neuron and 'sigmoid' as the activation function
model_4.add(Dense(1, activation='sigmoid'))
In [ ]:
# Use Adam as the optimizer.
optimizer = tf.keras.optimizers.Adam()

# Uncomment one of the following lines to define the metric to be used
# metric = 'accuracy'
metric = keras.metrics.Recall() # TP/(TP+FN) therefore we want an increase of TPs and a decrease of FNs
# metric = keras.metrics.Precision()
# metric = keras.metrics.F1Score()
In [ ]:
# Complete the code to compile the model with binary cross entropy as loss function and recall as the metric
model_4.compile(loss='binary_crossentropy',optimizer=optimizer,metrics=[metric])
In [ ]:
model_4.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                         ┃ Output Shape                ┃         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ dense (Dense)                        │ (None, 64)                  │             768 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_1 (Dense)                      │ (None, 32)                  │           2,080 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_2 (Dense)                      │ (None, 16)                  │             528 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_3 (Dense)                      │ (None, 1)                   │              17 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 3,393 (13.25 KB)
 Trainable params: 3,393 (13.25 KB)
 Non-trainable params: 0 (0.00 B)
In [ ]:
# Fitting the ANN SMOTE and Adam Optimizer
history_4 = model_4.fit(
    X_train_smote, y_train_smote,
    batch_size=32,  # You can specify the batch size, e.g., 32
    epochs=100,     # You can specify the number of epochs, e.g., 100
    verbose=1,
    validation_data=(X_val, y_val)
)
Epoch 1/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 4s 7ms/step - loss: 0.5842 - recall: 0.7138 - val_loss: 0.5235 - val_recall: 0.7883
Epoch 2/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4544 - recall: 0.7830 - val_loss: 0.5017 - val_recall: 0.7669
Epoch 3/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4337 - recall: 0.7951 - val_loss: 0.4945 - val_recall: 0.7485
Epoch 4/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4217 - recall: 0.8005 - val_loss: 0.4897 - val_recall: 0.7423
Epoch 5/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4119 - recall: 0.8077 - val_loss: 0.4895 - val_recall: 0.7362
Epoch 6/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4030 - recall: 0.8118 - val_loss: 0.4897 - val_recall: 0.7270
Epoch 7/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.3951 - recall: 0.8175 - val_loss: 0.4876 - val_recall: 0.7270
Epoch 8/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.3878 - recall: 0.8214 - val_loss: 0.4920 - val_recall: 0.7270
Epoch 9/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.3807 - recall: 0.8279 - val_loss: 0.4932 - val_recall: 0.7209
Epoch 10/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.3737 - recall: 0.8303 - val_loss: 0.4933 - val_recall: 0.7178
Epoch 11/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.3674 - recall: 0.8363 - val_loss: 0.4967 - val_recall: 0.7270
Epoch 12/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.3608 - recall: 0.8413 - val_loss: 0.4998 - val_recall: 0.7239
Epoch 13/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.3542 - recall: 0.8467 - val_loss: 0.4977 - val_recall: 0.7331
Epoch 14/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.3482 - recall: 0.8521 - val_loss: 0.5016 - val_recall: 0.7362
Epoch 15/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.3419 - recall: 0.8551 - val_loss: 0.5054 - val_recall: 0.7270
Epoch 16/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.3363 - recall: 0.8611 - val_loss: 0.5058 - val_recall: 0.7209
Epoch 17/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.3302 - recall: 0.8618 - val_loss: 0.5090 - val_recall: 0.7178
Epoch 18/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.3247 - recall: 0.8649 - val_loss: 0.5132 - val_recall: 0.7117
Epoch 19/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.3195 - recall: 0.8672 - val_loss: 0.5183 - val_recall: 0.7209
Epoch 20/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.3147 - recall: 0.8738 - val_loss: 0.5218 - val_recall: 0.7025
Epoch 21/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.3091 - recall: 0.8774 - val_loss: 0.5291 - val_recall: 0.7147
Epoch 22/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.3040 - recall: 0.8837 - val_loss: 0.5311 - val_recall: 0.7147
Epoch 23/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.2992 - recall: 0.8850 - val_loss: 0.5424 - val_recall: 0.7086
Epoch 24/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.2940 - recall: 0.8867 - val_loss: 0.5439 - val_recall: 0.7117
Epoch 25/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.2893 - recall: 0.8881 - val_loss: 0.5519 - val_recall: 0.7025
Epoch 26/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.2848 - recall: 0.8899 - val_loss: 0.5533 - val_recall: 0.6933
Epoch 27/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.2805 - recall: 0.8905 - val_loss: 0.5657 - val_recall: 0.6902
Epoch 28/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.2761 - recall: 0.8916 - val_loss: 0.5655 - val_recall: 0.6810
Epoch 29/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.2716 - recall: 0.8948 - val_loss: 0.5707 - val_recall: 0.6779
Epoch 30/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.2674 - recall: 0.8968 - val_loss: 0.5768 - val_recall: 0.6718
Epoch 31/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.2630 - recall: 0.8988 - val_loss: 0.5814 - val_recall: 0.6656
Epoch 32/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.2600 - recall: 0.9019 - val_loss: 0.5894 - val_recall: 0.6718
Epoch 33/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.2558 - recall: 0.9051 - val_loss: 0.5893 - val_recall: 0.6626
Epoch 34/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.2524 - recall: 0.9072 - val_loss: 0.5944 - val_recall: 0.6595
Epoch 35/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.2488 - recall: 0.9085 - val_loss: 0.6012 - val_recall: 0.6564
Epoch 36/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.2455 - recall: 0.9121 - val_loss: 0.6037 - val_recall: 0.6534
Epoch 37/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.2422 - recall: 0.9135 - val_loss: 0.6038 - val_recall: 0.6564
Epoch 38/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.2388 - recall: 0.9157 - val_loss: 0.6111 - val_recall: 0.6595
Epoch 39/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.2359 - recall: 0.9166 - val_loss: 0.6125 - val_recall: 0.6503
Epoch 40/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.2322 - recall: 0.9191 - val_loss: 0.6149 - val_recall: 0.6472
Epoch 41/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.2293 - recall: 0.9208 - val_loss: 0.6191 - val_recall: 0.6503
Epoch 42/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.2262 - recall: 0.9208 - val_loss: 0.6259 - val_recall: 0.6442
Epoch 43/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.2237 - recall: 0.9259 - val_loss: 0.6282 - val_recall: 0.6472
Epoch 44/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.2210 - recall: 0.9274 - val_loss: 0.6308 - val_recall: 0.6472
Epoch 45/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.2176 - recall: 0.9265 - val_loss: 0.6346 - val_recall: 0.6411
Epoch 46/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.2149 - recall: 0.9261 - val_loss: 0.6470 - val_recall: 0.6442
Epoch 47/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.2125 - recall: 0.9276 - val_loss: 0.6489 - val_recall: 0.6442
Epoch 48/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.2099 - recall: 0.9305 - val_loss: 0.6531 - val_recall: 0.6411
Epoch 49/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.2069 - recall: 0.9289 - val_loss: 0.6575 - val_recall: 0.6288
Epoch 50/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.2048 - recall: 0.9308 - val_loss: 0.6691 - val_recall: 0.6288
Epoch 51/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.2024 - recall: 0.9328 - val_loss: 0.6745 - val_recall: 0.6258
Epoch 52/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1991 - recall: 0.9343 - val_loss: 0.6773 - val_recall: 0.6258
Epoch 53/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1970 - recall: 0.9368 - val_loss: 0.6908 - val_recall: 0.6288
Epoch 54/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1953 - recall: 0.9370 - val_loss: 0.6960 - val_recall: 0.6196
Epoch 55/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1931 - recall: 0.9361 - val_loss: 0.7031 - val_recall: 0.6288
Epoch 56/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1905 - recall: 0.9367 - val_loss: 0.7120 - val_recall: 0.6288
Epoch 57/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1884 - recall: 0.9366 - val_loss: 0.7206 - val_recall: 0.6288
Epoch 58/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1857 - recall: 0.9371 - val_loss: 0.7290 - val_recall: 0.6258
Epoch 59/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1845 - recall: 0.9382 - val_loss: 0.7336 - val_recall: 0.6258
Epoch 60/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1823 - recall: 0.9361 - val_loss: 0.7520 - val_recall: 0.6350
Epoch 61/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1797 - recall: 0.9360 - val_loss: 0.7468 - val_recall: 0.6288
Epoch 62/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1790 - recall: 0.9368 - val_loss: 0.7535 - val_recall: 0.6319
Epoch 63/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1770 - recall: 0.9388 - val_loss: 0.7624 - val_recall: 0.6227
Epoch 64/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1749 - recall: 0.9380 - val_loss: 0.7687 - val_recall: 0.6227
Epoch 65/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1726 - recall: 0.9442 - val_loss: 0.7704 - val_recall: 0.6196
Epoch 66/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1713 - recall: 0.9399 - val_loss: 0.7801 - val_recall: 0.6227
Epoch 67/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1698 - recall: 0.9441 - val_loss: 0.7829 - val_recall: 0.6227
Epoch 68/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1675 - recall: 0.9432 - val_loss: 0.7895 - val_recall: 0.6288
Epoch 69/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1664 - recall: 0.9434 - val_loss: 0.7936 - val_recall: 0.6258
Epoch 70/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1645 - recall: 0.9451 - val_loss: 0.7979 - val_recall: 0.6166
Epoch 71/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1637 - recall: 0.9459 - val_loss: 0.8073 - val_recall: 0.6258
Epoch 72/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1616 - recall: 0.9484 - val_loss: 0.8000 - val_recall: 0.6135
Epoch 73/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1602 - recall: 0.9474 - val_loss: 0.8138 - val_recall: 0.6288
Epoch 74/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1590 - recall: 0.9471 - val_loss: 0.8180 - val_recall: 0.6227
Epoch 75/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1572 - recall: 0.9485 - val_loss: 0.8239 - val_recall: 0.6135
Epoch 76/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1564 - recall: 0.9502 - val_loss: 0.8311 - val_recall: 0.6196
Epoch 77/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1535 - recall: 0.9487 - val_loss: 0.8327 - val_recall: 0.6258
Epoch 78/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1519 - recall: 0.9521 - val_loss: 0.8505 - val_recall: 0.6166
Epoch 79/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1509 - recall: 0.9524 - val_loss: 0.8512 - val_recall: 0.6074
Epoch 80/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1493 - recall: 0.9506 - val_loss: 0.8498 - val_recall: 0.6074
Epoch 81/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1496 - recall: 0.9509 - val_loss: 0.8508 - val_recall: 0.6074
Epoch 82/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1475 - recall: 0.9500 - val_loss: 0.8543 - val_recall: 0.6012
Epoch 83/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1450 - recall: 0.9537 - val_loss: 0.8668 - val_recall: 0.5982
Epoch 84/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1455 - recall: 0.9539 - val_loss: 0.8646 - val_recall: 0.5920
Epoch 85/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1430 - recall: 0.9533 - val_loss: 0.8639 - val_recall: 0.5798
Epoch 86/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1425 - recall: 0.9529 - val_loss: 0.8739 - val_recall: 0.5890
Epoch 87/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1403 - recall: 0.9527 - val_loss: 0.8795 - val_recall: 0.5828
Epoch 88/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1391 - recall: 0.9541 - val_loss: 0.8882 - val_recall: 0.5859
Epoch 89/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1387 - recall: 0.9564 - val_loss: 0.8825 - val_recall: 0.5767
Epoch 90/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1379 - recall: 0.9574 - val_loss: 0.8980 - val_recall: 0.5736
Epoch 91/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1361 - recall: 0.9577 - val_loss: 0.8994 - val_recall: 0.5736
Epoch 92/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1345 - recall: 0.9576 - val_loss: 0.9040 - val_recall: 0.5736
Epoch 93/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1337 - recall: 0.9578 - val_loss: 0.9044 - val_recall: 0.5675
Epoch 94/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1324 - recall: 0.9576 - val_loss: 0.9136 - val_recall: 0.5706
Epoch 95/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1311 - recall: 0.9596 - val_loss: 0.9065 - val_recall: 0.5583
Epoch 96/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1299 - recall: 0.9592 - val_loss: 0.9147 - val_recall: 0.5644
Epoch 97/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1291 - recall: 0.9599 - val_loss: 0.9320 - val_recall: 0.5613
Epoch 98/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1291 - recall: 0.9603 - val_loss: 0.9371 - val_recall: 0.5675
Epoch 99/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1274 - recall: 0.9602 - val_loss: 0.9402 - val_recall: 0.5736
Epoch 100/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.1260 - recall: 0.9613 - val_loss: 0.9471 - val_recall: 0.5583

Loss function

In [ ]:
#Plotting Train Loss vs Validation Loss
plt.plot(history_4.history['loss'])
plt.plot(history_4.history['val_loss'])
plt.title('Model #4 Loss - Adam Optimizer + Balanced')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['train', 'validation'], loc='upper left')
plt.show()

Observations:

  • Interesting loss function here, it decreased in the Training set but increased with the Validation set...why?
  • The loss curves do not follow each other.
  • This loss is difficult to interpret for me.
In [ ]:
#Plotting Train recall vs Validation recall
plt.plot(history_4.history['recall'])
plt.plot(history_4.history['val_recall'])
plt.title('Model #4 Recall - Adam Optimizer + Balanced')
plt.ylabel('Recall')
plt.xlabel('Epoch')
plt.legend(['train', 'validation'], loc='upper left')
plt.show()

Observations:

  • What is going on with the Recall? It increased nicely during the training as expected, however with the validation --it was the opposite.
In [ ]:
y_train_pred = model_4.predict(X_train_smote)
#Predicting the results using 0.5 as the threshold
y_train_pred = (y_train_pred > 0.5)
y_train_pred
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step
Out[ ]:
array([[ True],
       [False],
       [False],
       ...,
       [ True],
       [ True],
       [ True]])
In [ ]:
y_val_pred = model_4.predict(X_val)
#Predicting the results using 0.5 as the threshold
y_val_pred = (y_val_pred > 0.5)
y_val_pred
50/50 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
Out[ ]:
array([[False],
       [False],
       [False],
       ...,
       [False],
       [ True],
       [ True]])
In [ ]:
model_name = "NN with SMOTE & Adam"

train_metric_df.loc[model_name] = recall_score(y_train_smote,y_train_pred)
valid_metric_df.loc[model_name] = recall_score(y_val,y_val_pred)

Classification report

In [ ]:
cr=classification_report(y_train_smote,y_train_pred)
print(cr)
              precision    recall  f1-score   support

           0       0.97      0.91      0.94      5096
           1       0.91      0.97      0.94      5096

    accuracy                           0.94     10192
   macro avg       0.94      0.94      0.94     10192
weighted avg       0.94      0.94      0.94     10192

In [ ]:
cr=classification_report(y_val,y_val_pred) ## Complete the code to check the model's performance on the validation set
print(cr)
              precision    recall  f1-score   support

           0       0.88      0.83      0.85      1274
           1       0.45      0.56      0.50       326

    accuracy                           0.77      1600
   macro avg       0.67      0.69      0.68      1600
weighted avg       0.79      0.77      0.78      1600

Confusion matrix

In [ ]:
#Calculating the confusion matrix
make_confusion_matrix(y_train_smote, y_train_pred)
In [ ]:
#Calculating the confusion matrix
make_confusion_matrix(y_val,y_val_pred)  ## Complete the code to check the model's performance on the validation set

Observations:

  • Predicted True Positives (TP) did not increase.
  • Predicted False Negatives (FN) did not decrease with SMOTE balanced or a more uniformed starting data and the adaptive optimizer Adam.
  • A higher recall means fewer false negatives. A decreasing Recall here with the validation set means the prediction wasn't successful.

(5) Neural Network with Balanced Data (by applying SMOTE), Adam Optimizer, and Dropout - - Model Performance Improvement

Dropout is a regularization technique used in neural networks to prevent overfitting. Here’s a summary of what it does:

  • Randomly “drops” neurons during training: At each training step, a random subset of neurons (and their connections) is “dropped” or temporarily deactivated. This forces the network to learn more robust patterns that don’t rely on specific neurons.
  • Prevents overfitting: By randomly dropping neurons, the model cannot over-rely on any one feature or pattern, which helps it generalize better to unseen data, reducing overfitting.
  • Improves performance: Dropout encourages the network to learn multiple independent features, improving overall model performance when deployed.
In [ ]:
backend.clear_session()
#Fixing the seed for random number generators so that we can ensure we receive the same output everytime
np.random.seed(2)
random.seed(2)
tf.random.set_seed(2)
In [ ]:
# Initializing the model
model_5 = Sequential()
# Adding input layer with 32 neurons and relu as activation function
model_5.add(Dense(32, activation='relu', input_dim=X_train_smote.shape[1]))
# Adding dropout layer with a dropout rate of 0.2
model_5.add(Dropout(0.2))
# Adding hidden layer with 16 neurons and relu as activation function
model_5.add(Dense(16, activation='relu'))
# Adding dropout layer with a dropout rate of 0.2
model_5.add(Dropout(0.2))
# Adding hidden layer with 8 neurons with relu as activation function
model_5.add(Dense(8, activation='relu'))
# Adding the output layer with 1 neuron and sigmoid activation function
model_5.add(Dense(1, activation='sigmoid'))
In [ ]:
#Complete the code to use Adam as the optimizer.
optimizer = tf.keras.optimizers.Adam()

# uncomment one of the following lines to define the metric to be used
# metric = 'accuracy'
metric = keras.metrics.Recall()
# metric = keras.metrics.Precision()
# metric = keras.metrics.F1Score()
In [ ]:
# Complete the code to compile the model with binary cross entropy as loss function and recall as the metric
model_5.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=[metric])
In [ ]:
model_5.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                         ┃ Output Shape                ┃         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ dense (Dense)                        │ (None, 32)                  │             384 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dropout (Dropout)                    │ (None, 32)                  │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_1 (Dense)                      │ (None, 16)                  │             528 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dropout_1 (Dropout)                  │ (None, 16)                  │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_2 (Dense)                      │ (None, 8)                   │             136 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_3 (Dense)                      │ (None, 1)                   │               9 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 1,057 (4.13 KB)
 Trainable params: 1,057 (4.13 KB)
 Non-trainable params: 0 (0.00 B)
In [ ]:
history_5 = model_5.fit(
    X_train_smote, y_train_smote,
    batch_size=32,  ## Specify the batch size to use (for example, 32)
    epochs=100,     ## Specify the number of epochs (for example, 100)
    verbose=1,
    validation_data=(X_val, y_val)
)
Epoch 1/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 5s 8ms/step - loss: 0.6356 - recall: 0.6616 - val_loss: 0.5559 - val_recall: 0.7423
Epoch 2/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5408 - recall: 0.7611 - val_loss: 0.5131 - val_recall: 0.7761
Epoch 3/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.5135 - recall: 0.7510 - val_loss: 0.4854 - val_recall: 0.7607
Epoch 4/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4979 - recall: 0.7595 - val_loss: 0.4778 - val_recall: 0.7730
Epoch 5/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4838 - recall: 0.7709 - val_loss: 0.4710 - val_recall: 0.7393
Epoch 6/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4804 - recall: 0.7512 - val_loss: 0.4664 - val_recall: 0.7393
Epoch 7/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4667 - recall: 0.7729 - val_loss: 0.4639 - val_recall: 0.7393
Epoch 8/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4702 - recall: 0.7658 - val_loss: 0.4614 - val_recall: 0.7362
Epoch 9/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4559 - recall: 0.7738 - val_loss: 0.4473 - val_recall: 0.7209
Epoch 10/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4619 - recall: 0.7698 - val_loss: 0.4461 - val_recall: 0.7209
Epoch 11/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4591 - recall: 0.7791 - val_loss: 0.4450 - val_recall: 0.7055
Epoch 12/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4527 - recall: 0.7797 - val_loss: 0.4474 - val_recall: 0.7270
Epoch 13/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4478 - recall: 0.7760 - val_loss: 0.4430 - val_recall: 0.7055
Epoch 14/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4477 - recall: 0.7866 - val_loss: 0.4468 - val_recall: 0.7178
Epoch 15/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4450 - recall: 0.7981 - val_loss: 0.4472 - val_recall: 0.7301
Epoch 16/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4477 - recall: 0.7932 - val_loss: 0.4381 - val_recall: 0.7239
Epoch 17/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4432 - recall: 0.7932 - val_loss: 0.4393 - val_recall: 0.7117
Epoch 18/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4390 - recall: 0.7898 - val_loss: 0.4474 - val_recall: 0.7239
Epoch 19/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4431 - recall: 0.7907 - val_loss: 0.4359 - val_recall: 0.7117
Epoch 20/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4369 - recall: 0.7942 - val_loss: 0.4472 - val_recall: 0.7055
Epoch 21/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4371 - recall: 0.7936 - val_loss: 0.4486 - val_recall: 0.7239
Epoch 22/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4307 - recall: 0.8075 - val_loss: 0.4326 - val_recall: 0.7055
Epoch 23/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4407 - recall: 0.7917 - val_loss: 0.4397 - val_recall: 0.7025
Epoch 24/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4280 - recall: 0.8042 - val_loss: 0.4375 - val_recall: 0.7147
Epoch 25/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4327 - recall: 0.7999 - val_loss: 0.4439 - val_recall: 0.7331
Epoch 26/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4309 - recall: 0.8083 - val_loss: 0.4430 - val_recall: 0.7362
Epoch 27/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4340 - recall: 0.8012 - val_loss: 0.4442 - val_recall: 0.7239
Epoch 28/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4313 - recall: 0.8120 - val_loss: 0.4398 - val_recall: 0.7117
Epoch 29/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4268 - recall: 0.8121 - val_loss: 0.4351 - val_recall: 0.7270
Epoch 30/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4294 - recall: 0.8044 - val_loss: 0.4347 - val_recall: 0.7025
Epoch 31/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4350 - recall: 0.8041 - val_loss: 0.4420 - val_recall: 0.7331
Epoch 32/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4325 - recall: 0.8029 - val_loss: 0.4321 - val_recall: 0.6994
Epoch 33/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4344 - recall: 0.7983 - val_loss: 0.4363 - val_recall: 0.7055
Epoch 34/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4237 - recall: 0.8071 - val_loss: 0.4317 - val_recall: 0.7055
Epoch 35/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4220 - recall: 0.8083 - val_loss: 0.4350 - val_recall: 0.7025
Epoch 36/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4218 - recall: 0.8043 - val_loss: 0.4329 - val_recall: 0.7209
Epoch 37/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4257 - recall: 0.8053 - val_loss: 0.4356 - val_recall: 0.7055
Epoch 38/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4271 - recall: 0.7996 - val_loss: 0.4367 - val_recall: 0.6994
Epoch 39/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4284 - recall: 0.7995 - val_loss: 0.4405 - val_recall: 0.7209
Epoch 40/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4247 - recall: 0.8076 - val_loss: 0.4333 - val_recall: 0.7025
Epoch 41/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4209 - recall: 0.8113 - val_loss: 0.4333 - val_recall: 0.6963
Epoch 42/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4228 - recall: 0.8093 - val_loss: 0.4384 - val_recall: 0.7055
Epoch 43/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4208 - recall: 0.8144 - val_loss: 0.4331 - val_recall: 0.6902
Epoch 44/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4220 - recall: 0.8099 - val_loss: 0.4382 - val_recall: 0.7025
Epoch 45/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4188 - recall: 0.8162 - val_loss: 0.4356 - val_recall: 0.7025
Epoch 46/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4199 - recall: 0.8040 - val_loss: 0.4336 - val_recall: 0.6902
Epoch 47/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4185 - recall: 0.8091 - val_loss: 0.4282 - val_recall: 0.6994
Epoch 48/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4177 - recall: 0.8098 - val_loss: 0.4320 - val_recall: 0.6963
Epoch 49/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4123 - recall: 0.8058 - val_loss: 0.4351 - val_recall: 0.7117
Epoch 50/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4122 - recall: 0.8157 - val_loss: 0.4296 - val_recall: 0.6933
Epoch 51/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4133 - recall: 0.8042 - val_loss: 0.4327 - val_recall: 0.7025
Epoch 52/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4139 - recall: 0.8176 - val_loss: 0.4256 - val_recall: 0.7025
Epoch 53/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4131 - recall: 0.8132 - val_loss: 0.4354 - val_recall: 0.7025
Epoch 54/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4140 - recall: 0.8143 - val_loss: 0.4338 - val_recall: 0.7117
Epoch 55/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4140 - recall: 0.8083 - val_loss: 0.4252 - val_recall: 0.6963
Epoch 56/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4160 - recall: 0.8115 - val_loss: 0.4352 - val_recall: 0.7055
Epoch 57/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4132 - recall: 0.8177 - val_loss: 0.4366 - val_recall: 0.7147
Epoch 58/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4147 - recall: 0.8111 - val_loss: 0.4322 - val_recall: 0.6840
Epoch 59/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4172 - recall: 0.8142 - val_loss: 0.4317 - val_recall: 0.6963
Epoch 60/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4128 - recall: 0.8108 - val_loss: 0.4318 - val_recall: 0.6994
Epoch 61/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4169 - recall: 0.8089 - val_loss: 0.4287 - val_recall: 0.6933
Epoch 62/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4092 - recall: 0.8224 - val_loss: 0.4331 - val_recall: 0.6994
Epoch 63/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4102 - recall: 0.8157 - val_loss: 0.4272 - val_recall: 0.6994
Epoch 64/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4078 - recall: 0.8133 - val_loss: 0.4339 - val_recall: 0.7117
Epoch 65/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4116 - recall: 0.8221 - val_loss: 0.4324 - val_recall: 0.7086
Epoch 66/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4085 - recall: 0.8117 - val_loss: 0.4244 - val_recall: 0.6902
Epoch 67/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4086 - recall: 0.8159 - val_loss: 0.4318 - val_recall: 0.7086
Epoch 68/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4072 - recall: 0.8185 - val_loss: 0.4335 - val_recall: 0.7086
Epoch 69/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4088 - recall: 0.8132 - val_loss: 0.4361 - val_recall: 0.7086
Epoch 70/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4137 - recall: 0.8235 - val_loss: 0.4299 - val_recall: 0.7086
Epoch 71/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4048 - recall: 0.8260 - val_loss: 0.4256 - val_recall: 0.6933
Epoch 72/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4089 - recall: 0.8104 - val_loss: 0.4294 - val_recall: 0.6963
Epoch 73/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4094 - recall: 0.8138 - val_loss: 0.4324 - val_recall: 0.6871
Epoch 74/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4036 - recall: 0.8167 - val_loss: 0.4286 - val_recall: 0.6994
Epoch 75/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4084 - recall: 0.8128 - val_loss: 0.4282 - val_recall: 0.6810
Epoch 76/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4039 - recall: 0.8175 - val_loss: 0.4334 - val_recall: 0.7055
Epoch 77/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4122 - recall: 0.8115 - val_loss: 0.4285 - val_recall: 0.7025
Epoch 78/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4012 - recall: 0.8214 - val_loss: 0.4303 - val_recall: 0.7086
Epoch 79/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4051 - recall: 0.8088 - val_loss: 0.4315 - val_recall: 0.7025
Epoch 80/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4050 - recall: 0.8178 - val_loss: 0.4337 - val_recall: 0.7025
Epoch 81/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4033 - recall: 0.8152 - val_loss: 0.4280 - val_recall: 0.7117
Epoch 82/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4078 - recall: 0.8191 - val_loss: 0.4305 - val_recall: 0.7086
Epoch 83/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.3980 - recall: 0.8274 - val_loss: 0.4315 - val_recall: 0.7117
Epoch 84/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4033 - recall: 0.8115 - val_loss: 0.4245 - val_recall: 0.7055
Epoch 85/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4035 - recall: 0.8216 - val_loss: 0.4278 - val_recall: 0.7086
Epoch 86/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4039 - recall: 0.8173 - val_loss: 0.4299 - val_recall: 0.7117
Epoch 87/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4046 - recall: 0.8164 - val_loss: 0.4259 - val_recall: 0.6994
Epoch 88/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4039 - recall: 0.8190 - val_loss: 0.4257 - val_recall: 0.7117
Epoch 89/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.3985 - recall: 0.8242 - val_loss: 0.4275 - val_recall: 0.7086
Epoch 90/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4068 - recall: 0.8210 - val_loss: 0.4248 - val_recall: 0.6994
Epoch 91/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4032 - recall: 0.8143 - val_loss: 0.4275 - val_recall: 0.7086
Epoch 92/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4041 - recall: 0.8164 - val_loss: 0.4264 - val_recall: 0.7147
Epoch 93/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4042 - recall: 0.8136 - val_loss: 0.4280 - val_recall: 0.7117
Epoch 94/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4059 - recall: 0.8164 - val_loss: 0.4280 - val_recall: 0.7055
Epoch 95/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4015 - recall: 0.8157 - val_loss: 0.4334 - val_recall: 0.7209
Epoch 96/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - loss: 0.4055 - recall: 0.8259 - val_loss: 0.4306 - val_recall: 0.6963
Epoch 97/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.3965 - recall: 0.8203 - val_loss: 0.4312 - val_recall: 0.7025
Epoch 98/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4011 - recall: 0.8207 - val_loss: 0.4270 - val_recall: 0.6994
Epoch 99/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.3994 - recall: 0.8248 - val_loss: 0.4304 - val_recall: 0.7117
Epoch 100/100
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.4004 - recall: 0.8224 - val_loss: 0.4295 - val_recall: 0.6994

Loss function

In [ ]:
#Plotting Train Loss vs Validation Loss
plt.plot(history_5.history['loss'])
plt.plot(history_5.history['val_loss'])
plt.title('Model #5 Loss - Adam Opt + Balanced + Dropout')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['train', 'validation'], loc='upper left')
plt.show()

Observations:

  • Outstanding loss curve for both the Training and Validation sets.
In [ ]:
#Plotting Train recall vs Validation recall
plt.plot(history_5.history['recall'])
plt.plot(history_5.history['val_recall'])
plt.title('Model #5 Recall - Adam Opt + Balanced + Dropout')
plt.ylabel('Recall')
plt.xlabel('Epoch')
plt.legend(['train', 'validation'], loc='upper left')
plt.show()

Observations:

  • Recall does not look that great with the validation set.
  • Adam optimizer with a Balanced data at starting and Droping some Neurons to improve the learning owith the training set was successful. However it minimized the Recall on the validation set again! Not good!
  • A higher recall means fewer (FN) false negatives with the Training set!!
  • The Recall on the validation set was poor and noisy.
In [ ]:
y_train_pred = model_5.predict(X_train_smote)
#Predicting the results using 0.5 as the threshold
y_train_pred = (y_train_pred > 0.5)
y_train_pred
319/319 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step
Out[ ]:
array([[ True],
       [False],
       [False],
       ...,
       [ True],
       [ True],
       [ True]])
In [ ]:
y_val_pred = model_5.predict(X_val)
#Predicting the results using 0.5 as the threshold
y_val_pred = (y_val_pred > 0.5)
y_val_pred
50/50 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
Out[ ]:
array([[ True],
       [False],
       [False],
       ...,
       [False],
       [ True],
       [ True]])
In [ ]:
model_name = "NN with SMOTE,Adam & Dropout"

train_metric_df.loc[model_name] = recall_score(y_train_smote,y_train_pred)
valid_metric_df.loc[model_name] = recall_score(y_val,y_val_pred)

Classification report

In [ ]:
cr=classification_report(y_train_smote,y_train_pred)
print(cr)
              precision    recall  f1-score   support

           0       0.84      0.84      0.84      5096
           1       0.84      0.84      0.84      5096

    accuracy                           0.84     10192
   macro avg       0.84      0.84      0.84     10192
weighted avg       0.84      0.84      0.84     10192

In [ ]:
#classification report
cr=classification_report(y_val,y_val_pred)  ## Complete the code to check the model's performance on the validation set
print(cr)
              precision    recall  f1-score   support

           0       0.92      0.83      0.87      1274
           1       0.52      0.70      0.60       326

    accuracy                           0.81      1600
   macro avg       0.72      0.77      0.73      1600
weighted avg       0.83      0.81      0.82      1600

Confusion matrix

In [ ]:
#Calculating the confusion matrix
make_confusion_matrix(y_train_smote, y_train_pred)
In [ ]:
#Calculating the confusion matrix
make_confusion_matrix(y_val,y_val_pred)  ## Complete the code to check the model's performance on the validation set

Observations:

  • A lower (FN) false negatives with the Validation set!!
  • Recall was decreased overall.
  • Training set look optimistically agreeable as a valid approach.
  • However the Validation set proved us wrong.

Model Performance Comparison and Final Model Selection --------------- Section

In [ ]:
print("\nTraining performance comparison:")
train_metric_df
Training performance comparison
Out[ ]:
recall
Neutral Network with SGD 0.325920
NN with Adam 0.551380
NN with Adam & Dropout 0.605061
NN with SMOTE & SGD 0.791601
NN with SMOTE & Adam 0.971154
NN with SMOTE,Adam & Dropout 0.838501

Observations:

  • The highest Recall (0.97) with the training predictions is with Model #5 (Balanced Data) SMOTE + Adam Optimizer. Great effort but NO COOKIE!! ;-)
In [ ]:
print("\nValidation set performance comparison:")
valid_metric_df
Validation set performance comparison
Out[ ]:
recall
Neutral Network with SGD 0.291411
NN with Adam 0.441718
NN with Adam & Dropout 0.441718
NN with SMOTE & SGD 0.763804
NN with SMOTE & Adam 0.558282
NN with SMOTE,Adam & Dropout 0.699387

Observations:

  • The highest Recall value with the validation predictions is with Model #3 SMOTE + SGD at 0.763
  • Slightly lower than with the highest Recall value (0.97) with the training predictions with Model #5 SMOTE + Adam.
  • Depending on time and resources, additional models could be explored using the selected as the top most efficient model as the starting baseline. For example Bayesian Networks and Principal Component Analysis (PCA) could be trained and compared against the baseline model.
  • Additional hidden layers could be added to the already trained models to determine if better performance can be achieved.
In [ ]:
train_metric_df - valid_metric_df # Delta between training and validation Recall metric.
Out[ ]:
recall
Neutral Network with SGD 0.034509
NN with Adam 0.109663
NN with Adam & Dropout 0.163344
NN with SMOTE & SGD 0.027798
NN with SMOTE & Adam 0.412872
NN with SMOTE,Adam & Dropout 0.139114

Observations:

  • The smallest Recall Delta between the training and validation predictions is also with Model 3 SMOTE + SGD at 0.0277</font>
In [ ]:
print("Best NN Model is Model #3 NN + Balanced Data + SGD Optimizer")
y_test_pred = model_3.predict(X_test)    ## Complete the code to specify the best model "Model_0"
y_test_pred = (y_test_pred > 0.5)
print(y_test_pred)
63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step
[[False]
 [False]
 [False]
 ...
 [ True]
 [False]
 [False]]
In [ ]:
# Classification Report Metrics on Unseen Data.
print("\nClassification Report Metrics on Unseen Data: \n")
cr=classification_report(y_test,y_test_pred)
print(cr)
              precision    recall  f1-score   support

           0       0.93      0.74      0.82      1593
           1       0.43      0.77      0.55       407

    accuracy                           0.75      2000
   macro avg       0.68      0.76      0.69      2000
weighted avg       0.83      0.75      0.77      2000

A final Verification - Using the "winning NN model" predictions against the Test dataset: ... Drum Roll please ... ;-)

In [ ]:
#Calculating the confusion matrix
make_confusion_matrix(y_test,y_test_pred)

Observations:

  • On unseen data (Test dataset) the predicted values matched the true data by 95.3% after training the model and reducing the number of False Positives down to 4.7%
  • The model trained best when optimized was Model 3 Balanced + SGD.

Actionable Insights, Engineering and Business Recommendations ------------------------- Section

Business Recommendations:

  • Use Model #3 A Neural Network with Balanced Data and SDG Optimizer to predict Customers likely to depart and exit the bank without saying proper goodbyes nor hugs. But seriously, this model was tested as as the best model to anticipate exits.
  • Since most of the Customers if not all are based in Europe, the EU AI Act is required and must be followed.
  • Once Corrective Actions are put in place, the model should be re-tested and reverified for efficiency in predicting Customer behavious that affect the bank. This should be done regularly as the data is likely to constantly grow.
  • Engineering needs to provide with a User Acceptance Plan to ensure all business units can properly use the predictions properly and all Use Cases are well understood before going to production.
  • IT must ensure data recovery after introducing the model into production.
  • A Program Manager should capture all the recommendations from this report shown elsewhere to ensure a proper deployment, a cost analysis and the very importan resource allocations of computing resources.
  • HR should be invited to review recommendations and insides from this report as they can provide invaluable help in assesing impact to current employees handling sensitive information.
  • The bank should prioritize gender diversity initiatives with its vendors and coordinate all marketing similarly in its advertizing strategies, truth in lending, fairness and economic access to create a more balanced and inclusive clientele which can ultimately benefit the company's performance.
  • Customer Support should analyze customer exits as they are likely to have experienced unfavorable transactions or help while trying to resolve issues.
  • A Root Cause Corrective Action Committe made up of multiple function should review periodically all recommendations and test preventative actions to limit and decrease Customer exits.
  • Training and Employee Development should consider offering upskilling or reskilling programs to preperly understand predictive models business process changes, current shortcomings and data impacts to increase business adaptation to these tools as this report recommends.

Engineering Recommendations:

  1. Release to a staging environment for functional acceptance testing, end-user acceptance testing, governance and regulatory compliance audits before going to production with the optimized Model #3 with Balanced + SGD Optimizer.
  2. Preprocessing New Data: (New Customers without having the the true target values.)
    • Ensure that the new customer data is preprocessed in the same way as the training data. This includes handling missing values, normalizing or scaling features (e.g., using the same StandardScaler instance), and encoding categorical variables whenever applicable.
  3. Using the Model for Predictions:
    • Use the trained model to predict outcomes for the new data. Call the predict() method on the model with the preprocessed new data.
      new_customer_predictions = model.predict(new_customer_data)
  4. Applying Threshold:
    • For binary classification tasks, we typically can apply a threshold to the predicted probabilities to determine class labels. Commonly, a threshold of 0.5 is used, but this can be adjusted based on business needs or the specific context of the problem.
      new_customer_labels = (new_customer_predictions > 0.5).astype(int)
  5. Interpreting the Predictions:
    • The resulting new_customer_labels can provide with the predicted class for each new customer. For example, in a churn prediction context, a label of 1 could indicate that a customer is likely to churn, while a label of 0 indicates they are likely to stay.
  6. Further Evaluation (if possible):
    • When we eventually obtain the true target values for the new customers (e.g., after some time), we can evaluate the performance of our model on this new customer data using metrics like recall, precision, or F1-score as well.
  7. Example PseudoCode:

    - Preprocessing new data
    new_customer_data = preprocess_new_data(raw_new_customer_data)  
    
    - Predicting with the model
    new_customer_predictions = model.predict(new_customer_data)
    
    - Applying a threshold to get class labels
    new_customer_labels = (new_customer_predictions > 0.5).astype(int)
    
    - Displaying predictions
    print("Predicted labels for new customers:", new_customer_labels)
  8. Summary:

    • To preprocess the new data in a Staging or Production environments:
    • Use the trained model to predict probabilities,
    • Apply a threshold to classify the outcomes,
    • Interpret these predicted labels for further analysis or action.
    • This process allows making informed decisions regarding new customers, even in the absence of true target values not yet materialized.

Refer to APPENDIX I to determine how to best use this model in production.

APPENDIX I¶

How to use the model to predict outcome with new incoming data?

When we have trained and validated a neural network (NN) model, and we want to predict values for new data (like new customers) without having the true target values, we follow these steps:

  1. Preprocessing New Data:

    • Ensure that the new customer data is preprocessed in the same way as the training data. This includes handling missing values, normalizing or scaling features (e.g., using the same StandardScaler instance), and encoding categorical variables if applicable.
  2. Using the Model for Predictions:

    • Use the trained model to predict outcomes for the new data. You can call the predict() method on the model with the preprocessed new data.
In [ ]:
# new_customer_predictions = model.predict(new_customer_data)
  1. Applying Threshold:
    • For binary classification tasks, you will typically apply a threshold to the predicted probabilities to determine class labels. Commonly, a threshold of 0.5 is used, but this can be adjusted based on business needs or the specific context of the problem.
In [ ]:
# new_customer_labels = (new_customer_predictions > 0.5).astype(int)
  1. Interpreting the Predictions:
    • The resulting new_customer_labels will provide you with the predicted class for each new customer. For example, in a churn prediction context, a label of 1 could indicate that a customer is likely to churn, while a label of 0 indicates they are likely to stay.
  2. Further Evaluation (if possible):
    • If you eventually obtain the true target values for the new customers (e.g., after some time), you can evaluate the performance of your model on this new data using metrics like recall, precision, or F1-score.

Example Code:

In [ ]:
# Preprocessing new data
new_customer_data = preprocess_new_data(raw_new_customer_data)  # Apply necessary preprocessing steps

# Predicting with the model
new_customer_predictions = model.predict(new_customer_data)

# Applying a threshold to get class labels
new_customer_labels = (new_customer_predictions > 0.5).astype(int)

# Displaying predictions
print("Predicted labels for new customers:", new_customer_labels)

Summary:

We preprocess the new data, use the trained model to predict probabilities, apply a threshold to classify the outcomes, and then interpret these predicted labels for further analysis or action. This process allows making informed decisions regarding customers, even in the absence of true target values not yet materialized.


NICE COLORS REFERENCE¶

Here's a reorganized list of the 105 color variations, grouped by similar color hues:

Blues:¶

Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead

Reds:¶

Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead

Purples:¶

Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead

Greens:¶

Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead

Yellows/Oranges:¶

Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead

Pinks/Magentas:¶

Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead

Grays/Neutral Tones:¶

Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead

Browns:¶

Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead Power Ahead ```

This categorization organizes colors by their general hue families. You can now use these groupings to display colors more cohesively!

APPENDIX II¶

Building an Artificial Neural Network (ANN)

  1. Define the Problem and Architecture Problem: Clearly understand the task (classification, regression, etc.) and the input/output data. Architecture: Choose the number of layers, neurons per layer, and activation functions. Consider factors like complexity and computational resources.
  2. Initialize Weights and Biases Random initialization: Assign random values to the weights and biases. Common methods include Xavier initialization or He initialization.
  3. Forward Propagation Input layer: Feed the input data to the network. Hidden layers: Calculate the weighted sum of inputs and biases for each neuron, then apply the activation function. Output layer: Compute the final output using the same process.
  4. Calculate the Loss Loss function: Choose a suitable loss function based on the problem (e.g., mean squared error for regression, cross-entropy for classification). Compute loss: Calculate the difference between the predicted output and the actual target.
  5. Backpropagation Chain rule: Use the chain rule to calculate the gradients of the loss with respect to the weights and biases. Update weights and biases: Adjust the weights and biases in the direction that minimizes the loss.
  6. Optimization Optimizer: Select an optimization algorithm (e.g., gradient descent, Adam, RMSprop) to update the weights and biases efficiently. Learning rate: Set the learning rate to control the step size during updates.
  7. Training Epochs: Iterate through the entire dataset multiple times (epochs). Mini-batches: Divide the dataset into smaller batches to reduce computational cost and improve generalization. Forward pass, loss calculation, backpropagation, and optimization: Repeat these steps for each mini-batch.
  8. Evaluation Validation set: Use a separate validation set to assess the model's performance during training. Metrics: Choose appropriate metrics (e.g., accuracy, precision, recall, F1-score) based on the problem. Overfitting: Monitor for overfitting (when the model performs well on the training set but poorly on the validation set). Example: Building a Multi-Layer Perceptron (MLP) for Image Classification import tensorflow as tf

Define the model¶

model = tf.keras.Sequential([ tf.keras.layers.Flatten(input_shape=(28, 28)), # Flatten the input image tf.keras.layers.Dense(128, activation='relu'), # Hidden layer with 128 neurons tf.keras.layers.Dense(10, activation='softmax') # Output layer with 10 classes (for MNIST) ])

Compile the model¶

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

Train the model¶

model.fit(x_train, y_train, epochs=10)

Evaluate the model¶

test_loss, test_acc = model.evaluate(x_test, y_test) print('Test accuracy:', test_acc) Key Points:

Activation functions: Choose appropriate activation functions (e.g., ReLU, sigmoid, tanh) based on the layer and problem. Multi-layers: Adding more layers can improve model complexity and learning capacity. Loss functions: Select a loss function that aligns with the problem's objective. Training and optimization: Experiment with different optimizers and learning rates to find the best configuration. Backpropagation: Efficiently calculate gradients to update the model's parameters. Evaluation: Use appropriate metrics to assess the model's performance and identify potential issues.

APPENDIX III¶

Utility Functions:

In [ ]:
#Printing the summary.
model.summary()

Model: "sequential"


Layer (type) | Output Shape | Param #

=================================================================

dense (Dense) | (None, 10) | 7850

=================================================================

  • Total params: 7850 (30.66 KB)
  • Trainable params: 7850 (30.66 KB)
  • Non-trainable params: 0 (0.00 Byte)

In [ ]:
batch_size = x_train.shape[0]
epochs = 10

# time.time() returns the time in seconds since Thu Jan 1 00:00:00 1970.
start = time.time()

# Fitting the model.
history = model.fit(
    x_train, y_train,
    validation_data=(x_val,y_val),
    batch_size=batch_size,
    epochs=epochs
)

# time.time() returns the time in seconds since Thu Jan 1 00:00:00 1970.
end = time.time()

Epoch 1/10 1/1 [==============================] - 2s 2s/step - loss: 2.3785 - accuracy: 0.0933 - val_loss: 2.3576 - val_accuracy: 0.1019 Epoch 2/10 1/1 [==============================] - 0s 269ms/step - loss: 2.3633 - accuracy: 0.0996 - val_loss: 2.3429 - val_accuracy: 0.1081

Epoch 3/10 1/1 [==============================] - 0s 303ms/step - loss: 2.3485 - accuracy: 0.1053 - val_loss: 2.3286 - val_accuracy: 0.1159

Epoch 4/10 1/1 [==============================] - 0s 279ms/step - loss: 2.3342 - accuracy: 0.1115 - val_loss: 2.3148 - val_accuracy: 0.1224

Epoch 5/10 1/1 [==============================] - 0s 196ms/step - loss: 2.3203 - accuracy: 0.1180 - val_loss: 2.3014 - val_accuracy: 0.1286

Epoch 6/10 1/1 [==============================] - 0s 227ms/step - loss: 2.3068 - accuracy: 0.1244 - val_loss: 2.2882 - val_accuracy: 0.1342

Epoch 7/10 1/1 [==============================] - 0s 306ms/step - loss: 2.2936 - accuracy: 0.1313 - val_loss: 2.2755 - val_accuracy: 0.1408

Epoch 8/10 1/1 [==============================] - 0s 177ms/step - loss: 2.2807 - accuracy: 0.1380 - val_loss: 2.2630 - val_accuracy: 0.1480

Epoch 9/10 1/1 [==============================] - 0s 201ms/step - loss: 2.2682 - accuracy: 0.1455 - val_loss: 2.2508 - val_accuracy: 0.1554

Epoch 10/10 1/1 [==============================] - 0s 193ms/step - loss: 2.2559 - accuracy: 0.1527 - val_loss: 2.2388 - val_accuracy: 0.1628

In [ ]:
print("Time taken in seconds ",end-start)

Time taken in seconds 4.701591730117798

In [ ]:
ef plot(history, name):
    """
    Function to plot loss/accuracy

    history: an object which stores the metrics and losses.
    name: can be one of Loss or Accuracy
    """
    fig, ax = plt.subplots() #Creating a subplot with figure and axes.
    plt.plot(history.history[name]) #Plotting the train accuracy or train loss
    plt.plot(history.history['val_'+name]) #Plotting the validation accuracy or validation loss

    plt.title('Model ' + name.capitalize()) #Defining the title of the plot.
    plt.ylabel(name.capitalize()) #Capitalizing the first letter.
    plt.xlabel('Epoch') #Defining the label for the x-axis.
    fig.legend(['Train', 'Validation'], loc="outside right upper") #Defining the legend, loc controls the position of the legend.
In [ ]:
plot(history,'loss')

Nice plot here

In [ ]:
plot(history,'accuracy')

Nice plot here

In [ ]:
results.loc[0] = [0,'-','-',10,50000,history.history["loss"][-1],history.history["val_loss"][-1],history.history["accuracy"][-1],history.history["val_accuracy"][-1],round(end-start,2)]

results
# hidden layers # neurons - hidden layer    activation function - hidden layer  # epochs    batch size  train loss  validation loss train accuracy  validation accuracy time (secs)

0 0 - - 10 50000 2.255868 2.238811 0.1527 0.1628 4.7

Importing Libraries Reference¶

In [ ]:
 Numerical and statistical libraries: -------------------------------------

# Importing the pandas library, which provides data structures
# and functions needed to manipulate structured data more easily.
# Pandas is useful for working with data in DataFrame format,
# which is similar to a table in a database or an Excel spreadsheet.
import pandas as pd

# Importing the numpy library, which adds support for large,
# multi-dimensional arrays and matrices, along with a
# collection of mathematical functions to operate on these arrays.
# Numpy is useful for numerical computations, manipulating data,
# performing complex mathematical operations, and conducting
# powerful statistical analyses.
import numpy as np

# Data visualization libraries: ------------------------------------------

# Importing the pyplot module from the matplotlib library,
# which is a plotting library used for creating static, interactive,
# and animated visualizations in Python.
# Matplotlib provides a wide range of plotting capabilities,
# including line plots, scatter plots, bar charts, histograms, and more.
# The pyplot module provides a MATLAB-like interface,
# making it easy to create and customize plots.
import matplotlib.pyplot as plt

# Importing the seaborn library, which is built on top of matplotlib
# and provides a high-level interface for drawing attractive
# and informative statistical graphics.
# Seaborn offers built-in themes and color palettes to make it easy to create
# visually appealing plots.
# It is particularly useful for visualizing complex datasets,
# with functions for creating heatmaps, time series plots,
# and more advanced statistical visualizations.
import seaborn as sns


# Removes the limit for the number of displayed columns
pd.set_option("display.max_columns", None)
# Sets the limit for the number of displayed rows
pd.set_option("display.max_rows", 200)

# ============================================================================
# Typical sequence to build a ML prediction model using Decision Trees:
#
# 1- Select an attribute of data and make all possible splits in data.
# 2- Calculate the Gini impurity after each split.
# 3- Repeat the steps for every attribute present in the data.
# 4- Decide the best split based on the lowest Gini impurity.
# 5- Repeat the complete process until the stopping criterion is reached or
#    the tree has achieved homogeneity in leaves.
#
# Decision Tree Classifier & Modelling Libraries: ----------------------------

# ML Library to split data
from sklearn.model_selection import train_test_split

# ML Algorythm to build model for prediction using Decision Tree Classifier
from sklearn.tree import DecisionTreeClassifier

# ML Algorythm to visualize Decision Tree
from sklearn.tree import plot_tree

# ML Algorythm function to get feature importance:
# It offers decision tree algorithms for both classification and regression tasks.
# Decision trees can be used to determine feature importance,
# which indicates how influential each feature is in making predictions.
from sklearn import tree

# ML Algorythm function to get feature importance Part II:
# Importing the permutation_importance function from the sklearn.inspection module.
# Function calculates feature importance by evaluating the decrease
# in a model's performance when the values of a feature are randomly shuffled.
# Permutation importance is a model-agnostic method, meaning it can be used
# with any trained model to assess the importance of features.
from sklearn.inspection import permutation_importance as pmp


# ML Decision Tree Algorythms Metric scores definitions:
'''
=========> Accuracy:
Description: The ratio of correctly predicted instances to the total instances.
Power: Measures the overall effectiveness of the model.

=========> Precision:
Description: The ratio of true positive predictions to the total predicted positives.
Power: Indicates how many of the predicted positive instances are actually positive.

=========> Recall (Sensitivity):
Description: The ratio of true positive predictions to the total actual positives.
Power: Measures the model's ability to identify all relevant instances.

=========> F1 Score:
Description: The harmonic mean of precision and recall.
Power: Provides a single metric that balances precision and recall,
       useful for imbalanced datasets.

=========> ROC-AUC (Receiver Operating Characteristic - Area Under the Curve):
Description: Represents the model's ability to distinguish between classes,
             plotted as true positive rate versus false positive rate.
Power: Evaluates the trade-off between sensitivity and specificity
             across different thresholds.

=========> Mean Squared Error (MSE):
Description: The average squared difference between predicted and actual values.
Power: Penalizes larger errors more than smaller ones,
       providing a clear indication of model accuracy.

=========> Root Mean Squared Error (RMSE):
Description: The square root of the average squared difference between
             predicted and actual values.
Power: Offers a measure of the model's prediction error in the same units
             as the target variable.

=========> Mean Absolute Error (MAE):
Description: The average absolute difference between predicted and actual values.
Power: Provides a straightforward measure of prediction accuracy w
       ithout emphasizing large errors.

=========> R-squared (Coefficient of Determination):
Description: The proportion of the variance in the dependent variable
             that is predictable from the independent variables.
Power: Indicates the goodness-of-fit of the model, showing how well the model
       explains the variability of the target variable.
'''


# ML Algorythm function to get these metric scores
from sklearn.metrics import (
    accuracy_score,
    recall_score,
    precision_score,
    f1_score,
    roc_auc_score,
    mean_squared_error,
    mean_absolute_error,
    r2_score,
    make_scorer,
    confusion_matrix
)

# ML Algorythm to finetune different models
from sklearn.model_selection import GridSearchCV

import warnings
warnings.filterwarnings("ignore")
In [ ]:
from google.colab import sheets
sheet = sheets.InteractiveSheet(df=data)

Excel Spreadsheet displayed

In [ ]:
## Statistical summary of the data. Round-off to two decimal spaces and transpose for easier reading:
data.describe().round(2).T
count   mean    std min 25% 50% 75% max

ID 5000.0 2500.50 1443.52 1.0 1250.75 2500.5 3750.25 5000.0 Age 5000.0 45.34 11.46 23.0 35.00 45.0 55.00 67.0 Experience 5000.0 20.10 11.47 -3.0 10.00 20.0 30.00 43.0 Income 5000.0 73.77 46.03 8.0 39.00 64.0 98.00 224.0 ZIPCode 5000.0 93169.26 1759.46 90005.0 91911.00 93437.0 94608.00 96651.0 Family 5000.0 2.40 1.15 1.0 1.00 2.0 3.00 4.0 CCAvg 5000.0 1.94 1.75 0.0 0.70 1.5 2.50 10.0 Education 5000.0 1.88 0.84 1.0 1.00 2.0 3.00 3.0 Mortgage 5000.0 56.50 101.71 0.0 0.00 0.0 101.00 635.0 Personal_Loan 5000.0 0.10 0.29 0.0 0.00 0.0 0.00 1.0 Securities_Account 5000.0 0.10 0.31 0.0 0.00 0.0 0.00 1.0 CD_Account 5000.0 0.06 0.24 0.0 0.00 0.0 0.00 1.0 Online 5000.0 0.60 0.49 0.0 0.00 1.0 1.00 1.0 CreditCard 5000.0 0.29 0.46 0.0 0.00 0.0 1.00 1.0

In [ ]:
# Improved

# Specify the column name
column_name = 'Age'

# ANSI escape codes for bold and orange
bold_orange = "\033[1;33m"  # Bold + Orange
reset = "\033[0m"  # Reset to normal

# Print the column name, unique values, and data type with formatting
print(f"Column: {bold_orange}{column_name}{reset}")
print(f"Unique values in '{column_name}' column: {data[column_name].unique()}") # Non-repeating values listed:
print(f"Data type of '{column_name}' column: {bold_orange}{data[column_name].dtype}{reset}")

Column: Age YELLOW Unique values in 'Age' column: [25 45 39 35 37 53 50 34 65 29 48 59 67 60 38 42 46 55 56 57 44 36 43 40 30 31 51 32 61 41 28 49 47 62 58 54 33 27 66 24 52 26 64 63 23] Data type of 'Age' column: int64 YELLOW

In [ ]:
# pip install colorama
In [ ]:
from IPython.display import display, HTML

u_tri = '\u25B2'  # (▲)
d_tri = '\u25BC'  # (▼)
l_tri = '\u25C0' # (◀)
r_tri = '\u25B6' # (▶)

display(HTML(f'We look for a loss decrease <span style="color: red;">{d_tri}</span> and recall increase <span style="color: green;">{u_tri}</span> both for training and validation sets:'))
We look for a loss decrease ▼ and recall increase ▲ both for training and validation sets:

APPENDIX IV¶

ILLUSTRATING FORWARD PROPAGATION AND BACK PROPAGATION ON ANN :-)

In [2]:
# Forward Propagation Illustration - Utility Function

import time
import sys

# ANSI color codes
RED = "\033[31m"
GREEN = "\033[32m"
ORANGE = "\033[33m"
RESET = "\033[0m"

# Forward Propagation Illustration
print("\nIllustration of -Forward Propagation- by Activation Functions at each neuron based on the strenght of importance or weights between nodes.")

# Function to animate dashes with color choice, moving towards a fixed [*] position, and stop after a timeout
def animate_dashes_with_fixed_asterisk(total_time=10, speed=0.1, color="RED"):
    # Choose color based on user input
    if color == "RED":
        dash_color = RED
    elif color == "GREEN":
        dash_color = GREEN
    elif color == "ORANGE":
        dash_color = ORANGE
    else:
        dash_color = RESET  # Default to no color if an invalid option is given

    width = 10  # Initial width of dashes
    fixed_pos = 45  # Fixed position for the asterisk [*]
    start_time = time.time()  # Record the starting time

    while time.time() - start_time < total_time:
        # Calculate remaining space before the asterisk [*]
        remaining_space = fixed_pos - width - 3  # 3 accounts for the length of [*]

        if remaining_space < 0:
            remaining_space = 0  # Stop growing dashes once they reach [*]

        # Use carriage return '\r' to return to the start of the line and overwrite the previous output
        sys.stdout.write(f"\r{GREEN}[* neuron layer_i *] {dash_color}{'-' * width}{' ' * remaining_space}{ORANGE}[* neuron layer_i+1 *]{RESET}")
        sys.stdout.flush()  # Force output to appear immediately
        time.sleep(speed)   # Control the speed of the animation

        if remaining_space == 0:
            width = 10  # Reset width after dashes meet [*] to restart animation
        else:
            width += 1  # Increase the number of dashes for movement effect

    # After the timeout ends, clear the line and stop
    sys.stdout.write("\r" + " " * (fixed_pos + 3) + "\r")
    sys.stdout.flush()

# Run the animation (set timeout in seconds and specify color: "RED", "GREEN", or "ORANGE")
animate_dashes_with_fixed_asterisk(total_time=10, speed=0.08, color="GREEN")

# Leave the illustration completed without animation
print(f"\r{GREEN}[* neuron layer_i *]{GREEN} ----------------------------------------- {ORANGE}[* neuron layer_i+1 *]")
Illustration of -Forward Propagation- by Activation Functions at each neuron based on the strenght of importance or weights between nodes.
[* neuron layer_i *] ----------------------------------------- [* neuron layer_i+1 *]
In [3]:
# Back Propagation Illustration - Utility Function

import time
import sys

# ANSI color codes
RED = "\033[31m"
GREEN = "\033[32m"
ORANGE = "\033[33m"
RESET = "\033[0m"

print("\nIllustration of -Back Propagation- to nodes with stronger influence after wights are updated with a keras Optimizer")

# Function to animate dashes moving from right to left, with "updated weight" in the middle
def animate_dashes_with_weight_in_middle(total_time=10, speed=0.1, asterisk_color="GREEN", dash_color="RED"):
    # Choose dash color based on user input
    if dash_color == "RED":
        dash_color = RED
    elif dash_color == "GREEN":
        dash_color = GREEN
    elif dash_color == "ORANGE":
        dash_color = ORANGE
    else:
        dash_color = RESET  # Default to no color if an invalid option is given

    # Choose asterisk color
    if asterisk_color == "RED":
        asterisk_color = RED
    elif asterisk_color == "GREEN":
        asterisk_color = GREEN
    elif asterisk_color == "ORANGE":
        asterisk_color = ORANGE
    else:
        asterisk_color = RESET  # Default to no color if an invalid option is given

    dash_width = 46  # Starting width of the dashes
    fixed_pos = 46  # Position where the asterisk [*] will stay fixed
    word = " updated weight "
    word_length = len(word)
    start_time = time.time()  # Record the starting time

    while time.time() - start_time < total_time:
        # Calculate the remaining space for the dashes
        remaining_space = fixed_pos - dash_width - 3  # 3 accounts for the length of [*]

        # Calculate the position to place the word in the middle
        total_length = dash_width + len(word) + 3  # Total length including dashes and word
        word_position = (total_length // 2) - (word_length // 2)  # Center the word

        # Create the dashes string
        dashes = '-' * dash_width

        # Insert the word into the dashes at the calculated position
        if word_position < dash_width:
            dashes = dashes[:word_position] + word + dashes[word_position + word_length:]

        # Print the line with dashes shrinking toward the asterisk [*]
        sys.stdout.write(f"\r{asterisk_color}[* neuron layer_i *] {dash_color}{dashes}{' ' * remaining_space}{RESET}")
        sys.stdout.flush()  # Force the output to appear immediately
        time.sleep(speed)   # Control the speed of the animation

        if dash_width == 0:
            dash_width = 45  # Reset dash width after dashes shrink to 0
        else:
            dash_width -= 1  # Decrease the number of dashes for the leftward effect

    # After timeout ends, clear the line and stop
    sys.stdout.write("\r" + " " * (fixed_pos + 3) + "\r")
    sys.stdout.flush()

# Run the animation (set timeout in seconds and specify colors)
animate_dashes_with_weight_in_middle(total_time=10, speed=0.08, asterisk_color="GREEN", dash_color="RED")

# Leave the illustration completed without animation
print(f"\r{GREEN}[* neuron layer_i *]{RED} <---------------------------------------- {ORANGE}[* neuron layer_i+1 *]")
Illustration of -Back Propagation- to nodes with stronger influence after wights are updated with a keras Optimizer
[* neuron layer_i *] <---------------------------------------- [* neuron layer_i+1 *]
In [19]:
message = "Essentially:\nHow fast the global minimum is found (by Error Loss) is a function of the chosen Keras optimizers \n" \
          "during the backflow propagation with updated weights and strengths of signals between neurons."

# Run the animation (set timeout in seconds and specify color: "RED", "GREEN", or "ORANGE")
animate_dashes_with_fixed_asterisk(total_time=5, speed=0.08, color="GREEN")
print(f"\r{GREEN}[* neuron layer_i *]{GREEN} ----------------------------------------- {ORANGE}[* neuron layer_i+1 *]")

# Run the animation (set timeout in seconds and specify colors)
animate_dashes_with_weight_in_middle(total_time=5, speed=0.08, asterisk_color="GREEN", dash_color="RED")
print(f"\r{GREEN}[* neuron later_i *]{RED} <---------------------------------------- {ORANGE}[* neuron layer_i+1 *]{RESET}")

print(message)
[* neuron layer_i *] ----------------------------------------- [* neuron layer_i+1 *]
[* neuron later_i *] <---------------------------------------- [* neuron layer_i+1 *]
Essentially:
How fast the global minimum is found (by Error Loss) is a function of the chosen Keras optimizers 
during the backflow propagation with updated weights and strengths of signals between neurons.