← Back to all articles

Neural Networks Multi-Class Classification in Python

Shaun Enslin
January 13, 2025
Neural Networks Multi-Class Classification

Photo by author: Avoid the pitfalls of retraining your data

This complete guide to multi class neural networks will transform our data, create the model, evaluate with k-fold cross validation, compile and evaluate a model and save the model for later use. Later, we will reload the models to make predictions without the need to re-train.

Introduction

This is a multi-class classification problem, meaning that there are more than two classes to be predicted, in fact there are 7 categories.

  • You can download the source code from GitHub
  • If you would like to see how to code a neural network from scratch, check this article
  • Download the dataset we will be using from Kaggle
  • A very good article on multi class concepts which I reference below

This article will focus on:

Import the classes and functions Train and save the model 2.1 Load our data 2.2 Prepare our features 2.3 Split train and test data 2.4 Hot Encoding Y 2.5 Define The Neural Network Model 2.6 Evaluate The Model with k-Fold Cross Validation 2.7 Compile and evaluate model on training data 2.8 Plot the learning curve 2.9 Save the model 3. Reload models from disk and predict 3.1 Look at our files 3.2 Reload the model 3.4 Reload 5% random data 3.5 Transform features 3.6 Predict and check for accuracy 4. Conclusion

1. Import Classes and functions

We can begin by importing all of the classes and functions we will need in this tutorial.

from keras.models import Sequential from keras.layers import Dense from keras.wrappers.scikit_learn import KerasClassifier from keras.utils import np_utils from sklearn.model_selection import cross_val_score from sklearn.model_selection import KFold from sklearn.preprocessing import LabelEncoder from sklearn.pipeline import Pipeline from sklearn.preprocessing import MinMaxScaler from sklearn.model_selection import train_test_split from joblib import dump, load import pandas as pd import numpy as np

2. Train and save model

2.1 Load our data

Lets load our data into a dataframe.

df = pd.read_csv('data/customertrain.csv') df = df.dropna() df = df.drop(['Segmentation','ID'], axis=1) # not needed df.head()
Dataframe results

Figure 1: Results of loading the data

We can see that we have 8068 training examples, but we do have some things to sort out:

  • We will need to encode the categories from Y
  • Use dump to save the encoder for later use
from sklearn.preprocessing import LabelEncoder from joblib import dump def prepareY(df): # extract Y and drop from dataframe Y = df['Var_1'] # encode class values as integers yencoder = LabelEncoder() yencoder.fit(Y) dump(yencoder, 'models/yencoder.joblib') return yencoder.transform(Y) y = prepareY(df) df = df.drop(['Var_1'], axis=1) pd.DataFrame(y).head()
Encoded Y data

Figure 2: Y has been encoded

2.2 Prepare our features

We need to do a few things to our features, so we can work with them a little easier.

  • Lets convert our string fields to numbers using OrdinalEncoder
  • Use MinMaxScaler to normalise our numbers so thay have mean of zero with a deviation of 1.

Get a list of our string and numeral columns.

numerical_ix = df.select_dtypes(include=['int64', 'float64']).columns categorical_ix = df.select_dtypes(include=['object', 'bool']).columns

Use ColumnTransformer to encode our string columns and then apply regularization to the numeric columns. We will use dump to save the column_trans class for later use.

from sklearn.compose import ColumnTransformer from sklearn.preprocessing import OrdinalEncoder from sklearn.preprocessing import MinMaxScaler column_trans = ColumnTransformer([ ('cat', OrdinalEncoder(),categorical_ix), ('num', MinMaxScaler(feature_range=(-1, 1)), numerical_ix)], remainder='drop') column_trans.fit(df) dump(column_trans,"models/column_trans.joblib") X = column_trans.transform(df) pd.DataFrame(X).head()
Transformed features

Figure 3: Results after running column transformation

2.3 Split train and test data

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

2.4 Hot Encoding Y

The output variable contains seven different string values.

When modeling multi-class classification problems using neural networks, it is good practice to reshape the output attribute from a vector that contains values for each class value to be a matrix with a boolean for each class value and whether or not a given instance has that class value or not.

This is called `one hot encoding` or creating dummy variables from a categorical variable.

For example, in this problem six class values are [1,2,3,4,5,6]. We can turn this into a one-hot encoded binary matrix for each data instance that would look as follows:

Hot encoded data

Figure 4: Results of hot encoding Y

yhot = np_utils.to_categorical(y) yhot_train = np_utils.to_categorical(y_train) yhot_test = np_utils.to_categorical(y_test)

2.5 Define The Neural Network Model

So, now you are asking "What are reasonable numbers to set these to?"

  • Input layer = set to the size of the features ie. 8
  • Hidden layers = set to input_layer * 2 (ie. 16)
  • Output layer = set to the size of the labels of Y. In our case, this is 7 categories

The network topology of this two-layer neural network can be summarized as:

8 inputs -> [16 hidden nodes] -> 7 outputs

Now create our model inside a function so we can use it in the KerasClassifier as well as later when we compile our model.

# define baseline model def baseline_model(): # create model model = Sequential() # Rectified Linear Unit Activation Function model.add(Dense(16, input_dim=8, activation='relu')) model.add(Dense(16, activation = 'relu')) # Softmax for multi-class classification model.add(Dense(7, activation='softmax')) # Compile model model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) return model

We can now create our KerasClassifier for use in scikit-learn. We us mini batches as this tends to be the fastest to train

cmodel = KerasClassifier(build_fn=baseline_model, epochs=200, batch_size=100, verbose=0)

2.6 Evaluate The Model with k-Fold Cross Validation

Now, lets evaluate the neural network model on all our data. Let's define the model evaluation procedure. Here, we set

kfold = KFold(n_splits=10, shuffle=True)

Now we can evaluate our model on our dataset (X and yhot) using a 10-fold cross-validation procedure (kfold).

result = cross_val_score(cmodel, X, yhot, cv=kfold) print('Result: %.2f%% (%.2f%%)' % (result.mean()*100, result.std()*100))

After running above, you should see a result of around 67.64%.

Great, kfold has done its job, this is the best we can hope for from this dataset in terms of accuracy

2.7 Compile and evaluate model on training data

Now, that we are happy with our epochs and batch size, lets compile a model we can use later.

model = baseline_model() model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) history = model.fit(X_train, yhot_train, validation_split=0.33, epochs=200, batch_size=100, verbose=0)

2.8 Plot the learning curve

The plots are provided below. The history for the validation dataset is labeled test by convention as it is indeed a test dataset for the model.

We can also see that the model has not yet over-learned the training dataset, showing comparable on both datasets.

import matplotlib.pyplot as plt # list all data in history print(history.history.keys()) # summarize history for accuracy plt.plot(history.history['accuracy']) plt.plot(history.history['val_accuracy']) plt.title('model accuracy') plt.ylabel('accuracy') plt.xlabel('epoch') plt.legend(['train', 'test'], loc='upper left') plt.show() # summarize history for loss plt.plot(history.history['loss']) plt.plot(history.history['val_loss']) plt.title('model loss') plt.ylabel('loss') plt.xlabel('epoch') plt.legend(['train', 'test'], loc='upper left') plt.show()
Learning curves

Figure 5: Our learning curve is looking good. Could even reduce the epochs

Let's run an evaluation on our test set and see how we hold up with new data. You should end up with an accuracy of 69.09%

# evaluate the keras model _, accuracy = model.evaluate(X_test, yhot_test) print('Accuracy from evaluate: %.2f' % (accuracy*100))

Finally, for fun, let's make a prediction on ALL our data and see how we go. Again, you should end up with an accuracy of 69%

predict_x = model.predict(X_test) pred = np.argmax(predict_x, axis=1) print(f'Prediction Accuracy: {(pred == y_test).mean() * 100:f}')

2.9 Save the model

Now, lets save the model, so later we can reload and make predicions without the need to retrain. The model is then converted to JSON format and written to model.json in the local directory. The network weights are written to model.h5 in the local directory.

model_json = model.to_json() with open("models/customermodel.json", "w") as json_file: json_file.write(model_json) # serialize weights to HDF5 model.save_weights("model.h5") print("Saved model to disk")

3. Reload models from disk and predict

3.1 Look at our files

The model and weight data is loaded from the saved files and a new model is created. It is important to compile the loaded model before it is used. This is so that predictions made using the model can use the appropriate efficient computation from the Keras backend.

The model is evaluated in the same way printing the same evaluation score.

ls -lt models
Saved model files

Figure 6: Saving our model, transformer and encoder

3.2 Reload the models

We will reload our data, simulating the event where we may be wanting to run a prediction a day or two later.

from keras.models import model_from_json # load json and create model json_file = open('models/customermodel.json', 'r') loaded_model_json = json_file.read() json_file.close() loaded_model = model_from_json(loaded_model_json) # load weights into new model loaded_model.load_weights('model.h5') print('Loaded model from disk') # evaluate loaded model on test data loaded_model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

Now, lets reload our transformer

column_trans = load('models/column_trans.joblib')

3.4 Reload 5% random data

Reload our training data, but take a 10% random sample

df = pd.read_csv('data/customertrain.csv') df = df.sample(frac=0.05) df.dropna(inplace=True) df = df.drop(['Segmentation', 'ID'], axis=1) # not needed df.info()

Now, when we reload Y, we first want to load our original encoder. Naturally, we cannot have new categories, else we will get an error at this point.

def prepareYreload(df): yencoder = load("models/yencoder.joblib") return yencoder.transform(df["Var_1"]) y = prepareYreload(df) df = df.drop(["Var_1"], axis=1) pd.DataFrame(y).head()

Ok, lets have a look at our data before we transform it

df.head()
Features before transform

Figure 7: Features before we transform

column_trans = load("models/column_trans.joblib") X = column_trans.transform(df) pd.DataFrame(X).head()
Features after transform

Figure 8: Features after we transform

3.6 Predict and check for accuracy

Reload our training data, but take a 10% random sample, you again should end up with an accuracy of 69%.

predict_x = loaded_model.predict(X) pred = np.argmax(predict_x, axis=1) print(f'Prediction Accuracy: {(pred == y).mean() * 100:f}')

Now, lets reload our transformer

column_trans = load('models/column_trans.joblib')

4. Conclusion

In this article you discovered how to develop and evaluate a neural network using the Keras Python library for deep learning.

You learned:

  • How to load data and make it available to Keras.
  • How to prepare multi-class classification data for modeling using one hot encoding.
  • How to use Keras neural network models with scikit-learn.
  • How to define a neural network using Keras for multi-class classification.
  • How to evaluate a Keras neural network model using scikit-learn with k-fold cross validation

5. Sources

In this article I did find https://machinelearningmastery.com very helpful with alot of concepts easily explained.