Techno Blender
Digitally Yours.

A Step-by-Step Tutorial to Develop a Multi-Output Model in TensorFlow | by Rashida Nasrin Sucky | Oct, 2022

0 56


Photo by Pavel Neznanov on Unsplash

With complete code

I wrote several tutorials on TensorFlow before which include models with Sequential and Functional API, Convolutional Neural Networks, Reinforcement Neural Networks, etc. In this article, we will work on a model using Functional API but it will predict two outputs with one model.

If you already know how functional API works, it should be simple for you. If you need a tutorial or a refresher on functional API, this article should help:

Let’s dive into the tutorial. First importing necessary packages:

import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, Input
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix
import itertools

I am using a public dataset named auto_clean. Please feel free to download the dataset to follow along from this link. This is a free public dataset.

First, create a pandas DataFrame with auto_clean.csv data:

df = pd.read_csv('auto_clean.csv')

The dataset has 201 rows and 29 columns. These are the columns:

df.columns

Output:

Index(['symboling', 'normalized-losses', 'make', 'aspiration', 'num-of-doors', 'body-style', 'drive-wheels', 'engine-location', 'wheel-base', 'length', 'width', 'height', 'curb-weight', 'engine-type', 'num-of-cylinders', 'engine-size', 'fuel-system', 'bore', 'stroke', 'compression-ratio', 'horsepower', 'peak-rpm', 'city-mpg', 'highway-mpg', 'price', 'city-L/100km', 'horsepower-binned', 'diesel', 'gas'], dtype='object')

The dataset has a few null values and for this tutorial purpose, I will simply delete the rows with null values. There are several different ways to deal with null values. Please feel free to try those for yourself.

df = df.dropna()

Now, the dataset has 196 rows of data which is not a lot but let’s see what we can do with it.

As this is a multi-output model, I chose num-of-cylinders and price as target variables. Here num-of-cylinders is a categorical variable and price is a continuous variable. You can take two categorical or two continuous variables as well.

Data Preparation

For data preparation, first, we need to convert the categorical variable to numeric values. Here is the procedure I followed for that.

Found out the numeric columns in the DataFrame:

num_columns = df.select_dtypes(include=np.number).columns
num_columns

Output:

Index(['symboling', 'normalized-losses', 'wheel-base', 'length', 'width', 'height', 'curb-weight', 'engine-size', 'bore', 'stroke', 'compression-ratio', 'horsepower', 'peak-rpm', 'city-mpg', 'highway-mpg', 'price', 'city-L/100km', 'diesel', 'gas'], dtype='object')

The output above shows the numeric column names. We need to convert the rest of the columns to numeric.

cat_columns = []
for col in df.columns:
if col not in num_columns:
cat_columns.append(col)
cat_columns

Output:

['make',
'aspiration',
'num-of-doors',
'body-style',
'drive-wheels',
'engine-location',
'engine-type',
'num-of-cylinders',
'fuel-system',
'horsepower-binned']

Here is how I converted these columns to numeric:

for cc in cat_columns:
df[cc] = pd.Categorical(df[cc])
df[cc] = df[cc].cat.codes

That’s how the data preparation is done for this project.

Data Splitting

For the model training, we will not use all the data in the dataset. 20% of the data will be reserved for validation and 20% will be for testing the performance of the model. I used the train_test_split method from scikit-learn library for this:

train, test = train_test_split(df, test_size=0.2, random_state=2)
train, val = train_test_split(train, test_size=0.2, random_state=23)

I will separate our two output variables and make a NumPy array with them.

This function will do just that:

def output_form(data):
price = data.pop('price')
price = np.array(price)
noc = data.pop('num-of-cylinders')
noc = np.array(noc)
return (price, noc)

Let’s use this function to train, test, and validation data:

train_y = output_form(train)
test_y = output_form(test)
val_y = output_form(val)

It is a good practice to standardize the data. Because different variables in the data can be in different data ranges. So, I will use describe function which gives count, mean, std, min, 25th, 50th, and 75th percentile, and max for all the variables. From there mean and std will be used to standardize the data:

train_stats = train_stats.transpose()
def norm(x):
return (x - train_stats['mean']) / train_stats['std']

We have the function ‘norm’ to standardize the data.

train_x = norm(train)
test_x = norm(test)
val_x = norm(val)

We should drop the target variables from the input features. We could have done that before. But let’s just do it now:

train_x = train_x.drop(columns=['price', 'num-of-cylinders'])
test_x = test_x.drop(columns=['price', 'num-of-cylinders'])
val_x = val_x.drop(columns=['price', 'num-of-cylinders'])

We have features and output variables ready for the model.

Model development

We will use two functions for model development. The base model will only define the dense layers, and the final model will add the output layers to the base model.

def base_model(inputs):
x= Dense(500, activation='tanh')(inputs)
x= Dense(500, activation='tanh')(x)
x= Dense(300, activation='tanh')(x)
x= Dense(300, activation='tanh')(x)
x= Dense(300, activation='tanh')(x)
x= Dense(300, activation='tanh')(x)
x= Dense(150, activation='tanh')(x)
x= Dense(150, activation='tanh')(x)
return x
def final_model(inputs):
x = base_model(inputs)
price = Dense(units='1', name='price')(x)

noc = Dense(units = '5', activation = 'sigmoid', name = 'noc')(x)

model = Model(inputs=inputs, outputs = [price, noc])

return model

That’s our model. Now, training the model and of course testing.

Training and Testing

For training, inputs and optimizers need to be defined. I will use adam optimizer for this model and the default learning rate. Please feel free to try any other optimizers and some different learning rates.

inputs = tf.keras.layers.Input(shape=(27,))

Now, pass this input to the model:

model = final_model(inputs)

For model compilation, there will be two loss functions and two metrics for accuracy for two output variables. Here the term ‘noc’ refers to the ‘number_of_cylinders’.

model.compile(optimizer='adam', 
loss = {'price': 'binary_crossentropy',
'noc': 'mse'},
metrics={'price': tf.keras.metrics.RootMeanSquaredError(),
'noc': 'accuracy'})

Everything is ready to train the model. Here starts the training. I trained the model for 400 epochs.

history = model.fit(train_x, train_y, 
epochs=400, validation_data=(val_x, val_y))

Here are the results of the last three epochs:

Epoch 398/400
4/4 [==============================] - 0s 11ms/step - loss: 390353.6250 - price_loss: 390342.9688 - noc_loss: 10.6905 - price_root_mean_squared_error: 624.7744 - noc_accuracy: 0.7097 - val_loss: 8178957.5000 - val_price_loss: 8178956.0000 - val_noc_loss: 1.6701 - val_price_root_mean_squared_error: 2859.8875 - val_noc_accuracy: 0.9062
Epoch 399/400
4/4 [==============================] - 0s 12ms/step - loss: 424782.6250 - price_loss: 424775.5625 - noc_loss: 7.0919 - price_root_mean_squared_error: 651.7481 - noc_accuracy: 0.6935 - val_loss: 8497714.0000 - val_price_loss: 8497707.0000 - val_noc_loss: 7.1780 - val_price_root_mean_squared_error: 2915.0828 - val_noc_accuracy: 0.8125
Epoch 400/400
4/4 [==============================] - 0s 11ms/step - loss: 351160.1875 - price_loss: 351145.4062 - noc_loss: 14.7626 - price_root_mean_squared_error: 592.5753 - noc_accuracy: 0.7258 - val_loss: 8427407.0000 - val_price_loss: 8427401.0000 - val_noc_loss: 5.7305 - val_price_root_mean_squared_error: 2902.9985 - val_noc_accuracy: 0.9062

From the results above you can see that the training accuracy for ‘no_of_cylinders’ after the last epoch was 72.58% and the validation accuracy was 90.62%.

While it may look a little funny to see that the validation accuracy is much higher than the training accuracy, we should remember that the dataset is very small and the validation dataset was only 20%.

Here I am printing the final losses and accuracy metrics:

loss, price_loss, noc_loss, price_root_mean_squared_error, noc_accuracy = model.evaluate(x=val_x, y=val_y)print()
print(f'loss: {loss}')
print(f'price_loss: {price_loss}')
print(f'noc_loss: {noc_loss}')
print(f'price_root_mean_squared_error: {price_root_mean_squared_error}')
print(f'noc_accuracy: {noc_accuracy}')

Output:

1/1 [==============================] - 0s 18ms/step - loss: 8427407.0000 - price_loss: 8427401.0000 - noc_loss: 5.7305 - price_root_mean_squared_error: 2902.9985 - noc_accuracy: 0.9062

loss: 8427407.0
price_loss: 8427401.0
noc_loss: 5.730476379394531
price_root_mean_squared_error: 2902.99853515625
noc_accuracy: 0.90625

Evaluation

We used training and validation data for training the model. The model has never seen the test dataset. So, we will use the test dataset for evaluation. Usually, predict function is used to predict the output for any data.

predictions=model.predict(test_x)

Because we have two outputs, we can access the prediction for price and no_of_clinders from the predictions like this:

price_pred = predictions[0]
noc_pred = predictions[1]

The accuracy rate for ‘no_of_cylinders’ is clear but for the price, there is no accuracy rate as this is a continuous variable. The price_root_mean_squared_error looks reasonable. A visual interpretation can be interesting.

The plot below shows the actual and predicted prices in the same plot:

plt.figure(figsize=(8, 6))
plt.scatter(range(len(price_pred)), price_pred.flatten(), color='green')
plt.scatter(range(len(price_pred)), test_y[0], color='red')
plt.legend()
plt.title("Comparison of Actual and Predicted Prices", fontsize=18)
plt.show()

I believe the prediction is reasonably closer to the actual. Please feel free to use other evaluation methods to evaluate the model. My focus was to make a tutorial on the multi-output model.

Conclusion

I hope this tutorial was helpful and you will be able to use it in your work or academic project. I used two output variables in this model. If you have a more complex dataset, please free to try this method for more than two variables.

Feel free to follow me on Twitter and like my Facebook page.

More Reading




Photo by Pavel Neznanov on Unsplash

With complete code

I wrote several tutorials on TensorFlow before which include models with Sequential and Functional API, Convolutional Neural Networks, Reinforcement Neural Networks, etc. In this article, we will work on a model using Functional API but it will predict two outputs with one model.

If you already know how functional API works, it should be simple for you. If you need a tutorial or a refresher on functional API, this article should help:

Let’s dive into the tutorial. First importing necessary packages:

import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, Input
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix
import itertools

I am using a public dataset named auto_clean. Please feel free to download the dataset to follow along from this link. This is a free public dataset.

First, create a pandas DataFrame with auto_clean.csv data:

df = pd.read_csv('auto_clean.csv')

The dataset has 201 rows and 29 columns. These are the columns:

df.columns

Output:

Index(['symboling', 'normalized-losses', 'make', 'aspiration', 'num-of-doors', 'body-style', 'drive-wheels', 'engine-location', 'wheel-base', 'length', 'width', 'height', 'curb-weight', 'engine-type', 'num-of-cylinders', 'engine-size', 'fuel-system', 'bore', 'stroke', 'compression-ratio', 'horsepower', 'peak-rpm', 'city-mpg', 'highway-mpg', 'price', 'city-L/100km', 'horsepower-binned', 'diesel', 'gas'], dtype='object')

The dataset has a few null values and for this tutorial purpose, I will simply delete the rows with null values. There are several different ways to deal with null values. Please feel free to try those for yourself.

df = df.dropna()

Now, the dataset has 196 rows of data which is not a lot but let’s see what we can do with it.

As this is a multi-output model, I chose num-of-cylinders and price as target variables. Here num-of-cylinders is a categorical variable and price is a continuous variable. You can take two categorical or two continuous variables as well.

Data Preparation

For data preparation, first, we need to convert the categorical variable to numeric values. Here is the procedure I followed for that.

Found out the numeric columns in the DataFrame:

num_columns = df.select_dtypes(include=np.number).columns
num_columns

Output:

Index(['symboling', 'normalized-losses', 'wheel-base', 'length', 'width', 'height', 'curb-weight', 'engine-size', 'bore', 'stroke', 'compression-ratio', 'horsepower', 'peak-rpm', 'city-mpg', 'highway-mpg', 'price', 'city-L/100km', 'diesel', 'gas'], dtype='object')

The output above shows the numeric column names. We need to convert the rest of the columns to numeric.

cat_columns = []
for col in df.columns:
if col not in num_columns:
cat_columns.append(col)
cat_columns

Output:

['make',
'aspiration',
'num-of-doors',
'body-style',
'drive-wheels',
'engine-location',
'engine-type',
'num-of-cylinders',
'fuel-system',
'horsepower-binned']

Here is how I converted these columns to numeric:

for cc in cat_columns:
df[cc] = pd.Categorical(df[cc])
df[cc] = df[cc].cat.codes

That’s how the data preparation is done for this project.

Data Splitting

For the model training, we will not use all the data in the dataset. 20% of the data will be reserved for validation and 20% will be for testing the performance of the model. I used the train_test_split method from scikit-learn library for this:

train, test = train_test_split(df, test_size=0.2, random_state=2)
train, val = train_test_split(train, test_size=0.2, random_state=23)

I will separate our two output variables and make a NumPy array with them.

This function will do just that:

def output_form(data):
price = data.pop('price')
price = np.array(price)
noc = data.pop('num-of-cylinders')
noc = np.array(noc)
return (price, noc)

Let’s use this function to train, test, and validation data:

train_y = output_form(train)
test_y = output_form(test)
val_y = output_form(val)

It is a good practice to standardize the data. Because different variables in the data can be in different data ranges. So, I will use describe function which gives count, mean, std, min, 25th, 50th, and 75th percentile, and max for all the variables. From there mean and std will be used to standardize the data:

train_stats = train_stats.transpose()
def norm(x):
return (x - train_stats['mean']) / train_stats['std']

We have the function ‘norm’ to standardize the data.

train_x = norm(train)
test_x = norm(test)
val_x = norm(val)

We should drop the target variables from the input features. We could have done that before. But let’s just do it now:

train_x = train_x.drop(columns=['price', 'num-of-cylinders'])
test_x = test_x.drop(columns=['price', 'num-of-cylinders'])
val_x = val_x.drop(columns=['price', 'num-of-cylinders'])

We have features and output variables ready for the model.

Model development

We will use two functions for model development. The base model will only define the dense layers, and the final model will add the output layers to the base model.

def base_model(inputs):
x= Dense(500, activation='tanh')(inputs)
x= Dense(500, activation='tanh')(x)
x= Dense(300, activation='tanh')(x)
x= Dense(300, activation='tanh')(x)
x= Dense(300, activation='tanh')(x)
x= Dense(300, activation='tanh')(x)
x= Dense(150, activation='tanh')(x)
x= Dense(150, activation='tanh')(x)
return x
def final_model(inputs):
x = base_model(inputs)
price = Dense(units='1', name='price')(x)

noc = Dense(units = '5', activation = 'sigmoid', name = 'noc')(x)

model = Model(inputs=inputs, outputs = [price, noc])

return model

That’s our model. Now, training the model and of course testing.

Training and Testing

For training, inputs and optimizers need to be defined. I will use adam optimizer for this model and the default learning rate. Please feel free to try any other optimizers and some different learning rates.

inputs = tf.keras.layers.Input(shape=(27,))

Now, pass this input to the model:

model = final_model(inputs)

For model compilation, there will be two loss functions and two metrics for accuracy for two output variables. Here the term ‘noc’ refers to the ‘number_of_cylinders’.

model.compile(optimizer='adam', 
loss = {'price': 'binary_crossentropy',
'noc': 'mse'},
metrics={'price': tf.keras.metrics.RootMeanSquaredError(),
'noc': 'accuracy'})

Everything is ready to train the model. Here starts the training. I trained the model for 400 epochs.

history = model.fit(train_x, train_y, 
epochs=400, validation_data=(val_x, val_y))

Here are the results of the last three epochs:

Epoch 398/400
4/4 [==============================] - 0s 11ms/step - loss: 390353.6250 - price_loss: 390342.9688 - noc_loss: 10.6905 - price_root_mean_squared_error: 624.7744 - noc_accuracy: 0.7097 - val_loss: 8178957.5000 - val_price_loss: 8178956.0000 - val_noc_loss: 1.6701 - val_price_root_mean_squared_error: 2859.8875 - val_noc_accuracy: 0.9062
Epoch 399/400
4/4 [==============================] - 0s 12ms/step - loss: 424782.6250 - price_loss: 424775.5625 - noc_loss: 7.0919 - price_root_mean_squared_error: 651.7481 - noc_accuracy: 0.6935 - val_loss: 8497714.0000 - val_price_loss: 8497707.0000 - val_noc_loss: 7.1780 - val_price_root_mean_squared_error: 2915.0828 - val_noc_accuracy: 0.8125
Epoch 400/400
4/4 [==============================] - 0s 11ms/step - loss: 351160.1875 - price_loss: 351145.4062 - noc_loss: 14.7626 - price_root_mean_squared_error: 592.5753 - noc_accuracy: 0.7258 - val_loss: 8427407.0000 - val_price_loss: 8427401.0000 - val_noc_loss: 5.7305 - val_price_root_mean_squared_error: 2902.9985 - val_noc_accuracy: 0.9062

From the results above you can see that the training accuracy for ‘no_of_cylinders’ after the last epoch was 72.58% and the validation accuracy was 90.62%.

While it may look a little funny to see that the validation accuracy is much higher than the training accuracy, we should remember that the dataset is very small and the validation dataset was only 20%.

Here I am printing the final losses and accuracy metrics:

loss, price_loss, noc_loss, price_root_mean_squared_error, noc_accuracy = model.evaluate(x=val_x, y=val_y)print()
print(f'loss: {loss}')
print(f'price_loss: {price_loss}')
print(f'noc_loss: {noc_loss}')
print(f'price_root_mean_squared_error: {price_root_mean_squared_error}')
print(f'noc_accuracy: {noc_accuracy}')

Output:

1/1 [==============================] - 0s 18ms/step - loss: 8427407.0000 - price_loss: 8427401.0000 - noc_loss: 5.7305 - price_root_mean_squared_error: 2902.9985 - noc_accuracy: 0.9062

loss: 8427407.0
price_loss: 8427401.0
noc_loss: 5.730476379394531
price_root_mean_squared_error: 2902.99853515625
noc_accuracy: 0.90625

Evaluation

We used training and validation data for training the model. The model has never seen the test dataset. So, we will use the test dataset for evaluation. Usually, predict function is used to predict the output for any data.

predictions=model.predict(test_x)

Because we have two outputs, we can access the prediction for price and no_of_clinders from the predictions like this:

price_pred = predictions[0]
noc_pred = predictions[1]

The accuracy rate for ‘no_of_cylinders’ is clear but for the price, there is no accuracy rate as this is a continuous variable. The price_root_mean_squared_error looks reasonable. A visual interpretation can be interesting.

The plot below shows the actual and predicted prices in the same plot:

plt.figure(figsize=(8, 6))
plt.scatter(range(len(price_pred)), price_pred.flatten(), color='green')
plt.scatter(range(len(price_pred)), test_y[0], color='red')
plt.legend()
plt.title("Comparison of Actual and Predicted Prices", fontsize=18)
plt.show()

I believe the prediction is reasonably closer to the actual. Please feel free to use other evaluation methods to evaluate the model. My focus was to make a tutorial on the multi-output model.

Conclusion

I hope this tutorial was helpful and you will be able to use it in your work or academic project. I used two output variables in this model. If you have a more complex dataset, please free to try this method for more than two variables.

Feel free to follow me on Twitter and like my Facebook page.

More Reading

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.
Leave a comment