How to Define Custom Layer, Activation Function, and Loss Function in TensorFlow | by Rashida Nasrin Sucky | Nov, 2022

By Jessie Hobb On Nov 10, 2022

Step-by-step explanation and examples with complete code

I have several tutorials on Tensorflow where always inbuilt loss functions and layers had been used. But Tensorflow is a lot more dynamic than that. It allows us to write our own custom loss functions and create our own custom layers. So, there are many ways to make highly efficient models in Tensorflow.

The best way to learn is by learning by doing. So, we will learn with exercises using a free public dataset that I used in my last tutorial on the multi-output model.

I am assuming that you already know the basics of data analysis, data cleaning, and Tensorflow already. So, we will move a bit fast in the beginning.

Data Processing

The open public dataset I will use in this tutorial is fairly clean. But still, a little bit of cleaning is necessary.

Here is the link to the dataset:

I already cleaned up the dataset as necessary. Please feel free to download the clean dataset from here to follow along:

Let’s start.

First import all the necessary packages here:

import numpy as np
import pandas as pdimport tensorflow as tf 
from tensorflow.keras import Sequential 
from tensorflow.keras.layers import Dense 
from tensorflow.keras.optimizers import Adam 
from tensorflow.keras import backend as K
from tensorflow.keras.layers import Layer

Here is the dataset:

df = pd.read_csv("auto_price.csv")

Though I said it is a clean dataset, it still has two unnecessary columns that need to be dropped:

df = df.drop(columns=['Unnamed: 0', 'symboling'])

We will divide the dataset into three portions. One for training, one for testing, and one for validation.

from sklearn.model_selection import train_test_split
train, test = train_test_split(df, test_size=0.2, random_state=2)
train, val = train_test_split(train, test_size=0.2, random_state=23)

To normalize the training data using the z-score method, we need to know the mean and standard deviation of all the training features. Here is how I got it:

train_stats = train.describe()
train_stats= train_stats.transpose()
train_stats

There is extra information as well but the mean and standard deviation is also there.

This norm function takes data and normalizes it using the mean and standard deviation we received in the previous step:

def norm(x):
return (x - train_stats['mean']) / train_stats['std']

Let’s normalize train, test, and validation data:

train_x = norm(train)
test_x = norm(test)
val_x = norm(val)

For this exercise, the price of the automobile will be used as the target variable and the rest of the variables as the training features.

train_x = train_x.drop(columns='price')
test_x = test_x.drop(columns='price')
val_x=val_x.drop(columns='price')train_y = train['price']
test_y = test['price']
val_y = val['price']

Training and target variables are ready.

Custom Loss and Custom Layer

Let’s start with the loss function, we all know. That is the root mean squared error. We will define it as a function and pass that function while compiling the model.

def rmse(y_true, y_pred):
return K.sqrt(K.mean(K.square(y_pred - y_true)))

Looks very familiar, right? Let’s keep this function in hand to use later. There are many other kinds of loss functions you can try.

Now, moving on to the custom Layer. For this as well, we will use the simple Linear formula Y=WX+B as the formula. This formula requires weights which are the coefficients of X and Bias (denoted as ‘B’ in the formula). I will explain more in detail after you see the code for this:

class SimpleLinear(Layer):def __init__(self, units=64, activation=None):
super(SimpleLinear, self).__init__()
self.units = units
self.activation=tf.keras.activations.get(activation)def weightsAndBias(self, input_shape):
w_init = tf.random_normal_initializer()
self.w = tf.Variable(name="kernel",
initial_value=w_init(shape=(input_shape[-1], self.units),
dtype='float32'),
trainable=True)b_init = tf.zeros_initializer()
self.b = tf.Variable(name="bias",
initial_value=b_init(shape=(self.units,), dtype='float32'),
trainable=True)def call(self, inputs):
return self.activation(tf.matmul(inputs, self.w) + self.b)

In the code above, we started by passing units and activation as parameters. Here I used units as 64 which means 64 neurons. We will end up specifying different numbers as neurons in the model. Here activation is none. We will use an activation in the model as well.

In the ‘weightsAndBias’ above, we initiate the weights and biases where weights are initiated as random numbers and biases as zeros.

In the call function, we multiply our inputs and weights using matrix multiplication (matmul method does matrix multiplication) and add the bias to it (remember the formula wx+b)

This is the most basic one. Please feel free to try some non lilear layers, may be quadratic or cubic formula.

Model Development

Model development is the simpler part. We have 24 variables as training features. So the input shape is (24, ). Here is the complete model:

model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(24,)),
SimpleLinear(512, activation='relu'),
tf.keras.layers.Dropout(0.2),
SimpleLinear(256, activation='relu'),
SimpleLinear(128, activation='relu'),
tf.keras.layers.Dense(1, activation='relu')
])

As you can see, we simply called SimpleLinear method we defined earlier as the layers. 512, 256, and 128 are the units and activation is ‘relu’.

Though it is also possible to use a custom activation method which will be in the next part.

Let’s compile the model and use the loss function ‘rmse’ we defined earlier:

model.compile(optimizer='adam',
loss = rmse,
metrics=tf.keras.metrics.RootMeanSquaredError())
h = model.fit(train_x, train_y, epochs=3)
model.evaluate(val_x, val_y)

Output:

Epoch 1/3
4/4 [==============================] - 0s 3ms/step - loss: 13684.0762 - root_mean_squared_error: 13726.8496
Epoch 2/3
4/4 [==============================] - 0s 3ms/step - loss: 13669.2314 - root_mean_squared_error: 13726.8496
Epoch 3/3
4/4 [==============================] - 0s 3ms/step - loss: 13537.3682 - root_mean_squared_error: 13726.8496

In the next part, we will experiment with some custom activation functions.

Custom Activation Function

I will explain two ways to use the custom activation function here. The first one is to use a lambda layer. The lambda layer defines the function right in the layer.

For example in the following model, the lambda layer takes the output from the SimpleLinear method and takes its absolute values of it so we do not get any negatives.

model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(24,)),
SimpleLinear(512),
tf.keras.layers.Lambda(lambda x: tf.abs(x)),
tf.keras.layers.Dropout(0.2),
SimpleLinear(256),
tf.keras.layers.Lambda(lambda x: tf.abs(x)),
tf.keras.layers.Dense(1),
tf.keras.layers.Lambda(lambda x: tf.abs(x)),
])

Please feel free to try any other kinds of operation in the lambda layer.

You do not have to define the operation in the lambda layer itself. It can be defined in a function and passed on to the lambda layer.

Here is a function that takes data and squares it:

def active1(x):
return x**2

Now, this function can be simply passed into the lambda layer like this:

model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(24,)),
SimpleLinear(512),
tf.keras.layers.Lambda(active1),
tf.keras.layers.Dropout(0.2),
SimpleLinear(256),
tf.keras.layers.Lambda(active1),
tf.keras.layers.Dense(1),
tf.keras.layers.Lambda(active1),
])

There are so many other different functions that can be used based on your project and your needs.

Conclusion

Tensorflow can be so dynamic to use. There are so many different ways it can be manipulated. In this article, I wanted to share some of the methods to make Tensorflow more flexible for you. I hope it is helpful and you try it in your own projects.

Feel free to follow me on Twitter and like my Facebook page.

Step-by-step explanation and examples with complete code

The best way to learn is by learning by doing. So, we will learn with exercises using a free public dataset that I used in my last tutorial on the multi-output model.

I am assuming that you already know the basics of data analysis, data cleaning, and Tensorflow already. So, we will move a bit fast in the beginning.

Data Processing

The open public dataset I will use in this tutorial is fairly clean. But still, a little bit of cleaning is necessary.

Here is the link to the dataset:

I already cleaned up the dataset as necessary. Please feel free to download the clean dataset from here to follow along:

Let’s start.

First import all the necessary packages here:

import numpy as np
import pandas as pdimport tensorflow as tf 
from tensorflow.keras import Sequential 
from tensorflow.keras.layers import Dense 
from tensorflow.keras.optimizers import Adam 
from tensorflow.keras import backend as K
from tensorflow.keras.layers import Layer

Here is the dataset:

df = pd.read_csv("auto_price.csv")

Though I said it is a clean dataset, it still has two unnecessary columns that need to be dropped:

df = df.drop(columns=['Unnamed: 0', 'symboling'])

We will divide the dataset into three portions. One for training, one for testing, and one for validation.

from sklearn.model_selection import train_test_split
train, test = train_test_split(df, test_size=0.2, random_state=2)
train, val = train_test_split(train, test_size=0.2, random_state=23)

To normalize the training data using the z-score method, we need to know the mean and standard deviation of all the training features. Here is how I got it:

train_stats = train.describe()
train_stats= train_stats.transpose()
train_stats

There is extra information as well but the mean and standard deviation is also there.

This norm function takes data and normalizes it using the mean and standard deviation we received in the previous step:

def norm(x):
return (x - train_stats['mean']) / train_stats['std']

Let’s normalize train, test, and validation data:

train_x = norm(train)
test_x = norm(test)
val_x = norm(val)

For this exercise, the price of the automobile will be used as the target variable and the rest of the variables as the training features.

train_x = train_x.drop(columns='price')
test_x = test_x.drop(columns='price')
val_x=val_x.drop(columns='price')train_y = train['price']
test_y = test['price']
val_y = val['price']

Training and target variables are ready.

Custom Loss and Custom Layer

Let’s start with the loss function, we all know. That is the root mean squared error. We will define it as a function and pass that function while compiling the model.

def rmse(y_true, y_pred):
return K.sqrt(K.mean(K.square(y_pred - y_true)))

Looks very familiar, right? Let’s keep this function in hand to use later. There are many other kinds of loss functions you can try.

class SimpleLinear(Layer):def __init__(self, units=64, activation=None):
super(SimpleLinear, self).__init__()
self.units = units
self.activation=tf.keras.activations.get(activation)def weightsAndBias(self, input_shape):
w_init = tf.random_normal_initializer()
self.w = tf.Variable(name="kernel",
initial_value=w_init(shape=(input_shape[-1], self.units),
dtype='float32'),
trainable=True)b_init = tf.zeros_initializer()
self.b = tf.Variable(name="bias",
initial_value=b_init(shape=(self.units,), dtype='float32'),
trainable=True)def call(self, inputs):
return self.activation(tf.matmul(inputs, self.w) + self.b)

In the ‘weightsAndBias’ above, we initiate the weights and biases where weights are initiated as random numbers and biases as zeros.

In the call function, we multiply our inputs and weights using matrix multiplication (matmul method does matrix multiplication) and add the bias to it (remember the formula wx+b)

This is the most basic one. Please feel free to try some non lilear layers, may be quadratic or cubic formula.

Model Development

Model development is the simpler part. We have 24 variables as training features. So the input shape is (24, ). Here is the complete model:

model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(24,)),
SimpleLinear(512, activation='relu'),
tf.keras.layers.Dropout(0.2),
SimpleLinear(256, activation='relu'),
SimpleLinear(128, activation='relu'),
tf.keras.layers.Dense(1, activation='relu')
])

As you can see, we simply called SimpleLinear method we defined earlier as the layers. 512, 256, and 128 are the units and activation is ‘relu’.

Though it is also possible to use a custom activation method which will be in the next part.

Let’s compile the model and use the loss function ‘rmse’ we defined earlier:

model.compile(optimizer='adam',
loss = rmse,
metrics=tf.keras.metrics.RootMeanSquaredError())
h = model.fit(train_x, train_y, epochs=3)
model.evaluate(val_x, val_y)

Output:

Epoch 1/3
4/4 [==============================] - 0s 3ms/step - loss: 13684.0762 - root_mean_squared_error: 13726.8496
Epoch 2/3
4/4 [==============================] - 0s 3ms/step - loss: 13669.2314 - root_mean_squared_error: 13726.8496
Epoch 3/3
4/4 [==============================] - 0s 3ms/step - loss: 13537.3682 - root_mean_squared_error: 13726.8496

In the next part, we will experiment with some custom activation functions.

Custom Activation Function

I will explain two ways to use the custom activation function here. The first one is to use a lambda layer. The lambda layer defines the function right in the layer.

For example in the following model, the lambda layer takes the output from the SimpleLinear method and takes its absolute values of it so we do not get any negatives.

model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(24,)),
SimpleLinear(512),
tf.keras.layers.Lambda(lambda x: tf.abs(x)),
tf.keras.layers.Dropout(0.2),
SimpleLinear(256),
tf.keras.layers.Lambda(lambda x: tf.abs(x)),
tf.keras.layers.Dense(1),
tf.keras.layers.Lambda(lambda x: tf.abs(x)),
])

Please feel free to try any other kinds of operation in the lambda layer.

You do not have to define the operation in the lambda layer itself. It can be defined in a function and passed on to the lambda layer.

Here is a function that takes data and squares it:

def active1(x):
return x**2

Now, this function can be simply passed into the lambda layer like this:

model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(24,)),
SimpleLinear(512),
tf.keras.layers.Lambda(active1),
tf.keras.layers.Dropout(0.2),
SimpleLinear(256),
tf.keras.layers.Lambda(active1),
tf.keras.layers.Dense(1),
tf.keras.layers.Lambda(active1),
])

There are so many other different functions that can be used based on your project and your needs.

Conclusion

Feel free to follow me on Twitter and like my Facebook page.

How to Define Custom Layer, Activation Function, and Loss Function in TensorFlow | by Rashida Nasrin Sucky | Nov, 2022

Step-by-step explanation and examples with complete code

Data Processing

Custom Loss and Custom Layer

Model Development

Custom Activation Function

Conclusion

More Reading

Step-by-step explanation and examples with complete code

Data Processing

Custom Loss and Custom Layer

Model Development

Custom Activation Function

Conclusion

More Reading