Building models in Keras is straightforward and easy. If you’re reading this, you’re likely familiar with the Sequential model and stacking layers together to form simple models. But what if you want to do something more complicated?

Enter the functional API. For complex models the functional API is really the only way to go – it can do all sorts of things that just aren’t possible with the Sequential model.  Models with multiple inputs and outputs, models with shared layers – once you start designing architectures that need these things, you will have to use the functional API to build your model.

But using the functional API is tricky – at least, I found it tricky. What I found helpful was sitting down and coding up some examples using the Functional API – just simple examples, but enough to get going.

In this post we’ll run through five of these examples. To understand this post there’s an assumed background of some exposure to Keras and ideally some prior exposure to the functional API already. There are better resources than this in describing the basics of the functional API – below will just be examples.

Creating some sample data

First we’ll need to set up some data to use for our examples. We’ll use numpy to help us with this. Normally I like to use pandas for these kind of tasks, but it turns out that pandas DataFrames don’t integrate well with Keras and you get some strange errors.

We’ll create two datasets: a training dataset, and a test dataset. Normally we’d create a cross validation set as well but for example purposes it’s okay to just have a test set.

The toy data will have three predictor variables (x1, x2 and x3) and two response variables (y_{classifier} and y_{cts}), where cts stands for continuous. We want two response variables because we’ll be building a number of models, some that predict discrete outcomes (like logistic regression) and some that predict continuous outcomes (like linear regression).

For our classifier variable y_{classifier}, we’ll set it equal to 1 when x_1 + x_2 + \frac{x_3}{3} + \epsilon > 1, where \epsilon \sim N(0,1), and y_{classifier} = 0 elsewhere.

For our continuous variable y_{cts}, we’ll set y_{cts} = x_1 + x_2 + \frac{x_3}{3} + \epsilon. This way both variables will look similar, and in the end we’ll have some data that isn’t linearly seperable, but good enough to train models on.

Let’s take a quick look at our data. Here’s the first two dimensions plotted with the colour symbolising different values of y_{classifier}:

Here’s the same graph for y_{cts}

 

Last thing we’re going to do is split up our data into a training and test set so that we can check model performance as we train it:

Our data is ready – let’s train some Keras models.

Example 1 – Logistic Regression

Our first example is building logistic regression using the Keras functional model. It’s quite easy and straightforward once you know some key frustration points:

  • The input layer needs to have shape (p,) where p is the number of columns in your training matrix. In our case we have three columns (x_1, x_2, x_3) so we set the shape to (3,)
  • The output layer needs to have the same number of dimensions as the number of neurons in the dense layer. In our case we’re predicting a binary vector (0 or 1), which has 1 dimension, so our dense layer needs to have one neuron.

We’ll use our trick from before to see the rough figure for what this eventually converges to:

If we look at our weights we can see that our logistic model has roughly the right idea about what they should be:

It’s a bit much to expect perfect performance given the amount of noise in the data. This is a pretty good result.

Example 2 – Linear Regression

Here we’ll construct a basic linear regression with y_{cts} as the dependent variable and x1, x2 and x3 as the predictor variables (remember which are stored inside the numpy array dat. You’ll see that the code for linear regression looks a lot like that for logistic regression above – we just use a linear activation function instead of a sigmoid one.

One way to see if our model is a good fit or not is by looking at some of the errors and seeing if there is a pattern.

Doesn’t look like it to me!

Example 3 – Simple Neural Network

Below is how to implement a simple neural network with one hidden layer. Again, it’s pretty straightforward.

We get up to about 84% accuracy on the test set with this model.

Example 4 – Deep Neural Network

Below we train a neural network with a large number of hidden layers. We also add Dropout to the layers to reduce overfitting.

It can be useful sometimes to use for loops and if statements when using the Functional API, particularly for complicated models.

This model does about as well as the previous neural network. It could probably do better by tuning the hyperparameters, like the amount of dropout or the number of neural network layers.

Example 5 – Simple Neural Network + Metadata

One use case for the Functional API is where you have multiple data sources that you want to pull together into one model.

For example, if your task is image classification you could use the Sequential model to build a convolutional neural network that would run over the images. If you decide to use the Functional API instead of the Sequential model, you can also include metadata of your image into your model: perhaps its size, date created, or its tagged location.

We’ll demonstrate this technique of combining multiple data sources by creating two vectors of “metadata” for use with our contrived dataset. It would be nice if they were helpful, so we’ll ‘cheat’ and make them equal to y_classifier plus some random noise. The metadata will help illustrate incorporating multiple sources of information in one model.

Our model architecture below will have two hidden layers with the metadata added after the first of them.

We’ll need the concatenate layer to merge the two data sources together.

Let’s build the model now. Note how we have two input layers: one for the original data and one for the metadata. Both sets of data go through a dense layer and a dropout layer. They are then combined with the concatenate layer, go through another dense and dropout layer before a final dense layer gives the output values.

When fitting this model remember to supply two sets of data, not just one.

This model gives 95% accuracy on the test set! It’s clear that adding information here from another source helped our model to make good predictions.

Summary

At this point you should hopefully know enough about the functional API to be able to start to decipher the complicated models that exist out there. The functional API can also do much more than what is described in this post.

If you read this and then make something cool, I’d love it if you dropped me a comment and showed it off! Thanks for reading.

Leave a Reply

Your email address will not be published. Required fields are marked *