>Business >Time Series Prediction with deep learning in Keras

Time Series Prediction with deep learning in Keras

Time series prediction is a tough problem both to frame and to tackle within machine learning.

In this blog article by AICorespot, you will find out how to develop neural network models for time series prediction in Python leveraging the Keras deep learning library.

After going through this post, you will be aware of:

  • About the airline passengers univariate time series prediction problem.
  • How to phrase time series prediction as a regression problem and generate a neural network model for it.
  • How to frame time series prediction with a time lag and generate a neural network model for it.

Problem Description

The problem we are taking a look at in this post is the international airline passengers prediction problem.

This is a problem where provided a year and a month, the activity is to forecast the number of international airline passengers in units of 1,000. The data ranges from January 1949 to December 1960 or 12 years, with 144 observations.

Download the dataset

Below is a sample of the starting few lines of the file.

 

1

2

3

4

5

“Month”,”Passengers”

“1949-01”,112

“1949-02”,118

“1949-03”,132

“1949-04”,129

 

We can load this dataset in a simple manner by leveraging the Pandas library. We are not concerned with the date, provided that every observation is separated by the same interval of a single month. Thus, when we load the dataset we can exclude the starting column.

Upon loading, we can easily plot the entire dataset. The code to load and plot the dataset is detailed below.

 

1

2

3

4

5

import pandas

import matplotlib.pyplot as plt

dataset = pandas.read_csv(‘airline-passengers.csv’, usecols=[1], engine=’python’)

plt.plot(dataset)

plt.show()

 

You can observe an upward trend in the plot.

You can additionally observe some periodicity to the dataset that likely corresponds to the northern hemisphere summer holiday period.

We are going to keep things simple and operate with the data as-is.

Typically, it is a good idea to look into several data prep strategies to rescale the data and make it stationary.

Multilayer Perceptron Regression

We wish to phrase the time series prediction problem as a regression problem.

That is, provided the number of passengers (in units of thousands) this month, what is the number of passengers in the upcoming month.

We can author a simple function to translate our singular column of data into a two-column dataset. The first column consisting of this month’s (t) passenger count and the second column consisting next months (t+1) passenger count, to be forecasted.

Prior to getting started, let’s initially import all of the functions and classes we intend to leverage. This goes by the assumption that there is an operational SciPy environment with the Keras deep learning library installed.

1

2

3

4

5

import numpy

import matplotlib.pyplot as plt

import pandas

from keras.models import Sequential

from keras.layers import Dense

 

We can additionally leverage the code from the prior section to load the dataset as a Pandas dataframe. We can then extract the NumPy array from the dataframe and translate the integer to values to floating point values which are more apt for modelling with a neural network.

 

1

2

3

4

5

# load the dataset

dataframe = pandas.read_csv(‘airline-passengers.csv’, usecols=[1], engine=’python’)

dataset = dataframe.values

dataset = dataset.astype(‘float32’)

 

After we model our data and estimate the skill of our model on the training dataset, we are required to get an idea of the skill of the model on new unobserved data. For a normal classification or regression problem we would perform this leveraging cross validation.

With time series data, the sequence of values is critical. A simple strategy that we can leverage is to split the ordered dataset into train and test datasets. The code below calculates the index of the split point and separates the information into the training datasets with 67% of the observations that we can leverage to train our model, leaving the pending 33% for evaluation of the model.

 

1

2

3

4

5

6

# split into train and test sets

train_size = int(len(dataset) * 0.67)

test_size = len(dataset) – train_size

train, test = dataset[0:train_size,:], dataset[train_size:len(dataset),:]

print(len(train), len(test))

 

Now we can go about defining a function to develop a new dataset as detailed above. The function takes two arguments, the dataset which is NumPy array that we wish to convert into a dataset and the look_back which is the number of prior time steps to leverage as input variables to forecast the next time period, in this scenario, defaulted to 1.

This default will develop a dataset where X is the number of passengers at a provided time (t) and Y is the number of passengers at the next time (t+1)

It can be configured and we will look at developing a differently shaped dataset in the subsequent section.

 

1

2

3

4

5

6

7

8

9

# convert an array of values into a dataset matrix

def create_dataset(dataset, look_back=1):

dataX, dataY = [], []

for i in range(len(dataset)-look_back-1):

a = dataset[i:(i+look_back), 0]

dataX.append(a)

dataY.append(dataset[i + look_back, 0])

return numpy.array(dataX), numpy.array(dataY)

 

Let’s take a peek at the impact of this function on the first few rows of the dataset.

 

1

2

3

4

5

6

X                                  Y

112                             118

118                             132

132                             129

129                             121

121                             135

 

If you contrast these first five rows to the original dataset sample listed in the prior section, you can observe the X=t and Y=t+1 pattern in the numbers.

Let’s leverage this function to prep the train and evaluate datasets ready for modelling.

 

1

2

3

4

5

# reshape into X=t and Y=t+1

look_back = 1

trainX, trainY = create_dataset(train, look_back)

testX, testY = create_dataset(test, look_back)

 

We can now fit a Multilayer Perceptron Model to the training data.

We leverage a simple network with 1 input, 1 hidden layer with 8 neurons and an output layer. The model is fitted leveraging mean squared error, which if we take the square root provides us an error score in the units of the dataset.

We attempted a few rough parameters and settled on the configuration below, but by no means is the network listed optimized.

1

2

3

4

5

6

7

# create and fit Multilayer Perceptron model

model = Sequential()

model.add(Dense(8, input_dim=look_back, activation=’relu’))

model.add(Dense(1))

model.compile(loss=’mean_squared_error’, optimizer=’adam’)

model.fit(trainX, trainY, epochs=200, batch_size=2, verbose=2)

 

After the model is fitted, we can estimate the performance of the model on the train and evaluate datasets. This will provide us a point of comparison for new models.

1

2

3

4

5

6

# Estimate model performance

trainScore = model.evaluate(trainX, trainY, verbose=0)

print(‘Train Score: %.2f MSE (%.2f RMSE)’ % (trainScore, math.sqrt(trainScore)))

testScore = model.evaluate(testX, testY, verbose=0)

print(‘Test Score: %.2f MSE (%.2f RMSE)’ % (testScore, math.sqrt(testScore)))

 

Lastly, we can produce forecasts leveraging the model for both the train and test dataset to obtain a visual indication of the ability of the model.

Because of how the dataset was prepped, we must alter the forecasts so that they align on the x-axis with the original dataset. Once prepped, the data is plotted, displaying the original dataset in blue, the forecasts for the train dataset in green the predictions on the unobserved test dataset in red.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

# generate predictions for training

trainPredict = model.predict(trainX)

testPredict = model.predict(testX)

# shift train predictions for plotting

trainPredictPlot = numpy.empty_like(dataset)

trainPredictPlot[:, :] = numpy.nan

trainPredictPlot[look_back:len(trainPredict)+look_back, :] = trainPredict

# shift test predictions for plotting

testPredictPlot = numpy.empty_like(dataset)

testPredictPlot[:, :] = numpy.nan

testPredictPlot[len(trainPredict)+(look_back*2)+1:len(dataset)-1, :] = testPredict

# plot baseline and predictions

plt.plot(dataset)

plt.plot(trainPredictPlot)

plt.plot(testPredictPlot)

plt.show()

 

Connecting this all together, the complete instance is detailed below.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

# Multilayer Perceptron to Predict International Airline Passengers (t+1, given t, t-1, t-2)

import numpy

import matplotlib.pyplot as plt

from pandas import read_csv

import math

from keras.models import Sequential

from keras.layers import Dense

 

# convert an array of values into a dataset matrix

def create_dataset(dataset, look_back=1):

dataX, dataY = [], []

for i in range(len(dataset)-look_back-1):

a = dataset[i:(i+look_back), 0]

dataX.append(a)

dataY.append(dataset[i + look_back, 0])

return numpy.array(dataX), numpy.array(dataY)

 

# load the dataset

dataframe = read_csv(‘international-airline-passengers.csv’, usecols=[1], engine=’python’)

dataset = dataframe.values

dataset = dataset.astype(‘float32′)

# split into train and test sets

train_size = int(len(dataset) * 0.67)

test_size = len(dataset) – train_size

train, test = dataset[0:train_size,:], dataset[train_size:len(dataset),:]

# reshape dataset

look_back = 3

trainX, trainY = create_dataset(train, look_back)

testX, testY = create_dataset(test, look_back)

# create and fit Multilayer Perceptron model

model = Sequential()

model.add(Dense(12, input_dim=look_back, activation=’relu’))

model.add(Dense(8, activation=’relu’))

model.add(Dense(1))

model.compile(loss=’mean_squared_error’, optimizer=’adam’)

model.fit(trainX, trainY, epochs=400, batch_size=2, verbose=2)

# Estimate model performance

trainScore = model.evaluate(trainX, trainY, verbose=0)

print(‘Train Score: %.2f MSE (%.2f RMSE)’ % (trainScore, math.sqrt(trainScore)))

testScore = model.evaluate(testX, testY, verbose=0)

print(‘Test Score: %.2f MSE (%.2f RMSE)’ % (testScore, math.sqrt(testScore)))

# generate predictions for training

trainPredict = model.predict(trainX)

testPredict = model.predict(testX)

# shift train predictions for plotting

trainPredictPlot = numpy.empty_like(dataset)

trainPredictPlot[:, :] = numpy.nan

trainPredictPlot[look_back:len(trainPredict)+look_back, :] = trainPredict

# shift test predictions for plotting

testPredictPlot = numpy.empty_like(dataset)

testPredictPlot[:, :] = numpy.nan

testPredictPlot[len(trainPredict)+(look_back*2)+1:len(dataset)-1, :] = testPredict

# plot baseline and predictions

plt.plot(dataset)

plt.plot(trainPredictPlot)

plt.plot(testPredictPlot)

plt.show()

 

Running the instance reports model performance.

Your outcomes might demonstrate variance provided the stochastic nature of the algorithm or evaluation procedure, or variations in numerical accuracy. Consider executing the instance a few times and contrast the average outcome.

Taking the square root of the performance estimates, we can observe that the model has an average error of 23 passengers (in thousands) on the training dataset and 48 passengers (in thousands) on the test dataset.

 

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

Epoch 195/200

0s – loss: 535.3075

Epoch 196/200

0s – loss: 551.2694

Epoch 197/200

0s – loss: 543.7834

Epoch 198/200

0s – loss: 538.5886

Epoch 199/200

0s – loss: 539.1434

Epoch 200/200

0s – loss: 533.8347

Train Score: 531.71 MSE (23.06 RMSE)

Test Score: 2355.06 MSE (48.53 RMSE)

 

From the plot, we can observe that the model did a pretty weak job of fitting both the training and the test datasets. It essentially forecasted the same input value as the output.

Multilayer Perceptron Leveraging the Window Method

We can also phrase the problem so that several recent time steps can be leveraged to make the forecast for the next time step.

This is referred to as the window strategy, and the size of the window is a parameter that can be tuned for every problem.

For instance, provided the current time (t) we wish to forecast the value at the next time in the sequence (t+1), we can leverage the present time (t) as well as the two prior times (t-1 and t-2)

When phrased as a regression problem the input variables are t-2, t-1, t and the output variable is t+1.

The create_dataset() function we authored in the prior section facilitates us to create this formulation of the time series problem by increasing the look_back argument from 1 to 3.

A sample of the dataset with this formulation looks as follows:

1

2

3

4

5

6

X1             X2             X3             Y

112           118           132           129

118           132           129           121

132           129           121           135

129           121           135           148

121           135           148           148

 

We can re-run the instance in the prior section with the bigger window size. We will improve the network capacity to manage the extra information. The first hidden layer is increased to 14 neurons and a second hidden layer is included with 8 neurons. The number of epochs is also increased to 400.

The whole code listing with only the window size change is detailed below for completeness.

 

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

# Multilayer Perceptron to Predict International Airline Passengers (t+1, given t, t-1, t-2)

import numpy

import matplotlib.pyplot as plt

from pandas import read_csv

import math

from keras.models import Sequential

from keras.layers import Dense

 

# convert an array of values into a dataset matrix

def create_dataset(dataset, look_back=1):

dataX, dataY = [], []

for i in range(len(dataset)-look_back-1):

a = dataset[i:(i+look_back), 0]

dataX.append(a)

dataY.append(dataset[i + look_back, 0])

return numpy.array(dataX), numpy.array(dataY)

 

# load the dataset

dataframe = read_csv(‘international-airline-passengers.csv’, usecols=[1], engine=’python’)

dataset = dataframe.values

dataset = dataset.astype(‘float32′)

# split into train and test sets

train_size = int(len(dataset) * 0.67)

test_size = len(dataset) – train_size

train, test = dataset[0:train_size,:], dataset[train_size:len(dataset),:]

# reshape dataset

look_back = 3

trainX, trainY = create_dataset(train, look_back)

testX, testY = create_dataset(test, look_back)

# create and fit Multilayer Perceptron model

model = Sequential()

model.add(Dense(12, input_dim=look_back, activation=’relu’))

model.add(Dense(8, activation=’relu’))

model.add(Dense(1))

model.compile(loss=’mean_squared_error’, optimizer=’adam’)

model.fit(trainX, trainY, epochs=400, batch_size=2, verbose=2)

# Estimate model performance

trainScore = model.evaluate(trainX, trainY, verbose=0)

print(‘Train Score: %.2f MSE (%.2f RMSE)’ % (trainScore, math.sqrt(trainScore)))

testScore = model.evaluate(testX, testY, verbose=0)

print(‘Test Score: %.2f MSE (%.2f RMSE)’ % (testScore, math.sqrt(testScore)))

# generate predictions for training

trainPredict = model.predict(trainX)

testPredict = model.predict(testX)

# shift train predictions for plotting

trainPredictPlot = numpy.empty_like(dataset)

trainPredictPlot[:, :] = numpy.nan

trainPredictPlot[look_back:len(trainPredict)+look_back, :] = trainPredict

# shift test predictions for plotting

testPredictPlot = numpy.empty_like(dataset)

testPredictPlot[:, :] = numpy.nan

testPredictPlot[len(trainPredict)+(look_back*2)+1:len(dataset)-1, :] = testPredict

# plot baseline and predictions

plt.plot(dataset)

plt.plot(trainPredictPlot)

plt.plot(testPredictPlot)

plt.show()

 

Your outcomes might demonstrate variance provided the stochastic nature of the algorithm or evaluation procedure, or variations in numerical accuracy. Consider running the instance a few times and contrast the average outcome.

Running the instance furnishes the following output.

 

1

2

3

4

5

6

7

8

9

10

11

12

13

14

Epoch 395/400

0s – loss: 485.3482

Epoch 396/400

0s – loss: 479.9485

Epoch 397/400

0s – loss: 497.2707

Epoch 398/400

0s – loss: 489.5670

Epoch 399/400

0s – loss: 490.8099

Epoch 400/400

0s – loss: 493.6544

Train Score: 564.03 MSE (23.75 RMSE)

Test Score: 2244.82 MSE (47.38 RMSE)

 

We can observe that the error was not significantly minimized contrasted to that of the prior section.

Observing the graph, we can observe more structure in the predictions.

Again, the window size and the network architecture were not tuned, this is only a demonstration of how to frame a prediction problem.

Taking the square root of the performance scores we can observe the average error on the training dataset was 23 passengers (in thousands per month) and the average error on the unobserved test set was 47 passengers (in thousands per month.)

Conclusion

In this blog article, you found out about how to generate a neural network model for a time series forecasting problem leveraging the Keras deep learning library.

After going through this guide, you are now aware of:

  • About the international airline passenger prediction time series dataset.
  • How to frame time series prediction problems as a regression problem and develop a neural network model.
  • How to leverage the window strategy to frame a time series prediction problem and develop a neural network model.
Add Comment