>Business >Developing a neural net for forecasting disturbances in the Ionosphere

Developing a neural net for forecasting disturbances in the Ionosphere

It can be a challenge to generate a neural network predictive model for a fresh dataset.

One strategy is to initially inspect the dataset and generate ideas for what models might function, then explore the learning dynamics of simplistic models on the dataset, then lastly develop and tune a model for the dataset with a solid test harness.

This procedure can be leveraged to develop effective neural network models for classification and regression predictive modelling problem.

In this guide, you will find out how to develop a multiplayer perceptron neural network model for the ionosphere binary classification dataset.

After going through this guide, you will be aware of:

  • How to load and summarize the ionosphere dataset and leverage the outcomes to suggest information preparations and model configurations to leverage.
  • How to explore the learning dynamics of simplistic MLP models on the dataset.
  • How to develop solid estimates of model performance, tune model performance, and make forecasts on fresh data.

Tutorial Summarization

This tutorial is subdivided into four portions, which are:

1] Ionosphere Binary Classification Dataset

2] Neural Network Learning Dynamics

3] Evaluating and Tuning MLP models

4] Final Model and Make Predictions

Ionosphere Binary Classification Dataset

The first phase is to define and explore the dataset.

We will be operating with the “Ionosphere” standard binary classification dataset.

This data consists of forecasting if a structure is in the atmosphere or not provided radar returns.

You can observe the first few rows of the Ionosphere dataset here.

1,0,0.99539,-0.05889,0.85243,0.02306,0.83398,-0.37708,1,0.03760,0.85243,-0.17755,0.59755,-0.44945,

0.60536,-0.38223,0.84356,-0.38542,0.58212,-0.32192,0.56971,-0.29674,0.36946,-0.47357,0.56811,-0.51171,

0.41078,-0.46168,0.21266,-0.34090,0.42267,-0.54487,0.18641,-0.45300,g

 

1,0,1,-0.18829,0.93035,-0.36156,-0.10868,-0.93597,1,-0.04549,0.50874,-0.67743,0.34432,-0.69707,-0.51685,

-0.97515,0.05499,-0.62237,0.33109,-1,-0.13151,-0.45300,-0.18056,-0.35734,-0.20332,-0.26569,-0.20468,

-0.18401,-0.19040,-0.11593,-0.16626,-0.06288,-0.13738,-0.02447,b

 

1,0,1,-0.03365,1,0.00485,1,-0.12062,0.88965,0.01198,0.73082,0.05346,0.85443,0.00827,0.54591,0.00299,

0.83775,-0.13644,0.75535,-0.08540,0.70887,-0.27502,0.43385,-0.12062,0.57528,-0.40220,0.58984,

-0.22145,0.43100,-0.17365,0.60436,-0.24180,0.56045,-0.38238,g 1,0,1,-0.45161,1,1,0.71216,-1,0,0,0,0,0,0,

-1,0.14516,0.54094,-0.39330,-1,-0.54467,-0.69975,1,0,0,1,0.90695, 0.51613,1,1,-0.20099,0.25682,1,

-0.32382,1,b

 

1,0,1,-0.02401,0.94140,0.06531,0.92106,-0.23255,0.77152,-0.16399,0.52798,-0.20275,0.56409,

-0.00712,0.34395,-0.27457,0.52940,-0.21780,0.45107,-0.17813,0.05982,-0.35575,0.02309,-0.52879,0.03286,

-0.65158,0.13290,-0.53206,0.02431,-0.62197,-0.05707,-0.59573,-0.04608,-0.65697,g

 

We can observe that the values are universally numeric and probably in the range [-1, 1]. This indicates some variant of scaling would likely not be required.

We can also observe that the label is a string (“g” and “b”), indicating that the values will require to be encoded to 0 and 1 before fitting a model.

We can then load the dataset as a pandas DataFrame directly from the URL; for instance,

# load the ionosphere dataset and summarize the shape

from pandas import read_csv

# define the location of the dataset

url = ‘https://raw.githubusercontent.com/jbrownlee/Datasets/master/ionosphere.csv’

# load the dataset

df = read_csv(url, header=None)

# summarize shape

print(df.shape)

 

Running the instance loads the dataset directly from the URL and reports the shape of the dataset.

In this scenario, we can observe that the dataset possesses 35 variables (34 input and one output) and that the dataset possesses 351 rows of data.

This is not several rows of data for a neural network and indicates that a small network, probably with regularization, would be relevant.

It also indicates that leveraging k-fold cross-validation would be a good idea provided that it will provide a more dependable estimate of model performance than a train/test split and as a single model will fit in seconds rather than hours or days with the biggest datasets.

(351, 35)

Then, we can learn more about the dataset by observing summary statistics and a plot of the data.

# show summary statistics and plots of the ionosphere dataset

from pandas import read_csv

from matplotlib import pyplot

# define the location of the dataset

url = ‘https://raw.githubusercontent.com/jbrownlee/Datasets/master/ionosphere.csv’

# load the dataset

df = read_csv(url, header=None)

# show summary statistics

print(df.describe())

# plot histograms

df.hist()

pyplot.show()

 

Running the instance first loads the data before and then prints summary stats for every variable.

We can observe that the mean values for every variable are in the tens, with values ranging from -1 to 1. This confirms that scaling the data is likely not needed.

0      1           2   …          31          32          33

count  351.000000  351.0  351.000000  …  351.000000  351.000000  351.000000

mean     0.891738    0.0    0.641342  …   -0.003794    0.349364    0.014480

std      0.311155    0.0    0.497708  …    0.513574    0.522663    0.468337

min      0.000000    0.0   -1.000000  …   -1.000000   -1.000000   -1.000000

25%      1.000000    0.0    0.472135  …   -0.242595    0.000000   -0.165350

50%      1.000000    0.0    0.871110  …    0.000000    0.409560    0.000000

75%      1.000000    0.0    1.000000  …    0.200120    0.813765    0.171660

max      1.000000    0.0    1.000000  …    1.000000    1.000000    1.000000

 

A histogram plot is then developed for every variable.

We can observe that several variables possess a Gaussian or Gaussian-like distribution.

We might have some advantages to reap in leveraging a power transform on every variable in order to make the probability distribution less skewed which will probably enhance model performance.

 

Now that we are acquainted with the dataset, let’s look into how we might develop a neural network model.

Neural Network Learning Dynamics

We will produce a Multilayer Perceptron (MLP) model for the dataset leveraging TensorFlow.

We cannot know what model architecture of learning hyperparameters would be good or best for this dataset, so we must experiment and find out what functions well.

Provided that the dataset is minimal, a small batch size is likely a good idea, for example, 16 or 32 rows. Leveraging the Adam variant of Stochastic gradient descent is a good idea when beginning as it will automatically adapt the learning rate and works well on most datasets.

Prior to evaluating models in earnest, it is a good thought to review the learning dynamics and tune the model architecture and learning configuration till we possess stable learning dynamics, then look at getting the most out of the model.

We can perform this by leveraging a simple train/test split of the data and review plots of the learning curves. This will assist us to see if we are over-learning or under-learning; then we can adapt the configuration based on that.

To start with, we must make sure all input variables are floating-point values and encode the target label integer values 0 and 1.

# ensure all data are floating point values

X = X.astype(‘float32′)

# encode strings to integer

y = LabelEncoder().fit_transform(y)

 

Then, we can split the dataset into input and output variables, then into 67/33 train and test sets.

# split into input and output columns

X, y = df.values[:, :-1], df.values[:, -1]

# split into train and test datasets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)

 

We can then go about defining a minimal MLP model. In this scenario, we will leverage one hidden layer with 10 nodes and one output layer (selected randomly). We will leverage the ReLU activation function in the hidden layer and the “he_normal” weight initialization, as combined, they are best practice.

The output of the model is a sigmoid activation for binary classification and we will minimize binary cross-entropy loss.

# determine the number of input features

n_features = X.shape[1]

# define model

model = Sequential()

model.add(Dense(10, activation=’relu’, kernel_initializer=’he_normal’, input_shape=(n_features,)))

model.add(Dense(1, activation=’sigmoid’))

# compile the model

model.compile(optimizer=’adam’, loss=’binary_crossentropy’)

 

We will fit the model for 200 training epochs (selected randomly) with a batch size of 32 as it is a small dataset.

We are fitting the model on raw data, which we think may be a good idea, however it is a critical beginning point.

# fit the model

history = model.fit(X_train, y_train, epochs=200, batch_size=32, verbose=0, validation_data=(X_test,y_test))

 

At the conclusion of training, we will assess the model’s performance on the test dataset and report performance as the classification precision.

# predict test set

yhat = model.predict_classes(X_test)

# evaluate predictions

score = accuracy_score(y_test, yhat)

print(‘Accuracy: %.3f’ % score)

 

Lastly, we will plot learning curves of the cross-entropy loss on the train and test sets during the course of training.

# plot learning curves

pyplot.title(‘Learning Curves’)

pyplot.xlabel(‘Epoch’)

pyplot.ylabel(‘Cross Entropy’)

pyplot.plot(history.history[‘loss’], label=’train’)

pyplot.plot(history.history[‘val_loss’], label=’val’)

pyplot.legend()

pyplot.show()

 

Connecting all of this together, the complete example of assessing our first MLP on the ionosphere dataset is detailed here.

# fit a simple mlp model on the ionosphere and review learning curves

from pandas import read_csv

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import LabelEncoder

from sklearn.metrics import accuracy_score

from tensorflow.keras import Sequential

from tensorflow.keras.layers import Dense

from matplotlib import pyplot

# load the dataset

path = ‘https://raw.githubusercontent.com/jbrownlee/Datasets/master/ionosphere.csv’

df = read_csv(path, header=None)

# split into input and output columns

X, y = df.values[:, :-1], df.values[:, -1]

# ensure all data are floating point values

X = X.astype(‘float32′)

# encode strings to integer

y = LabelEncoder().fit_transform(y)

# split into train and test datasets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)

# determine the number of input features

n_features = X.shape[1]

# define model

model = Sequential()

model.add(Dense(10, activation=’relu’, kernel_initializer=’he_normal’, input_shape=(n_features,)))

model.add(Dense(1, activation=’sigmoid’))

# compile the model

model.compile(optimizer=’adam’, loss=’binary_crossentropy’)

# fit the model

history = model.fit(X_train, y_train, epochs=200, batch_size=32, verbose=0, validation_data=(X_test,y_test))

# predict test set

yhat = model.predict_classes(X_test)

# evaluate predictions

score = accuracy_score(y_test, yhat)

print(‘Accuracy: %.3f’ % score)

# plot learning curves

pyplot.title(‘Learning Curves’)

pyplot.xlabel(‘Epoch’)

pyplot.ylabel(‘Cross Entropy’)

pyplot.plot(history.history[‘loss’], label=’train’)

pyplot.plot(history.history[‘val_loss’], label=’val’)

pyplot.legend()

pyplot.show()

 

Running the instance first fits the model on the training dataset, then reports the classification precision on the test dataset.

Your outcomes may demonstrate variance provided the stochastic nature of the algorithm or evaluation procedure, or variations in numerical accuracy. Take up running the instance a few times and contrast the average outcome.

In this scenario, we can observe that the model accomplished a precision of approximately 88 percentage, which is a good baseline in performance that we may be able to better.

Accuracy: 0.888

Line plots of the loss on the train and test sets are then developed.

We can observe that the model seems to converge but has overfit the training dataset.

Let’s try improving the capacity of the model.

This will slow down learning for the same learning hyperparameters and might provide better precision.

We will include a second hidden layer with eight nodes, selected randomly.

# define model

model = Sequential()

model.add(Dense(10, activation=’relu’, kernel_initializer=’he_normal’, input_shape=(n_features,)))

model.add(Dense(8, activation=’relu’, kernel_initializer=’he_normal’))

model.add(Dense(1, activation=’sigmoid’))

 

The complete instance is detailed here:

# fit a deeper mlp model on the ionosphere and review learning curves

from pandas import read_csv

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import LabelEncoder

from sklearn.metrics import accuracy_score

from tensorflow.keras import Sequential

from tensorflow.keras.layers import Dense

from matplotlib import pyplot

# load the dataset

path = ‘https://raw.githubusercontent.com/jbrownlee/Datasets/master/ionosphere.csv’

df = read_csv(path, header=None)

# split into input and output columns

X, y = df.values[:, :-1], df.values[:, -1]

# ensure all data are floating point values

X = X.astype(‘float32′)

# encode strings to integer

y = LabelEncoder().fit_transform(y)

# split into train and test datasets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)

# determine the number of input features

n_features = X.shape[1]

# define model

model = Sequential()

model.add(Dense(10, activation=’relu’, kernel_initializer=’he_normal’, input_shape=(n_features,)))

model.add(Dense(8, activation=’relu’, kernel_initializer=’he_normal’))

model.add(Dense(1, activation=’sigmoid’))

# compile the model

model.compile(optimizer=’adam’, loss=’binary_crossentropy’)

# fit the model

history = model.fit(X_train, y_train, epochs=200, batch_size=32, verbose=0, validation_data=(X_test,y_test))

# predict test set

yhat = model.predict_classes(X_test)

# evaluate predictions

score = accuracy_score(y_test, yhat)

print(‘Accuracy: %.3f’ % score)

# plot learning curves

pyplot.title(‘Learning Curves’)

pyplot.xlabel(‘Epoch’)

pyplot.ylabel(‘Cross Entropy’)

pyplot.plot(history.history[‘loss’], label=’train’)

pyplot.plot(history.history[‘val_loss’], label=’val’)

pyplot.legend()

 

Running the instance first fits the model on the training dataset, then reports the precision on the test dataset.

Your outcomes may demonstrate variance provided the stochastic nature of the algorithm or evaluation process, or variations in numerical accuracy. Take up running the instance a few times and contrast the average outcome.

In this scenario, we can observe a slight improvement in precision to approximately 93%, even though the high variance of the train/test split implies that this evaluation is not dependent.

Accuracy: 0.931

Learning curves for the loss on the train and test sets are then plotted. We can observe that the model still seems to display an overfitting behaviour.

Lastly, we can attempt a broader network.

We will enhance the number of nodes in the first hidden layer from 10 to 50, and in the 2nd hidden layer from 8 to 10.

This will include additional capacity to the model, slow down learning, and might further enhance outcomes.

# define model

model = Sequential()

model.add(Dense(50, activation=’relu’, kernel_initializer=’he_normal’, input_shape=(n_features,)))

model.add(Dense(10, activation=’relu’, kernel_initializer=’he_normal’))

model.add(Dense(1, activation=’sigmoid’))

 

We will also reduce the number of training epochs from 200 to 100.

# fit the model

history = model.fit(X_train, y_train, epochs=100, batch_size=32, verbose=0, validation_data=(X_test,y_test))

 

The complete example is detailed here:

# fit a wider mlp model on the ionosphere and review learning curves

from pandas import read_csv

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import LabelEncoder

from sklearn.metrics import accuracy_score

from tensorflow.keras import Sequential

from tensorflow.keras.layers import Dense

from matplotlib import pyplot

# load the dataset

path = ‘https://raw.githubusercontent.com/jbrownlee/Datasets/master/ionosphere.csv’

df = read_csv(path, header=None)

# split into input and output columns

X, y = df.values[:, :-1], df.values[:, -1]

# ensure all data are floating point values

X = X.astype(‘float32′)

# encode strings to integer

y = LabelEncoder().fit_transform(y)

# split into train and test datasets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)

# determine the number of input features

n_features = X.shape[1]

# define model

model = Sequential()

model.add(Dense(50, activation=’relu’, kernel_initializer=’he_normal’, input_shape=(n_features,)))

model.add(Dense(10, activation=’relu’, kernel_initializer=’he_normal’))

model.add(Dense(1, activation=’sigmoid’))

# compile the model

model.compile(optimizer=’adam’, loss=’binary_crossentropy’)

# fit the model

history = model.fit(X_train, y_train, epochs=100, batch_size=32, verbose=0, validation_data=(X_test,y_test))

# predict test set

yhat = model.predict_classes(X_test)

# evaluate predictions

score = accuracy_score(y_test, yhat)

print(‘Accuracy: %.3f’ % score)

# plot learning curves

pyplot.title(‘Learning Curves’)

pyplot.xlabel(‘Epoch’)

pyplot.ylabel(‘Cross Entropy’)

pyplot.plot(history.history[‘loss’], label=’train’)

pyplot.plot(history.history[‘val_loss’], label=’val’)

pyplot.legend()

pyplot.show()

 

Running the instance first fits the model on the training dataset, then reports the precision on the test dataset.

Your results may demonstrate variance provided the stochastic nature of the algorithm or evaluation process, or variations in numerical accuracy. Take up running the instance a few times and contrast the average outcome.

In this scenario, the model accomplishes an improved precision score, with a value of approximately 94 percentage. We will ignore model performance for the time being.

Accuracy: 0.940

Line plots of the learning curves are developed displaying that the model accomplished a reasonable fit and had more than adequate time to converge.

Now that we possess some understanding of the learning dynamics for simplistic MLP models on the dataset, we can look at evaluating the performance of the model in addition to tuning the configuration of the models.

Evaluating and Tuning MLP Models

The k-fold cross-validation process can furnish a more dependable estimate of MLP performance, even though it can be very slow.

This is as k models must be fit and evaluated. This is not a problem when the dataset size is minimal, like the ionosphere dataset.

We can leverage the StratifiedKFold class and enumerate every fold manually, fit the model, assess it, and then report the mean of the evaluation scores at the conclusion of the process.

# prepare cross validation

kfold = KFold(10)

# enumerate splits

scores = list()

for train_ix, test_ix in kfold.split(X, y):

# fit and evaluate the model…

# summarize all scores

print(‘Mean Accuracy: %.3f (%.3f)’ % (mean(scores), std(scores)))

 

We can leverage this framework to generate a dependable estimate of MLP model performance with an array of differing data preparations, model architectures, and learning configurations.

It is critical that we first developed a comprehension of the learning dynamics of the model on the dataset in the prior section prior to leveraging k-fold cross-validation to estimate the performance levels. If we began to tune the model directly, we might obtain good outcomes, but if not, we might have no idea of why, for example, that the model was over or under fitting.

If we make major changes to the model again, it is a good idea to go back and confirm that the model is converging in the correct manner.

The complete instance of this framework to assess the base MLP model from the prior section is detailed here.

# k-fold cross-validation of base model for the ionosphere dataset

from numpy import mean

from numpy import std

from pandas import read_csv

from sklearn.model_selection import StratifiedKFold

from sklearn.preprocessing import LabelEncoder

from sklearn.metrics import accuracy_score

from tensorflow.keras import Sequential

from tensorflow.keras.layers import Dense

from matplotlib import pyplot

# load the dataset

path = ‘https://raw.githubusercontent.com/jbrownlee/Datasets/master/ionosphere.csv’

df = read_csv(path, header=None)

# split into input and output columns

X, y = df.values[:, :-1], df.values[:, -1]

# ensure all data are floating point values

X = X.astype(‘float32′)

# encode strings to integer

y = LabelEncoder().fit_transform(y)

# prepare cross validation

kfold = StratifiedKFold(10)

# enumerate splits

scores = list()

for train_ix, test_ix in kfold.split(X, y):

# split data

X_train, X_test, y_train, y_test = X[train_ix], X[test_ix], y[train_ix], y[test_ix]

# determine the number of input features

n_features = X.shape[1]

# define model

model = Sequential()

model.add(Dense(50, activation=’relu’, kernel_initializer=’he_normal’, input_shape=(n_features,)))

model.add(Dense(10, activation=’relu’, kernel_initializer=’he_normal’))

model.add(Dense(1, activation=’sigmoid’))

# compile the model

model.compile(optimizer=’adam’, loss=’binary_crossentropy’)

# fit the model

model.fit(X_train, y_train, epochs=100, batch_size=32, verbose=0)

# predict test set

yhat = model.predict_classes(X_test)

# evaluate predictions

score = accuracy_score(y_test, yhat)

print(‘>%.3f’ % score)

scores.append(score)

# summarize all scores

print(‘Mean Accuracy: %.3f (%.3f)’ % (mean(scores), std(scores)))

 

Running the instance reports the model performance every iteration of the evaluation process and reports the mean and standard deviation of classification precision at the conclusion of the run.

Your outcomes may demonstrate variance provided the stochastic nature of the algorithm or evaluation process, or variations in numerical accuracy. Take up running the instance a few times and contrast the average outcome.

In this scenario, we can observe that the MLP model accomplished a mean precision of approximately 93.4%.

We will leverage this outcome as out baseline to observe if we can accomplish improved performance.

>0.972

>0.886

>0.943

>0.886

>0.914

>0.943

>0.943

>1.000

>0.971

>0.886

Mean Accuracy: 0.934 (0.039)

 

Then, let’s attempt including regularization to minimize overfitting of the model.

In this scenario, we can include dropout layers between the hidden layers of the network. For instance:

# define model

model = Sequential()

model.add(Dense(50, activation=’relu’, kernel_initializer=’he_normal’, input_shape=(n_features,)))

model.add(Dropout(0.4))

model.add(Dense(10, activation=’relu’, kernel_initializer=’he_normal’))

model.add(Dropout(0.4))

model.add(Dense(1, activation=’sigmoid’))

 

The complete instance of the MLP model with dropout is detailed here.

# k-fold cross-validation of the MLP with dropout for the ionosphere dataset

from numpy import mean

from numpy import std

from pandas import read_csv

from sklearn.model_selection import StratifiedKFold

from sklearn.preprocessing import LabelEncoder

from sklearn.metrics import accuracy_score

from tensorflow.keras import Sequential

from tensorflow.keras.layers import Dense

from tensorflow.keras.layers import Dropout

from matplotlib import pyplot

# load the dataset

path = ‘https://raw.githubusercontent.com/jbrownlee/Datasets/master/ionosphere.csv’

df = read_csv(path, header=None)

# split into input and output columns

X, y = df.values[:, :-1], df.values[:, -1]

# ensure all data are floating point values

X = X.astype(‘float32′)

# encode strings to integer

y = LabelEncoder().fit_transform(y)

# prepare cross validation

kfold = StratifiedKFold(10)

# enumerate splits

scores = list()

for train_ix, test_ix in kfold.split(X, y):

# split data

X_train, X_test, y_train, y_test = X[train_ix], X[test_ix], y[train_ix], y[test_ix]

# determine the number of input features

n_features = X.shape[1]

# define model

model = Sequential()

model.add(Dense(50, activation=’relu’, kernel_initializer=’he_normal’, input_shape=(n_features,)))

model.add(Dropout(0.4))

model.add(Dense(10, activation=’relu’, kernel_initializer=’he_normal’))

model.add(Dropout(0.4))

model.add(Dense(1, activation=’sigmoid’))

# compile the model

model.compile(optimizer=’adam’, loss=’binary_crossentropy’)

# fit the model

model.fit(X_train, y_train, epochs=100, batch_size=32, verbose=0)

# predict test set

yhat = model.predict_classes(X_test)

# evaluate predictions

score = accuracy_score(y_test, yhat)

print(‘>%.3f’ % score)

scores.append(score)

# summarize all scores

print(‘Mean Accuracy: %.3f (%.3f)’ % (mean(scores), std(scores)))

 

Running reports the mean and standard deviation of the classification precision at the conclusion of the run.

Your outcomes may demonstrate variance provided the stochastic nature of the algorithm or evaluation procedure, or variations in numerical accuracy. Take up running the instance a few times and contrast the average outcome.

In this scenario, we can observe that the MLP model with dropout accomplishes improved outcomes with a precision of about 94.6% contrasted to 93.4% without dropout.

Mean Accuracy: 0.946 (0.043)

Lastly, we will attempt reducing the batch size from 32 down to 8.

This will have the outcome of more noisy gradients and might additionally slow down the speed at which the model is learning the problem.

# fit the model

model.fit(X_train, y_train, epochs=100, batch_size=8, verbose=0)

 

The complete instance is detailed here.

# k-fold cross-validation of the MLP with dropout for the ionosphere dataset

from numpy import mean

from numpy import std

from pandas import read_csv

from sklearn.model_selection import StratifiedKFold

from sklearn.preprocessing import LabelEncoder

from sklearn.metrics import accuracy_score

from tensorflow.keras import Sequential

from tensorflow.keras.layers import Dense

from tensorflow.keras.layers import Dropout

from matplotlib import pyplot

# load the dataset

path = ‘https://raw.githubusercontent.com/jbrownlee/Datasets/master/ionosphere.csv’

df = read_csv(path, header=None)

# split into input and output columns

X, y = df.values[:, :-1], df.values[:, -1]

# ensure all data are floating point values

X = X.astype(‘float32′)

# encode strings to integer

y = LabelEncoder().fit_transform(y)

# prepare cross validation

kfold = StratifiedKFold(10)

# enumerate splits

scores = list()

for train_ix, test_ix in kfold.split(X, y):

# split data

X_train, X_test, y_train, y_test = X[train_ix], X[test_ix], y[train_ix], y[test_ix]

# determine the number of input features

n_features = X.shape[1]

# define model

model = Sequential()

model.add(Dense(50, activation=’relu’, kernel_initializer=’he_normal’, input_shape=(n_features,)))

model.add(Dropout(0.4))

model.add(Dense(10, activation=’relu’, kernel_initializer=’he_normal’))

model.add(Dropout(0.4))

model.add(Dense(1, activation=’sigmoid’))

# compile the model

model.compile(optimizer=’adam’, loss=’binary_crossentropy’)

# fit the model

model.fit(X_train, y_train, epochs=100, batch_size=8, verbose=0)

# predict test set

yhat = model.predict_classes(X_test)

# evaluate predictions

score = accuracy_score(y_test, yhat)

print(‘>%.3f’ % score)

scores.append(score)

# summarize all scores

print(‘Mean Accuracy: %.3f (%.3f)’ % (mean(scores), std(scores)))

 

Running reports the mean and standard deviation of the classification precision at the conclusion of the run.

Your outcomes may demonstrate variance provided the stochastic nature of the algorithm or evaluation process, or variations in numerical accuracy. Take up running the instance a few times and contrast the average outcome.

In this scenario, we can observe that the MLP model with dropout accomplishes a bit better outcomes with a precision of about 94.9 percentage.

Mean Accuracy: 0.949 (0.042)

We will leverage this configuration as our final model.

We could continue to test alternative configurations to the model architecture (more or reduced nodes or layers), learning hyperparameters (more or reduced batches), and data transforms.

Final Model and Make Predictions

After we select a model configuration, we can go about training a final model on all available data and leverage it to make forecasts on fresh data.

In this scenario, we will leverage the model with dropout and a small batch size as the final model.

We can prep the data and fit the model as prior, even though on the entire dataset rather than a training subset of the dataset.

# split into input and output columns

X, y = df.values[:, :-1], df.values[:, -1]

# ensure all data are floating point values

X = X.astype(‘float32′)

# encode strings to integer

le = LabelEncoder()

y = le.fit_transform(y)

# determine the number of input features

n_features = X.shape[1]

# define model

model = Sequential()

model.add(Dense(50, activation=’relu’, kernel_initializer=’he_normal’, input_shape=(n_features,)))

model.add(Dropout(0.4))

model.add(Dense(10, activation=’relu’, kernel_initializer=’he_normal’))

model.add(Dropout(0.4))

model.add(Dense(1, activation=’sigmoid’))

# compile the model

model.compile(optimizer=’adam’, loss=’binary_crossentropy’)

 

We can then leverage this model to make forecasts on fresh data.

Then, we can define a row of fresh data.

# define a row of new data

row = [1,0,0.99539,-0.05889,0.85243,0.02306,0.83398,-0.37708,1,0.03760,0.85243,-0.17755,0.59755,-0.44945,

0.60536,-0.38223,0.84356,-0.38542,0.58212,-0.32192,0.56971,-0.29674,0.36946,-0.47357,0.56811,-0.51171,

0.41078,-0.46168,0.21266,-0.34090,0.42267,-0.54487,0.18641,-0.45300]

 

This row was taken from the first row of the dataset and the predicted label is a ‘g’

We can then make a prediction.

# make prediction

yhat = model.predict_classes([row])

 

Then invert the transform on the prediction, so we can leverage or interpret the outcome in the correct label.

# invert transform to get label for class

yhat = le.inverse_transform(yhat)

 

And in this scenario, we will merely report the prediction.

# report prediction

print(‘Predicted: %s’ % (yhat[0]))

 

Connecting all of this together, the complete instance of fitting a final model for the ionosphere dataset and leveraging it to make a forecast on fresh data is detailed here.

# fit a final model and make predictions on new data for the ionosphere dataset

from pandas import read_csv

from sklearn.preprocessing import LabelEncoder

from sklearn.metrics import accuracy_score

from tensorflow.keras import Sequential

from tensorflow.keras.layers import Dense

from tensorflow.keras.layers import Dropout

# load the dataset

path = ‘https://raw.githubusercontent.com/jbrownlee/Datasets/master/ionosphere.csv’

df = read_csv(path, header=None)

# split into input and output columns

X, y = df.values[:, :-1], df.values[:, -1]

# ensure all data are floating point values

X = X.astype(‘float32′)

# encode strings to integer

le = LabelEncoder()

y = le.fit_transform(y)

# determine the number of input features

n_features = X.shape[1]

# define model

model = Sequential()

model.add(Dense(50, activation=’relu’, kernel_initializer=’he_normal’, input_shape=(n_features,)))

model.add(Dropout(0.4))

model.add(Dense(10, activation=’relu’, kernel_initializer=’he_normal’))

model.add(Dropout(0.4))

 

 

model.add(Dense(1, activation=’sigmoid’))

# compile the model

model.compile(optimizer=’adam’, loss=’binary_crossentropy’)

# fit the model

model.fit(X, y, epochs=100, batch_size=8, verbose=0)

# define a row of new data

row = [1,0,0.99539,-0.05889,0.85243,0.02306,0.83398,-0.37708,1,0.03760,0.85243,-0.17755,0.59755,-0.44945,

0.60536,-0.38223,0.84356,-0.38542,0.58212,-0.32192,0.56971,-0.29674,0.36946,-0.47357,0.56811,

-0.51171,0.41078,-0.46168,0.21266,-0.34090,0.42267,-0.54487,0.18641,-0.45300]

# make prediction

yhat = model.predict_classes([row])

# invert transform to get label for class

yhat = le.inverse_transform(yhat)

# report prediction

print(‘Predicted: %s’ % (yhat[0]))

 

Running the instance fits the model on the entire dataset and makes a forecast for a singular row of fresh data.

Your outcomes may demonstrate variance provided the stochastic nature of the algorithm or evaluation process, or variations in numerical accuracy. Take up running the example a few time and contrast the average outcome.

In this scenario, we can observe that the model forecasted a “g” label for the input row.

Predicted: g

Conclusion

In this guide, you found out how to develop a multilayer perceptron neural network model for the ionosphere binary classification dataset.

Particularly, you learned:

  • How to load and summarize the ionosphere dataset and leverage the outcomes to suggest information preparations and model configurations to leverage.
  • How to explore the learning dynamics of simple MLP models on the dataset.
  • How to produce robust estimates with regards to model performance, tune model performance and make forecasts on fresh data.
Add Comment