Applied Deep Learning in Python mini-course
Deep learning is an enthralling domain of study and the strategies are accomplishing world class outcomes in an array of challenging machine learning problems. It can be tough to get a head start in machine learning. Which library should you leverage and which strategies should you concentrate on?
In this blog article, you will be introduced to a 14-part crash course into deep learning in Python with the simple to leverage and potent Keras library.
This mini-course is aimed at python machine learning practitioners that are already comfortable with scikit-learn on the SciPy ecosystem for machine learning.
Who is this mini-course targeted at?
Prior to beginning, let’s ensure you are in the correct place. The list below furnishes some generic guidelines as to who this course was developed for.
Don’t panic if you do not match these points in an exact fashion, you might just require to brush up in one sphere or another to keep up the pace.
- Devs that are aware of how to author a bit of code: This implies that is not a major deal for you to complete things with Python and know how to install the SciPy ecosystem on your workstation, which is a prerequisite. It does not imply you’re an expert coder, however, it does imply you’re not frightened to setup packages and author scripts.
- Developers that are aware of a little machine learning: This implies you know about the fundamentals of machine learning such as cross validation, a few algorithms and the bias-variance trade-off. It doesn’t imply that you are a machine learning PhD, just that you are aware of the landmarks or know where to look them up.
This mini-course is not a textbook on Deep Learning.
It will take you from a developer that is aware of a bit of machine learning in Python to a developer who can obtain results and bring the power of Deep Learning to your own projects.
Mini-course overview (what to expect)
This mini-course is subdivided into fourteen portions.
Every lesson was developed to take the average developer approximately half an hour. You might complete some much quicker and other you might select to delve deeper and spend additional time.
You can finish every part as swiftly or as slowly as you like. A comfortable schedule may be to finish one lesson each day during a two-week period. It comes very recommended.
The subjects you will cover during the course of the next 14 lessons are as follows:
1] Intro to Theano
2] Intro to TensorFlow
3] Intro to Keras
4] Crash Course in Multi-Layer Perceptrons
5] Produce your first neural network in Keras.
6] Leverage Keras models with Scikit-learn
7] Plot model training history
8] Save your best model during training with checkpointing
9] Minimize overfitting with dropout regularization
10] Lift performance with learning rate schedules
11] Crash course in convolutional neural networks
12] Handwritten digit recognition
13] Object recognition in small photographs
14] Enhance generalization with data augmentation
Lesson 01: Intro to Theano
Theano is a Python library for quick numerical computation to assist in the development of deep learning models.
At it’s core, Theano is a compiler for mathematical expressions in Python. It is aware of how to take your structures and convert them into really effective code that leverages NumPy and effective native libraries to run as quickly as possible on CPUs or GPUs.
The actual syntax of Theano expressions is symbolic, which might be off-putting to beginners used to regular software development. Particularly, expressions are defined in the abstract sense, compiled and later actually leveraged to make calculations.
In this lesson your objective is to setup Theano and author a small instance that demonstrates the symbolic nature of Theano programs.
For instance, you can setup Theano leveraging pip as follows:
sudo pip install Theano
A small instance of a Theano program that you can leverage as a beginning point is detailed below.
[Control]
1 2 3 4 5 6 7 8 9 10 11 12 13 | import theano from theano import tensor # declare two symbolic floating-point scalars a = tensor.dscalar() b = tensor.dscalar() # create a simple expression c = a + b # convert the expression into a callable object that takes (a,b) # values as input and computes a value for c f = theano.function([a,b], c) # bind 1.5 to ‘a’, 2.5 to ‘b’, and evaluate ‘c’ result = f(1.5, 2.5) print(result) |
Lesson 02: Intro to TensorFlow
TensorFlow is a Python library for quick numerical computing developed and put out by Google. Like Theano, TensorFlow is aimed to be leveraged to produce deep learning models.
With the support of Google, probably leveraged in some of its production systems and leveraged by the Google DeepMind research group, it is a platform that we can’t ignore.
Not like Theano, TensorFlow does have more of a production concentration with a capacity to run on CPUs, GPUs, and even very big clusters.
In this lesson, you objective is to setup TensorFlow become familiar with the syntax of the symbolic expressions leveraged in TensorFlow programs.
For instance, you can setup TensorFlow leveraging pip:
sudo pip install TensorFlow
A small instance of a TensorFlow program that you can leverage as a beginning point is detailed below:
[Control]
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | # Example of TensorFlow library import tensorflow as tf import tensorflow.compat.v1 as tf tf.disable_v2_behavior() # declare two symbolic floating-point scalars a = tf.placeholder(tf.float32) b = tf.placeholder(tf.float32) # create a simple symbolic expression using the add function add = tf.add(a, b) # bind 1.5 to ‘a’, 2.5 to ‘b’, and evaluate ‘c’ sess = tf.Session() binding = {a: 1.5, b: 2.5} c = sess.run(add, feed_dict=binding) print(c) |
Know more about TensorFlow on the TensorFlow homepage.
Lesson 03: Intro to Keras
A complication of both Theano and TensorFlow is that it can take a ton of code to develop even really simple neural network models.
These libraries were developed mainly as a platform for research and development more than for the practical concerns of applied deep learning.
The Keras library tackles these concerns by furnishing a wrapper for both Theano and TensorFlow. It furnishes a clean and simplistic API that enables you to define and assess deep learning frameworks in minimal lines of code.
Due to the simplicity of use and because it harnesses the power of Theano and TensorFlow. Keras is swiftly becoming the go-to library for applied deep learning.
The concentration of Keras is the concept of a model. The life-cycle of a model can be summarized as follows:
- Define your model: Develop a sequential model and include configured layers
- Compile your model: Mention loss function and optimizers and call the compile() function on the model.
- Fit your model: Train the model on a sample of data by calling the fit() function on the model.
- Make forecasts: Leverage the model to produce forecasts on fresh data by calling functions like evaluate() or forecast() on the model.
Your objective for this lesson is to setup Keras.
For instance, you can setup Keras leveraging pip:
sudo pip install keras
Begin to familiarize yourself with the Keras library ready for the upcoming lessons where we will go about implementing our first model.
Lesson 04: Crash Course in Multi-Layer Perceptrons
Artificial neural networks are an enthralling sphere of study, even though they can be intimidating when just beginning.
The domain of artificial neural networks is usually just referred to as neural networks or multi-layer perceptrons after probably the most useful variant of neural network.
The building block for neural networks are artificial neurons. These are simplistic computational units that possess weighted input signals and generate an output signal leveraging an activation function.
Neurons are arranged into networks of neurons. A row of neurons is referred to as a layer and a single network can have several layers. The architecture of the neurons in the network is typically referred to as the network topology.
Upon configuration, the neural network requires to be trained on your dataset. The traditional and still preferred training algorithm for neural networks is referred to as stochastic gradient descent.
Your objective for this lesson is to become familiar with neural network terminology.
Delve a bit deeper into terminology such as neurons, weights, activation function, learning rate and more.
Lesson 05: Develop your first neural network in Keras
Keras enables you to develop and assess deep learning models in very minimal lines of code.
In this lesson, your objective is to produce your first neural network leveraging the Keras Library.
Leverage a conventional binary (two-class) classification dataset from the UCI Machine Learning Repository, like the Pima Indians onset of diabetes or the ionosphere datasets.
Put together code to accomplish the following:
- Load your dataset leveraging NumPy or Pandas.
- Define your neural network model and compile it.
- Fit your model to the dataset.
- Estimate the performance on your model on unobserved data.
To provide you with a massive kick start, listed here is a complete working instance that you can leverage as a beginning point.
Download the dataset and place it in your present working directory.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | from keras.models import Sequential from keras.layers import Dense # Load the dataset dataset = numpy.loadtxt(“pima-indians-diabetes.csv”, delimiter=”,”) X = dataset[:,0:8] Y = dataset[:,8] # Define and Compile model = Sequential() model.add(Dense(12, input_dim=8, activation=’relu’)) model.add(Dense(8, activation=’relu’)) model.add(Dense(1, activation=’sigmoid’)) model.compile(loss=’binary_crossentropy’ , optimizer=’adam’, metrics=[‘accuracy’]) # Fit the model model.fit(X, Y, epochs=150, batch_size=10) # Evaluate the model scores = model.evaluate(X, Y) print(“%s: %.2f%%” % (model.metrics_names[1], scores[1]*100)) |
Now produce your own model on a differing dataset, or adapt this instance.
Lesson 06: Leverage Keras Models with Scikit-learn
The scikit-learn library is a general purpose ML framework in Python developed on top of SciPy.
Scikit-learn excels at activities like assessing model performance and optimization of model hyperparameters in just a few lines of code.
Keras furnishes a wrapper class that enables you to leverage your deep learning models with scikit-learn. For instance, an instance of KerasClassifier class in Keras can wrap your deep learning model and can be leveraged as an Estimator in scikit-learn.
When leveraging the KerasClassifier class, you must mention the name of a function that the class can use to define and compile your model. You can also pass extra parameters to the constructor of the KerasClassifier class that will be passed to the model.fit() call later, like the number of epochs and batch size.
In this lesson your objective is to generate a deep learning model and assess it leveraging k-fold cross validation.
For instance, you can define an example of the KerasClassifier and the custom function to develop your model as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | # Function to create model, required for KerasClassifier def create_model(): # Create model model = Sequential() … # Compile model model.compile(…) return model
# create classifier for use in scikit-learn model = KerasClassifier(build_fn=create_model, nb_epoch=150, batch_size=10) # evaluate model using 10-fold cross validation in scikit-learn kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed) results = cross_val_score(model, X, Y, cv=kfold) |
Learn more about leveraging your Keras deep learning models with scikit-learn on the Wrappers for the Scikit-Learn API webpage.
Lesson 07: Plot Model Training History
You can learn a ton about neural networks and deep learning models by looking into their performance with the passage of time during training.
Keras furnishes the capacity to register callbacks when training a deep learning model.
One of the default callbacks that is registered when training all deep learning models is the history callback. It documents training metrics for every epoch. This consists of the loss and the precision (for classification problems) in addition to the loss and precision for the validation dataset, if one is set.
The history object is returned from calls to the fit() function leveraged to train the model. Metrics are recorded in a dictionary in the history member of the object returned.
Your objective for this lesson is to look into the history object and develop plots of model performance during the course of training.
For instance, you can print the listing of metrics gathered by your history object as follows:
[Control]
1 2 3 | # list all data in history history = model.fit(…) print(history.history.keys()) |
Lesson 08: Save your best model during training with checkpointing
Application checkpointing is a fault tolerance for long running procedures.
The Keras Library furnishes a checkpointing capacity by a callback API. The ModelCheckpoint callback class enables you to define where to checkpoint the model weights, how the file should be named and under what scenarios to make a checkpoint of the model.
Checkpoint can be useful to maintain track of the model weights in the scenario your training run is ceased prematurely. It is also useful to retain track of the best model observed during training.
In this lesson, your objective is to leverage the ModelCheckpoint callback in Keras to maintain track of the ideal model observed in the course of training.
You could define a ModelCheckpoint that saves network weights to the same file every time an enhancement is observed. For instance:
[Control]
1 2 3 4 5 6 | from keras.callbacks import ModelCheckpoint … checkpoint = ModelCheckpoint(‘weights.best.hdf5′, monitor=’val_accuracy’, save_best_only=True, mode=’max’) callbacks_list = [checkpoint] # Fit the model model.fit(…, callbacks=callbacks_list) |
Lesson 09: Minimize overfitting with Dropout regularization
A major issue with neural networks is that they have a tendency to overlearn your training dataset.
Dropout is a simple way yet really efficient strategy for minimizing dropout and has proven useful in large deep learning models.
Dropout is a strategy where randomly chosen neurons are ignored during the course of training. They are dropped-out randomly. This implies that their contribution to the activation of downstream neurons is temporally removed on the forward pass and any weight updates are not applied to the neuron on the backward pass.
You can include a dropout layer to your deep learning model leveraging the Dropout layer class.
In this lesson your objective is to experiment with adding dropout at differing points in your neural network and set to differing probability of dropout values.
For instance, you can develop a dropout layer with the probability of 20% and include it to your model as follows:
[Control]
1 2 3 | from keras.layers import Dropout … model.add(Dropout(0.2)) |
Lesson 10: Lift Performance with Learning Rate Schedules
You can usually get a boost in the performance of your model by leveraging a learning rate schedule.
Typically referred to as an adaptive learning rate or an annealed learning rate, this is a strategy where the learning rate leveraged by stochastic gradient descent alters while training your model.
Keras has a time-driven learning rate schedule built into the implementation of the stochastic gradient descent algorithm in the SGD class.
When developing the class, you can mention the decay which is the amount that your learning rate (also mentioned) will reduce every epoch. When leveraging learning rate decay you should bump up your preliminary learning rate and consider including a large momentum value like 0.8 or 0.9.
Your objective in this lesson is to experiment with the time-based learning rate schedule built into Keras.
For instance, you can mention a learning rate schedule that begins at 0.1 and drops by 0.0001 every epoch as follows:
1 2 3 4 | from keras.optimizers import SGD … sgd = SGD(lr=0.1, momentum=0.9, decay=0.0001, nesterov=False) model.compile(…, optimizer=sgd) |
Lesson 11: Crash Course in Convolutional Neural Networks
Convolutional neural networks are a potent artificial neural network strategy.
They expect and maintain the spatial relationship amongst pixels in images by learning internal failure representations leveraging small squares of input data.
Features are learned and leveraged throughout the entire image, allowing for the objects in your imagery to be shifted or translated in the scene and still identifiable by the network. It is the reason why this variant of network is so useful for object recognition in photographs, picking out digits, faces, objects and so on with varying orientation.
There are three variants of layers in a Convolutional Neural Network.
- Convolutional Layers: comprised of filters and feature maps.
- Pooling Layers that down sample the activations from feature maps.
- Fully-connected layers that plug on the end of the model and can be leveraged to make predictions.
In this lesson your objective is to familiarize yourself with the terms used when detailing convolutional neural networks.
This may need a bit of research on your behalf.
Don’t be too concerned about how they function just yet, just learn the terminology and configuration of the several layers leveraged in this variant of network.
Lesson 12: Handwritten Digit Recognition
Handwriting digit recognition is a difficult computer vision classification problem.
The MNIST dataset is a standard problem for assessing algorithms on the problem of handwriting digit recognition. It consists of 60k images of digits that can be leveraged to train a model, and 10,000 images that can be leveraged to assess its performance.
State of the art outcomes can be accomplished on the MNIST problem leveraging convolutional neural networks. Keras makes loading the MNIST dataset dead easy.
In this lesson, your objective is to develop a very simple convolutional neural network for the MNIST problem consisted of one convolutional layer, a single max pooling layer and one dense layer to make forecasts.
For instance, you can load the MNIST dataset in Keras as follows:
[Control]
1 2 3 | from keras.datasets import mnist … (X_train, y_train), (X_test, y_test) = mnist.load_data() |
It might take a moment to download the files to your computer.
Pro-tip: the Keras Conv2D layer that you will leverage as your first hidden layer expects image data in the format width x height x channels, where the MNIST data has 1 channel as the images are gray scale and a width and height of 28 pixels. You can easily reshape the MNIST dataset as follows:
X_train = X_train.reshape((X_train.shape[0], 28, 28, 1))
X_test = X_test.reshape((X_test.shape[0], 28, 28, 1))
You will also require to one-hot encode the output class value, that Keras also furnishes a handy helper function to accomplish.
1 2 3 4 | from keras.utils import np_utils … y_train = np_utils.to_categorical(y_train) y_test = np_utils.to_categorical(y_test) |
As a last tip, here is a model definition that you can leverage as a launch pad.
1 2 3 4 5 6 7 8 | model = Sequential() model.add(Conv2D(32, (3, 3), padding=’valid’, input_shape=(28, 28, 1), activation=’relu’)) model.add(MaxPooling2D()) model.add(Flatten()) model.add(Dense(128, activation=’relu’)) model.add(Dense(num_classes, activation=’softmax’)) model.compile(loss=’categorical_crossentropy’, optimizer=’adam’, metrics=[‘accuracy |
Lesson 13: Object Recognition in Small Photographs
Object recognition is a problem where your model must provide an indication as to what is in a photgraph.
Deep learning models accomplish state of the art outcomes in this problem leveraging deep convolutional neural networks.
A widespread traditional dataset for assessing models on this variant of problem is referred to as CIFAR-10. It consists of 60k small photographs, each of one of 10 objects, like a cat, ship or airplane.
As with the MNIST dataset, Keras furnishes a convenient function that you can leverage to load the dataset, and it will download it to your device the first time you attempt to load it. The dataset comes in at 163MB so it might take a few minutes to download.
Your objective in this lesson is to generate a deep convolutional neural network for CIFAR-10 dataset. We would recommend a repetitive pattern of convolution and pooling layers. Consider experimenting with drop-out and protracted training times.
For instance, you can load the CIFAR-10 dataset in Keras and prep it for leveraging with a convolutional neural network as follows.
[Control]
1 2 3 4 5 6 7 8 9 10 11 | from keras.datasets import cifar10 from keras.utils import np_utils # load data (X_train, y_train), (X_test, y_test) = cifar10.load_data() # normalize inputs from 0-255 to 0.0-1.0 X_train = X_train.astype(‘float32’) X_test = X_test.astype(‘float32’) X_train = X_train / 255.0 X_test = X_test / 255.0 # one hot encode outputs y_train = np_utils.to_categorical(y_train) y_test = np_utils.to_categorical(y_test) |
Lesson 14: Enhance generalization with Data Augmentation
Data prep is needed when operating with neural network and deep learning models.
Increasingly data augmentation is also needed on more complicated object recognition activities. This is where imagery within your dataset are altered with random flips and shifts. This basically increases the size of your training dataset and assists your model to generalize the position and orientation of objects in images.
Keras furnishes an image augmentation API that will develop altered versions of imagery in your dataset just in-time. The ImageDataGenerator class can be leveraged to define the image augmentation operations to execute which can be fitted to a dataset and then leveraged in place of your dataset during training of your model.
Your objective with this lesson is to experiment with the Keras image augmentation API leveraging a dataset you are already acquainted with a prior lesson like MNIST or CIFAR-10.
For instance, the example below develops arbitrary annotations of up to 90 degrees of imagery in the MNIST dataset.
[Control]
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | # Random Rotations from keras.datasets import mnist from keras.preprocessing.image import ImageDataGenerator from matplotlib import pyplot # load data (X_train, y_train), (X_test, y_test) = mnist.load_data() # reshape to be [samples][pixels][width][height] X_train = X_train.reshape((X_train.shape[0], 28, 28, 1)) X_test = X_test.reshape((X_test.shape[0], 28, 28, 1)) # convert from int to float X_train = X_train.astype(‘float32’) X_test = X_test.astype(‘float32’) # define data preparation datagen = ImageDataGenerator(rotation_range=90) # fit parameters from data datagen.fit(X_train) # configure batch size and retrieve one batch of images for X_batch, y_batch in datagen.flow(X_train, y_train, batch_size=9): # create a grid of 3 * 3 images for i in range(0, 9): pyplot.subplot(330 + 1 + i) pyplot.imshow(X_batch[i].reshape(28, 28), cmap=pyplot.get_cmap(‘gray’)) # show the plott pyplot.show() break |
Deep Learning Mini-Course Review
Well done, you have reached the conclusion of the course.
Let’s do a short retrospective and look at how far we have come:
- You found out all about deep learning libraries in python including the potent numerical libraries Theano and TensorFlow and the simple to utilize Keras library for applied deep learning.
- You developed your first neural network leveraging Keras and learned how to use your deep learning models with scikit-learn and how to recover and plot the training history for your models.
- You learned more about more sophisticated strategies like dropout regularization and learning rate schedules and how you can leverage these strategies in Keras.
- Lastly, you too the next step and learned about produced convolutional neural networks for complicated computer vision activities and learned about augmentation of image data.
Do not make light of this, you have travelled a long way in a brief amount of time. This is just the start of your journey with deep learning in Python. Keep practicing and developing your skills.