How to leverage Nelder-Mead Optimization in Python
The Nelder-Mead optimization algorithm is a broadly leveraged strategy for non-differentiable objective functions.
As such, it is typically referenced to as a pattern search algorithm and is leveraged as a local or global search procedure, challenging nonlinear and possibly noisy and multimodal function optimization problems.
In this guide, you will find out all about the Nelder-Mead optimization algorithm.
After going through this guide, you will be aware of:
- The Nelder-Mead optimization algorithm is a variant of pattern search that does not leverage function gradients.
- How to apply the Nelder-Mead algorithm for function optimization in Python.
- How to go about interpreting the results of the Nelder-Mead algorithm on noisy and multimodal objective functions
Tutorial Summarization
This tutorial is subdivided into three portions, which are:
1] Nelder-Mead algorithm
2] Nelder-Mead Example within Python
3] Nelder-Mead on Challenging Functions
- Noisy Optimization Problem
- Multimodal Optimization Problem
Nelder-Mead Algorithm
Nelder-Mead is an optimization algorithm named in honour of the developers of the strategy, John Nelder and Roger Mead.
The algorithm was detailed in their 1965 paper titled “A Simplex Method for Function Minimization” and has become a standard and broadly leveraged strategy for function optimization.
It is relevant for one-dimensional or multidimensional functions with numerical inputs.
Nelder-Mead is a pattern search optimization algorithm, which implies it does not need or leverage function gradient data and is relevant for optimization problems where the gradient of the function is unknown or cannot be reasonably computed.
It is typically leveraged for multidimensional nonlinear function optimization problems, even though it can get stuck in local optima.
A beginning point must be furnished to the algorithm, which might be the endpoint of another global optimization algorithm or a random point drawn from the domain.
Provided that the algorithm might get stuck, it might have advantages to reap from several restarts with differing starting points.
The algorithm functions by leveraging a shape structure (referred to as a simplex) which consists of n + 1 points (vertices), where n is the number of input dimensions to the function.
For instance, on a two-dimensional issue that might be plotted as a surface, the shape structure would be made up of three points indicated as a triangle.
The points of the shape structure are assessed and simplistic rules are leveraged to determine how to shift the points of the shape on the basis of their relative evaluation. This consists of operations like “reflection”, “expansion”, “contraction”, and “shrinkage” of the simplex shape on the surface of the objective function.
In a singular iteration of the Nelder-Mead algorithm, we look to remove the vertex with the worst function value and substitute it with another point with an improved value. The new point is gotten by reflecting, through expansion, or contracting the simplex along the line joining the worst vertex with the centroid of the remaining vertices. If we cannot identify a better point in this fashion, we retain just the vertex with the best function value, and we shrink the simplex by shifting all other vertices toward this value.
The search ceases when the points converge on an optimum, when a minimum difference between evaluations is looked at, or when a maximum number of function evaluations are carried out.
Now that we possess a high-level idea of how the algorithm operates, let’s observe how we might leverage that practically.
Nelder-Mead Example in Python
The Nelder-Mead optimization algorithm can be leveraged in Python through the minimize() function.
This function needs that the “method” argument be set to “nelder-mead” to leverage the Nelder-Mead algorithm. It takes the objective function to be reduced and preliminary point for the search.
[Control]
1 2 3 | … # perform the search result = minimize(objective, pt, method=’nelder-mead’) |
The outcome is an OptimizeResult object that consists of data about the outcome of the optimization accessible through keys.
For instance, the “success” boolean signifies whether the search was finished successfully or not, the “message” furnishes a human-readable message with regards to the success or failure of the search, and the “nfev” key signifies the number of function evaluations that were carried out.
Critically, the “x” key goes about specifying the input values that signify the optima identified by the search, if successful.
[Control]
1 2 3 4 5 | … # summarize the result print(‘Status : %s’ % result[‘message’]) print(‘Total Evaluations: %d’ % result[‘nfev’]) print(‘Solution: %s’ % result[‘x’]) |
We can illustrate the Nelder-Mead optimization algorithm on a well-behaved function to demonstrate that it can swiftly and efficiently identify the optima without leveraging any derivative data from the function.
In this scenario, we will leverage the x^2 function in dual dimension, defined in the range -5.0 to 5.0 with the known optima at [0.0, 0.0]
We can define the objective() function below:
[Control]
1 2 3 | # objective function def objective(x): return x[0]**2.0 + x[1]**2.0 |
We will leverage a random point in the defined domain as a beginning point for the search.
[Control]
1 2 3 4 5 | … # define range for input r_min, r_max = -5.0, 5.0 # define the starting point as a random sample from the domain pt = r_min + rand(2) * (r_max – r_min) |
The search can then be carried out. We leverage the default maximum number of function evaluations established through the “maxiter” and set to N*200, where n is equivalent to the number of input variables, which is dual in this scenario, for example, 400 evaluations.
[Control]
1 2 3 | … # perform the search result = minimize(objective, pt, method=’nelder-mead’) |
Following the conclusion of the search, we will report the total function evaluations leveraged to identify the optima and the success message of the search, which is predicted to be positive in this scenario.
[Control]
1 2 3 4 | … # summarize the result print(‘Status : %s’ % result[‘message’]) print(‘Total Evaluations: %d’ % result[‘nfev’]) |
Lastly, we will recover the input values for located optima, evaluate it leveraging the objective function, and report both in a human-readable fashion.
[Control]
1 2 3 4 5 | … # evaluate solution solution = result[‘x’] evaluation = objective(solution) print(‘Solution: f(%s) = %.5f’ % (solution, evaluation)) |
Connecting this together, the complete instance of leveraging the Nelder-Mead optimization algorithm on a simplistic convex objective function is detailed here.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | # nelder-mead optimization of a convex function from scipy.optimize import minimize from numpy.random import rand
# objective function def objective(x): return x[0]**2.0 + x[1]**2.0
# define range for input r_min, r_max = -5.0, 5.0 # define the starting point as a random sample from the domain pt = r_min + rand(2) * (r_max – r_min) # perform the search result = minimize(objective, pt, method=’nelder-mead’) # summarize the result print(‘Status : %s’ % result[‘message’]) print(‘Total Evaluations: %d’ % result[‘nfev’]) # evaluate solution solution = result[‘x’] evaluation = objective(solution) print(‘Solution: f(%s) = %.5f’ % (solution, evaluation)) |
Running the instance carries out the optimization, then reports the results.
Your outcomes may demonstrate variance provided the stochastic nature of the algorithm or evaluation process, or variations in numerical accuracy. Take up executing the example a few times and contrast the average outcome.
In this scenario, we can observe that the search was successful, as we expected, and was finished following 88 function evaluations.
We can observe that the optima was situated with inputs very close to [0,0], which evaluates to the minimum objective value of 0.0
1 2 3 | Status: Optimization terminated successfully. Total Evaluations: 88 Solution: f([ 2.25680716e-05 -3.87021351e-05]) = 0.00000 |
Now that we have observed how to leverage the Nelder-Mead optimization algorithm successfully, let’s observe some instances where it does not do so well.
Nelder-Mead on Challenging Functions
The Nelder-Mead optimization algorithm operates well for an array of challenging nonlinear and non-differentiable objective functions.
Nonetheless, it can get stuck on multimodal optimization issues and noisy problems.
To make this concrete, let’s look at an instance of each.
Noisy Optimization Problem
A noisy objective function is a function that provides differing solutions every time the same input is assessed.
We can make an objective function artificially noisy by including small Gaussian random numbers to the inputs before their evaluation.
For instance, we can go about defining a 1D variant of the x^2 function and leverage the randn() function to include small Gaussian random numbers to the input with a mean of 0.0 and a standard deviation of 0.3.
[Control]
1 2 3 | # objective function def objective(x): return (x + randn(len(x))*0.3)**2.0 |
The noise will make the function a challenge to optimize for the algorithm and it will very probably not identify the optima at x=0.0.
The complete instance of leveraging Nelder-Mead to optimize the noisy objective function is detailed below.
[Control]
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | # nelder-mead optimization of noisy one-dimensional convex function from scipy.optimize import minimize from numpy.random import rand from numpy.random import randn
# objective function def objective(x): return (x + randn(len(x))*0.3)**2.0
# define range for input r_min, r_max = -5.0, 5.0 # define the starting point as a random sample from the domain pt = r_min + rand(1) * (r_max – r_min) # perform the search result = minimize(objective, pt, method=’nelder-mead’) # summarize the result print(‘Status : %s’ % result[‘message’]) print(‘Total Evaluations: %d’ % result[‘nfev’]) # evaluate solution solution = result[‘x’] evaluation = objective(solution) print(‘Solution: f(%s) = %.5f’ % (solution, evaluation)) |
Running the instance carries out the optimization, then reports the outcomes.
Your outcomes may demonstrate variance provided the stochastic nature of the algorithm or evaluation process, or variations in numerical accuracy. Take up running the instance a few times and contrast the average outcome.
In this scenario, the algorithm does not converge and rather leverages the maximum number of function evaluations, which is 200.
[Control]
1 2 3 | Status: Maximum number of function evaluations has been exceeded. Total Evaluations: 200 Solution: f([-0.6918238]) = 0.79431 |
The algorithm might converge on some runs of the code but will arrive on a point away from the optima.
Multimodal Optimization Problem
Several nonlinear objective functions might have several optima, referenced to as multimodal problems.
The issue might be structured such that it has several global optima that have an equivalent function evaluation, or a singular global optima and several local optima where algorithms like the Nelder-Mead can get stuck looking for the local optima.
The Ackley function is an instance of the latter. It is a 2D objective function that possesses a global optima and several local optima.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | # ackley multimodal function from numpy import arange from numpy import exp from numpy import sqrt from numpy import cos from numpy import e from numpy import pi from numpy import meshgrid from matplotlib import pyplot from mpl_toolkits.mplot3d import Axes3D
# objective function def objective(x, y): return -20.0 * exp(-0.2 * sqrt(0.5 * (x**2 + y**2))) – exp(0.5 * (cos(2 * pi * x) + cos(2 * pi * y))) + e + 20
# define range for input r_min, r_max = -5.0, 5.0 # sample input range uniformly at 0.1 increments xaxis = arange(r_min, r_max, 0.1) yaxis = arange(r_min, r_max, 0.1) # create a mesh from the axis x, y = meshgrid(xaxis, yaxis) # compute targets results = objective(x, y) # create a surface plot with the jet color scheme figure = pyplot.figure() axis = figure.gca(projection=’3d’) axis.plot_surface(x, y, results, cmap=’jet’) # show the plot pyplot.show() |
Running the instance creates the surface plot of the Ackley function displaying the major number of local optima.
We would expect the Nelder-Mead function to get stuck in one of the local optima while looking for the global optima.
Initially, when the simplex is big, the algorithm might jump over several local optima, but as experiences contraction, it is bound to get stuck.
We can look into this with the instance here that illustrates the Nelder-Mead algorithm on the Ackley function.
[Control]
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | # nelder-mead for multimodal function optimization from scipy.optimize import minimize from numpy.random import rand from numpy import exp from numpy import sqrt from numpy import cos from numpy import e from numpy import pi
# objective function def objective(v): x, y = v return -20.0 * exp(-0.2 * sqrt(0.5 * (x**2 + y**2))) – exp(0.5 * (cos(2 * pi * x) + cos(2 * pi * y))) + e + 20
# define range for input r_min, r_max = -5.0, 5.0 # define the starting point as a random sample from the domain pt = r_min + rand(2) * (r_max – r_min) # perform the search result = minimize(objective, pt, method=’nelder-mead’) # summarize the result print(‘Status : %s’ % result[‘message’]) print(‘Total Evaluations: %d’ % result[‘nfev’]) # evaluate solution solution = result[‘x’] evaluation = objective(solution) print(‘Solution: f(%s) = %.5f’ % (solution, evaluation)) |
Your outcomes may demonstrate variance provided the stochastic nature of the algorithm or evaluation procedure, or variations in numerical accuracy. Take up running the instance a few times and contrast the average outcome.
In this scenario, we can observe that the search finished successfully but did not identify the global optima. It got stuck and identified a local optima.
Every time we execute the instance, we will identify a differing local optima provided the different random beginning point for the search.
[Control]
1 2 3 | Status: Optimization terminated successfully. Total Evaluations: 62 Solution: f([-4.9831427 -3.98656015]) = 11.90126 |
Further Reading
This section furnishes additional resources on the subject if you are seeking to delve deeper.
Papers
A Simplex Method For Function Minimization, 1965
Books
Algorithms for Optimization, 2019
Numerical Optimization, 2006
APIs
Nelder-Mead Simplex algorithm (method = ‘Nelder-Mead’)
scipy.optimize.minimize API
scipy.optimize.OptimizeResult API
numpy.random.randn API
Articles
Nelder-Mead method, Wikipedia
Nelder-Mead algorithm, Wikipedia
Conclusion
In this guide, you found out about the Nelder-Mead optimization algorithm.
Particularly, you learned:
- The Nelder-Mead optimization algorithm is a variant of pattern search that does not leverage function gradients.
- How to apply the Nelder-Mead algorithm for function optimization within Python.
- How to interpret the outcomes of the Nelder-Mead algorithm on noisy and multimodal objective functions.