### An intro to optimization mathematical programming

Regardless of it is a supervised learning problem or an unsupervised problem, there will be some optimization algorithm operating in the background. Nearly any classification, regression or clustering issue can be cast as an optimization problem.

In this guide, you will find out about optimization and some ideas that are connected to it.

After going through this guide, you will be aware of:

1] What is mathematical programming or optimization

2] Difference between a maximization and minimization problems.

3] Difference amongst local and global optimal solutions

4] Difference amongst constrained and unconstrained optimization

5] Difference between linear and non-linear programming

6] Instances of optimization

__Tutorial Summarization__

This tutorial is subdivided into two portions, which are:

- 1 Various introductory topics related to optimization

- Constrained v. unconstrained optimization
- Equality v. inequality constraints
- Feasible region

- 2 Instances of optimization within machine learning

__What is Optimization or Mathematical Programming?__

Within calculus and mathematics, the optimization issue is also referred to as mathematical programming. To detail this issue in simple terms, it is the mechanism through which we can identify an element, variable or quantity that best fits a set of provided criterion or constraints.

__Maximization v. Minimization Problems__

The simplest scenarios of optimization problems are minimization or maximization of scalar functions. If we possess a scalar function of one or additional variables, f(x_1, x_2, … x_n) then the following is an optimization issue.

Find x_1, x_2, …, x_n where f(x) is minimum

Or we can possess an equivalent maximization issue.

When we define functions quantifying errors or penalties, we apply a minimization issue. On the other side of things, if we possess a scalar function of a single or more variables, f(x_1, x_2, … x_n) then the following can be viewed as an optimization problem.

Find x_1, x_2, …, x_n where f(x) is minimum

Or we can possess an equal maximization problem.

When we go about defining functions quantifying errors or penalties, we apply a minimization problem. On the other side, if a learning algorithm builds a function modelling the precision of a method, we would maximize this function.

Several automated software utilities for optimisation, typically implement either a maximization problem or a minimization task but not both. Therefore, we can convert a maximization problem to a minimization problem (and vice versa) by including a negative sign to f(x), i.e.

Maximize f(x) w.r.r x is equal to Minimize -f(x) w.r.t. x

As the two issues are equal, we’ll only talk about either minimization or maximization issues in the remainder of the tutorial. The same rules and definitions are applicable to its equivalent.

__Global v. Local Optimum Points__

Within machine learning, we face encounter functions, which are very non-linear with a complicated landscape. It is doable that there is a point where the function has the lowest value within a small or local region around that point. Such a point is referred to as local minimum point.

This is in opposition to global minimum point, which is a point where the function possesses the least value over its entire domain. The following image displays local and global maximum points.

__Unconstrained v. Constrained Optimization__

There are several issues within machine learning, where we are interested in identifying the global optimum point with no constraints or limitations on the region in space. Such issues are referred to as unconstrained optimization problems.

At times we have to find a solution to an optimization problem subjected to specific constraints. These optimization problems are referred to as constrained optimization problems. For instance,

Minimize x^2 + y^2 subject to x+y <= 1

Instances of constrained optimization are:

- Identify minimum of a function when the total of variables in the domain must sum to one
- Identify minimum of a function such that specific vectors are normal to one another
- Identify minimum of a function such that specific domain variables lie in a specific range.

__Feasible Region__

All the points in space where the constraints on the issue hold true comprise the feasible region. An optimization algorithm looks for optimal points in the feasible region. The feasible region for the two variants of constraints is displayed in the figure of the next section.

For an unconstrained optimization problem, the entire domain of the function is a feasible region.

__Equality v. Inequality Constraints__

The constraints imposed in an optimization issue could be equality constraints or inequality constraints. The image below demonstrates the two variants of constraints.

__Linear v. Non-linear Programming__

An optimization problem where the function is linear and all equality or inequality constraints are also linear constraints is referred to as a linear programming problem.

If either the objective function is non-linear or one or more than one constraints is non-linear then we possess a non-linear programming problem.

To visualize the difference amongst linear and non-linear functions you can look at the figure below:

__Instances of Optimization within Machine Learning__

Detailed here are some renowned machine learning algorithms that employ optimization. You should remember that nearly all machine learning algorithms leverage some kind of optimization.

- Gradient descent within neural networks (unconstrained optimization)
- Method of Lagrange multiplier in support vector machines (constrained optimization)
- Principal component analysis (constrained optimization)
- Clustering through expectation maximization algorithm (constrained optimization)
- Logistic Regression (unconstrained optimization)
- Genetic algorithms in evolutionary learning algorithms (differing variations exist to solve both constrained and unconstrained optimization issues.)

__Extensions__

This section details some concepts for extending the guide that you might desire to explore

- Method of Lagrange multipliers
- Non-linear optimization techniques
- The simplex method

__Further Reading__

This part of the blog post furnishes additional resources on the subject if you are seeking to delve deeper.

*Tutorials*

Function of several variables and gradient vectors

Gradient descent for machine learning

Why optimization is critical in machine learning

How to select an optimization algorithm

*Books*

Thomas’ Calculus, 14^{th} Edition, 2017 (based on the original works of George B. Thomas, revised by Joel Hass, Christopher Heil, Maurice Weir.)

Calculus, 3^{rd} edition, 2017 (Glibert Strang)

Calculus, 8^{th} edition, 2015 (James Stewart)

__Conclusion__

In this guide, you found out what mathematical programming is or optimization problem. Particularly, you learned:

- Maximization v. minimization
- Constrained v. unconstrained optimization
- Why optimization is critical in machine learning