Data Science Fundamentals
Data Scientists tackle a wide range of real-life problems using data and various techniques. Mathematical optimisation, a powerful technique that can be applied to a wide range of problems in many domains, makes a great investment to Data Scientists’ toolkit. In this practical introductory post, we will familiarise with three popular optimisation libraries in Python: Google’s OR-Tools, IBM’s DOcplex and COIN-OR Foundation’s PuLP.
Mathematical optimisation is about finding optimal choice for a quantitative problem within predefined bounds. It has three components:
- Objective function(s): Tells us how good a solution is and allows us to compare solutions. An optimal solution is the one that maximises or minimises objective function depending on the use case.
▶ ️In some cases, there can be multiple objective functions. This adds complexity in determining what an optimal solution is.
▶ ️In some cases, there may be no objective function. Such optimisation problems are called feasibility problems. - Decision variable(s): Represents a value or values we want to find out, the answer we are looking for in a quantitative problem. Optimisation can be split into two kinds depending on the type of decision variables:
▶ ️ Discrete optimisation: Decision variables are discrete. Allocating timetable and finding shortest travel path between two locations are some examples of discrete optimisation. If you want to learn more about discrete optimisation, this course and/or this guide might be of interest to you.
▶ ️Continuous optimisation: Decision variables are continuous. You may have already heard of the term optimisation in the context of machine learning. Machine learning is one example area where continuous optimisation is used. If you want to learn more about continuous optimisation, you may find this tutorial useful. - Constraint(s): Defines feasible range of solutions for the decision variables.
▶ ️In some continuous optimisation problems, there may be no constraints. This is called unconstrained optimisation.