Heuristics, Approximation Algorithms, and Relaxations: An Introduction
Heuristics
While all NP-hard optimization problems are identical in terms of exact solvability, they may differ wildly from the approximative point of view. If the goal is to obtain an answer that is “good enough”, some problems become much easier (such as $\text{KNAPSACK}$), while others (such as $\text{CLIQUE}$) remain extremely hard. When efficient algorithms or traditional techniques do not exist, problems are solved using “heuristic methods”.
A heuristic algorithm or heuristic is a procedure that tends to give reasonable or near-optimal estimates for an optimization problem, but that does so by sacrificing optimality, completeness, accuracy, or precision for speed. Essentially, these are shortcuts, taken when other, more exact methods are either too show, unknown, or proven to be non-existent. Some heuristic methods may guarantee an optimal solution, but take exponential time; some may run in polynomial time but not guarantee an optimal (correct) solution. Some heuristics are based on a strong foundation of underlying theory, whereas others are based mostly on prior experience or “rules of thumb” (the latter are more prone to producing erroneous information). Heuristic methods underpin the entire field of Artificial Intelligence.
Heuristic methods can be broadly categorized into construction methods (greedy algorithms) and local search methods. Greedy algorithms attempt to make the optimal decision at each stage, in hopes that this will lead to the overall optimal decision. (For example, Dijkstra’s algorithm is a greedy algorithm.) Local search algorithms start with an initial solution, and explore the neighbourhood of the current solution. A new solution is chosen from this neighbourhood, replacing the initial solution. This process is run iteratively until a termination condition is met.
Examples of heuristics include backtrack search (plus its variants), mathematical programming methods, genetic algorithms, tabu search, and simulated annealing.
Approximation Algorithms
Typically, in computer science, when analyzing computational and space complexity of algorithms, we ignore constants and treat them as negligible. However, when viewing algorithmic complexity in real-world contexts, things like constants can actually matter. It is, in fact, possible for constants to get so large that an algorithm is functionally unusable. (See this page on “galactic algorithms”.) Just because an algorithm “runs in polynomial time” doesn’t mean it won’t be excrutiatingly slow or run out of memory on real-world machines. This is, for example, the case when examinating the “polynomial time” exact algorithms for solving the Densest Subgraph Problem.
When exact algorithms prove too cumbersome (or possibly, don’t exist), approximation algorithms can be an alternate method for attempting to solve problems. Instead of designing an algorithm that always produces the optimal answer, one can design an algorithm that always gets “close enough” to the optimal answer. While there is a tradeoff in terms of accuracy, there are generally huge gains in terms of speed or space complexity.
The goal of approximation algorithms is to provide fast (polynomial-time) algorithms that give some sort of guarantee on the “quality” of the solution found. Each instance of an optimization problem has a set of feasible solutions, and every optimization problem has an objective function $OPT$ that assigns a real or rational number to each instance $I$ of the problem. In a minimization problem, we are looking for the minimum of the objective function; in a maximization problem, we are looking for the maximum of the objective function.
Formally, we define approximation algorithms in the following way: Let $P$ be a problem (this means $P$ is the space of all possible inputs, for example, the set of all graphs), and let $I \in P$ be an instance to the problem ($I$ is a particular input, such as a specific graph $G$). Let $OPT(I)$ denote the value of the optimal solution for the problem (for example, the density of the densest subgraph of $G$). An algorithm $\mathcal{A}$ is called an $\alpha$-approximation for $P$ if:
For a maximization problem: $\forall I \in P,~ \alpha \in (0,1] ,~ \alpha \cdot OPT(I) \leq \mathcal{A}(I) \leq OPT(I)$ (lower bound).
For a minimization problem: $\forall I \in P,~ \alpha \geq 1 ,~ OPT(I) \leq \mathcal{A}(I) \leq \alpha \cdot OPT(I)$ (upper bound).
Note: some literature will use an alternative convention for max problems, which redefines $\alpha$ as being a number $\geq 1$ and $\mathcal{A}(I)$ as being greater than or equal to $\frac{1}{\alpha} \cdot OPT(I)$. This is to use the same (integer) constants for both max problems and min-problems. While this notation is also in use, I personally believe it is more confusing and therefore don’t use it.
If I have a $\frac{1}{5}$-approximation for a maximization problem, that means the algorithm always returns a value that is at least 20% of the optimal value. Approximation algorithms always return a value that is within a multiplicative factor of $OPT(I)$.
The “gold standard” for an approximation algorithm is a $(1-\epsilon)$-approximation algorithm (or $(1+\epsilon)$-approximation algorithm, for a minimization problem). These are algorithms which allow you to specify how much error ($\epsilon$) you’re willing to tolerate, at the cost of the number of iterations or runtime of the algorithm depending on $\epsilon$ in some way (usually some power of $\epsilon$ is in the denominator).
Remark: The approximation ratio of an algorithm for a min problem is the supremum, over all instances of the problem, of the ratio between values of the solution returned by the algorithm and the optimal solution (infimum for max. problems). Therefore this is a bound on the worst case performance of the algorithm.
Relaxations
Relaxations are an important technique in the design of approximation algorithms. The idea is that you loosen one of the constraints of a problem, enlarging the feasible solution space. This new problem should be more solvable than the initial problem. Then, the solution to the relaxation should give more insight into how to solve, or potentially approximate, the original solution.
A relaxation of a maximization problem is another maximization problem with two properties:
- The original problem’s set of feasible solutions is a subset of the relaxed problem’s set of feasible solutions.
- The relaxed problem’s solution is a lower bound for the original problem’s solution.
References and Further Reading
- Chekuri’s Notes from his Approximation Algorithms course (CS 583, Spring 2018, Introduction Lecture)
- Cornell University Computational Optimization Open Textbook, Heuristic Algorithms page
- The Wikipedia article on heuristics
- Local Search Algorithms in AI: A Comprehensive Guide
- Wikipedia article on Relaxations