Polynomial Time and “Efficient” Algorithms

Jimmy Wu
Probably Approximately Correct
6 min readJul 21, 2016

Why do computational theorists consider the set of “efficient” algorithms to be those that run in polynomial time? Why not some other limit, like O(n³)? This same question is also often phrased, a little accusingly:

Do theorists actually consider an O(n¹⁰⁰)-time algorithm fast in practice?

Let’s get this out of the way: the answer is unequivocally no.

One then wonders, does the claim “polynomial-time ≡ efficient” still have a compelling argument, something beyond simply “fast in practice”? This a very fair question, yet one that algorithms courses rarely address. No explanation appears in my favorite algorithms textbook, nor until page 1,000+ in the encyclopedia pretending to be an educational text used by most universities.

Here we’ll explore a few justifications for, and limitations of, “poly(n) as efficient”.

Reason 1: Polynomial Properties and a Programmer’s Intuition

Consider an elementary fact:

Polynomials are closed under addition and multiplication: if p(x) and q(x) are polynomials, then so are p(x) + q(x) and p(x)q(x).

Now suppose a programmer wants to formally define “efficient”. They reason as follows:

  • Should an algorithm with running time O(n) be considered efficient? Of course—linear time is needed merely to read its input.
  • Naturally, we use programs as subroutines within larger programs. It’s sensible to say that if an algorithm B calls an efficient algorithm A in an efficient manner, then B should also be considered efficient. For example,
procedure B(x):
n = |x|
do n times:
A(x) // A is some efficient procedure
  • Ultimately, the main procedure will sit atop a number of subroutine layers that is some constant (because the codebase is fixed).

But every time we nest a procedure within another, we multiply their running times; whenever two procedures are placed in series, we add them. The possible time bounds from this nested construction make up precisely the polynomials: O(n^c) for any fixed c>0.

A second simple property:

Polynomials are closed under composition: if p(x) and q(x) are polynomials, then so is p(q(x)).

This means that if we define “efficient” as “polynomial-time”, then we are allowed to perform another natural operation: compose programs. Under this definition, we can write the following:

procedure D(x):
return C(B(A(x))) // C, B, and A are all efficient procedures

and the overall procedure D is still efficient in terms of the original input x. In fact, any constant number of compositions is fine. This is because when we compose programs, their running times are also composed. In particular the output of A is possibly larger than x, but no more than polynomially larger; therefore B also spends at most poly(|x|) time, and so forth.

Sanity check: Neither the composition of programs, nor the nesting of subroutines discussed earlier, work out if we define “efficient” as taking only, say, O(n³) time. The difference:

The set {polynomials of degree at most 3} is not closed under composition or multiplication.

All this is to say that from the perspective of computer programs, polynomial time is very well-motivated and forms the basis of a clean theory of complexity.

Reason 2: Robustness to Variations in Machine Model

Another wonderful property of polynomial time is that it is robust to different assumptions about the computer hardware. For example:

  • In the world of high-level programming abstractions, we are accustomed to thinking of computers as random access machines (RAM), in which accessing any location in memory is O(1)-time (sometimes instead defined as O(log N)-time to reach address N).
  • Meanwhile in the more primitive Turing machine (TM) model, accessing an arbitrary memory location involves sliding the head of the machine along a tape, one cell at a time, until the desired location is reached.
Cartoonized Turing machine

Fortunately, it is known that if a RAM can compute something in T(n) steps, then a TM can do it in O(T(n)²) steps. Thus the complexity class P (problems solvable in poly(n) time) is the same on both Turing machines and RAM.

More generally, complexity theorists hypothesize that any reasonable model of computation in the Universe can be simulated, with at most polynomial overhead, on a Turing machine. This so-called complexity-theoretic Church-Turing thesis has withstood many decades of new models of computation. (Recently it is challenged by quantum computing, though it’s a wide-open question whether scalable quantum computers are physically realizable. Even if they are, they almost certainly will not crack NP-complete problems.)

Reason 3: Anecdotal Arguments from the Research History

One thing we should easily agree on is that algorithms expending exponential time—that is, Ω(2^poly(n))—should be considered inefficient on principle. After all, for many problems, O(2^poly(n)) can be achieved by the dumbest algorithm of all: “exhaustively try all possible solutions”. This method solves all problems in the large complexity class NP.

NP: problems for which we can check proposed solutions in polynomial time.

A nearly-as-trivial idea works for all problems in an even larger class:

PSPACE: problems solvable in polynomial space, and allowed unlimited time.

In contrast, a polynomial-time algorithm is a special object, because it means that the overwhelming majority of potential solutions to the problem can be cleverly dismissed. In other words, polynomial time represents a clear antithesis to brute-force search.

“When you develop a polynomial-time algorithm for a problem, that’s really amazing. You’ve really done a good day’s work. There was this man named Jack Edmonds who, in his time in the 60’s, found these amazing algorithms for some very complicated problems — polynomial-time algorithms. I know for a fact that he slept very well that night.”
— Christos Papadimitriou

To be sure, there are plenty of not-too-strange functions that grow superpolynomially, yet are asymptotically much smaller than exponentials.

Example: Clearly O(n^log(n)) outgrows any polynomial. And since it can be written as O(2^log²(n)), it is also substantially subexponential.

Such “quasi-polynomial” running times are the best known for some prominent problems, such as graph isomorphism and planted clique.

Yet throughout the many decades that researchers have investigated computational problems, the vast majority of them have either admitted polynomial-time algorithms, or stubbornly resisted anything better than exponential.

  • Problems known to be in P: shortest path, 2-SAT, linear programming, bipartite matching, DNA sequence alignment, max flow, primality testing, and thousands more.
  • Problems with only exponential-time algorithms known: longest path, 3-SAT, integer linear programming, 3D matching, multiple sequence alignment, vehicle routing, chess, local hamiltonian, and thousands more.

Meanwhile, the area in between (superpolynomial but subexponential) is much sparser.

In fact in the above lists, several corresponding examples (shortest vs. longest path, etc.) are close cousins. It turns out that this phenomenon—similar-sounding problems separated by an exponential valley—is extremely common. Empirically, our decision to make polynomial solutions the enemy of exponential ones is right on the mark.

Practical Matters and Beyond the Class P

Interestingly, by the time hierarchy theorem, we know that there exist problems solvable in time O(n³) but not O(n²), or even in O(n⁸¹) but not O(n⁸⁰ log⁵⁰ n). In other words, generally, giving a machine more time makes it strictly more powerful.

Despite this result, the real world does not seem to be littered with extravagant polynomial running times. For problems in P of any practical or theoretical interest, it is rare for the best known upper bound to be worse than about O(n⁶) for very long (some exotic exceptions, for the interested). This leads computer scientists to believe that the large-exponent problems guaranteed to exist by the time hierarchy theorem are artificial-looking or contrived.

To be sure, an O(n⁶) algorithm is not just unappetizing; in most realistic settings, it is unusable. The rise of “big data” (a term at which we roll our eyes, but nonetheless an important phenomenon) means that even O(n³) is too costly, and O(n²) is pushing it—unless your problem is embarrassingly parallelizable. Developers and data scientists who work with tera- or petabytes of data are really looking for O(n log n) or O(n)-time algorithms.

Today’s programmer may face other daunting constraints:

  • The inputs may be given piece by piece in a data stream rather than all at once; algorithms that work with these are called streaming algorithms.
  • Their algorithm may be required to make decisions in real time, incrementally outputting something after seeing each data element. Online algorithms are the response to this need.

Amazingly, we know of many useful streaming algorithms that run in sublinear (o(n), pronounced “little-oh of n”) time or space—not even enough to read or store the entire input. Such algorithms are known for many big data-type applications, especially calculating statistics about a dataset (medians, distinct elements, higher frequency moments, etc.).

This concludes our brief journey. Modern applications demand highly competitive algorithms under severe constraints, and researchers are working hard to deliver them. Yet the traditional notion of polynomial time and the class P remain near and dear to our hearts—not just as a coarse rule of thumb for efficiency, but really, because of their philosophical importance and their service as the bedrock for a robust theory of computation.

--

--