天天看点

Quora上关于P, NP, NP-complete, and NP-hard问题的解答

NP问题的概念自己始终没有吃透。国内的博客有不少中文资料,但是写的好的真是很少。今天看到Quora上的关于NP问题的概念介绍,特别拿过来分享给大家,希望能帮助大家加深对该问题的理解。

Quora的关于P,NP等问题解释:

https://www.quora.com/What-are-P-NP-NP-complete-and-NP-hard

网页的pdf版本:

http://pan.baidu.com/s/1kTlO1bL

我不知道国内是否会屏蔽,这里我直接贴上部分原文,里边有不同人的解释。这里贴上两个浏览最多的解释,其他的我放在了一个pdf里,大家可以下载下来查看:

Jessica Su, Stanford algorithms TA

177.5k Views • Upvoted by Jeff Nelson, Invented Chromebook. Former Googler. • Elynn Lee, CS major at UT Austin, Google NYC 2013 and 2014 intern, Former Amazon and Facebook intern, research… • Shrey Banga (श्रेय), Platform Engineer at Quora

Jessica has 87 endorsements in Computer Science.

These refer to how long it takes a program to run.  Problems in class P can be solved with algorithms that run in  polynomial time.

Say you have an algorithm that finds the smallest integer in an array.  One way to do this is by iterating over all the integers of the array and keeping track of the smallest number you've seen up to that point.  Every time you look at an element, you compare it to the current minimum, and if it's smaller, you update the minimum.

How long does this take?  Let's say there are n elements in the array.  For every element the algorithm has to perform a constant number of operations.  Therefore we can say that the algorithm runs in O(n) time, or that the runtime is a linear function of how many elements are in the array.*  So this algorithm runs in  linear time.

You can also have algorithms that run in  quadratic time  (O(n^2)),  exponential time  (O(2^n)), or even  logarithmic time  (O(log n)).  Binary search (on a balanced tree) runs in logarithmic time because the height of the binary search tree is a logarithmic function of the number of elements in the tree.

If the running time is some polynomial function of the size of the input**, for instance if the algorithm runs in linear time or quadratic time or cubic time, then we say the algorithm runs in  polynomial time  and the problem it solves is in class  P .

NP

Now there are a lot of programs that don't (necessarily) run in polynomial time on a regular computer, but do run in polynomial time on a nondeterministic Turing machine.  These programs solve problems in  NP , which stands for nondeterministic polynomial time.   A nondeterministic Turing machine can do everything a regular computer can and more.***  This means all problems in P are also in NP.

An equivalent way to define NP is by pointing to the problems that can be verified in polynomial time.  This means there is not necessarily a polynomial-time way to find a solution, but once you have a solution it only takes polynomial time to verify that it is correct.

Some people think P = NP, which means any problem that can be verified in polynomial time can also be solved in polynomial time and vice versa.  If they could prove this, it would revolutionize computer science because people would be able to construct faster algorithms for a lot of important problems.

NP-hard

What does NP-hard mean?  A lot of times you can solve a problem by reducing it to a different problem.  I can reduce Problem B to Problem A if, given a solution to Problem A, I can easily construct a solution to Problem B.  (In this case, "easily" means "in polynomial time.")

If a problem is  NP-hard , this means I can reduce any problem in NP to that problem.  This means if I can solve that problem, I can easily solve any problem in NP.  If we could solve an NP-hard problem in polynomial time, this would prove P = NP.

NP-complete

A problem is  NP-complete  if the problem is both

  • NP-hard, and
  • in NP.

* A technical point: O(n) actually means the algorithm runs in  asymptotically  linear time, which means the time complexity approaches a line as n gets very large.  Also, O(n) is technically an upper bound, so if the algorithm ran in sublinear time you could still say it's O(n), even if that's not the best description of it.

** Note that if the input has many different parameters, like n and k, it might be polynomial in n and exponential in k

*** Per  Xuan Luo 's comment, deterministic and nondeterministic Turing machines can compute exactly the same things, since every nondeterministic Turing machine can be simulated by a deterministic Turing machine (a "regular computer").  However, they may compute things in different amounts of time. Updated 4 Jan 2013  •  View Upvotes

Alex Flint , PhD Student, AI/Computer Vision/Robotics, Oxford University 51.1k Views  •  Alex  is a Most Viewed Writer in  P vs. NP .

Complexity classes are one way to talk about how difficult or easy a problem is. Complexity theory gets very technical but the basics are actually extraordinarily intuitive, and it's possible to understand the P versus NP issue with very little math background.

The kinds of problems we deal with in complexity theory come in pairs: a "search" version and a "verification" version. For example --

  • Problem: sorting.

    Search version: input a list of numbers X and output the same list in sorted order (call it Y). 

    Verification version: input a list of numbers X and another list Y, and output "YES" if Y is the sorted version of X and "NO" otherwise.

  • Problem: graph coloring.

    Search version: input a network of nodes and edges X, and output colors Y for each node such that no adjacent nodes have the same color. 

    Verification version: input a network X and a set of colors Y and output "YES" if all adjacent nodes have different colors and "NO" otherwise.

  • Problem: partition. 

    Search version: input some numbers X and divide the numbers into two groups that add up to exactly the same value (call the assignment of numbers to their group Y). 

    Verification version: input some numbers X and the groupings Y and output "YES" if the two groups add up to the same value, or "NO" otherwise.

This is the P versus NP problem:

Are there any problems for which the verification version can be solved efficiently but for which there is no efficient solution to the search version?

If there is a fast solution to the  search version of a problem then the problem is said to be  Polynomial-time, or  P for short. If there is a fast solution to the  verification version of a problem then the problem is said to be  Nondeterministic Polynomial-time, or  NP for short. The question of "P=NP" is then the question of whether these sets are identical. 注:P问题能在多项式时间内找到解;NP问题则在于多项式时间内验证解。由此引出了数学难题P是否等于NP。

(The "nondeterministic polynomial-time" terminology is terribly counter-intuitive ( adj. 反直觉的;违反语感的)  in my opinion. It was used originally because whenever a  Turing machine  can efficiently solve the  verification  version of a problem, a  non-deterministic Turing machine  can efficiently solve the corresponding  search  problem. But this is really not at all important to understanding P vs NP.)

In the case of the sorting problem above, there are fast algorithms for both the  search  and  verification  versions. But for the other two problems, the  verification  versions are easy (heck, my grandmother could probably write a computer program to check that two lists of numbers add up to the same value) but the  search  versions are difficult, and indeed there are no fast solutions known. So all three problems are in  NP , but only the first is (known to be) in  P .

Some problems can be translated into one another in such a way that a fast solution to one problem would automatically give us a fast solution to the other. There are some problems that every single problem in NP can be translated into, and a fast solution to such a problem would automatically give us a fast solution to every problem in NP. This group of problems are known as  NP-Hard . Some problems in NP-Hard are actually not themselves in NP; the group of problems that are in  both  NP and NP-Hard is called  NP-Complete .

You start to see the far-reaching implications of a fast solution to any one problem in NP-Hard: we would automatically get a fast solution to  every  problem in NP, which would mean that whenever there is a fast solution to the  verification  version of a problem then there is  always  a fast solution to the corresponding  search  version.

Remember how the verification versions of those problems seemed easy but the search versions seemed hard? A fast solution to any NP-Complete problem would be mean that as long as you can  verify  proposed solutions to a problem you would  never  need to search through a substantial大量的,大部分的 fraction of the search space to find solutions; there would  always  be a faster way. This seems implausible难以置信的 to most mathematicians (and for deeper reasons that I've listed here) and that is why most mathematicians think that there are no fast solutions to NP-complete problems. But we haven't proved it yet. Updated 2 Jun 2013  •  View Upvotes

继续阅读