It is worth pointing out here that Quicksort has complexity of Ɵ(n²) in the wors...

shanusmagnus · on July 6, 2014

That's true, but it's also worth pointing out that the worst case can be trivially prevented by shuffling the array before you Quicksort it. Which is maybe a confirmation of your larger point.

greatzebu · on July 6, 2014

The principled way to avoid bad worst-case performance for quicksort is to select the pivot using a median of medians approach. That gives you good asymptotic performance even for adversarial inputs, but in practice it tends to be slower. Which fits well with the theme of this article.

btilly · on July 6, 2014

You can trivially achieve bounded worst case performance with minimal overhead by doing median of medians only, say, one time in 5.

In the average case you do the expensive median of medians on a small fraction of the data and so only pay a few percent average penalty. And in the worst case you still get n log(n) performance.

judk · on July 6, 2014

How is the worst case not n^2 on a case you don't use medians of medians?

adrianN · on July 7, 2014

Every fifth pivot is chosen to be an actual median.

ballpoint · on July 6, 2014

The median-of-medians approach is also somewhat tricky to implement (avoiding array copies and so on), although it admittedly looks neat on the slides of the algorithms lectures.

ruggeri · on July 6, 2014

Strictly speaking, it is not true that shuffling first avoids the worst case. Given a deterministic shuffling function, some permutation of input needs to shuffle to sorted order, which will again trigger the worst case.

Of course, shuffling is still potentially helpful because ascending/descending order is often common in many applications.

brudgers · on July 6, 2014

Randomization of the pivots produces O(n log n) average running time with a very low probability of any other running time for any value of n where the worst case running time matters.

ruggeri · on July 6, 2014

Even without randomization of pivots expected run time is O(n log n). If the order of the input data is random, I don't believe randomizing pivots changes the distribution of running times.

What changes is that one very common permutation of data (ascending data) does not have O(n * 2) performance.

brudgers · on July 6, 2014

There's little justification for expecting data to behave randomly unless our algorithm introduces it.

mtdewcmu · on July 7, 2014

Correct. In general, real data will seldom, if ever, look like random data.

shanusmagnus · on July 6, 2014

Shuffling with a too-small seed size will render the shuffle susceptible to brute force attacks; but in no way invalidates its use as a dodge for pathological (pre-sorted) arrays submitted to Quicksort in the course of normal usage. There's a better chance of getting eaten by a dinosaur than finding O(N^2) QS performance on shuffled data that comes from anything other than an attacking adversary.

mtdewcmu · on July 7, 2014

You're right that shuffling doesn't avoid the worst case, but your explanation isn't quite correct. See my other comment for my explanation.

I'm not sure what you mean by deterministic shuffling. Shuffling is supposed to be random, and therefore non-deterministic. In practice, you'd probably use a pseudorandom function to implement shuffling, but the goal of pseudorandom functions is to be indistinguishable from random functions, so that detail shouldn't matter.

mtdewcmu · on July 7, 2014

Shuffling can never prevent the worst case of any algorithm. "Worst case" means the worst out of all possible cases. Shuffling doesn't change the set of cases, and it doesn't change anything about any individual case.

Probably what you meant was that shuffling makes the worst case very unlikely.

On a practical level, if you have to shuffle the data before sorting to make your sort algorithm work better, you've most likely already blown any performance advantage. So you might as well choose a different sort algorithm.

AYBABTME · on July 6, 2014

I believe Big Theta means bounded above and below by `n^2`, where the bounds differ by a constant factor. Saying that Quicksort is Ɵ(n^2) means that is has worst case `n^2` AND best case `n^2`. But the best case of Quicksort is `n`.

So Quicksort is O(n^2); the worst case is <= cn^2, for a constant c.

Sorry to bring it up. I was wondering if I wanted to be the annoying guy who nitpicks on that; but then I thought you might be interested in the difference.

   [edit1: was wrong, don't want to mislead]
   [edit2: not sure I was wrong anymore, this is confusing]

devinj · on July 6, 2014

No, it means the worst case is bounded above and below by n^2. That is, the worst-case scales no worse or better than n^2, but approximately proportional, asymptotically. (Whereas, for example, every O(n^2) algorithm also happens to be O(2^n))

The best case is something entirely separate, and can itself be bounded from above or below.

OMBUG · on July 6, 2014

It doesn't make sense to say there's an upper and lower bound on the "worst" case. If that were true, the upper bound would be the new worst case, and the lower bound would be some middle case. A best or worst case is inherently always going to be big-theta.

ballpoint · on July 6, 2014

This talks about our ability to reach or describe the worst case, so if we say it's Ɵ(n^2), we're saying that we can tell that the worst case is at least quadratic, and it's also no worse than quadratic. We might not be so good at the math and have only discovered that the worst case was at least linear and no worse than exponential (which is true of quicksort).

emillon · on July 6, 2014

To help you wrt your edit 2, you seem to be confusing two things:

- the Ɵ() and O() notations can be used on mathematical function. For example 5n^2 + 3n = Ɵ(n^2) [^1] or 3n^2 = O(n^3).

- algorithms have different complexities: for best case, for worst case, or average case. It corresponds to as many mathematical functions.

Each of these functions can be given appropriate Ɵ() or O() bounds.

[^1]: really meaning λn.5n^2 + 3n ∈ Ɵ(λn.n^2), it's a relation between functions.

dlp211 · on July 6, 2014

Except you are wrong. O(x) is an upper bound, Omega(x) is a lower bound, and something is Theta(x) iff it is bounded by Omega(x) and O(x) where (x) is the same[1].

And if a stackoverflow link isn't good enough, than go read CLRS, Dasgupta, or Skiena.

Quicksort has no Theta(x), it is O(n^2) and Omega(n log n) with an expected runtime of n log n. The reason that Quicksort performs so well is because it is only in very specialized cases where it devolves to its worst case.

[1] http://stackoverflow.com/questions/10376740/big-theta-notati...

Edit: Replied to the wrong commenter, so the replied to post isn't wrong, the grandparent post is.

AYBABTME · on July 6, 2014

After reviewing, I've understood two meanings people give to Theta(n):

    1. Theta(n) means that it's O(n), O(n^2), O(...), but 
       Theta(n) is a proper, tighter definition of the upper
       bound.  It doesn't say anything about the lower bound.

    2. Theta(n) means that it's tightly bounded up and down 
       by n. Both Omega(n) and some O(n) are equal, thus its 
       Theta(n).

I think 2 is the proper definition (thus my 2nd edit), but I'm not totally sure. I've let my original comment there with the edit chain in hope discussions would bring up the proper definition. Unfortunately I'm getting downvoted for that, which I find odd.

emillon · on July 6, 2014

I was confused at first :)

alco · on July 6, 2014

Saying it is bounded by O(n²) is too vague, because it's the same as being bounded by O(n³) and so on. A linear algorithm is also O(n²) and O(n³).

That is why I emphasized that the worst case in particular has its bound within a constant factor of n², hence Ɵ.