Just to throw it out there, here's a 6th myth to consider:
> Hash tables are fast
When you're dealing with lots of keys, they're usually the best option out there. But if you're dealing with a small number of keys, a simple array can often be faster. A hash table lookup involves a constant amount of overhead resulting from calculating the key, and possibly also added indirection depending on whether the implementation uses open addressing. If the amount of time that would be spent on a simple linear or binary search of the array is less than (or even not much more than) the time that would be spent on that overhead, then the array will be faster.
Of course, the point where the hash table overtakes the array depends on a lot of factors. As usual the only sure-fire way to know which option is better in a particular situation is to measure.
I recently read a story by one of Wirth's students, reminiscing about working on the Oberon compiler. They had several important principles for developing the Oberon compiler that drove development, one was something to the effect that anything you add must pay for itself or improve the speed of compilation--the benchmark being the whole Oberon system. The students had implemented a very fancy data structure, I think a balanced tree, to use with register allocation. Wirth found out and ordered them to remove it and replace it with a linked list. They groaned, but complied--and the compiler was both smaller and faster as a result, despite worsening the worst case.
"I still vividly remember the day that Wirth
decided to replace the elegant data structure used in the compiler’s symbol table
handler by a mundane linear list. In the original compiler, the objects in the symbol
table had been sorted in a tree data structure (in identifier lexical order) for fast
access, with a separate linear list representing their declaration order. One day Wirth
decided that there really weren’t enough objects in a typical scope to make the
sorted tree cost-effective. All of us Ph.D. students were horrified: it had taken time
to implement the sorted tree, the solution was elegant, and it worked well – so why
would one want to throw it away and replace it by something simpler, and even
worse, something as prosaic as a linear list? But of course, Wirth was right, and the
simplified compiler was both smaller and faster than its predecessor."
Perhaps this was an example where the total number of elements was well bounded. After all, CPUs aren't growing more general-purpose registers very often, or the size of the scope considered in the scheduling may be limited by other constraints.
Yes, absolutely. Along similar lines, Hindley-Milner type inferencing is also super-exponential in the worst case, but people don't write code bad enough to trigger the worst case.
> Hash tables are fast
When you're dealing with lots of keys, they're usually the best option out there. But if you're dealing with a small number of keys, a simple array can often be faster. A hash table lookup involves a constant amount of overhead resulting from calculating the key, and possibly also added indirection depending on whether the implementation uses open addressing. If the amount of time that would be spent on a simple linear or binary search of the array is less than (or even not much more than) the time that would be spent on that overhead, then the array will be faster.
Of course, the point where the hash table overtakes the array depends on a lot of factors. As usual the only sure-fire way to know which option is better in a particular situation is to measure.