Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You know how when you try to implement a function to compute Fibonacci numbers naively with recursion you end up with a minor performance problem?

In Python:

    def fib(n):
        if n < 2:
            return n
        else:
            return fib(n - 1) + fib(n - 2)
This is pretty close to a literal transcription of the definition of a Fibonacci number.

The problem is that we end up recomputing intermediate results quite a bit. Here's a slightly modified version of the above function, which counts the number of times that fib() is called with each n as an argument:

    numbercounts = dict()
    def fib(n):
        global numbercounts

        numbercounts[n] = numbercounts.get(n, 0) + 1
	
        if n < 2:
            return n
        else:
            return fib(n - 1) + fib(n - 2)
Calling fib(20) and then printing the values of numbercounts gives us the following. This shows for each value of n, how many times it was evaluated. I have cleaned it up a bit below for display.

    n: number_times_evaluated

    0: 4181
    1: 6765
    2: 4181
    3: 2584
    4: 1597
    5: 987
    6: 610
    7: 377
    8: 233
    9: 144
    10: 89
    11: 55
    12: 34
    13: 21
    14: 13
    15: 8
    16: 5
    17: 3
    18: 2
    19: 1
    20: 1
The key observation is that fib(low_number) gets called many, many times for fib(high_number). You may notice an interesting trend in the number of times fib(n) is called. The number of invocations for n is fib(1), for n - 1 is fib(2) and so on. That's not important, but just cool.

Anyway, we will have to evaluate the same intermediate result many times. We can optimize this with memoization, which is really at the core of dynamic programming. Rather than recalculating fib(n) every time we see n, we can calculate the value of fib(n) once and store the value. Then when we see fib(n) again, we can simply look up the value (cheap) rather than go to the trouble of recalculating it (expensive).

Below is a very simple implementation of this in Python:

    numbercounts = dict()
    fibs = {0: 0, 1: 1}
    def fib(n):
        global numbercounts, fibs

        numbercounts[n] = numbercounts.get(n, 0) + 1

        if n in fibs:
            return fibs[n]
        else:
            fibs[n] = fib(n - 1) + fib(n - 2)
            return fibs[n]
Now, if we run fib(20) again and inspect the value of numbercounts, we get the following:

    n: number_times_evaluated

    0: 1
    1: 2
    2: 2
    3: 2
    4: 2
    5: 2
    6: 2
    7: 2
    8: 2
    9: 2
    10: 2
    11: 2
    12: 2
    13: 2
    14: 2
    15: 2
    16: 2
    17: 2
    18: 2
    19: 1
    20: 1
This shows how storing the intermediate values, fib(low_n), we can greatly reduce the number of operations necessary to calculate fib(n).

This is the essence of dynamic programming, storing oft-used, expensive-to-calculate intermediate values in a structure that supports fast lookups.

The Python samples above are the minimum to make the examples work. You may want to look at the Counter datatype[0], and a more general purpose memoizing decorator[1][2]. Of course, if you were going to start using dynamic programming, you may want to utilize functools' lru_cache[3].

I am sure there will be plenty of comments below to help both of us to learn from all of the mistakes I have made above.

[0] https://docs.python.org/3.5/library/collections.html#collect...

[1] https://wiki.python.org/moin/PythonDecorators

[2] https://wiki.python.org/moin/PythonDecoratorLibrary#Memoize

[3] https://docs.python.org/3/library/functools.html#functools.l...



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: