Friday, March 20, 2020

Dynamic Programming basics

Dynamic Programming always was a topic that really scared me back at school, but it is something that can become really manageable once you get the trick of it. Let's define it first.

Dynamic programming is an algorithm design technique that attempts to improve the time that takes solving a problem that can be stated in a recursive philosophy. The deal is making the most out of auxiliary data structures in order to save time complexity (by sacrificing a bit of space complexity). These data structures will hold particular solutions that can be reused for obtaining bigger solutions.

Let's start with a basic example to see what I mean. You normally would implement Fibonacci's function as follows (considering n≥0 as precondition):

function Fibonacci (n: int): int
    if n equals 0
        return 0
    else if n equals 1
        return 1
    else
        return Fibonacci(n-1) + Fibonacci(n-2)

(Well, if you didn't know about it, reading the pseudocode above is pretty straightforward: "Fibonacci's function of 0 is 0, of 1 is 1, and of anything else above 1 is the sum of the calculations for the last two numbers)

And, well, it does the job... but when you start inputting n=40, n=80, and on... you start to notice the problem. It freaking gets stuck.

So, how to efficiently solve this problem? Let's see how it behaves. The following is a diagram on function calls for Fibonacci of 4:


function calls diagram

All these calls stack on memory in depth. Notice that there tends to be a lot of unnecessary calls of stuff we've already calculated. The same will happen for bigger inputs, of course... So, yeah, that won't look pretty for our processor.

Now, analyzing the above behavior, we get the following recurrence equation that describes it:

Tn = Tn-1 + Tn-2 + O(C)

This is equivalent to a O(2n) time complexity, which totally sucks.

So, what to do with that? Well, the best thing we could possibly do is storing the result of a call somewhere and just accessing that result just when needed. That is basically what Dynamic Programming is about.

There are two methods to do this: Top-down Dynamic Programming (also known as Memoization) or Bottom-up Dynamic Programming (also known as Tabulation). I will explain Bottom-up below, as it is the easiest one for me to work with.

Bottom-up is my favorite. You start from the base case (the most trivial problem you can solve), and then you gradually go solving bigger cases, step by step, until you reach the problem you want to solve. For this case, the algotithm will be as follows:

function BottomUpDPFunction(<input>: <input type>): <output type>
    <create helper data structure>
    <store trivial problem's solution on the structure>
    for <i-th problem> from <first non-trivial problem's index> to <general problem's index>
        <solve i-th problem using the (i-1)-th problem in the data structure> 
    return <general problem's solution stored in the data structure>

Therefore, implementing Fibonacci's goes like this:

function FibonacciBottomUp(n: int): int {
    solutions = new array of size n+1, initialized with zeros
    solutions[0] = 0    // First trivial problem
    solutions[1] = 1    // Second trivial problem

    for i from 2 to n
        solutions[i] = solutions[i-1] + solutions[i-2]    // Solve current problem with previous solutions

    return solutions[n]
}

(Notice that because of array's index numbers, our i-th problem corresponds to the (i-1)-th position)

If we analyze it, we get a time complexity of O(n) !!! Isn't that cool!?  (Of course, we are sacrificing some additional memory; in this case we have a space complexity of O(n), which is not much of a problem anyway, but we have to be aware of that)

Please let me know what you think. If you have any suggestions or corrections, please tell me in the comments section below.

Thanks for reading!


[--EDIT--]


Adding here the Top-Down approach because this post feels incomplete without it:
function TopDownDPFunction(<input>: <input type>, <helper data structure>: <array or dictionary of types according to problem>): <output type>
    if <base case conditions>
        return <base case solution in helper data structure>
              else
                  if <helper data structure doesn't contain solution>
                      <store in helper data structure a solution that uses>: TopDownDPFunction(<reduced input>, <helper data structure>)

                  return <solution of the current problem found in the data structure>


So here is what the implementation would look like:

function FibonacciTopDown(n: int, solutions: array of int): int {
    if n equals 0 or n equals 1
        return solutions[n]
              else
                  if solutions[n] equals 0
                      solutions[n] = fibonacciTopDown(n-1, solutions) + fibonacciTopDown(n-2, solutions)
                  return solutions[n]
}

 And don't forget we need a wrapper function to run that
 
function Fibonacci(n: int): int {
    solutions = new array of size n+1, initialized with zeros
    solutions[0] = 0    // First trivial problem
    solutions[1] = 1    // Second trivial problem
    return FibonacciTopDown(n, solutions)
}

No comments:

Post a Comment