CODINGNEWS

Memoization

Oct 8

Memoization in computer programs is a really simple concept. It’s an optimization technique where a computer program returns a cached result when the same input occurs again. It’s essential when working with heavy calculation algorithms or time consuming functions. It basically works like this:

Ruby

In Ruby, memoization is almost part of the language thanks to Ruby’s conditional assignment (||=):

def slow_function
  sleep 5
  'result'
end

def memoization
  @takes_time ||= slow_function
end

memoization()
# Wait 5 seconds
=> 'result'
memoziation()
# Inmediateley
=> 'result'

We can also refactor a little bit using blocks (Ruby will treat this block as a single thing):

def slow_function
  @takes_time ||= begin
    sleep 5
    'result'
  end
end

slow_function()
# Wait 5 seconds
=> 'result'
slow_function()
# Inmediateley
=> 'result'

There is one caveat with this method. If slow_function returns nil, it will perform the heavy calculation every time. In that specific and annoying case we can use the object method defined?:

class HeavyClass
  def call
    return @call if defined?(@call)
    @call = slow_method
  end

  def slow_method
    sleep 5
    nil
  end
end

heavy_object = HeavyClass.new
heavy_object.call
# Wait 5 seconds
=> nil
heavy_object.call
# Inmediateley
=> nil

What if we use parameters? Simple, we will use a Hash. We will initialize our Hash with a block. According to the Ruby documentation if we call our hash with an unexisting key, it will call our block as:

hash = Hash.new do |h, key|
  h[key] = rand
end

hash[:non_existing_key]
=> 0.3884906318307012
hash[:other_non_existing_key]
=> 0.19581108461258834
hash[:non_existing_key]
=> 0.3884906318307012

Now, going back to our memoization post:

def takes_a_while
  sleep 5
  'result'
end

def slow_method(param)
  @hash ||= Hash.new do |h, key|
    h[key] = takes_a_while
  end
  @hash[param]
end

slow_method(1)
# Wait 5 seconds
=> 'result'
slow_method(1)
# Inmediately
=> 'result'
slow_method(2)
# Wait 5 seconds
=> 'result'

Python

Sadly, in Python things are not as easy as they are in Ruby, but that does not mean that they are impossible:

from time import sleep

cache = {}
def slow_method(param):
    if param in cache:
        return cache[param]
    else:
        sleep(5)
        cache[param] = param
        return cache[param]
slow_method(1)
# Wait 5 seconds
=> 1
slow_method(1)
# Inmediately
=> 1

We can also define a function and use it as a decorator as in:

from time import sleep

def memoize(f):
    memo = {}
    def helper(x):
        if x not in memo:
            memo[x] = f(x)
        return memo[x]
    return helper

@memoize
def slow_function(n):
    sleep(5)
    return n
slow_method(1)
# Wait 5 seconds
=> 1
slow_method(1)
# Inmediately
=> 1

If you are using Python 3 maybe you can check functools.lru_cache on the official documentation and if you are using Django you should also check the django-memoize library.

Javascript

We can define a method in the Function prototype and use a concept very similar as the Python one:

Function.prototype.memoize = function(){
  var self = this, cache = {};
  return function( arg ){
    if(arg in cache) {
      return cache[arg];
    } else {
      return cache[arg] = self( arg );
    }
  }
}

function slow() {
  console.log('Heavy calculation...');
  return 5;
}

var memSlow = slow.memoize();

> memSlow();
Heavy calculation...
5
> memSlow();
5

But if you are using underscore (and your should!) you can use the memoize method:

var fibonacci = _.memoize(function(n) {
  return n < 2 ? n: fibonacci(n - 1) + fibonacci(n - 2);
});

Much easier! We define the function once, and we don’t need to memoize it later.

What is all this doing here?

Well, if you work on data journalism you might be doing some scraping and some calculations on huge data sets. I think that the memoization concept is really useful to optimize some of your tools or scripts. Something to have in mind.

comments powered by Disqus