Saturday, August 28, 2010

Learning to see patterns in my own behaviour

So, a week and a half ago I was looking at a question on Stack Overflow (Algorithm to calculate the number of combinations to form 100 ). I set about solving it in Haskell, and came up against a block to my success:

Given a list of numbers xs and another number n, generate a list of all the possible combinations lists of length n that contain the numbers from xs.

So, given the list [1,2] and the number 3, the function should generate this list of lists: [[1,1,1],[1,1,2],[1,2,1],[1,2,2],[2,1,1],[2,1,2],[2,2,1],[2,2,2]]

I was pretty sure that this had been done before, but because I'm trying to get better at deducing algorithms, I'm stubborn, and I'm doing this for fun I decided to figure out the algorithm for myself.

It wasn't as easy as it seemed.

I sat down and wrote out the outputs for a few different sets of inputs, I looked at them, I looked some more. I could see a couple of patterns, namely that (length of xs)n is the length of the final output and that you could create a rectangle of answers with width length of xs and height (length of xs - 1)n. Neither of these were helpful.

I left the problem alone for a while, hoping that time would give me perspective. I was surprised how hard I was finding it to find the pattern.

Today I came back to it with a fresh brain and time to kill. I took a walk to the park, sat down, started to write out the output where the input is a list of length 3, and n as 3. As I was writing, I had the realisation that the way to solve this was to figure out the algorithm of how to write it down. The problem in my previous examples of output was that I hadn't written it in a good enough pattern. I started writing out the output for a different input a list of length 4, with n of 4 (256 items, for those keeping count). This time I was very systematic about how I wrote out the output. I got to the 44th list in the list and stopped to see if I could see it yet. I could: the last element in the individual lists was repeating every 4 items.

I stood up and, as is my wont when I am thinking, I started pacing. I must have looked a little unhinged, as I was pacing in a small circle around my bag.

It took me a few minutes, but eventually I figured out how to represent what I was seeing in my written output as an algorithm: the first time through, each item of xs is appended to an empty list, for each subsequent time through, each item in xs is appended to each list in the list of lists.

In Haskell, I came up with this function to do the work:

makeallsets :: Integral a => [a] -> a -> [[a]] makeallsets xs n = mas (addtoonelist [] xs) xs (n - 1) where mas yss _ 0 = yss mas yss xs (n + 1) = mas (addtoeachlist yss xs) xs n where addtoeachlist [] xs = [] addtoeachlist (ys:yss) xs = (addtoonelist ys xs) ++ (addtoeachlist yss xs) addtoonelist ys [] = [] addtoonelist ys (x:xs) = (x : ys) : (addtoonelist ys xs)

This allowed me to create an answer to the Stack Overflow problem. (Although there's no point posting it for 3 very good reasons: 1. it's not in the target language (which is Scala); 2. It uses the brute force approach; 3. There is already a better answer.)

Score 1 for perseverance!

P.s. if anyone would like to show me a better way, I'd be very glad to hear it.