Programming Chalenges: The 3n+1 problem

3n+1 is the first, "warm-up" problem at Programming Chalenges site: http://www.programming-challenges.com/pg.php?page=downloadproblem&probid=110101&format=html (This problem illustrates Collatz conjecture: http://en.wikipedia.org/wiki/3n_%2B_1#Program_to_calculate_Collatz_sequences ) As long as the judge on this site takes only C and Java solutions, I submitted in Java some add-hock code (see at the end of this message) where I used recursion and a cache of computed cycles. Judge accepted my code and measured 0.292 sec with best overall submissions of 0.008 sec to solve the problem. *** Question: I wonder how to implement cache for this problem in Haskell? At the moment, I am not so much interested in the speed of the code, as in nice implementation. To illustrate my question I add the problem description and my Java solution at the end of this message. Thanks! *** Problem Consider the following algorithm to generate a sequence of numbers. Start with an integer *n*. If *n* is even, divide by 2. If *n* is odd, multiply by 3 and add 1. Repeat this process with the new value of *n*, terminating when *n* = 1. For example, the following sequence of numbers will be generated for *n* = 22: 22 11 34 17 52 26 13 40 20 10 5 16 8 4 2 1 It is *conjectured* (but not yet proven) that this algorithm will terminate at *n* = 1 for every integer *n*. Still, the conjecture holds for all integers up to at least 1, 000, 000. For an input *n*, the *cycle-length* of *n* is the number of numbers generated up to and *including* the 1. In the example above, the cycle length of 22 is 16. Given any two numbers *i* and *j*, you are to determine the maximum cycle length over all numbers between *i* and *j*, *including* both endpoints. InputThe input will consist of a series of pairs of integers *i* and *j*, one pair of integers per line. All integers will be less than 1,000,000 and greater than 0. OutputFor each pair of input integers *i* and *j*, output *i*, *j* in the same order in which they appeared in the input and then the maximum cycle length for integers between and including *i* and *j*. These three numbers should be separated by one space, with all three numbers on one line and with one line of output for each line of input. Sample Input 1 10 100 200 201 210 900 1000 Sample Output 1 10 20 100 200 125 201 210 89 900 1000 174 *** my Java solution import java.io.BufferedReader; import java.io.InputStreamReader; public class Main { final static BufferedReader reader_ = new BufferedReader(new InputStreamReader(System.in)); /** * @param args */ public static void main(String[] args) { new Problem().run(); } static String[] ReadLn() { String[] tokens = null; try { String line = reader_.readLine(); String REGEX_WHITESPACE = "\\s+"; String cleanLine = line.trim().replaceAll(REGEX_WHITESPACE, " "); tokens = cleanLine.split(REGEX_WHITESPACE); } catch (Exception e) {} return tokens; } } class Problem implements Runnable { long CACHE_SIZE = 65536; private final long[] cache_ = new long[(int) CACHE_SIZE]; /** * Compute cycle length for a single number * * @param n number for which we find cycle length * @return cycle length */ long cycleLen(long n) { long len = 1; if (n != 1) { len = getFromCache(n); if (len == 0) { //not yet in cache // Recursively compute and store all intermediate values of cycle length if ((n & 1) == 0) { len = 1 + cycleLen(n >> 1); } else { len = 1 + cycleLen(n * 3 + 1); } putInCache(n, len); } } return len; } void putInCache(long n, long len) { if(n < CACHE_SIZE) { cache_[(int)n] = len; } } long getFromCache(long n) { long result = 0; if(n < CACHE_SIZE) { result = cache_[(int)n]; } return result; } /** * Find max cycle on interval * * @param from interval start * @param to interval end * @return max cycle */ Long maxCycle(Long from, Long to) { Long result = 0L; Long cycle = 0L; // Get all values of cycle length on the interval and put these values into a sorted set for (long i = from; i <= to; i++) { cycle = cycleLen(i); if (cycle > result) result = cycle; } return result; } public void run() { String[] tokens = null; long from, to, result = 0; long arg1, arg2 = 0; while ((tokens = Main.ReadLn()) != null) { if (tokens.length == 2) { arg1 = new Long(tokens[0]).longValue(); arg2 = new Long(tokens[1]).longValue(); from = (arg1 <= arg2) ? arg1 : arg2; to = (arg2 >= arg1 ) ? arg2 : arg1; result = maxCycle(from, to); out(arg1+" "+arg2+" "+result); } } } static void out(String msg) { System.out.println(msg); } }

This is very similar to the Problem 14 on Project Euler [1]. Should you really want a nice solution in Haskell, and noting that this is a big spoiler, there are some on the wiki [2]. But I highly recommend you to try coding your own solutions before looking at the site =). Cheers! [1] http://projecteuler.net/index.php?section=problems&id=14 [2] http://www.haskell.org/haskellwiki/Euler_problems/11_to_20#Problem_14 -- Felipe.

Thanks, Felipe! Sure I will try to code solution myself. Doing it all by yourself is the main fun of everything, isn't it? I was just asking about the *idea* of implementing cache in Haskell, in general, not just in this case only. On Thu, Apr 14, 2011 at 3:11 PM, Felipe Almeida Lessa < felipe.lessa@gmail.com> wrote:
This is very similar to the Problem 14 on Project Euler [1]. Should you really want a nice solution in Haskell, and noting that this is a big spoiler, there are some on the wiki [2]. But I highly recommend you to try coding your own solutions before looking at the site =).
Cheers!
[1] http://projecteuler.net/index.php?section=problems&id=14 [2] http://www.haskell.org/haskellwiki/Euler_problems/11_to_20#Problem_14
-- Felipe.
-- All the best, Dmitri O. Kondratiev "This is what keeps me going: discovery" dokondr@gmail.com http://sites.google.com/site/dokondr/welcome

Hi Dmitri, As a reference you might want to take a look at the Project Euler problem #14 and the corresponding dubious entry in the HaskellWiki: http://www.haskell.org/haskellwiki/Euler_problems/11_to_20#Problem_14 .
*** Question: I wonder how to implement cache for this problem in Haskell? At the moment, I am not so much interested in the speed of the code, as in nice implementation.
a while ago I wondered about the same question: memoization and more specifically tabulation (DP) in Haskell. After some time I came up with code which does it IMO in a fairly nice way using an array of lazy computations which gives you top-down DP for free at the expense of performance. Another way is to use lazily built tries ala MemoTrie: http://www.haskell.org/haskellwiki/MemoTrie . Specific implementation:
import Data.Array import Data.MemoTrie
I write the collatz function using open recursion in order to separate the recursive function from the memo scheme.
type Gen a = a -> a type Fix a = Gen a -> a
collatzGen :: Gen (Integer -> Integer) collatzGen c 1 = 1 collatzGen c n | even n = 1 + c (div n 2) | otherwise = 1 + c (3*n + 1)
We can use plain old Data.Function.fix to get the usual stupid recursive function or use our custom fixpoint combinators. |fixtab| for example uses an array to cache the result in the given range. When the argument is outside of the range the normal recursive scheme is applied.
fixtab :: Ix i => (i, i) -> Fix (i -> r) fixtab b f = f' where a = listArray b [ f f' i | i <- range b ] f' i = if inRange b i then a!i else f f' i
Another option is to use the MemoTrie package where we can write a fixpoint combinator like this:
fixmemo :: HasTrie i => Fix (i -> r) fixmemo f = let m = memo (f m); in m
And of course the final collatz function:
collatz :: Integer -> Integer collatz = fixtab (1,1000000) collatzGen
Cheers, Steven

So if we were to emulate your Java solution, we'd do
import Data.Array
cacheSize :: Int
cacheSize = 65536
table :: Array Int Integer
table = listArray (1,cacheSize) (1 : map go [2..cacheSize]) where
go n
| even n = 1 + lookup (n `div` 2)
| otherwise = 1 + lookup (3 * n + 1)
lookup :: Integer -> Integer
lookup n
| n < cacheSize = table ! (fromInteger n)
| even n = 1 + lookup (n `div` 2)
| otherwise = 1 + lookup (3 * n + 1)
The rest of the code is just some simple i/o.
The table is filled up lazily as you request values from it.
On Thu, Apr 14, 2011 at 3:29 AM, Dmitri O.Kondratiev
3n+1 is the first, "warm-up" problem at Programming Chalenges site:
http://www.programming-challenges.com/pg.php?page=downloadproblem&probid=110101&format=html
(This problem illustrates Collatz conjecture:
http://en.wikipedia.org/wiki/3n_%2B_1#Program_to_calculate_Collatz_sequences )
As long as the judge on this site takes only C and Java solutions, I submitted in Java some add-hock code (see at the end of this message) where I used recursion and a cache of computed cycles. Judge accepted my code and measured 0.292 sec with best overall submissions of 0.008 sec to solve the problem.
*** Question: I wonder how to implement cache for this problem in Haskell? At the moment, I am not so much interested in the speed of the code, as in nice implementation.
To illustrate my question I add the problem description and my Java solution at the end of this message. Thanks!
*** Problem
Consider the following algorithm to generate a sequence of numbers. Start with an integer *n*. If *n* is even, divide by 2. If *n* is odd, multiply by 3 and add 1. Repeat this process with the new value of *n*, terminating when *n* = 1. For example, the following sequence of numbers will be generated for *n* = 22: 22 11 34 17 52 26 13 40 20 10 5 16 8 4 2 1 It is *conjectured* (but not yet proven) that this algorithm will terminate at *n* = 1 for every integer *n*. Still, the conjecture holds for all integers up to at least 1, 000, 000.
For an input *n*, the *cycle-length* of *n* is the number of numbers generated up to and *including* the 1. In the example above, the cycle length of 22 is 16. Given any two numbers *i* and *j*, you are to determine the maximum cycle length over all numbers between *i* and *j*, * including* both endpoints.
Input The input will consist of a series of pairs of integers *i* and *j*, one pair of integers per line. All integers will be less than 1,000,000 and greater than 0.
OutputFor each pair of input integers *i* and *j*, output *i*, *j* in the same order in which they appeared in the input and then the maximum cycle length for integers between and including *i* and *j*. These three numbers should be separated by one space, with all three numbers on one line and with one line of output for each line of input.
Sample Input
1 10 100 200 201 210 900 1000
Sample Output
1 10 20 100 200 125 201 210 89 900 1000 174
*** my Java solution
import java.io.BufferedReader; import java.io.InputStreamReader; public class Main { final static BufferedReader reader_ = new BufferedReader(new InputStreamReader(System.in)); /** * @param args */ public static void main(String[] args) { new Problem().run(); } static String[] ReadLn() { String[] tokens = null; try { String line = reader_.readLine(); String REGEX_WHITESPACE = "\\s+"; String cleanLine = line.trim().replaceAll(REGEX_WHITESPACE, " "); tokens = cleanLine.split(REGEX_WHITESPACE); } catch (Exception e) {} return tokens; } }
class Problem implements Runnable { long CACHE_SIZE = 65536; private final long[] cache_ = new long[(int) CACHE_SIZE]; /** * Compute cycle length for a single number * * @param n number for which we find cycle length * @return cycle length */ long cycleLen(long n) { long len = 1; if (n != 1) { len = getFromCache(n); if (len == 0) { //not yet in cache // Recursively compute and store all intermediate values of cycle length if ((n & 1) == 0) { len = 1 + cycleLen(n >> 1); } else { len = 1 + cycleLen(n * 3 + 1); } putInCache(n, len); } } return len; }
void putInCache(long n, long len) { if(n < CACHE_SIZE) { cache_[(int)n] = len; } }
long getFromCache(long n) { long result = 0; if(n < CACHE_SIZE) { result = cache_[(int)n]; } return result; }
/** * Find max cycle on interval * * @param from interval start * @param to interval end * @return max cycle */ Long maxCycle(Long from, Long to) { Long result = 0L; Long cycle = 0L; // Get all values of cycle length on the interval and put these values into a sorted set for (long i = from; i <= to; i++) { cycle = cycleLen(i); if (cycle > result) result = cycle; } return result; }
public void run() { String[] tokens = null; long from, to, result = 0; long arg1, arg2 = 0; while ((tokens = Main.ReadLn()) != null) { if (tokens.length == 2) { arg1 = new Long(tokens[0]).longValue(); arg2 = new Long(tokens[1]).longValue(); from = (arg1 <= arg2) ? arg1 : arg2; to = (arg2 >= arg1 ) ? arg2 : arg1; result = maxCycle(from, to); out(arg1+" "+arg2+" "+result); } } }
static void out(String msg) { System.out.println(msg); }
}
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Thanks, everybody!
Your feedback is a great food for my mind (as Lewis Carroll once wrote :)
When asking "how to implement cache in Haskell" I was hopping that there
exists some solution without using Data.Array, more "functional" approach,
if I may say so ...
I must be wrong, though (need more time to fully comprehend solution that
Steven described in this thread ).
On Thu, Apr 14, 2011 at 3:22 PM, Ryan Ingram
So if we were to emulate your Java solution, we'd do
import Data.Array
cacheSize :: Int cacheSize = 65536
table :: Array Int Integer table = listArray (1,cacheSize) (1 : map go [2..cacheSize]) where go n | even n = 1 + lookup (n `div` 2) | otherwise = 1 + lookup (3 * n + 1)
lookup :: Integer -> Integer lookup n | n < cacheSize = table ! (fromInteger n) | even n = 1 + lookup (n `div` 2) | otherwise = 1 + lookup (3 * n + 1)
The rest of the code is just some simple i/o.
The table is filled up lazily as you request values from it.
-- All the best, Dmitri O. Kondratiev
"This is what keeps me going: discovery" dokondr@gmail.com http://sites.google.com/site/dokondr/welcome

Hi Dimitri,
When asking "how to implement cache in Haskell" I was hopping that there exists some solution without using Data.Array, more "functional" approach, if I may say so ...
Steven's second solution is purely functional. It uses so-called tries to cache results instead of mutable arrays. Here is an alternative definition of Steven's `fixmemo` function: fixmemo :: HasTrie a => ((a -> b) -> (a -> b)) -> (a -> b) fixmemo f = fix (memo . f) It uses the standard fixpoint combinator Data.Function.fix :: (a -> a) -> a and has a similar (but more restrictive) type. In order to understand Steven's solution you need to know how to define recursive functions using a fixpoint combinator. For example, you can define the Fibonacci function as follows: fibonacci :: Int -> Integer fibonacci = fix fib fib :: (Int -> Integer) -> (Int -> Integer) fib fib 0 = 0 fib fib 1 = 1 fib fib n = fib (n-1) + fib (n-2) Note how the first argument `fib` of the function `fib` shadows the name of the global function `fib`. The advantage of this complicated definition is that you get a memoized version of the `fibonacci` function simply by using `fixmemo` instead of `fix`: memoFib :: Int -> Integer memoFib = fixmemo fib This definition avoids to recompute the same recursive calls over and over again in is therefore much faster than the original version. That said, memoization helps to solve your original problem only if you reuse the cache for different top-level calls of the Collatz length function. According to the Collatz conjecture no argument appears again when computing the Collatz length of a number (otherwise the computation would not terminate). Ryan's solution achieves reuse of the cache by defining it at the top level and you can do the same with tries. For the `fib` example it would look like this: fibCache :: Int :->: Integer fibCache = trie (fib cachedFib) cachedFib :: Int -> Integer cachedFib = untrie fibCache Now even independent calls to `cachedFib` reuse previously computed results. In my experiments, caching did not pay off, even for computing the maximum Collatz length of a given range. Without caching it took less than 5 seconds to compute the maximum Collatz length of all numbers between 1 and 1000000. With caching, it took 30 seconds. Cheers, Sebastian

2011/4/14 Sebastian Fischer
The advantage of this complicated definition is that you get a memoized version of the `fibonacci` function simply by using `fixmemo` instead of `fix`:
Wow. I think something about 'fix' just made sense thanks to your post, though I had read a few blog posts about it. David.

Am 14.04.2011 12:29, schrieb Dmitri O.Kondratiev:
3n+1 is the first, "warm-up" problem at Programming Chalenges site: http://www.programming-challenges.com/pg.php?page=downloadproblem&probid=110101&format=html http://www.programming-challenges.com/pg.php?page=downloadproblem&probid=110101&format=html
(This problem illustrates Collatz conjecture: http://en.wikipedia.org/wiki/3n_%2B_1#Program_to_calculate_Collatz_sequences)
As long as the judge on this site takes only C and Java solutions, I submitted in Java some add-hock code (see at the end of this message) where I used recursion and a cache of computed cycles. Judge accepted my code and measured 0.292 sec with best overall submissions of 0.008 sec to solve the problem.
*** Question: I wonder how to implement cache for this problem in Haskell? At the moment, I am not so much interested in the speed of the code, as in nice implementation.
I'ld use something like: import qualified Data.Map as Map addToMap :: Integer -> Map.Map Integer Integer -> Map.Map Integer Integer addToMap n m = case Map.lookup n m of Nothing -> let l = if even n then div n 2 else 3 * n + 1 p = addToMap l m Just s = Map.lookup l p in Map.insert n (s + 1) p Just _ -> m addRangeToMap :: Integer -> Integer -> Map.Map Integer Integer -> Map.Map Integer Integer addRangeToMap i j m = if j < i then m else addRangeToMap i (j - 1) $ addToMap j m getMaxLength :: Integer -> Integer -> Map.Map Integer Integer -> Integer getMaxLength i j = Map.foldWithKey (\ k l -> if i > k || k > j then id else max l) 0 -- putting it all togeter getRangeMax :: Integer -> Integer -> Integer getRangeMax i j = getMaxLength i j $ addRangeToMap i j $ Map.singleton 1 1

On Thu, Apr 14, 2011 at 4:29 AM, Dmitri O.Kondratiev
3n+1 is the first, "warm-up" problem at Programming Chalenges site:
http://www.programming-challenges.com/pg.php?page=downloadproblem&probid=110101&format=html
(This problem illustrates Collatz conjecture:
http://en.wikipedia.org/wiki/3n_%2B_1#Program_to_calculate_Collatz_sequences )
As long as the judge on this site takes only C and Java solutions, I submitted in Java some add-hock code (see at the end of this message) where I used recursion and a cache of computed cycles. Judge accepted my code and measured 0.292 sec with best overall submissions of 0.008 sec to solve the problem.
*** Question: I wonder how to implement cache for this problem in Haskell? At the moment, I am not so much interested in the speed of the code, as in nice implementation.
This is the exact problem data-memocombinators was written to solve. http://hackage.haskell.org/packages/archive/data-memocombinators/0.4.1/doc/h... For this problem, it is too slow to memoize everything; you have to use a bounded memo table. That's why I use a combinator-based memo approach as opposed to the type-directed approach used in eg. MemoTrie. The memo table you need is something like switch (<10^6) integral id Luke

On Thu, Apr 14, 2011 at 8:02 PM, Luke Palmer
For this problem, it is too slow to memoize everything; you have to use a bounded memo table. That's why I use a combinator-based memo approach as opposed to the type-directed approach used in eg. MemoTrie. The memo table you need is something like
switch (<10^6) integral id
I think because of the definition of `Data.Function.fix` fix f = let x = f x in x which uses sharing, the definition fibonacci = fix (switch (<10^6) integral id . fib) chaches results even of independent global calls. If `fix` would be defined as fix f = f (fix f) it would only cache the recursive calls of each individual call. Do you agree? Here is a fixpoint combinator that is parameterized by a memo combinator: fixmemo :: Memo a -> ((a -> b) -> (a -> b)) -> (a -> b) fixmemo memo f = fix (memo . f) If I use fixmemo (switch (<=10^6) integral id) collatz the computation of the maximum Collatz length between 1 and 10^6 takes around 16 seconds (rather than 4 seconds without memoization). But when using fixmemo (arrayRange (1,10^6)) collatz memoization actually pays off and run time goes down to around 3 seconds. I uploaded the program underlying my experiments to github (spoiler alert): https://gist.github.com/921469 I knew that memocombinators are more flexible than a type-based MemoTrie but it is nice to see that they also lead to more efficient implementations and allow to define the other array-based implementation from this thread in a modular way. Sebastian

Hi Dmitri,
*** Question: I wonder how to implement cache for this problem in Haskell? At the moment, I am not so much interested in the speed of the code, as in nice implementation.
Yet another option for memoization implementation: to use "monad-memo" package [1] which provides memoization for monadic function (using Data.Map by default). To use it we need to define our recursive function in monadic form and add `memo` in place of its recursive call:
import Control.Applicative import Control.Monad.Memo
-- calculates the length of sequence (with memoization) calcM 1 = return 1 calcM n = succ <$> memo calcM (if even n then n `div` 2 else n*3 + 1)
Here is a helper-function to run this calculation (we don't really need it here since `calcM` is going to be called from other monadic function directly):
runCalc :: Integer -> Integer runCalc = startEvalMemo . calcM
NB: the inferred type for `calcM` might look a little bit.. verbose, but we don't really need to specify it or expose `calcM` from our module:
calcM :: (MonadMemo a1 a m, Num a, Functor m, Integral a1, Enum a) => a1 -> m a
The implementation of the function to calculate the maximum length of the sequence for all numbers in a range is straightforward (and also monadic):
-- NB: memoization cache is shared among all 'calcM' calls (very efficient) calcRangeM f t = maximum <$> forM [f..t] calcM
and a similar helper run-function (is not needed for the task either):
runCalcRange :: Integer -> Integer -> Integer runCalcRange f t = startEvalMemo $ calcRangeM f t
To run `calcRangeM` for the input list of values (map the function over it) we can define yet another simple monadic function which calls `calcRangeM` directly (thus reusing/building the same memoization cache):
solM = mapM (uncurry calcRangeM)
and a helper run-function:
runSol :: [(Integer, Integer)] -> [Integer] runSol = startEvalMemo . solM
Composing all these together results in the following test code (I hard-coded the input for the sake of simplicity):
import Control.Applicative import Control.Monad.Memo
calcM 1 = return 1 calcM n = succ <$> memo calcM (if even n then n `div` 2 else n*3 + 1)
calcRangeM f t = maximum <$> forM [f..t] calcM
solM = mapM (uncurry calcRangeM)
runSol = startEvalMemo . solM
main = print $ runSol [ (1, 10), (100, 200), (201, 210), (900, 1000) ]
As for the performance, `main = print $ runSol [(1, 1000000)]` takes ~40 seconds with -O2 on my laptop. [1] http://hackage.haskell.org/package/monad-memo
participants (9)
-
Christian Maeder
-
David Virebayre
-
Dmitri O.Kondratiev
-
Eduard Sergeev
-
Felipe Almeida Lessa
-
Luke Palmer
-
Ryan Ingram
-
Sebastian Fischer
-
Steven Keuchel