
OK, so I *started* writing a request for help, but... well I answered my own question! See the bottom... I was reading Wikipedia, and I found this: http://en.wikipedia.org/wiki/Burrows-Wheeler_transform I decided to sit down and see what that looks like in Haskell. I came up with this: module BWT where import Data.List rotate1 :: [x] -> [x] rotate1 [] = [] rotate1 xs = last xs : init xs rotate :: [x] -> [[x]] rotate xs = take (length xs) (iterate rotate1 xs) bwt :: (Ord x) => [x] -> [x] bwt = map last . sort . rotate step :: (Ord x) => [x] -> [[x]] -> [[x]] step xs = zipWith (:) xs . sort inv_bwt :: (Ord x) => x -> [x] -> [x] inv_bwt mark xs = head . filter (\xs -> head xs == mark) . head . drop ((length xs) - 1) . iterate (step xs) . map (\x -> [x]) $ xs My my, isn't that SO much shorter? I love Haskell! :-D Unfortunately, the resident C++ expert fails to grasp the concept of "example code", and insists on comparing the efficiency of this program to the C one on the website. Fact is, he's translated the presented C into C++, and it can apparently transform a 145 KB file in 8 seconds using only 3 MB of RAM. The code above, however, took about 11 seconds to transform 4 KB of text, and that required about 60 MB of RAM. (I tried larger, but the OS killed the process for comsuming too much RAM.) Well anyway, the code was written for simplicity, not efficiency. I've tried to explain this, but apparently that is beyond his comprehension. So anyway, it looks like we have a race on. >:-D The first thing I did was the optimisation mentioned on Wikipedia: you don't *need* to build a list of lists. You can just throw pointers around. So I arrived at this: module BWT2 (bwt) where import Data.List rotate :: Int -> [x] -> Int -> [x] rotate l xs n = (drop (l-n) xs) ++ (take (l-n) xs) bwt xs = let l = length xs ys = rotate l xs in map (last . rotate l xs) $ sortBy (\n m -> compare (ys n) (ys m)) [0..(l-1)] This is indeed *much* faster. With this, I can transform 52 KB of text in 9 minutes + 60 MB RAM. The previous version seemed to have quadratic memory usage, whereas this one seems to be linear. 52 KB would have taken many months with the first version! Still, 9 minutes (for a file 3 times smaller) is nowhere near 8 seconds. So we must try harder... For my next trick, ByteStrings! (Never used them before BTW... this is my first try!) module BWT3 (bwt) where import Data.List import qualified Data.ByteString as Raw rotate :: Int -> Raw.ByteString -> Int -> Raw.ByteString rotate l xs n = (Raw.drop (l-n) xs) `Raw.append` (Raw.take (l-n) xs) bwt xs = let l = Raw.length xs ys = rotate l xs in Raw.pack $ map (Raw.last . rotate l xs) $ sortBy (\n m -> compare (ys n) (ys m)) [0..(l-1)] Now I can transform 52 KB in 54 seconds + 30 MB RAM. Still nowhere near C++, but a big improvement none the less. Woah... What the hell? I just switched to Data.ByteString.Lazy and WHAM! Vast speed increases... Jeepers, I can transform 52 KB so fast I can't even get to Task Manager fast enough to *check* the RAM usage! Blimey... OK, just tried the 145 KB test file that Mr C++ used. That took 2 seconds + 43 MB RAM. Ouch.