
On Fri, Jun 22, 2007 at 09:26:40PM +0100, Andrew Coppin wrote:
OK, so I *started* writing a request for help, but... well I answered my own question! See the bottom...
...
module BWT3 (bwt) where
import Data.List import qualified Data.ByteString as Raw
rotate :: Int -> Raw.ByteString -> Int -> Raw.ByteString rotate l xs n = (Raw.drop (l-n) xs) `Raw.append` (Raw.take (l-n) xs)
bwt xs = let l = Raw.length xs ys = rotate l xs in Raw.pack $ map (Raw.last . rotate l xs) $ sortBy (\n m -> compare (ys n) (ys m)) [0..(l-1)]
Now I can transform 52 KB in 54 seconds + 30 MB RAM. Still nowhere near C++, but a big improvement none the less.
The trouble is that Raw.append is an O(N) operation, making the computation O(N^2) where it ought to be O(NlogN).
Woah... What the hell? I just switched to Data.ByteString.Lazy and WHAM! Vast speed increases... Jeepers, I can transform 52 KB so fast I can't even get to Task Manager fast enough to *check* the RAM usage! Blimey...
OK, just tried the 145 KB test file that Mr C++ used. That took 2 seconds + 43 MB RAM. Ouch.
In this case append is an O(1) operation. But you're still getting killed on prefactors, because you're generating a list of size N and then sorting it. Lists are just not nice data structures to sort, nor are they nice to have for large N. To get better speed and memory use, I think you'd want to avoid the intermediate list in favor of some sort of strict array, but that'd be ugly. -- David Roundy Department of Physics Oregon State University