Re: [Haskell-cafe] excercise - a completely lazy sorting algorithm

7 Jul 2009


      The "sorted array of bags of unsorted input" is a nice idea.  However,
you have to use the data structure in a single-threaded [1] fashion to
obtain the claimed bounds.

Here's a pure solution that uses amortization and laziness.
...
import qualified Data.Sequence as S
import Data.Sequence ((><))
import Data.Foldable
import Data.Monoid
Suppose we have a function to find the the median of a list, and
partition it into three sublists: Smaller than the median, equal to
the media, larger than the median.  That function should run in linear
time.
...
partitionOnMedian :: forall a. (Ord a) => (S.Seq a) -> BTreeRaw a (S.Seq a)
where the following data structure holds the sublists and some
bookkeeping information:
...
data BTreeRaw a m = Leaf
                  | Node {cmp::(a->Ordering)
                         , lN :: Int
                         , less::m
                         , eq :: (S.Seq a)
                         , gN :: Int
                         , greater::m
                         }
where 'lN' and 'gN' are the length of 'less' and 'greater'.

We can make BTreeRaw a functor:
...
instance Functor (BTreeRaw a) where
    fmap f Leaf = Leaf
    fmap f (Node c lN l e gN g) = Node c lN (f l) e gN (f g)
Now using a fixed-point construction we can bootstrap a sorting
algorithm from partitionOnMedian:
...
data Fix m = Fix {unfix :: (m (Fix m))}
type BTree a = Fix (BTreeRaw a)
...
treeSort :: forall a. (Ord a) => S.Seq a -> BTree a
treeSort = Fix . helper . partitionOnMedian
    where helper = fmap (Fix . helper . partitionOnMedian)
Now treeSort produces the thunk of a balanced binary search tree.  Of
course we can get a sorted list out of it (forcing the whole
structure):
...
flatten :: BTree a -> S.Seq a
flatten (Fix Leaf) = S.empty
flatten (Fix (Node _ lN l e gN g)) = flatten l >< e >< flatten g
...
mySort = flatten . treeSort
But we can also get elements efficently, forcing only a linear amount
of comparisions in the worst case:
...
index :: BTree a -> Int -> a
index (Fix Leaf) _ = error "tried to get an element of Leaf"
index (Fix (Node lN l e gN g)) i | i < lN
                                     = index l i
                                 | i - lN < S.length e
                                     = S.index e (i-lN)
                                 | i - lN - S.length e < gN
                                     = index g (i - lN - S.length e)
                                 | i - lN - S.length e - gN >= 0
                                     = error "index out of bounds"
Although we do have to force comparisions only once every time we
touch the same element in the tree, we do still have to traverse the
tree (in logarithmic time).

If you want linear time access on first touch of an element and
constant time access afterwards us toArray:
...
toArray :: (IA.IArray a t) => Fix (BTreeRaw t) -> a Int t
toArray tree = IA.listArray (0,maxI) (map (index tree) [0..maxI])
    where size (Fix Leaf) = 0
          size (Fix (Node lN _ e gN _)) = lN + S.length e + gN
          maxI = size tree - 1
[1] Single-Threaded in the sense of Okasaki's use of the word.

Re: [Haskell-cafe] excercise - a completely lazy sorting algorithm

Matthias Görgens