
The "sorted array of bags of unsorted input" is a nice idea. However, you have to use the data structure in a single-threaded [1] fashion to obtain the claimed bounds. Here's a pure solution that uses amortization and laziness.
import qualified Data.Sequence as S import Data.Sequence ((><)) import Data.Foldable import Data.Monoid
Suppose we have a function to find the the median of a list, and partition it into three sublists: Smaller than the median, equal to the media, larger than the median. That function should run in linear time.
partitionOnMedian :: forall a. (Ord a) => (S.Seq a) -> BTreeRaw a (S.Seq a)
where the following data structure holds the sublists and some bookkeeping information:
data BTreeRaw a m = Leaf | Node {cmp::(a->Ordering) , lN :: Int , less::m , eq :: (S.Seq a) , gN :: Int , greater::m }
where 'lN' and 'gN' are the length of 'less' and 'greater'. We can make BTreeRaw a functor:
instance Functor (BTreeRaw a) where fmap f Leaf = Leaf fmap f (Node c lN l e gN g) = Node c lN (f l) e gN (f g)
Now using a fixed-point construction we can bootstrap a sorting algorithm from partitionOnMedian:
data Fix m = Fix {unfix :: (m (Fix m))} type BTree a = Fix (BTreeRaw a)
treeSort :: forall a. (Ord a) => S.Seq a -> BTree a treeSort = Fix . helper . partitionOnMedian where helper = fmap (Fix . helper . partitionOnMedian)
Now treeSort produces the thunk of a balanced binary search tree. Of course we can get a sorted list out of it (forcing the whole structure):
flatten :: BTree a -> S.Seq a flatten (Fix Leaf) = S.empty flatten (Fix (Node _ lN l e gN g)) = flatten l >< e >< flatten g
mySort = flatten . treeSort
But we can also get elements efficently, forcing only a linear amount of comparisions in the worst case:
index :: BTree a -> Int -> a index (Fix Leaf) _ = error "tried to get an element of Leaf" index (Fix (Node lN l e gN g)) i | i < lN = index l i | i - lN < S.length e = S.index e (i-lN) | i - lN - S.length e < gN = index g (i - lN - S.length e) | i - lN - S.length e - gN >= 0 = error "index out of bounds"
Although we do have to force comparisions only once every time we touch the same element in the tree, we do still have to traverse the tree (in logarithmic time). If you want linear time access on first touch of an element and constant time access afterwards us toArray:
toArray :: (IA.IArray a t) => Fix (BTreeRaw t) -> a Int t toArray tree = IA.listArray (0,maxI) (map (index tree) [0..maxI]) where size (Fix Leaf) = 0 size (Fix (Node lN _ e gN _)) = lN + S.length e + gN maxI = size tree - 1
[1] Single-Threaded in the sense of Okasaki's use of the word.