Manual constructor specialization

I have a rope data type with the invariant that one of its data constructors can never appear as a leaf. module Data.Rope where import Data.Word (Word8) data Rope = Empty | Leaf | Node !Rope !Rope index :: Rope -> Int -> Word8 index Empty _ = error "empty" index Leaf _ = error "leaf" index (Node l r) n = index' l n where index' Leaf _ = error "leaf" index' (Node l r) n = index' l n I removed some of the actual details (Leafs have a ByteString member and Nodes have a length and a depth field). The point is that Empty can only appear at the top by construction (i.e. it's not possible to construct a rope that breaks this invariant using the exported API). If I understand compilation of case statements correctly they will compile into some if like construct that checks the constructor tag. This means that I would like that once we have established that the constructor is not Empty we could use a specialized function to traverse the tree without checking for Empty on each iteration. The above doesn't achieve this as GHC inserts an automatic error branch for Empty in index'. Perhaps one solution is to reorder the cases and remove index'? -- Johan

On 10/9/07, Johan Tibell
data Rope = Empty | Leaf | Node !Rope !Rope
The point is that Empty can only appear at the top by construction
How about indicating this in your data type? I.e., data Rope = Empty | NonEmptyRope data NonEmptyRope = Leaf | Node !NonEmptyRope !NonEmptyRope

On 10/9/07, David Benbennick
On 10/9/07, Johan Tibell
wrote: data Rope = Empty | Leaf | Node !Rope !Rope
The point is that Empty can only appear at the top by construction
How about indicating this in your data type? I.e.,
data Rope = Empty | NonEmptyRope data NonEmptyRope = Leaf | Node !NonEmptyRope !NonEmptyRope
That would be an idea. What are the performance effects of this? I'm trying to not have too many layers of indirection (I will spend quite some time reading -ddumpsimpl to optimize this library so anything that makes that simpler is a Good Thing.)

On Tue, 2007-10-09 at 17:40 +0200, Johan Tibell wrote:
On 10/9/07, David Benbennick
wrote: On 10/9/07, Johan Tibell
wrote: data Rope = Empty | Leaf | Node !Rope !Rope
The point is that Empty can only appear at the top by construction
How about indicating this in your data type? I.e.,
data Rope = Empty | NonEmptyRope data NonEmptyRope = Leaf | Node !NonEmptyRope !NonEmptyRope
That would be an idea. What are the performance effects of this? I'm trying to not have too many layers of indirection (I will spend quite some time reading -ddumpsimpl to optimize this library so anything that makes that simpler is a Good Thing.)
Another approach would be to define your invariant as a QuickCheck property. If you run your test suite often enough you will still find errors quickly. W.r.t. performance, note that allocating data on the C-stack might be slow, so make sure you also measure this. / Thomas

Johan Tibell wrote:
On 10/9/07, David Benbennick
wrote: On 10/9/07, Johan Tibell
wrote: data Rope = Empty | Leaf | Node !Rope !Rope The point is that Empty can only appear at the top by construction How about indicating this in your data type? I.e.,
data Rope = Empty | NonEmptyRope data NonEmptyRope = Leaf | Node !NonEmptyRope !NonEmptyRope
That would be an idea. What are the performance effects of this? I'm trying to not have too many layers of indirection (I will spend quite some time reading -ddumpsimpl to optimize this library so anything that makes that simpler is a Good Thing.)
What you might want as a wrapper is a strict version of a maybe type:
data Rope = NoRope | JustRope {-# UNPACK #-} !ARope data ARope = Leaf {- rest of data here -} | Node {-# UNPACK #-} !NonEmptyRope {-# UNPACK #-} !NonEmptyRope
With the "constructor tagging" in the latest GHC the runtime indirection costs for such 2-constructor types should be quite low. -- Chris Kuklewicz

A GADT version seems to generate OK code: data Top data NTop data Rope t where Empty :: Rope Top Leaf :: Rope NTop Node :: !(Rope NTop) -> !(Rope NTop) -> Rope NTop index :: Rope t -> Int -> Word8 index Empty _ = error "empty" index Leaf _ = error "leaf" index (Node l r) n = index' l n where index' :: Rope NTop -> Int -> Word8 index' Leaf _ = error "leaf" index' (Node l r) n = index' l n Here's the result of -O -ddump-simpl for index': $windex'_rx5 = \ (w_swu :: Tree.Rope Tree.NTop) -> case w_swu of wild_Xf { Tree.Leaf -> lvl1_rwZ; Tree.Node l_adF r_adG -> $windex'_rx5 l_adF } Regards, Zun.
participants (6)
-
Bryan O'Sullivan
-
ChrisK
-
David Benbennick
-
Johan Tibell
-
Roberto Zunino
-
Thomas Schilling