ANN: data-fix-cse -- Common subexpression elimination for EDSLs

I'm glad to announce the package for Common subexpression elimination [1]. It's an implementation of the hashconsig algorithm as described in the paper 'Implementing Explicit and Finding Implicit Sharing in EDSLs' by Oleg Kiselyov. Main point of the library is to define this algorithm in the most generic way. You can define the AST for your DSL as fixpoint type[2]. And then all you need to use the library is to define the instance for type class `Traversable`. This idea is inspired by `data-reify` [3] package which you can use to transform your ASTs to DAGs too. But it relies on inspection of the references for values when `data-fix-cse` doesn't sacrifices the purity. A short example: Let's define a tiny DSL for signals import Data.Fix type Name = String type E = Fix Exp data Exp a = Const Double | ReadPort Name | Tfm Name [a] | Mix a a deriving (Show, Eq, Ord) We can make constant signals, read them from some ports and transform them (apply some named function to the list of signals) and mix two signals. Let's define an instance of the Traversable (hence for the Functor and Foldable) import Control.Applicative import Data.Monoid import Data.Traversable import Data.Foldable instance Functor Exp where fmap f x = case x of Const d -> Const d ReadPort n -> ReadPort n Mix a b -> Mix (f a) (f b) Tfm n as -> Tfm n $ fmap f as instance Foldable Exp where foldMap f x = case x of Mix a b -> f a <> f b Tfm n as -> mconcat $ fmap f as _ -> mempty instance Traversable Exp where traverse f x = case x of Mix a b -> Mix <$> f a <*> f b Tfm n as -> Tfm n <$> traverse f as a -> pure a Now we can use the functio `cse` cse :: (Eqhttp://hackage.haskell.org/packages/archive/base/4.6.0.1/doc/html/Data-Eq.ht...(f Inthttp://hackage.haskell.org/packages/archive/base/4.6.0.1/doc/html/Data-Int.h...), Ordhttp://hackage.haskell.org/packages/archive/base/4.6.0.1/doc/html/Data-Ord.h...(f Inthttp://hackage.haskell.org/packages/archive/base/4.6.0.1/doc/html/Data-Int.h...), Traversablehttp://hackage.haskell.org/packages/archive/base/4.6.0.1/doc/html/Data-Trave...f) => Fixhttp://hackage.haskell.org/packages/archive/data-fix/0.0.1/doc/html/Data-Fix...f -> Daghttp://hackage.haskell.org/packages/archive/data-fix-cse/0.0.1/doc/html/Data...f to transform our AST to DAG. DAG is already sorted. Later we can define a handy wrapper to hide the details from the client newtype Sig = Sig { unSig :: E } You can find examples in the package archive Extra-Source-Files: test/Exp.hs test/Impl.hs test/Expl.hs If you want to see a real world example of usage you can find it in the csound-expression[4]. An edsl for the Csound language. One side-note form my experience: Fixpoint types can be very flexible. It's easy to compose them. If suddenly we need to add some extra data to all cases from the example above we can easily do it with just another Functor: Imagine that we want to use a SampleRate value with all signals. Then we can do it like this: type E = Fix SampledExp data SampledExp a = SampledExp SampleRate (Exp a) then we should define an instance of the type class Traversable for our new type SampleRate. The Exp doesn't change. [1] http://hackage.haskell.org/package/data-fix-cse-0.0.1 [2] http://hackage.haskell.org/package/data-fix-0.0.1 [3] http://hackage.haskell.org/package/data-reify [4] http://hackage.haskell.org/package/csound-expression Anton

2013-02-19 12:10, Anton Kholomiov skrev:
I'm glad to announce the package for Commonsubexpression elimination [1]. It's an implementation of the hashconsig algorithm as described in the paper 'Implementing Explicit and Finding Implicit Sharing in EDSLs' by Oleg Kiselyov.
Main point of the library is to define this algorithm in the most generic way. You can define the AST for your DSL as fixpoint type[2]. And then all you need to use the library is to define the instance for type class `Traversable`.
One way to make the library even more useful would have been to base it on compdata instead of data-fix. Compdata has support for composable types and lots of extra functionality. On the other hand, it's easy enough to translate from compdata terms to your `Fix`...
One side-note form my experience: Fixpoint types can be very flexible. It's easy to compose them. If suddenly we need to add some extra data to all cases from the example above we can easily do it with just another Functor:
Imagine that we want to use a SampleRate value with all signals. Then we can do it like this:
type E = Fix SampledExp
data SampledExp a = SampledExp SampleRate (Exp a)
then we should define an instance of the type class Traversable for our new type SampleRate. The Exp doesn't change.
Very useful indeed! A more principled way to extend data types in this way is Data Types à la Carte: http://dx.doi.org/10.1017/S0956796808006758 (Implemented in compdata.) / Emil

There are several packages that already define fixpoints (another one is
about unification), but all packages that I'm aware of define a lot of
functionality
that I don't need (and actually don't understand, packages with fixpoint
types
tend to be rather dense with math). I'd like it to be simple and
lightweight.
Just fixpoints, just folds and unfolds.
2013/2/19 Emil Axelsson
2013-02-19 12:10, Anton Kholomiov skrev:
I'm glad to announce the package for Commonsubexpression elimination [1].
It's an implementation of the hashconsig algorithm as described in the paper 'Implementing Explicit and Finding Implicit Sharing in EDSLs' by Oleg Kiselyov.
Main point of the library is to define this algorithm in the most generic way. You can define the AST for your DSL as fixpoint type[2]. And then all you need to use the library is to define the instance for type class `Traversable`.
One way to make the library even more useful would have been to base it on compdata instead of data-fix. Compdata has support for composable types and lots of extra functionality. On the other hand, it's easy enough to translate from compdata terms to your `Fix`...
One side-note form my experience: Fixpoint types can be very flexible.
It's easy to compose them. If suddenly we need to add some extra data to all cases from the example above we can easily do it with just another Functor:
Imagine that we want to use a SampleRate value with all signals. Then we can do it like this:
type E = Fix SampledExp
data SampledExp a = SampledExp SampleRate (Exp a)
then we should define an instance of the type class Traversable for our new type SampleRate. The Exp doesn't change.
Very useful indeed! A more principled way to extend data types in this way is Data Types à la Carte:
http://dx.doi.org/10.1017/**S0956796808006758http://dx.doi.org/10.1017/S0956796808006758
(Implemented in compdata.)
/ Emil

Fully understandable! Compdata would be quite a heavy dependency for your library. I'm just generally fond of the idea of collecting all DSL implementation tricks under one umbrella. That requires using the same term representation. / Emil 2013-02-19 14:12, Anton Kholomiov skrev:
There are several packages that already define fixpoints (another one is about unification), but all packages that I'm aware of define a lot of functionality that I don't need (and actually don't understand, packages with fixpoint types tend to be rather dense with math). I'd like it to be simple and lightweight. Just fixpoints, just folds and unfolds.
2013/2/19 Emil Axelsson
mailto:emax@chalmers.se> 2013-02-19 12:10, Anton Kholomiov skrev:
I'm glad to announce the package for Commonsubexpression elimination [1].
It's an implementation of the hashconsig algorithm as described in the paper 'Implementing Explicit and Finding Implicit Sharing in EDSLs' by Oleg Kiselyov.
Main point of the library is to define this algorithm in the most generic way. You can define the AST for your DSL as fixpoint type[2]. And then all you need to use the library is to define the instance for type class `Traversable`.
One way to make the library even more useful would have been to base it on compdata instead of data-fix. Compdata has support for composable types and lots of extra functionality. On the other hand, it's easy enough to translate from compdata terms to your `Fix`...
One side-note form my experience: Fixpoint types can be very flexible. It's easy to compose them. If suddenly we need to add some extra data to all cases from the example above we can easily do it with just another Functor:
Imagine that we want to use a SampleRate value with all signals. Then we can do it like this:
type E = Fix SampledExp
data SampledExp a = SampledExp SampleRate (Exp a)
then we should define an instance of the type class Traversable for our new type SampleRate. The Exp doesn't change.
Very useful indeed! A more principled way to extend data types in this way is Data Types à la Carte:
http://dx.doi.org/10.1017/__S0956796808006758 http://dx.doi.org/10.1017/S0956796808006758
(Implemented in compdata.)
/ Emil

What a delightfully elegant approach to CSE! I've been thinking about CSE
for DSELs and about functor fixpoints, but it never occurred to me to put
the two together.
Do you think the approach can be extended for non-regular (nested)
algebraic types (where the recursive data type is sometimes at a different
type instance)? For instance, it's very handy to use GADTs to capture
embedded language types in host language (Haskell) types, which leads to
non-regularity.
- Conal
On Tue, Feb 19, 2013 at 3:10 AM, Anton Kholomiov
I'm glad to announce the package for Common subexpression elimination [1].
It's an implementation of the hashconsig algorithm as described in the paper 'Implementing Explicit and Finding Implicit Sharing in EDSLs' by Oleg Kiselyov.
Main point of the library is to define this algorithm in the most generic way. You can define the AST for your DSL as fixpoint type[2]. And then all you need to use the library is to define the instance for type class `Traversable`. This idea is inspired by `data-reify` [3] package which you can use to transform your ASTs to DAGs too. But it relies on inspection of the references for values when `data-fix-cse` doesn't sacrifices the purity.
A short example:
Let's define a tiny DSL for signals
import Data.Fix
type Name = String
type E = Fix Exp
data Exp a = Const Double | ReadPort Name | Tfm Name [a] | Mix a a deriving (Show, Eq, Ord)
We can make constant signals, read them from some ports and transform them (apply some named function to the list of signals) and mix two signals.
Let's define an instance of the Traversable (hence for the Functor and Foldable)
import Control.Applicative
import Data.Monoid import Data.Traversable import Data.Foldable
instance Functor Exp where fmap f x = case x of Const d -> Const d ReadPort n -> ReadPort n Mix a b -> Mix (f a) (f b) Tfm n as -> Tfm n $ fmap f as
instance Foldable Exp where foldMap f x = case x of Mix a b -> f a <> f b Tfm n as -> mconcat $ fmap f as _ -> mempty
instance Traversable Exp where traverse f x = case x of Mix a b -> Mix <$> f a <*> f b Tfm n as -> Tfm n <$> traverse f as a -> pure a
Now we can use the functio `cse`
cse :: (Eqhttp://hackage.haskell.org/packages/archive/base/4.6.0.1/doc/html/Data-Eq.ht...(f Inthttp://hackage.haskell.org/packages/archive/base/4.6.0.1/doc/html/Data-Int.h...), Ordhttp://hackage.haskell.org/packages/archive/base/4.6.0.1/doc/html/Data-Ord.h...(f Inthttp://hackage.haskell.org/packages/archive/base/4.6.0.1/doc/html/Data-Int.h...), Traversablehttp://hackage.haskell.org/packages/archive/base/4.6.0.1/doc/html/Data-Trave...f) => Fixhttp://hackage.haskell.org/packages/archive/data-fix/0.0.1/doc/html/Data-Fix...f -> Daghttp://hackage.haskell.org/packages/archive/data-fix-cse/0.0.1/doc/html/Data...f
to transform our AST to DAG. DAG is already sorted.
Later we can define a handy wrapper to hide the details from the client
newtype Sig = Sig { unSig :: E }
You can find examples in the package archive
Extra-Source-Files: test/Exp.hs test/Impl.hs test/Expl.hs
If you want to see a real world example of usage you can find it in the csound-expression[4]. An edsl for the Csound language.
One side-note form my experience: Fixpoint types can be very flexible. It's easy to compose them. If suddenly we need to add some extra data to all cases from the example above we can easily do it with just another Functor:
Imagine that we want to use a SampleRate value with all signals. Then we can do it like this:
type E = Fix SampledExp
data SampledExp a = SampledExp SampleRate (Exp a)
then we should define an instance of the type class Traversable for our new type SampleRate. The Exp doesn't change.
[1] http://hackage.haskell.org/package/data-fix-cse-0.0.1 [2] http://hackage.haskell.org/package/data-fix-0.0.1 [3] http://hackage.haskell.org/package/data-reify [4] http://hackage.haskell.org/package/csound-expression
Anton
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Do you think the approach can be extended for non-regular (nested) algebraic types (where the recursive data type is sometimes at a different type instance)? For instance, it's very handy to use GADTs to capture embedded language types in host language (Haskell) types, which leads to non-regularity.
I'm not sure I understand the case you are talking about. Can you write a simple example of the types like this? Cheers, Anton

On Tue, Feb 19, 2013 at 9:28 PM, Anton Kholomiov
Do you think the approach can be extended for non-regular (nested)
algebraic types (where the recursive data type is sometimes at a different type instance)? For instance, it's very handy to use GADTs to capture embedded language types in host language (Haskell) types, which leads to non-regularity.
I'm not sure I understand the case you are talking about. Can you write a simple example of the types like this?
Here's an example of a type-embedded DSEL, represented as a GADT:
data E :: * -> * where Lit :: Show a => a -> E a Op :: Op a -> E a App :: E (a -> b) -> E a -> E b ...
data Op :: * -> * where Add :: Num a => E (a -> a -> a) Mul :: Num a => E (a -> a -> a) Neg :: Num a => E (a -> a) ...
-- Conal

I don't know how to express it. You need to have some dynamic
representation since
dag is a container of `(Int, f Int)`. I've tried to go along this road
type Exp a = Fix (E a)
data E c :: * -> * where
Lit :: Show a => a -> E a c
Op :: Op a -> E a c
App :: Phantom (a -> b) c -> Phantom a c -> E b c
data Op :: * -> * where
Add :: Num a => Op (a -> a -> a)
Mul :: Num a => Op (a -> a -> a)
Neg :: Num a => Op (a -> a)
newtype Phantom a b = Phantom { unPhantom :: b }
But got stuck with the definition of
app :: Exp (a -> b) -> Exp a -> Exp b
app f a = Fix $ App (Phantom f) (Phantom a)
App requires the arguments to be of the same type (in the second type
argument `c`).
2013/2/23 Conal Elliott
On Tue, Feb 19, 2013 at 9:28 PM, Anton Kholomiov < anton.kholomiov@gmail.com> wrote:
Do you think the approach can be extended for non-regular (nested)
algebraic types (where the recursive data type is sometimes at a different type instance)? For instance, it's very handy to use GADTs to capture embedded language types in host language (Haskell) types, which leads to non-regularity.
I'm not sure I understand the case you are talking about. Can you write a simple example of the types like this?
Here's an example of a type-embedded DSEL, represented as a GADT:
data E :: * -> * where Lit :: Show a => a -> E a Op :: Op a -> E a App :: E (a -> b) -> E a -> E b ...
data Op :: * -> * where Add :: Num a => E (a -> a -> a) Mul :: Num a => E (a -> a -> a) Neg :: Num a => E (a -> a) ...
-- Conal

This should be possible using higher-order terms, as in http://hackage.haskell.org/packages/archive/compdata/latest/doc/html/Data-Co... The only complication I see is that the Dag nodes would get heterogeneous types requiring existential quantification with a `Typeable` constraint. A better representation might be typed ASGs [1] Syntactic has typed ASTs and it has a module that does something similar to data-fix-cse (uses a combination of stable names and hashing), but it needs some fixing up. / Emil [1]: http://dl.acm.org/citation.cfm?id=2426909 2013-02-20 01:58, Conal Elliott skrev:
Do you think the approach can be extended for non-regular (nested) algebraic types (where the recursive data type is sometimes at a different type instance)? For instance, it's very handy to use GADTs to capture embedded language types in host language (Haskell) types, which leads to non-regularity.
participants (3)
-
Anton Kholomiov
-
Conal Elliott
-
Emil Axelsson