
Hi, Is there currently a consensus on what is the best way to embed a pure dataflow language while keeping control over value sharing? I have a stream processing language that has a monadic interface (a -> b -> m c) to track variable sharing so it can compile down to C or Verilog, but I would like to create an expression interface layer on top (s a -> s b -> s c) that still allows me to recover the underlying monad representation to implement sharing. I got used to writing in monadic form but I really can't sell this to EEs... Other than a TH or syntax frontend, unsafe tricks, or Conal's compiling to categories which requires a plugin (I believe), is there another way? Cheers, Tom

Applicative do notation ?
Doesn’t recent ghc also have source plugins ? I’m admittedly unfamiliar
with those ?
Have you looked at how stuff like the ivory/tower edsl libraries do their
embedding?
Do you have enough examples for them to treat getting started as a “use
your cookbook” and then go from there ?
On Wed, Feb 5, 2020 at 7:57 AM Tom Schouten
Hi,
Is there currently a consensus on what is the best way to embed a pure dataflow language while keeping control over value sharing?
I have a stream processing language that has a monadic interface (a -> b -> m c) to track variable sharing so it can compile down to C or Verilog, but I would like to create an expression interface layer on top (s a -> s b -> s c) that still allows me to recover the underlying monad representation to implement sharing. I got used to writing in monadic form but I really can't sell this to EEs...
Other than a TH or syntax frontend, unsafe tricks, or Conal's compiling to categories which requires a plugin (I believe), is there another way?
Cheers,
Tom _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

On 2/5/20 8:17 AM, Carter Schonwald wrote:
Applicative do notation ?
Probably not. See below.
Doesn’t recent ghc also have source plugins ? I’m admittedly unfamiliar with those ?
Have you looked at how stuff like the ivory/tower edsl libraries do their embedding?
I'm unaware of these. Thanks for the pointer.
Do you have enough examples for them to treat getting started as a “use your cookbook” and then go from there ?
Haskell is already a stretch. Doing things like this: a <- (b `and`) =<< not c Is just never going to work. Keeping it in ANF: nc <- not c a <- b `and` nc Is easier to explain, but it is very hard to justify why you can only put the result of a single operation in a variable, and why sometimes it is "let" and other times it is that arrow. Basically I'm asking people to give up expressions while they work just fine in C or Verilog. It's a step down. I really don't blame them.

On 2/5/20 8:17 AM, Carter Schonwald wrote:
Have you looked at how stuff like the ivory/tower edsl libraries do their embedding?
That is a nice project! Thanks for pointing it out. For Ivory, the embedding is monadic so not any different in that respect, and this is also much more expressive than what I need. I guess I don't know what exactly I don't know... I'm doing something quite straightforward. Basically I know that the language I'm embedding only has pure functions mapping pairs of sequences to pairs of sequences with the restriction that the mapping is causal when you look at individual elements in a stream, but I dont' think this fact is even observable after abstracting to streams. Keeping track of the sharing information necessary to be able to compile it to an external target introduces an effect. But this is the _only_ effect, and it is an implementation detail that makes me loose all the nice properties of pure functions. That just feels wrong. I'm sure I'm missing something. I believe the core issue is that I'm not understanding something quite fundamental. Why is it so hard to recover sharing information if the thing that is embedded is pure? I suspect the answer is something something referential transparency but how exactly? This is what I sort of understand: - Compiling to categories fixes the problem completely using a big gun: abstracting over function abstraction and application. It's great, but can't be done in Haskell as is. This is probably the cleanest solution. I suspect this also has the answer to my question above but I don't quite see it. - There is another Functional HDL that solves this using some unsafe reference trick to keep track of the sharing. I believe it is CλaSH but I'm not sure. I believe you can get away with this because the semantics is pure so in practice doesn't cause any inconsistencies, but it really doesn't sound like something I would do without some kind of proof that it is actually ok. If it is ok, it would probably make sense to abstract this in a library. Maybe someone has done that already? - You can try to recover sharing later by doing common subexpression elimination. This works but has complexity issues and doesn't scale to large systems. - Maybe it is possible to hide the compiler using existential types. I tried something along these lines but I couldn't figure it out so I don't know if it's just lack of insight or just impossible. Probably the latter.

Try using clash, its its own thing, and overanalysis might be more
challenging than just trying it out
On Wed, Feb 5, 2020 at 10:37 AM Tom Schouten
On 2/5/20 8:17 AM, Carter Schonwald wrote:
Have you looked at how stuff like the ivory/tower edsl libraries do their embedding?
That is a nice project! Thanks for pointing it out.
For Ivory, the embedding is monadic so not any different in that respect, and this is also much more expressive than what I need.
I guess I don't know what exactly I don't know...
I'm doing something quite straightforward. Basically I know that the language I'm embedding only has pure functions mapping pairs of sequences to pairs of sequences with the restriction that the mapping is causal when you look at individual elements in a stream, but I dont' think this fact is even observable after abstracting to streams.
Keeping track of the sharing information necessary to be able to compile it to an external target introduces an effect. But this is the _only_ effect, and it is an implementation detail that makes me loose all the nice properties of pure functions. That just feels wrong. I'm sure I'm missing something.
I believe the core issue is that I'm not understanding something quite fundamental. Why is it so hard to recover sharing information if the thing that is embedded is pure? I suspect the answer is something something referential transparency but how exactly?
This is what I sort of understand:
- Compiling to categories fixes the problem completely using a big gun: abstracting over function abstraction and application. It's great, but can't be done in Haskell as is. This is probably the cleanest solution. I suspect this also has the answer to my question above but I don't quite see it.
- There is another Functional HDL that solves this using some unsafe reference trick to keep track of the sharing. I believe it is CλaSH but I'm not sure. I believe you can get away with this because the semantics is pure so in practice doesn't cause any inconsistencies, but it really doesn't sound like something I would do without some kind of proof that it is actually ok. If it is ok, it would probably make sense to abstract this in a library. Maybe someone has done that already?
- You can try to recover sharing later by doing common subexpression elimination. This works but has complexity issues and doesn't scale to large systems.
- Maybe it is possible to hide the compiler using existential types. I tried something along these lines but I couldn't figure it out so I don't know if it's just lack of insight or just impossible. Probably the latter.

On 2/5/20 10:39 AM, Carter Schonwald wrote:
Try using clash, its its own thing, and overanalysis might be more challenging than just trying it out
Fair point.
- There is another Functional HDL that solves this using some unsafe reference trick to keep track of the sharing. I believe it is CλaSH but I'm not sure. I believe you can get away with this because the semantics is pure so in practice doesn't cause any inconsistencies, but it really doesn't sound like something I would do without some kind of proof that it is actually ok. If it is ok, it would probably make sense to abstract this in a library. Maybe someone has done that already?
I was able to recover some information from my notes. I ran into this a while ago, then decided to keep the implementation simple and just use Monads: A survey in Andy Gill's presentation on observable sharing: http://www.ittc.ku.edu/~andygill/talks/20090903-hask.pdf Just checked the Clash website and there is a link to Christiaan Baaij's master thesis, which has a description of the sharing problem in Appendix C: https://essay.utwente.nl/59482/1/scriptie_C_Baaij.pdf I think there is probably a use for a generic library that can do this kind of sharing recovery. Still I'm not quite happy with the "can be unsafe in some cases" remarks and would like to learn more. However there might be a tradeoff to use this as a simplified interface to something that is implemented in a safe monadic style under the hood.

The SBV library (https://hackage.haskell.org/package/sbv https://hackage.haskell.org/package/sbv) uses the ideas in Andy Gill’s Observable sharing paper (http://www.ittc.ku.edu/~andygill/papers/reifyGraph.pdf http://www.ittc.ku.edu/~andygill/papers/reifyGraph.pdf) to safely observe sharing. Expressions remain pure, so long as “observation” of the sharing is done in the IO monad. In my experience, this works really well and closely captures the application model: You want your users to program as if in a pure language, but the various backends (For SBV, this means C-compilation, SMTLib translation, Test-case generation etc.) already happens in a monadic framework, so it all works out rather nicely. SBV doesn’t use Andy’s data-reify package (https://hackage.haskell.org/package/data-reify https://hackage.haskell.org/package/data-reify), but that’s mostly historic. I’d definitely give that package a try. But if it doesn’t work for you for whatever reason (maybe the API doesn’t quite fit), Andy’s paper is extremely well written and you can easily use his ideas to roll your own. -Levent.
On Feb 5, 2020, at 9:40 AM, Tom Schouten
wrote: On 2/5/20 10:39 AM, Carter Schonwald wrote:
Try using clash, its its own thing, and overanalysis might be more challenging than just trying it out
Fair point.
- There is another Functional HDL that solves this using some unsafe reference trick to keep track of the sharing. I believe it is CλaSH but I'm not sure. I believe you can get away with this because the semantics is pure so in practice doesn't cause any inconsistencies, but it really doesn't sound like something I would do without some kind of proof that it is actually ok. If it is ok, it would probably make sense to abstract this in a library. Maybe someone has done that already?
I was able to recover some information from my notes. I ran into this a while ago, then decided to keep the implementation simple and just use Monads:
A survey in Andy Gill's presentation on observable sharing:
http://www.ittc.ku.edu/~andygill/talks/20090903-hask.pdf http://www.ittc.ku.edu/~andygill/talks/20090903-hask.pdf Just checked the Clash website and there is a link to Christiaan Baaij's master thesis, which has a description of the sharing problem in Appendix C:
https://essay.utwente.nl/59482/1/scriptie_C_Baaij.pdf https://essay.utwente.nl/59482/1/scriptie_C_Baaij.pdf
I think there is probably a use for a generic library that can do this kind of sharing recovery. Still I'm not quite happy with the "can be unsafe in some cases" remarks and would like to learn more. However there might be a tradeoff to use this as a simplified interface to something that is implemented in a safe monadic style under the hood.
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

On 2/5/20 12:53 PM, Levent Erkok wrote:
The SBV library (https://hackage.haskell.org/package/sbv) uses the ideas in Andy Gill’s Observable sharing paper (http://www.ittc.ku.edu/~andygill/papers/reifyGraph.pdf) to safely observe sharing. Expressions remain pure, so long as “observation” of the sharing is done in the IO monad.
In my experience, this works really well and closely captures the application model: You want your users to program as if in a pure language, but the various backends (For SBV, this means C-compilation, SMTLib translation, Test-case generation etc.) already happens in a monadic framework, so it all works out rather nicely.
This would probably fit my case also.
SBV doesn’t use Andy’s data-reify package (https://hackage.haskell.org/package/data-reify), but that’s mostly historic. I’d definitely give that package a try. But if it doesn’t work for you for whatever reason (maybe the API doesn’t quite fit), Andy’s paper is extremely well written and you can easily use his ideas to roll your own.
Thanks for the pointers!

Den 2020-02-05 kl. 17:40, skrev Tom Schouten:
I think there is probably a use for a generic library that can do this kind of sharing recovery. Still I'm not quite happy with the "can be unsafe in some cases" remarks and would like to learn more. However there might be a tradeoff to use this as a simplified interface to something that is implemented in a safe monadic style under the hood.
Here's the generic library that Kansas Lava uses for sharing recovery: https://hackage.haskell.org/package/data-reify The original Observable Sharing paper explains in more detail the properties of observable sharing: https://www.researchgate.net/profile/David_Sands3/publication/225679607_Obse... Basically, the worst thing that can happen is that you get less sharing (and maybe in some cases even more) than you expect because of whether or not GHC decides to inline something. Also, beware that overloaded definitions may not be shared due to the extra dictionary argument. In practice, I've had very few surprises with observable sharing. But of course, the risk of unexpected duplication is not ideal when we're talking hardware :-) / Emil

On 2/5/20 1:18 PM, Emil Axelsson wrote:
Den 2020-02-05 kl. 17:40, skrev Tom Schouten:
I think there is probably a use for a generic library that can do this kind of sharing recovery. Still I'm not quite happy with the "can be unsafe in some cases" remarks and would like to learn more. However there might be a tradeoff to use this as a simplified interface to something that is implemented in a safe monadic style under the hood.
Here's the generic library that Kansas Lava uses for sharing recovery:
https://hackage.haskell.org/package/data-reify
The original Observable Sharing paper explains in more detail the properties of observable sharing:
https://www.researchgate.net/profile/David_Sands3/publication/225679607_Obse...
Basically, the worst thing that can happen is that you get less sharing (and maybe in some cases even more) than you expect because of whether or not GHC decides to inline something. Also, beware that overloaded definitions may not be shared due to the extra dictionary argument.
In practice, I've had very few surprises with observable sharing. But of course, the risk of unexpected duplication is not ideal when we're talking hardware :-)
/ Emil
Thanks That unpredictable nature doesn't sit well with me :)

it looks like bluespec open sourced their tool chain (though its build
system is a tad confusing when i tried to build it on my mac )
https://github.com/B-Lang-org/bsc
the clash build setup process looks positively relaxing by comparison :)
On Wed, Feb 5, 2020 at 12:40 PM Tom Schouten
On 2/5/20 10:39 AM, Carter Schonwald wrote:
Try using clash, its its own thing, and overanalysis might be more challenging than just trying it out
Fair point.
- There is another Functional HDL that solves this using some unsafe
reference trick to keep track of the sharing. I believe it is CλaSH but I'm not sure. I believe you can get away with this because the semantics is pure so in practice doesn't cause any inconsistencies, but it really doesn't sound like something I would do without some kind of proof that it is actually ok. If it is ok, it would probably make sense to abstract this in a library. Maybe someone has done that already?
I was able to recover some information from my notes. I ran into this a while ago, then decided to keep the implementation simple and just use Monads:
A survey in Andy Gill's presentation on observable sharing:
http://www.ittc.ku.edu/~andygill/talks/20090903-hask.pdf
Just checked the Clash website and there is a link to Christiaan Baaij's master thesis, which has a description of the sharing problem in Appendix C:
https://essay.utwente.nl/59482/1/scriptie_C_Baaij.pdf
I think there is probably a use for a generic library that can do this kind of sharing recovery. Still I'm not quite happy with the "can be unsafe in some cases" remarks and would like to learn more. However there might be a tradeoff to use this as a simplified interface to something that is implemented in a safe monadic style under the hood.

On Wed, Feb 05, 2020 at 07:56:39AM -0500, Tom Schouten wrote:
Is there currently a consensus on what is the best way to embed a pure dataflow language while keeping control over value sharing?
Have you looked into how Edward Kmett's AD package does observable sharing? Observable sharing is particularly important for reverse mode AD. -- For this form of reverse-mode AD we use 'System.Mem.StableName.StableName' to recover -- sharing information from the tape to avoid combinatorial explosion, and thus -- run asymptotically faster than it could without such sharing information, but the use -- of side-effects contained herein is benign. https://github.com/ekmett/ad/blob/c3d9599030f7e4793896013c69ab6b19ce403906/s... Tom

On 2/5/20 5:35 PM, Tom Ellis wrote:
On Wed, Feb 05, 2020 at 07:56:39AM -0500, Tom Schouten wrote:
Is there currently a consensus on what is the best way to embed a pure dataflow language while keeping control over value sharing? Have you looked into how Edward Kmett's AD package does observable sharing? Observable sharing is particularly important for reverse mode AD.
-- For this form of reverse-mode AD we use 'System.Mem.StableName.StableName' to recover -- sharing information from the tape to avoid combinatorial explosion, and thus -- run asymptotically faster than it could without such sharing information, but the use -- of side-effects contained herein is benign.
https://github.com/ekmett/ad/blob/c3d9599030f7e4793896013c69ab6b19ce403906/s...
No, and thanks! Getting many great responses. Will have to chew on this for a bit.
participants (5)
-
Carter Schonwald
-
Emil Axelsson
-
Levent Erkok
-
Tom Ellis
-
Tom Schouten