[Rejected Paper] Experience Report: Writing NetBSD Sound Drivers in Haskell

older
JOB: AHRC Doctoral Studentship in...

Kiwamu Okabe

10 Jun 2014 10 Jun '14

10:13 a.m.

Hi jhc-hackers, I have written a paper "Experience Report: Writing NetBSD Sound Drivers in Haskell". http://metasepi.org/papers.html It explains Ajhc customized jhc GC, and sometime useful to develop jhc. Thank's, -- Kiwamu Okabe at METASEPI DESIGN

Show replies by date

John Meacham

10 Jun 10 Jun

10:55 a.m.

New subject: [Rejected Paper] Experience Report: Writing NetBSD Sound Drivers in Haskell

Oooh. and you documented jhc's RTS while you were at it. that is great :) I have been thinking about a way to extend JGC to seamlessly handle interfaces that utilize stack allocated C structs. The main target being GMP. Right now I can do pretty well by having a 'self-pointing' 'self-cleaning' ForeignPtr. The definition of ForeignPtr is

...

data ForeginPtr = ForeignPtr Addr_

However, it is a little magic in that you can allocate it in a larger space than it would naturally take up, I can then have Addr_ point to the word following it directly in memory. The garbage collector treats it just like normal and needs no finalizer, as long as the ForeginPtr is live, the area it points to is live since they are the same space, as far as the code is concerned it could be pointing to an external C structure. This is very efficient and a very fast way to throw around C structures without having to worry about whether they were allocated in haskell or in C. However, things like GMP require the memory region to be initialized and freed since it may have internal pointers. I could just continue with standard foreign pointers, attaching a destructor, but this has a couple problems - I have to initialize the memory area, this is hard to do without invoking the IO monad to ensure proper sequencing, for the result of an addition, that seems heavy, it is hard for the complier to "see through" an unsafePerformIO when optimizing. - every Integer will have to carry around two extra words, a self pointer that always points to its own memory location, and a pointer to a destructor that is always going to be the same. - memory will be destructed when it is likely to immediately be re-used as an Integer, it would be good to deforest this destruct-construct pair. To solve both I was thinking of assosciating a contructor/destructor with a type rather than a value. By creating an entire block of the same type, it can initialize the entire block at once, then delay the destructor until the entire block is freed. since GMP ints can be re-used in place, it would get rid of almost all initialization and destruction overhead. A 'delayed destructor' if you will, that only needs to be called if the memory location is going to be used for a different type. Allocations in jhc are already tagged by type so this isn't difficult to keep track of. I was thinking something like data {-# CCONSTRUCTOR "init_integer" #-} {-# CDESTRUCTOR delayed "fini_integer" #-} Integer_ :: # data Integer = Integer Integer_ since Integer_ is unboxed, we can ensure they are only created in the right heap by using a primitive to do so, there is no way for a user to conjure up an unboxed type that doesn't take part in unboxed num polymorphism so can be represented by 0# 1# etc. John On Tue, Jun 10, 2014 at 3:13 AM, Kiwamu Okabe wrote:

...

Hi jhc-hackers,

I have written a paper "Experience Report: Writing NetBSD Sound Drivers in Haskell".

http://metasepi.org/papers.html

It explains Ajhc customized jhc GC, and sometime useful to develop jhc.

Thank's, -- Kiwamu Okabe at METASEPI DESIGN _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

-- John Meacham - http://notanumber.net/

Kiwamu Okabe

11:19 a.m.

New subject: [Rejected Paper] Experience Report: Writing NetBSD Sound Drivers in Haskell

Hi John, On Tue, Jun 10, 2014 at 7:55 PM, John Meacham wrote:

...

Oooh. and you documented jhc's RTS while you were at it. that is great :)

BTW. How do you think about jgc having multiple "arena" with the context? It can realize reentrant GC on jhc. Of cause, I know you feel bad about the cost initializing jgc every call C=>Haskell. Do you have some idea for the multiple "arena"? Best regards, -- Kiwamu Okabe at METASEPI DESIGN

John Meacham

11:26 a.m.

New subject: [Rejected Paper] Experience Report: Writing NetBSD Sound Drivers in Haskell

Hmm.. what were you thinking of in terms of how it would change the API? By reentrant do you mean you want C functions that were called by haskell to be able to call back into haskell again? what is the issue with the context that is stowed. or are you talking about SMP/lightweight threads? On Tue, Jun 10, 2014 at 4:19 AM, Kiwamu Okabe wrote:

...

Hi John,

On Tue, Jun 10, 2014 at 7:55 PM, John Meacham wrote:

...
Oooh. and you documented jhc's RTS while you were at it. that is great :)

BTW. How do you think about jgc having multiple "arena" with the context? It can realize reentrant GC on jhc. Of cause, I know you feel bad about the cost initializing jgc every call C=>Haskell.

Do you have some idea for the multiple "arena"?

Best regards, -- Kiwamu Okabe at METASEPI DESIGN

-- John Meacham - http://notanumber.net/

John Meacham

11:28 a.m.

New subject: [Rejected Paper] Experience Report: Writing NetBSD Sound Drivers in Haskell

Ah. I think I see what you mean by reentrant in your paper. Can you point me to your context switching code in ajhc? Is SMP a concern for you or are you mainly concerned about hardware interrupts? On Tue, Jun 10, 2014 at 4:26 AM, John Meacham wrote:

...

Hmm.. what were you thinking of in terms of how it would change the API?

By reentrant do you mean you want C functions that were called by haskell to be able to call back into haskell again? what is the issue with the context that is stowed. or are you talking about SMP/lightweight threads?

On Tue, Jun 10, 2014 at 4:19 AM, Kiwamu Okabe wrote:

...
Hi John,

On Tue, Jun 10, 2014 at 7:55 PM, John Meacham wrote:

...
Oooh. and you documented jhc's RTS while you were at it. that is great :)

BTW. How do you think about jgc having multiple "arena" with the context? It can realize reentrant GC on jhc. Of cause, I know you feel bad about the cost initializing jgc every call C=>Haskell.

Do you have some idea for the multiple "arena"?

Best regards, -- Kiwamu Okabe at METASEPI DESIGN

-- John Meacham - http://notanumber.net/

-- John Meacham - http://notanumber.net/

Kiwamu Okabe

11:42 a.m.

New subject: [Rejected Paper] Experience Report: Writing NetBSD Sound Drivers in Haskell

Hi John, On Tue, Jun 10, 2014 at 8:28 PM, John Meacham wrote:

...

Ah. I think I see what you mean by reentrant in your paper. Can you point me to your context switching code in ajhc?

Here is. https://github.com/ajhc/ajhc/blob/arafura/rts/rts/conc.c#L33 It's a sample with pthread. But CLHs can choose any thread style with calling C code that generate context switch.

...

Is SMP a concern for you or are you mainly concerned about hardware interrupts?

Both thread on SMP and interrupt. Former uses active context switch, and the example is the above. Later uses passive context switch, however the interrupt context begins on C context. The C context create new "arena" when calling C => Haskell. Thank's, -- Kiwamu Okabe at METASEPI DESIGN

John Meacham

11:49 a.m.

New subject: [Rejected Paper] Experience Report: Writing NetBSD Sound Drivers in Haskell

Hmm... if we allocate the gc_stack on an aligned boundry, can we recover the arena by keeping a pointer at its base? sort of like I recover the cache block pointer from an arbitrary heap location by rounding down to the block boundry. The main issue would be how it affects allocation speed, its okay to make the GC slower as long as allocation is still fast, Before pre-populating the cache pointers sped things up considerably, how would it make sure to use one from the current arena without slowing down allocation in general? John On Tue, Jun 10, 2014 at 4:42 AM, Kiwamu Okabe wrote:

...

Hi John,

On Tue, Jun 10, 2014 at 8:28 PM, John Meacham wrote:

...
Ah. I think I see what you mean by reentrant in your paper. Can you point me to your context switching code in ajhc?

Here is.

https://github.com/ajhc/ajhc/blob/arafura/rts/rts/conc.c#L33

It's a sample with pthread. But CLHs can choose any thread style with calling C code that generate context switch.

...
Is SMP a concern for you or are you mainly concerned about hardware interrupts?

Both thread on SMP and interrupt. Former uses active context switch, and the example is the above. Later uses passive context switch, however the interrupt context begins on C context. The C context create new "arena" when calling C => Haskell.

Thank's, -- Kiwamu Okabe at METASEPI DESIGN

-- John Meacham - http://notanumber.net/

Kiwamu Okabe

11:58 a.m.

New subject: [Rejected Paper] Experience Report: Writing NetBSD Sound Drivers in Haskell

Hi John, On Tue, Jun 10, 2014 at 8:49 PM, John Meacham wrote:

...

The main issue would be how it affects allocation speed, its okay to make the GC slower as long as allocation is still fast, Before pre-populating the cache pointers sped things up considerably, how would it make sure to use one from the current arena without slowing down allocation in general?

So I don't have any benchmark for it today. I worry about the cost initializing arena when C=>Haskell. Current jgc has no cost, but my jgc initializes arena when C=>Haskell everytime. Please imagine the cost call all of find_cache(). Regards, -- Kiwamu Okabe at METASEPI DESIGN

John Meacham

12:07 p.m.

New subject: [Rejected Paper] Experience Report: Writing NetBSD Sound Drivers in Haskell

Yeah, find_cache is fairly slow. In fact, just checking if it is NULL noticibly slows things down. So, something that could be done is generate a struct with each cache used as offsets in it, basically putting the entire generate s_cache table in a struct then initializing them all when the arena is allocated. that would add a single redirect thruogh the arena to the caches which might not be too bad... what would be better is to use a thread or processor local register. John On Tue, Jun 10, 2014 at 4:58 AM, Kiwamu Okabe wrote:

...

Hi John,

On Tue, Jun 10, 2014 at 8:49 PM, John Meacham wrote:

...
The main issue would be how it affects allocation speed, its okay to make the GC slower as long as allocation is still fast, Before pre-populating the cache pointers sped things up considerably, how would it make sure to use one from the current arena without slowing down allocation in general?

So I don't have any benchmark for it today. I worry about the cost initializing arena when C=>Haskell. Current jgc has no cost, but my jgc initializes arena when C=>Haskell everytime. Please imagine the cost call all of find_cache().

Regards, -- Kiwamu Okabe at METASEPI DESIGN

-- John Meacham - http://notanumber.net/

John Meacham

2:16 p.m.

New subject: [Rejected Paper] Experience Report: Writing NetBSD Sound Drivers in Haskell

Hmm.. well in any case, collecting the whole context into a handy struct is a good cleanup anyway, even if there is just a single global one. So I should backport that as well as the pthreads code. On Tue, Jun 10, 2014 at 5:07 AM, John Meacham wrote:

...

Yeah, find_cache is fairly slow. In fact, just checking if it is NULL noticibly slows things down.

So, something that could be done is generate a struct with each cache used as offsets in it, basically putting the entire generate s_cache table in a struct then initializing them all when the arena is allocated. that would add a single redirect thruogh the arena to the caches which might not be too bad...

what would be better is to use a thread or processor local register.

John

On Tue, Jun 10, 2014 at 4:58 AM, Kiwamu Okabe wrote:

...
Hi John,

On Tue, Jun 10, 2014 at 8:49 PM, John Meacham wrote:

...
The main issue would be how it affects allocation speed, its okay to make the GC slower as long as allocation is still fast, Before pre-populating the cache pointers sped things up considerably, how would it make sure to use one from the current arena without slowing down allocation in general?

So I don't have any benchmark for it today. I worry about the cost initializing arena when C=>Haskell. Current jgc has no cost, but my jgc initializes arena when C=>Haskell everytime. Please imagine the cost call all of find_cache().

Regards, -- Kiwamu Okabe at METASEPI DESIGN

-- John Meacham - http://notanumber.net/

-- John Meacham - http://notanumber.net/

Kiwamu Okabe

2:22 p.m.

New subject: [Rejected Paper] Experience Report: Writing NetBSD Sound Drivers in Haskell

Hi John, Thank's for your advice. On Tue, Jun 10, 2014 at 11:16 PM, John Meacham wrote:

...

Hmm.. well in any case, collecting the whole context into a handy struct is a good cleanup anyway, even if there is just a single global one. So I should backport that as well as the pthreads code.

I think strongly depending on pthread is bad idea, because it will destroy jhc's minimalism. Ajhc is result that is chosen with the design of selectable thread arch, but slow than jhc... I should think more and more to merge CLHs into jhc... Thank's, -- Kiwamu Okabe at METASEPI DESIGN

Henning Thielemann

9:33 p.m.

New subject: [jhc] [Rejected Paper] Experience Report: Writing NetBSD Sound Drivers in Haskell

Am 10.06.2014 12:13, schrieb Kiwamu Okabe:

...

Hi jhc-hackers,

I have written a paper "Experience Report: Writing NetBSD Sound Drivers in Haskell".

http://metasepi.org/papers.html

Since you are concerned with low-level Haskell programming, what do you think about the Reduceron project? I also wondered whether it would be possible to teach functional programming to processors with customizable machine code like the Transmeta processors. I don't know whether comparable projects are still alive.

Kiwamu Okabe

11 Jun 11 Jun

12:03 a.m.

New subject: [jhc] [Rejected Paper] Experience Report: Writing NetBSD Sound Drivers in Haskell

Hi Henning, On Wed, Jun 11, 2014 at 6:33 AM, Henning Thielemann wrote:

...

Since you are concerned with low-level Haskell programming, what do you think about the Reduceron project? I also wondered whether it would be possible to teach functional programming to processors with customizable machine code like the Transmeta processors. I don't know whether comparable projects are still alive.

http://www.cs.york.ac.uk/fp/reduceron/ Oh, I hasn't know it. Thank's. I am not good to talk about HDL. However,a part of jhc's runtime can be designed with HDL, perhaps. Main part of jhc's runtime is GC. The GC clears bit marking array before marking. The clearing can be executed by HDL in parallel. Regards, -- Kiwamu Okabe at METASEPI DESIGN

4056

Age (days ago)

4057

Last active (days ago)

List overview

Download

12 comments

3 participants

participants (3)

Henning Thielemann
John Meacham
Kiwamu Okabe

[Rejected Paper] Experience Report: Writing NetBSD Sound Drivers in Haskell

tags

participants (3)