Re: [Haskell] installing streams library

newer
sum-file shootout test

older
Re: [Haskell-cafe] Editors for...

Chad Scherrer

21 May 2006 21 May '06

6:37 p.m.

On 5/20/06, Donald Bruce Stewart wrote:

...

Data.ByteString is in the base libraries now. For a bit of the flavour, see: http://haskell.org/haskellwiki/Wc

In this message http://article.gmane.org/gmane.comp.lang.haskell.general/13625 Bulat says, i foresee that Streams + Fast Packed Strings together will yield a breakthrough in GHC I/O speed, and this can be implemented even without waiting for GHC 6.6 Before reading this I had thought it might be an XOR situation, but now it seems a happy coexistence may be possible. Are there any preliminary results on how these might work together, and the potential speedups? -- Chad Scherrer "Time flies like an arrow; fruit flies like a banana" -- Groucho Marx

Attachments:

attachment.html (text/html — 1.4 KB)

Show replies by date

Jeremy Shaw

21 May 21 May

8:20 p.m.

New subject: [Haskell] installing streams library

Hello, I really wanted to respond to the parent thread, but I deleted it already, so this message will be a bit out of context. For my own needs, I cabalized and debianized the Streams library. It generates binary debs for ghc6 and hugs -- but I think the hugs version is broken. In any case, it is a start, you can download the packaging at: http://www.n-heptane.com/nhlab/tmp/Streams_packaging.tar.gz That tarball contains only the packaging -- you just untar it over top of an existing 'Streams' directory that already contains the source. I have only done minimal testing on it. Cabal Question: The streams library uses cpphs to do some preprocessing. There is a hugs specific Makefile that invokes cpphs with the command-line: cpphs --noline -D__HUGS__ -D__HUGS_VERSION__=2005 -DSIZEOF_HSINT=4 -DSIZEOF_HSWORD=4 I tried adding the -D stuff to the cc-includes section of the .cabal file, but that caused the ghc6 build to start failing. Is there an easy way to specify flags that should only be used with cpphs and only when building hugs? Or is this one of those cases when I need to use some of the fancy hook features of cabal? Thanks. j.

Ross Paterson

11:59 p.m.

New subject: [Haskell] installing streams library

On Sun, May 21, 2006 at 01:20:54PM -0700, Jeremy Shaw wrote:

...

Cabal Question:

The streams library uses cpphs to do some preprocessing. There is a hugs specific Makefile that invokes cpphs with the command-line:

cpphs --noline -D__HUGS__ -D__HUGS_VERSION__=2005 -DSIZEOF_HSINT=4 -DSIZEOF_HSWORD=4

I tried adding the -D stuff to the cc-includes section of the .cabal file, but that caused the ghc6 build to start failing. Is there an easy way to specify flags that should only be used with cpphs and only when building hugs? Or is this one of those cases when I need to use some of the fancy hook features of cabal?

Cabal already adds -D__HUGS__ when building for Hugs, __HUGS_VERSION__ isn't used, and the SIZEOF's aren't universally valid. For GHC, the package gets them from MachDeps.h (an undocumented interface). Doing it portably probably requires autoconfery. Apart from that, the main in Setup.lhs could just be defaultMain.

Bulat Ziganshin

22 May 22 May

5:43 a.m.

New subject: Re[2]: Re: [Haskell] installing streams library

Hello Ross, Monday, May 22, 2006, 3:59:17 AM, you wrote:

...

...
cpphs --noline -D__HUGS__ -D__HUGS_VERSION__=2005 -DSIZEOF_HSINT=4 -DSIZEOF_HSWORD=4

...

Cabal already adds -D__HUGS__ when building for Hugs, __HUGS_VERSION__ isn't used, and the SIZEOF's aren't universally valid. For GHC, the package gets them from MachDeps.h (an undocumented interface). Doing it portably probably requires autoconfery.

but what to do if library depends on Hugs version and word size of target machine? GHC can solve such problems by providing all the necessary preprocessor symbols but Hugs can't :( -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Bulat Ziganshin

25 May 25 May

9:42 a.m.

New subject: Re[2]: Re: [Haskell] installing streams library

Hello Jeremy, Monday, May 22, 2006, 12:20:54 AM, you wrote:

...

For my own needs, I cabalized and debianized the Streams library. It generates binary debs for ghc6 and hugs -- but I think the hugs version is broken. In any case, it is a start, you can download the packaging at:

...

http://www.n-heptane.com/nhlab/tmp/Streams_packaging.tar.gz

can i include your work in the library itself? is it better to include 'debian' directory to my archive or left this to the debian packagers? can you say how you are use my library? it's both interesting for me and can be helpful in deciding how it should be further developed -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Jeremy Shaw

27 May 27 May

9:29 p.m.

New subject: Re[2]: Re: [Haskell] installing streams library

At Thu, 25 May 2006 13:42:11 +0400, Bulat Ziganshin wrote:

...

Hello Jeremy,

Monday, May 22, 2006, 12:20:54 AM, you wrote:

...
For my own needs, I cabalized and debianized the Streams library. It generates binary debs for ghc6 and hugs -- but I think the hugs version is broken. In any case, it is a start, you can download the packaging at:

...
http://www.n-heptane.com/nhlab/tmp/Streams_packaging.tar.gz

can i include your work in the library itself?

Absolutely.

...

is it better to include 'debian' directory to my archive or left this to the debian packagers?

If someone volunteers to maintain the package -- then it is probably better to not keep a copy of the debian directory in your archive -- because it will often be out of date and confuse people -- and debian users will be able to get the debianized source easily by typing, 'apt-get source haskell-streams'. On the other hand -- if there is no one officially maintaing it -- it would be useful to provide the debian directory (with a disclaimer) so that debian users can easily build and install the .deb, since subverting the debian package system tends to lead to long-term complications.

...

can you say how you are use my library? it's both interesting for me and can be helpful in deciding how it should be further developed

I am using it to serialize/deserialize haskell data structures so I can store them in a Berkeley database. To get them into BDB I need to convert the haskell data structure into a C structure that looks like this: struct __db_dbt { void *data; /* Key/data */ u_int32_t size; /* key/data length */ }; Currently I am doing it like this -- but this will clearly fail if the serialized data structure is longer than 512 bytes... withDBT :: (Binary a) => a -> (Ptr DBT -> IO b) -> IO b withDBT thedata f = allocaBytes #{size DBT} $ \dbtPtr -> allocaBytes 512 $ \dataPtr -> do h <- openMemBuf dataPtr 512 withByteAlignedLE h $ flip put_ thedata wrote <- vTell h vClose h #{poke DBT, data} dbtPtr (castPtr dataPtr) #{poke DBT, size} dbtPtr ((fromIntegral wrote) :: Int) f dbtPtr I don't really need the file-system interface for this project -- what would be nice is something like 'withCStringLen' and 'peekCString' for the encode/decode functions: type PtrLen a = (Ptr a, Int) encodePtrLen :: (Binary a) => a -> (PtrLen a -> IO b) -> IO b decodePtr :: (Binary a) => Ptr a -> IO a I could simulate this by using 'encode' to convert the data structure to a String and then use 'withCStringLen' to get the pointer and length -- but having the intermediate String seems like it could be a big performance hit. Two alternative ideas are: (1) accurately pre-calculate the size of the serialized structure and allocate the correct amount of memory from the start (2) start with a 'guess' and realloc the memory if the initial guess is too small. Both of those alternatives have their own problems -- so I think only testing will tell what works best... I have not looked at the library exhaustively, so if there is already a good way to do this, let me know. Thanks! j.

Bulat Ziganshin

28 May 28 May

10:44 a.m.

New subject: Re[4]: Re: [Haskell] installing streams library

...

...
can i include your work in the library itself? Absolutely.

Hello Jeremy, Sunday, May 28, 2006, 1:29:02 AM, you wrote: thanks

...

...
is it better to include 'debian' directory to my archive or left this to the debian packagers?

...

If someone volunteers to maintain the package -- then it is probably

afaiu, you say about maintaining their debian ports? it seems that i should include these files now and omit them when someone will start to maintain the package?

...

would be useful to provide the debian directory (with a disclaimer) so

what disclaimer?

...

To get them into BDB I need to convert the haskell data structure into a C structure that looks like this:

...

struct __db_dbt { void *data; /* Key/data */ u_int32_t size; /* key/data length */ };

...

Currently I am doing it like this -- but this will clearly fail if the serialized data structure is longer than 512 bytes...

...

withDBT :: (Binary a) => a -> (Ptr DBT -> IO b) -> IO b withDBT thedata f = allocaBytes #{size DBT} $ \dbtPtr -> allocaBytes 512 $ \dataPtr -> do h <- openMemBuf dataPtr 512 withByteAlignedLE h $ flip put_ thedata wrote <- vTell h vClose h #{poke DBT, data} dbtPtr (castPtr dataPtr) #{poke DBT, size} dbtPtr ((fromIntegral wrote) :: Int) f dbtPtr

i will prefer to split it into two parts. and, DBD-interfacing part can be also implemented using binary i/o: withDBT00 :: Ptr a -> Int -> (Ptr DBT -> IO b) -> IO b withDBT00 f buf size = do h <- createMemBuf 20 >>= openByteAlignedLE put_ h buf putWord32 h size vRewind h (dbt,_) <- vReceiveBuf h result <- f dbt vClose h return result withDBT f thedata = encodeMemBufLE (withDBT00 f) thedata

...

I don't really need the file-system interface for this project -- what would be nice is something like 'withCStringLen' and 'peekCString' for the encode/decode functions:

...

type PtrLen a = (Ptr a, Int) encodePtrLen :: (Binary a) => a -> (PtrLen a -> IO b) -> IO b decodePtr :: (Binary a) => Ptr a -> IO a

encodeMemBufLE f thedata = do h <- createMemBuf 512 >>= openByteAlignedLE put_ h thedata vRewind h (buf,size) <- vReceiveBuf h result <- f buf size vClose h return result decodeMemBufLE buf size = do h <- openMemBuf buf size >>= openByteAlignedLE result <- get h vClose h return result but it will work only with Streams 0.1. you have spotted the problem that there is no official way to get access to the whole buffer contents if buffer was created with createMemBuf.

...

I could simulate this by using 'encode' to convert the data structure to a String and then use 'withCStringLen' to get the pointer and length -- but having the intermediate String seems like it could be a big performance hit.

Strings are slow by itself and moreover 'encode' has O(n^2) complexity

...

Two alternative ideas are:

...

(1) accurately pre-calculate the size of the serialized structure and allocate the correct amount of memory from the start

it's good idea to have 'binarySize :: Binary a => a->Int' function, although using it will halve the speed, so for you it's not the best solution

...

(2) start with a 'guess' and realloc the memory if the initial guess is too small.

createMemBuf does exactly this :) it's for why the whole Streams part exist. actually i have started with trivial instance ByteStream Handle where vPutByte h n = do hPutChar h (chr (fromEnum n)) vGetByte h = do c <- hGetChar h return $! (toEnum (ord c)) and only after Binary part was enough matured, i goes to adding all those fancy Stream features

...

I have not looked at the library exhaustively, so if there is already a good way to do this, let me know.

the way exist but it's not guaranteed by library interface and uses my knowledge of library internals. i will add interface which guarantees access to full buffer's contents. after that, 'encodeMemBufLE' can be written using official library capabilities. i will also add encodeMemBuf*/decodeMemBuf* to the lib, although i'm not sure that these functions are universal enough -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Duncan Coutts

11:05 a.m.

New subject: Re[4]: Re: [Haskell] installing streams library

On Sun, 2006-05-28 at 14:44 +0400, Bulat Ziganshin wrote:

...

Hello Jeremy,

Sunday, May 28, 2006, 1:29:02 AM, you wrote:

...
Two alternative ideas are:

...
(1) accurately pre-calculate the size of the serialized structure and allocate the correct amount of memory from the start

it's good idea to have 'binarySize :: Binary a => a->Int' function, although using it will halve the speed, so for you it's not the best solution

...
(2) start with a 'guess' and realloc the memory if the initial guess is too small.

createMemBuf does exactly this :)

One of the areas where we found that Data.ByteString.Lazy was performing better than the ordinary Data.ByteString is cases like this where we do not know beforehand how big the buffer will be. If you have to use a single contiguous buffer then it involves guessing and possible reallocation. With a 'chunked' representation like ByteString.Lazy it's not a problem as we just allocate another chunk and start to fill that. Obvious example include concat and getContents. Would the same make sense for a MemBuf stream? Why does it need to be a single large buffer? Couldn't it be a list of buffers? Duncan

Bulat Ziganshin

4:40 p.m.

New subject: Re[6]: Re: [Haskell] installing streams library

Hello Duncan, Sunday, May 28, 2006, 3:05:53 PM, you wrote:

...

...
createMemBuf does exactly this :)

...

One of the areas where we found that Data.ByteString.Lazy was performing better than the ordinary Data.ByteString is cases like this where we do not know beforehand how big the buffer will be.

i like your idea of using ByteString.Lazy to implement fast and easy-to-use i/o, although i don't think that speed will be in 10% of C :) ghc by itself generates code that is several times slower than gcc-generated and you can't do anything agaist this, except for implementing everything in C. i've wriiten about this in ghc-users list in February. for example, simple "a[i]=b[i]+c[i]" floating-point loop works 20 times slower but, nevertheless, i think that this is a great idea - much faster than String-based hGetContents. it should help in numerous programs that need fast-and-dirty text processing, although it needs further development of library in order to implement for LazyByteString full String-like interface

...

If you have to use a single contiguous buffer then it involves guessing and possible reallocation. With a 'chunked' representation like ByteString.Lazy it's not a problem as we just allocate another chunk and start to fill that.

...

Obvious example include concat and getContents.

...

Would the same make sense for a MemBuf stream? Why does it need to be a single large buffer? Couldn't it be a list of buffers?

i also had this idea and it can be implemented in 1 day, i think (when someone will need this). but this is not for Jeremy, he need a contiguous buffer for interfacing with DBD. btw, it's better to use UArray instead of list -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Duncan Coutts

5:04 p.m.

New subject: Re[6]: Re: [Haskell] installing streams library

On Sun, 2006-05-28 at 20:40 +0400, Bulat Ziganshin wrote:

...

Hello Duncan,

Sunday, May 28, 2006, 3:05:53 PM, you wrote:

...
...
createMemBuf does exactly this :)

...
One of the areas where we found that Data.ByteString.Lazy was performing better than the ordinary Data.ByteString is cases like this where we do not know beforehand how big the buffer will be.

i like your idea of using ByteString.Lazy to implement fast and easy-to-use i/o, although i don't think that speed will be in 10% of C :)

Actually Donald recently posted a benchmark (to the libraries mailing list) of ByteString.Lazy where we were getting within 6% of C. That was on a 10GB file.

...

ghc by itself generates code that is several times slower than gcc-generated and you can't do anything agaist this, except for implementing everything in C.

ByteString does use C code in places and ByteString.Lazy inherits the benefits of that. Both modules also use array fusion to combine pipelines of loops into a single loop. This has big performance benefits. This is not something you can easily do in C. Using fusion also mean one doesn't have to allocate so many buffers and some transformations can work in-place on intermediate buffers. You might be able to use similar fusion techniques for layering Streams.

...

but, nevertheless, i think that this is a great idea - much faster than String-based hGetContents. it should help in numerous programs that need fast-and-dirty text processing, although it needs further development of library in order to implement for LazyByteString full String-like interface

Data.ByteString.Lazy implements more or less the same interface as Data.ByteString which in turn implements almost the same interface as Data.List. We're still working on improving the API.

...

...
If you have to use a single contiguous buffer then it involves guessing and possible reallocation. With a 'chunked' representation like ByteString.Lazy it's not a problem as we just allocate another chunk and start to fill that.

...
Obvious example include concat and getContents.

...
Would the same make sense for a MemBuf stream? Why does it need to be a single large buffer? Couldn't it be a list of buffers?

i also had this idea and it can be implemented in 1 day, i think (when someone will need this). but this is not for Jeremy, he need a contiguous buffer for interfacing with DBD.

The approach we're taking for Data.ByteString.Lazy is that when a contiguous buffer is needed (eg for passing to foreign code) that we convert it to an ordinary strict Data.ByteString.

...

btw, it's better to use UArray instead of list

Not if you want to generate or consume the stream lazily. Duncan

Bulat Ziganshin

31 May 31 May

8:27 a.m.

New subject: Re[5]: Re: [Haskell] installing streams library

Hello Bulat, Sunday, May 28, 2006, 2:44:37 PM, you wrote:

...

...
type PtrLen a = (Ptr a, Int) encodePtrLen :: (Binary a) => a -> (PtrLen a -> IO b) -> IO b decodePtr :: (Binary a) => Ptr a -> IO a

Finally i've implemented the following (you then would use 'withForeignPtr' to work with contents of ForeignPtr): -- ----------------------------------------------------------------------------- -- Encode/decode contents of memory buffer encodePtr :: (Binary a, Integral size) => a -> IO (ForeignPtr x, size) encodePtrLE :: (Binary a, Integral size) => a -> IO (ForeignPtr x, size) encodePtrBitAligned :: (Binary a, Integral size) => a -> IO (ForeignPtr x, size) encodePtrBitAlignedLE :: (Binary a, Integral size) => a -> IO (ForeignPtr x, size) encodePtr = encodePtr' openByteAligned encodePtrLE = encodePtr' openByteAlignedLE encodePtrBitAligned = encodePtr' openBitAligned encodePtrBitAlignedLE = encodePtr' openBitAlignedLE decodePtr :: (Binary a, Integral size) => Ptr x -> size -> IO a decodePtrLE :: (Binary a, Integral size) => Ptr x -> size -> IO a decodePtrBitAligned :: (Binary a, Integral size) => Ptr x -> size -> IO a decodePtrBitAlignedLE :: (Binary a, Integral size) => Ptr x -> size -> IO a decodePtr = decodePtr' openByteAligned decodePtrLE = decodePtr' openByteAlignedLE decodePtrBitAligned = decodePtr' openBitAligned decodePtrBitAlignedLE = decodePtr' openBitAlignedLE -- Universal function what encodes data with any alignment encodePtr' open thedata = do h <- createMemBuf 512 >>= open put_ h thedata vFlush h vRewind h (buf,size) <- vReceiveBuf h READING -- FIXME: MemBuf-implementation specific fptr <- newForeignPtr finalizerFree (castPtr buf) -- FIXME: also MemBuf-implementation specific return (fptr,size) -- Universal function what decodes data written with any alignment decodePtr' open ptr size = do h <- openMemBuf ptr size >>= open result <- get h vClose h return result -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Jacques Carette

1:33 p.m.

New subject: [Haskell] installing streams library

[See comments at bottom] Bulat Ziganshin wrote:

...

Finally i've implemented the following (you then would use 'withForeignPtr' to work with contents of ForeignPtr):

-- ----------------------------------------------------------------------------- -- Encode/decode contents of memory buffer

encodePtr :: (Binary a, Integral size) => a -> IO (ForeignPtr x, size) encodePtrLE :: (Binary a, Integral size) => a -> IO (ForeignPtr x, size) encodePtrBitAligned :: (Binary a, Integral size) => a -> IO (ForeignPtr x, size) encodePtrBitAlignedLE :: (Binary a, Integral size) => a -> IO (ForeignPtr x, size) encodePtr = encodePtr' openByteAligned encodePtrLE = encodePtr' openByteAlignedLE encodePtrBitAligned = encodePtr' openBitAligned encodePtrBitAlignedLE = encodePtr' openBitAlignedLE

decodePtr :: (Binary a, Integral size) => Ptr x -> size -> IO a decodePtrLE :: (Binary a, Integral size) => Ptr x -> size -> IO a decodePtrBitAligned :: (Binary a, Integral size) => Ptr x -> size -> IO a decodePtrBitAlignedLE :: (Binary a, Integral size) => Ptr x -> size -> IO a

decodePtr = decodePtr' openByteAligned decodePtrLE = decodePtr' openByteAlignedLE decodePtrBitAligned = decodePtr' openBitAligned decodePtrBitAlignedLE = decodePtr' openBitAlignedLE

Am I the only one who finds this encoding-of-types in the _name_ of a function quite distateful? There is no type safety being enforced here, no ensuring one will not be encoding a Ptr one way and decoding it another. Why not use Haskell's type system to help you there? One could imagine putting encodePtr' and decodePtr' in a type class for example? Or many other solutions. This is not meant to be a general critique of this habit of encoding types into function names, not of the particular instance above. My interest in starting this thread is to discuss the solutions that work, and the situations where no solution currently seems to exist. I believe there may be instances of encoding-types-in-names that are currently necessary in Haskell because the type system is not powerful enough to do anything else. Using Typeable and a type-witness just moves the problem, it does not ``solve'' it. Jacques

Bulat Ziganshin

1:58 p.m.

New subject: Re[2]: Re: [Haskell] installing streams library

Hello Jacques, Wednesday, May 31, 2006, 5:33:39 PM, you wrote:

...

...
decodePtrBitAlignedLE :: (Binary a, Integral size) => Ptr x -> size -> IO a

Am I the only one who finds this encoding-of-types in the _name_ of a function quite distateful? There is no type safety being enforced here,

can you please write code you suggested? i'm not sure that type "a" should be encoded only to area pointed by "Ptr a" - binary encoding of value and it's memory representation are different concepts, although they are similar at the look -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Jacques Carette

2:13 p.m.

New subject: [Haskell] installing streams library

You would need to define a type class (Binary a) => EncodedPtr a b where the 'a' is as you have it currently, and the b would be an enumerated type which tracks the memory representation. I agree they are different concepts - that is why an EncodedPtr would require 2 type parameters. Of course, this class would define encode/decode functions, but without the need for the name encoding (and additional safety). Jacques Bulat Ziganshin wrote:

...

Hello Jacques,

Wednesday, May 31, 2006, 5:33:39 PM, you wrote:

...
...
decodePtrBitAlignedLE :: (Binary a, Integral size) => Ptr x -> size -> IO a

Am I the only one who finds this encoding-of-types in the _name_ of a function quite distateful? There is no type safety being enforced here,

can you please write code you suggested? i'm not sure that type "a" should be encoded only to area pointed by "Ptr a" - binary encoding of value and it's memory representation are different concepts, although they are similar at the look

Bulat Ziganshin

3:09 p.m.

New subject: Re[2]: Re: [Haskell] installing streams library

Hello Jacques, Wednesday, May 31, 2006, 5:33:39 PM, you wrote:

...

...
encodePtr :: (Binary a, Integral size) => a -> IO (ForeignPtr x, size) encodePtrLE :: (Binary a, Integral size) => a -> IO (ForeignPtr x, size) encodePtrBitAligned :: (Binary a, Integral size) => a -> IO (ForeignPtr x, size) encodePtrBitAlignedLE :: (Binary a, Integral size) => a -> IO (ForeignPtr x, size)

Am I the only one who finds this encoding-of-types in the _name_ of a function quite distateful? There is no type safety being enforced here, no ensuring one will not be encoding a Ptr one way and decoding it another. Why not use Haskell's type system to help you there?

i misunderatood you when i wrote previous message. now that i can say: you are right. but on practice this means more text typing and coercion. especially when we go to ForeignPtrs. moreover, in most cases, imho, data encoded by 'encodePtr*', will go to the FFI libraries, so we can't use typechecking anyway i'm not against your idea, you absolutely right that this will be more Haskell way, but can this be implemented without additional complications for library users? -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Jacques Carette

4:07 p.m.

New subject: [Haskell] installing streams library

Bulat Ziganshin wrote:

...

i'm not against your idea, you absolutely right that this will be more Haskell way, but can this be implemented without additional complications for library users?

C is a language which pushes the boundaries of "no complications" (ie convenience) quite far (and yet claims to have types). The beauty of Haskell is that you are forced to think before you lay down some code, to make sure what you write really is meaningful. A Haskell API for a library /can/ likewise force its users to think about what they really need to do before they lay down some code. Yes, that makes the use more complicated. Convenience is a short-term gain, and going from 'convenient' languages (think Perl and Python here) to Haskell is quite the shock! But think of the long-term gains of doing it correctly / the Haskell way. I am completely biased in this regard: I have spent several years maintaining ~800Kloc of legacy dynamically typed [commercial] code. A lot of this code has a life-span of roughly 20 years [ie once written, it takes an average of 20 years before it gets re-written]. That experience has converted me to a static-type fan, as well as a fan of designs that are for the "long term"; short-term convenience is something that is great for short-lived code (< 5 years is short-lived to me ;-) ). I think the choice really boils down to the expected life-span of your library, as well as the expected size of the user base. Jacques (stepping off my soap-box now...)

Bulat Ziganshin

7:33 p.m.

New subject: Re[2]: Re: [Haskell] installing streams library

Hello Jacques, Wednesday, May 31, 2006, 8:07:29 PM, you wrote:

...

I am completely biased in this regard: I have spent several years maintaining ~800Kloc of legacy dynamically typed [commercial] code. A lot of this code has a life-span of roughly 20 years [ie once written, it takes an average of 20 years before it gets re-written]. That experience has converted me to a static-type fan, as well as a fan of designs that are for the "long term"; short-term convenience is something that is great for short-lived code (< 5 years is short-lived to me ;-) ).

my own programming experience say the same - strict typing significantly simplify program writing by ensuring it's correctness. and Haskell catch many problems as early as i compile code. but in this case we will add more complexity for standard use of functions (when just Ptr required) without any improvements in reliability just to catch potential problems with unusual usage. moreover, there are also encode/encodeLE/... functions that produce String - they also don't need any special String types why i include encoding type in function name? just to simplify usage, all the 'encodePtr*' functions can be expressed via one encodePtrWith, but i don't think that many peoples want to write this himself: encodePtr = encodePtrWith put_ encodePtrLE = encodePtrLEWith put_ encodePtrBitAligned = encodePtrBitAlignedWith put_ encodePtrBitAlignedLE = encodePtrBitAlignedLEWith put_ encodePtrLEWith write = encodePtrWith (\s a-> withByteAlignedLE s (`write` a)) encodePtrBitAlignedWith write = encodePtrWith (\s a-> withBitAligned s (`write` a)) encodePtrBitAlignedLEWith write = encodePtrWith (\s a-> withBitAlignedLE s (`write` a)) -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Tim Newsham

3:22 p.m.

New subject: [Haskell] installing streams library

...

Am I the only one who finds this encoding-of-types in the _name_ of a function quite distateful? There is no type safety being enforced here, no ensuring one will not be encoding a Ptr one way and decoding it another. Why not use Haskell's type system to help you there?

When marshalling data you often don't want any type safety. You often want to explicitely linearize data from one type and then unlinearize it into another type. The net result is that of casting. In fact, you can write a marshalling library with an interface based entirely on this concept: http://www.lava.net/~newsham/x/Pkts5.lhs The interface is, in essense, a glorified casting mechanism. To marshall data you convert it to an array of bytes and to unmarshall data you unconvert it.

...

Jacques

Tim Newsham http://www.lava.net/~newsham/

Jacques Carette

3:40 p.m.

New subject: [Haskell] installing streams library

I have no problems with marshalling/unmarshalling (and even with the implicit casting going on). What I dislike is having a bunch of functions which are "the same" but with different names, where the difference boils down to enumerated types that end up being encoded in the function name. Regardless of type safety. Jacques Tim Newsham wrote:

...

...
Am I the only one who finds this encoding-of-types in the _name_ of a function quite distateful? There is no type safety being enforced here, no ensuring one will not be encoding a Ptr one way and decoding it another. Why not use Haskell's type system to help you there?

When marshalling data you often don't want any type safety. You often want to explicitely linearize data from one type and then unlinearize it into another type. The net result is that of casting. In fact, you can write a marshalling library with an interface based entirely on this concept:

http://www.lava.net/~newsham/x/Pkts5.lhs

The interface is, in essense, a glorified casting mechanism. To marshall data you convert it to an array of bytes and to unmarshall data you unconvert it.

...
Jacques

Tim Newsham http://www.lava.net/~newsham/

dons＠cse.unsw.edu.au

22 May 22 May

12:19 a.m.

New subject: [Haskell] installing streams library

chad.scherrer:

...

On 5/20/06, Donald Bruce Stewart <[1]dons@cse.unsw.edu.au> wrote:

Data.ByteString is in the base libraries now. For a bit of the flavour, see: [2]http://haskell.org/haskellwiki/Wc

In this message [3]http://article.gmane.org/gmane.comp.lang.haskell.general/ 13625 Bulat says, i foresee that Streams + Fast Packed Strings together will yield a breakthrough in GHC I/O speed, and this can be implemented even without waiting for GHC 6.6

Before reading this I had thought it might be an XOR situation, but now it seems a happy coexistence may be possible. Are there any preliminary results on how these might work together, and the potential speedups?

I imagine that Bulat could easily write some IO operations that construct ByteStrings as results. So that should be very possible. -- Don

Bulat Ziganshin

5:37 a.m.

New subject: Re[2]: [Haskell] installing streams library

Hello Donald, Monday, May 22, 2006, 4:19:59 AM, you wrote:

...

...
i foresee that Streams + Fast Packed Strings together will yield a breakthrough in GHC I/O speed, and this can be implemented even without waiting for GHC 6.6

Before reading this I had thought it might be an XOR situation, but now it seems a happy coexistence may be possible. Are there any preliminary results on how these might work together, and the potential speedups?

...

I imagine that Bulat could easily write some IO operations that construct ByteStrings as results. So that should be very possible.

yes, it is no problem. i tried this with other packed string libraries, but at those attempts results was bad because of inefficiency in my library organization. now i've changed these internals according to one Simon's idea, and i think that ByteString I/O will be done with the same speed as low-level character i/o, i.e. about 100-200 mb/sec on modern cpus (faster than disk I/O by itself :) ) -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

7024

Age (days ago)

7034

Last active (days ago)

List overview

Download

20 comments

8 participants

participants (8)

Bulat Ziganshin
Chad Scherrer
dons＠cse.unsw.edu.au
Duncan Coutts
Jacques Carette
Jeremy Shaw
Ross Paterson
Tim Newsham