
On Fri, 2007-04-06 at 00:26 +0900, Robert Marlow wrote:
Hi Bulat
On Thu, 2007-04-05 at 15:08 +0400, Bulat Ziganshin wrote:
but why you provide ByteString-only API?? i think that more common idiom is to provide String functions here and use somewhat like Network.ByteString, Network.ByteString.Lazy modules to provide ByteString/ByteStringLazy equivalents of String function from Network.hs
Mostly because I wanted ByteStrings so that's what I implemented :)
Good point though. I've uploaded a replacement patch changing the Network functions to use String and adding Network.ByteString and Network.ByteString.Lazy. Thanks for the suggestion.
I'm not sure this really makes sense. In most situations there is an obvious candidate amongst String, strict ByteString and lazy ByteString. In this case, for datagram communication the obvious choice is indeed strict ByteString. Correct me if I'm wrong but datagrams are relatively small contiguous chunks and they arrive in our memory space all in one go. So they are not at all like a continuous stream of data which is what a lazy ByteString models. So there would never be any advantage to using a lazy ByteString in this case, it would always just have one chunk. Similarly, for String, one has to go via a strict contiguous chunk representation in the first place so any String interface would be a trivial wrapper on a ByteString representation. Remember that the types are trivially inter-convertible with a single function call[1]. I'm not sure that we need two whole extra module to replace a single pack/unpack call in a calling module. It's exactly this kind of thing that makes me worry about people creating a Stringlike class. By passing the operations in via a class rather than converting representations on the boundary we are in danger of loosing all the performance benefits we were after in the first place. I'm sure it makes more sense to provide a class to give us a string equivalent of fromIntegral. That way operations that want to provide an api that works on any string can chose the best internal representation and just use the conversion on the boundary. That way we only need to inline the conversion into the calling program to make it fast. As with fromIntegral, that conversion can often be optimised or turned into a no-op. For performance, class dictionary use should be kept as near to the 'surface' as possible. For example, consider this standard List module function: elemIndex :: Eq a => a -> [a] -> Maybe Int elemIndex x = findIndex (x==) This is not a naive definition. It is very cunning. If we wrote a full version of elemIndex in the style of findIndex but using == at the appropriate point then to optimise uses of elemIndex where we know the particular Eq class instance we'd have to inline the whole of elemIndex. This isn't a tiny amount of code and GHC is normally disinclined to do that. So we'd end up passing an Eq dictionary. Disaster! Instead, with the above definition we've lifted the use of the class right to the surface. Now elemIndex looks tiny and ghc will inline it in the calling context where we know the Eq instance. So now we just build a little specialised (x==) function and make a call to the findIndex function. So we get minimal code duplication and pretty fast results. And all this happens without having to bludgeon the compiler with INLINE or SPECIALISE pragmas. In other words it works just fine on ordinary user code. Ok, enough ranting. Duncan [1] Well two to get between strict and lazy bytestrings, but that's kind of deliberate to encourage people to think twice about doing that