
On 9/8/10 10:18 AM, Ian Lynagh wrote:
On Tue, Sep 07, 2010 at 11:21:19PM +0100, Duncan Coutts wrote:
Many of these are deliberate and sensible.
Some at least seem just gratuitously different, e.g.:
BS: break :: (Char -> Bool) -> ByteString -> (ByteString, ByteString) breakSubstring :: ByteString -> ByteString -> (ByteString, ByteString) Text: break :: Text -> Text -> (Text, Text) breakBy :: (Char -> Bool) -> Text -> (Text, Text)
One consistency problem I see with this is that the ByteString versions permit breaking on a disjunctive pattern (e.g., \c -> c=='a' || c=='q') whereas the Text version would require multiple passes to perform these queries, since it takes a Text instead of a (Text->Bool). Since proper usage of Text.break requires being able to do various normalizations on the query, it's unclear whether this inconsistency can be remedied effectively. If it cannot, then it seems that the names of the functions should be adjusted in order to make it clear that there is this difference. Other than that, I do agree with the philosophy of the "deliberate and sensible" differences. Though, given the philosophy that these aren't Char-wise operations, why does Text.breakBy accept a (Char->Bool)? Is this just an optimization for common cases like breaking on Unicode-defined whitespace codepoints? -- Live well, ~wren