
On 2007.12.12 12:51:58 -0600, Tommy M McGuire
Gwern Branwen wrote:
Some of those really look like they could be simpler, like 'copy' - couldn't that simply be 'main = interact (id)'? Have you seen http://haskell.org/haskellwiki/Simple_Unix_tools? For example, 'charcount' could be a lot simpler - 'charcount = showln . length' would work, wouldn't it, for the core logic, and the whole thing might look like:
main = do (print . showln . length) =<< getContents Similarly wordcount could be a lot shorter, like 'wc_l = showln . length . lines' (showln is a convenience function: showln a = show a ++ "\n")
Yes, that's absolutely true, and I am adding a section showing implementations based on interact as soon as I send this message. The reason I didn't do so before is that I was trying to (to an extent) preserve the structure of the original implementations, which means using an imperative style.
Yes, I'm looking at it now. Pretty nice.
Strangely, I have considerably more confidence in the imperative-ish Haskell code than I do in the imperative Pascal code, in spite of the fact that they are essentially the same. Probably this is due to the referential transparency that monadic IO preserves and that does not even enter into the picture in traditional Pascal. For example, the pseudo-nroff implementation has a giant, horrible block of a record (containing the state taken directly from K&P) that is threaded through the program, but I am tolerably happy with it because I know that is the *only* state going through the program.
Further, while interact could probably handle all of the filter-style programs (and if I understand correctly, could also work for the main loop of the interactive editor)
If your editor is referentially transparent, I think. Something like ed or sed could be done, as long as you didn't implement any of the IO stuff (like ! for ed).
and a similar function could handle the later file-reading programs, I do not see how to generalize that to the out-of-core sort program.
Well, for out-of-core sort, someone several many months back posted a very neat solution using ByteStrings which iirc had performance as competitive as GNU sort's out-of-core sort.... [much searching later] Ah! Here we go: "[Haskell-cafe] External Sort and unsafeInterleaveIO" http://www.haskell.org/pipermail/haskell-cafe/2007-July/029156.html. I at least found it interesting.
(Plus, interact is scary. :-D )
It's not scary! It's neat!
I... I want to provide a one-liner for 'detab', but it looks impressively monstrous and I'm not sure I understand it.
If you think that's bad.... :-)
detab is one of the programs I do not like. I kept the "direct translation" approach up through that, but I think it really hides the simplicity there; detab copies its input to its output replacing tabs with 1-8 spaces, based on where the tab occurs in a line. The only interesting state dealt with is the count of characters on each line, but that gets hidden, not emphasized.
On the other hand, I'm not looking for one-liners; I really want clarity as opposed to cleverness.
Well, one-liners generally can be expanded to 2 or 3 lines if you want to add descriptive variable names. Better to start with a short version and expand it where unclear than have a long unclear one in the first place, right?
One final comment: as regards run-length encoding, there's a really neat way to do it. I wrote a little article on how to do it a while ago, so I guess I'll just paste it in here. :)
That *is* neat.
-- Tommy M. McGuire
Thanks. It took a while to write, but I never really found any place to put it up for other people to read. -- gwern GSM Submarine E. 510 ddnp building y friends RDI JCET