Re: [Haskell-cafe] Haskell & monads for newbies

Peter Verswyvelen wrote:
Ouch, I should not have brought up these monads again! I should have known better ;-)
Mmm... ;-)
I hope the Haskell community understands that for outsiders / newbies who want to learn or just look at Haskell and then do some Googling, all this monad talk looks a bit euh "strange"?
Yeah. I spent time on another forum trying to explain Haskell. I wrote a fifty-mile long post that's basically a complete beginner's introduction to the language. Every reply was of the form "wow, did you write all that yourself? That's really good! You should write stuff for a living... Haskell sounds kinda cool, but... monads look very hard, and basically everything I write programs for is mainly about I/O..." I'm sure others have seen something similar...
Maybe in the next version of Haskell monads should be called something different
SPJ suggested "warm fuzzy thing". ;-)
and then all these tutorials and discussions about monads will then be silently forgotten over time ;-)
Oh, I don't know about that... (I for one still haven't figured out how to work monad transformers, for example. And they look useful...)

On Sun, 15 Jul 2007 00:21:50 +0100, you wrote: [quoting a generic attitude]
"basically everything I write programs for is mainly about I/O..."
It's funny how people always seem to think that, but if you look at what they're really doing, I/O is usually the least of their worries. Programming GUIs is about the only reasonably common I/O-related task that has any sort of complexity. Most everything else is reading or writing streams of bytes; the hard part is what happens between the reading and the writing. Steve Schafer Fenestra Technologies Corp. http://www.fenestra.com/

On Sat, 2007-07-14 at 21:25 -0400, Steve Schafer wrote:
On Sun, 15 Jul 2007 00:21:50 +0100, you wrote:
[quoting a generic attitude]
"basically everything I write programs for is mainly about I/O..."
It's funny how people always seem to think that, but if you look at what they're really doing, I/O is usually the least of their worries. Programming GUIs is about the only reasonably common I/O-related task that has any sort of complexity. Most everything else is reading or writing streams of bytes; the hard part is what happens between the reading and the writing.
Or for a different slant look at xmonad. (www.xmonad.org)

Derek Elkins wrote:
On Sat, 2007-07-14 at 21:25 -0400, Steve Schafer wrote:
On Sun, 15 Jul 2007 00:21:50 +0100, you wrote:
[quoting a generic attitude]
"basically everything I write programs for is mainly about I/O..."
It's funny how people always seem to think that, but if you look at what they're really doing, I/O is usually the least of their worries. Programming GUIs is about the only reasonably common I/O-related task that has any sort of complexity. Most everything else is reading or writing streams of bytes; the hard part is what happens between the reading and the writing.
Or for a different slant look at xmonad. (www.xmonad.org)
I'm still really really fuzzy on why this exists... Anyway, I have pointed out that somebody once wrote a Quake clone in Haskell - and that's about as interactive, I/O-intensive and performance demanding as it gets. It's basically doing all the stuff that Haskell supposedly sucks at. And yet it works. (I'm told...) However, this argument convinces nobody. Saying "it can be done" is different from saying "it can be done easily". ("One person did it" seems to imply the former, whereas "loads of people have done it" would imply the latter...)

On Sun, 2007-07-15 at 17:11 +0100, Andrew Coppin wrote:
Derek Elkins wrote:
On Sat, 2007-07-14 at 21:25 -0400, Steve Schafer wrote:
On Sun, 15 Jul 2007 00:21:50 +0100, you wrote:
[quoting a generic attitude]
"basically everything I write programs for is mainly about I/O..."
It's funny how people always seem to think that, but if you look at what they're really doing, I/O is usually the least of their worries. Programming GUIs is about the only reasonably common I/O-related task that has any sort of complexity. Most everything else is reading or writing streams of bytes; the hard part is what happens between the reading and the writing.
Or for a different slant look at xmonad. (www.xmonad.org)
I'm still really really fuzzy on why this exists...
What?
Anyway, I have pointed out that somebody once wrote a Quake clone in Haskell - and that's about as interactive, I/O-intensive and performance demanding as it gets. It's basically doing all the stuff that Haskell supposedly sucks at. And yet it works. (I'm told...) However, this argument convinces nobody. Saying "it can be done" is different from saying "it can be done easily". ("One person did it" seems to imply the former, whereas "loads of people have done it" would imply the latter...)
The reason I pointed it out is that it (a window manager) is something one usually thinks of as being "nothing but IO", yet this is not at all how xmonad is implemented. As for ease, it is shorter, more featureful, more robust, and was implemented much quicker that it's "inspiration" dwm. It also seems pretty popular for its age (not just in the Haskell community) as far as minimalistic window managers go.

Derek Elkins wrote:
On Sun, 2007-07-15 at 17:11 +0100, Andrew Coppin wrote:
I'm still really really fuzzy on why this exists...
What?
xmonad.
The reason I pointed it out is that it (a window manager) is something one usually thinks of as being "nothing but IO", yet this is not at all how xmonad is implemented.
Indeed it sure *looks* like it should be "all I/O"... ;-)
As for ease, it is shorter, more featureful, more robust, and was implemented much quicker that it's "inspiration" dwm. It also seems pretty popular for its age (not just in the Haskell community) as far as minimalistic window managers go.
More... featureful...? It's a minimalistic WM. It even says so on the tin. Either it's minimal or it isn't... As for "robust"... it tiles windows. What could possibly go wrong?

On Jul 15, 2007, at 14:23 , Andrew Coppin wrote:
More... featureful...?
It's a minimalistic WM. It even says so on the tin. Either it's minimal or it isn't...
minimalistic != minimal The disconnect here is that most people don't want a *truly* minimal WM. They want one which stays out of the way as much as possible, while still providing what turns out to be fairly sophisticated services ("tiling is simple" --- so what happens when a program decides it wants to open a pop-up menu?). -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

On Sun, Jul 15, 2007 at 07:23:37PM +0100, Andrew Coppin wrote:
As for "robust"... it tiles windows. What could possibly go wrong?
I'm told that early versions of DWM had a habit of segfaulting if you looked at them wrong. Just usual C stuff. Which in a normal setup, will cause the rest of your login session to die with broken pipes and whatnot... Stefan

On 7/15/07, Andrew Coppin
Anyway, I have pointed out that somebody once wrote a Quake clone in Haskell
Really? Do you have a link? This would be quite hard to do, so I'm going to assume that someone who took the effort to do this would make an effort to publicize it.

On Jul 15, 2007, at 14:34 , Hugh Perkins wrote:
On 7/15/07, Andrew Coppin
wrote: Anyway, I have pointed out that somebody once wrote a Quake clone in Haskell Really? Do you have a link? This would be quite hard to do, so I'm going to assume that someone who took the effort to do this would make an effort to publicize it.
http://www.haskell.org/haskellwiki/Frag -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

On 7/15/07, Brandon S. Allbery KF8NH
Wow, cool! :-D Looks awesome :-)

Steve Schafer wrote:
"basically everything I write programs for is mainly about I/O..."
It's funny how people always seem to think that, but if you look at what they're really doing, I/O is usually the least of their worries. Programming GUIs is about the only reasonably common I/O-related task that has any sort of complexity. Most everything else is reading or writing streams of bytes; the hard part is what happens between the reading and the writing.
Indeed, I replied "if all you want to do is run some algorithm over some data, you can write a thin layer to do the I/O, and then write the rest in pure code". And everybody was like "er, I don't understand what distinction you think you're making..." Obviously, what I meant is that one can write main = do stuff <- readFile "source.txt" writeFile "source.exe" $ compile stuff compile = ... I guess because in most normal programming languages you can do I/O anywhere you damn like, it doesn't occur to most programmers that it's possible to make a seperation. (Most seem to realise that, e.g., mixing business logic with GUI code is a Bad Thing though...) I saw a quote somewhere round here that went like this: "Haskell isn't really suited to heavily I/O-oriented programs." "What, you mean like darcs?" "...oh yeah." ;-)

On 15/07/07, Andrew Coppin
I guess because in most normal programming languages you can do I/O anywhere you damn like, it doesn't occur to most programmers that it's possible to make a seperation. (Most seem to realise that, e.g., mixing business logic with GUI code is a Bad Thing though...)
Hmm, I would speculate (I have no hard data, in other words...) that it's more the case that in imperative languages, you do I/O throughout the program, because that defers the I/O (which is slow) to the last possible moment, and it allows you to reuse memory buffers. People's intuition about performance and memory usage says that delaying I/O is good, and "separating" I/O and logic (which is taken to mean slurping data in all at once, and then processing it) is memory intensive and risks doing unnecessary I/O. Haskell handles this with laziness. The canonical example is counting characters in a file, where you just grab the whole file, and use length. An imperative programmer's intuition says that this wastes huge amounts of memory compared to reading character by character and incrementing a count. Lazy I/O means that no more than 1 character needs to be in RAM at any one time, without the programmer need to do the bookkeeping. If lazy I/O was publicised in this way, as separation of concerns (I/O and processing) with the compiler and language handling the work of minimising memory use and avoiding unnecessary I/O, then maybe the message might get through better. However, the only article I've ever seen taking this approach (http://blogs.nubgames.com/code/?p=22) didn't seem to get a good reception in the Haskell community, sparking comments that hGetContents and similar functions had a number of issues which made them "bad practice". The result was to leave me with a feeling that separating I/O and processing in Haskell really was hard, but I never quite understood why... So I guess that leaves me with the question: is separating I/O and processing really the right thing to do (in terms of memory usage and performance) in Haskell, and if so, why isn't it advertised more? (And for extra credit, please explain why the article I quoted above didn't make more of an impact in the Haskell community... :-)) Paul.

On Sunday 15 July 2007, Paul Moore wrote:
On 15/07/07, Andrew Coppin
wrote: I guess because in most normal programming languages you can do I/O anywhere you damn like, it doesn't occur to most programmers that it's possible to make a seperation. (Most seem to realise that, e.g., mixing business logic with GUI code is a Bad Thing though...)
Hmm, I would speculate (I have no hard data, in other words...) that it's more the case that in imperative languages, you do I/O throughout the program, because that defers the I/O (which is slow) to the last possible moment, and it allows you to reuse memory buffers.
People's intuition about performance and memory usage says that delaying I/O is good, and "separating" I/O and logic (which is taken to mean slurping data in all at once, and then processing it) is memory intensive and risks doing unnecessary I/O.
Haskell handles this with laziness. The canonical example is counting characters in a file, where you just grab the whole file, and use length. An imperative programmer's intuition says that this wastes huge amounts of memory compared to reading character by character and incrementing a count. Lazy I/O means that no more than 1 character needs to be in RAM at any one time, without the programmer need to do the bookkeeping.
If lazy I/O was publicised in this way, as separation of concerns (I/O and processing) with the compiler and language handling the work of minimising memory use and avoiding unnecessary I/O, then maybe the message might get through better. However, the only article I've ever seen taking this approach (http://blogs.nubgames.com/code/?p=22) didn't seem to get a good reception in the Haskell community, sparking comments that hGetContents and similar functions had a number of issues which made them "bad practice". The result was to leave me with a feeling that separating I/O and processing in Haskell really was hard, but I never quite understood why...
Because hGetContents only buys you laziness /if you use it lazily/. And laziness is, technically, a denotational property, but it is a very operational-feeling denotational property. And operational reasoning is difficult in imperative languages and gets really, really hard in lazy functional languages. And the article you cite falls flat on its face in trying to be lazy:
readWithIncludes :: String -> IO [String] readWithIncludes f = do s <- readFile f ss <- mapM expandIncludes (lines s) return (concat ss)
expandIncludes :: String -> IO [String] expandIncludes s = if isInclude s then readWithIncludes (includeFile s) else return [s]
That's calling mapM, a strict function, on the result of lines ss --- an arbitrarily long list. More generally, I suspect the Haskell community has a collective memory of stream I/O, back when this sort of thing used to be /really, really important/, because your program had type [Response] -> [Request] and if it wasn't lazy enough in its argument, you'd get a deadlock --- and that deadlock had nothing whatsoever to do with the result of applying your function to total arguments, so reasoning about it required abandoning every Haskeller's instinct to reason about functions only over total (or even finite total) arguments. interact takes a function with a type eerily similar to [Response] -> [Request], which means its argument has all the same problems. Laziness is great and everything --- but it's a lot of work, even in Haskell.
So I guess that leaves me with the question: is separating I/O and processing really the right thing to do (in terms of memory usage and performance) in Haskell, and if so, why isn't it advertised more? (And for extra credit, please explain why the article I quoted above didn't make more of an impact in the Haskell community... :-))
Jonathan Cast http://sourceforge.net/projects/fid-core http://sourceforge.net/projects/fid-emacs

Paul Moore wrote:
Haskell handles this with laziness. The canonical example is counting characters in a file, where you just grab the whole file, and use length. An imperative programmer's intuition says that this wastes huge amounts of memory compared to reading character by character and incrementing a count. Lazy I/O means that no more than 1 character needs to be in RAM at any one time, without the programmer need to do the bookkeeping.
Indeed, I had *this* conversation with Mr C++ as well... He proudly showed off a 3-page alphabet soup of C++ which allows him to do bit-level processing of a file as if it's really a collection of bits. And I said that in my program, I just grab a list of bytes and convert it into a list of bits. And he was like "wow - that's going to waste a heck of a lot of RAM..." But using the magic of getContents... actually no, it isn't. ;-)
If lazy I/O was publicised in this way, as separation of concerns (I/O and processing) with the compiler and language handling the work of minimising memory use and avoiding unnecessary I/O, then maybe the message might get through better. However, the only article I've ever seen taking this approach (http://blogs.nubgames.com/code/?p=22) didn't seem to get a good reception in the Haskell community, sparking comments that hGetContents and similar functions had a number of issues which made them "bad practice". The result was to leave me with a feeling that separating I/O and processing in Haskell really was hard, but I never quite understood why...
So I guess that leaves me with the question: is separating I/O and processing really the right thing to do (in terms of memory usage and performance) in Haskell, and if so, why isn't it advertised more?
It's something I use all the time... Of course, as soon as you want to scan the data *twice*... well, if you do it in the obvious way, the GC system will hold who knows how many MB (or even GB) of data in RAM ready for you to scan it the second time. I have a vague recollection of somebody muttering something about ByteStrings and memory-mapped files...?

Hello Andrew, Monday, July 16, 2007, 1:06:42 AM, you wrote:
I have a vague recollection of somebody muttering something about ByteStrings and memory-mapped files...?
http://www.haskell.org/library/StreamsBeta.tar.gz you can either open m/m file with openBinaryMMFile and use it to read/write any data including ByteStrings or use the following code that maps file into memory and allow to use it as ByteString -- ----------------------------------------------------------------------------- -- Mapping file contents into ByteString / memory buffer #if defined(__GLASGOW_HASKELL__) #if defined(USE_BYTE_STRING) -- | Like mmapBinaryFilePtr, but returns ByteString representing -- the entire file contents. mmapBinaryFileBS :: FilePath -> IO ByteString mmapBinaryFileBS f = do (fp,l) <- mmapBinaryFilePtr f return $ fromForeignPtr fp 0 l #endif -- | Like 'readFile', this reads an entire file directly into a -- 'ByteString', but it is even more efficient. It involves directly -- mapping the file to memory. This has the advantage that the contents -- of the file never need to be copied. Also, under memory pressure the -- page may simply be discarded, while in the case of readFile it would -- need to be written to swap. You can run into bus -- errors if the file is modified. mmapBinaryFilePtr :: FilePath -> IO (ForeignPtr a, Int) mmapBinaryFilePtr f = do fd <- openBinaryFD f ReadMode len <- fdFileSize fd l <- checkedFromIntegral len $ do -- some files are >4GB at those days ;) fail $ "mmapBinaryFilePtr: file '"++f++"' is too big ("++show len++" bytes) !" -- Don't bother mmaping small files because each mmapped file takes up -- at least one full VM block. if l < mmap_limit then do fp <- mallocForeignPtrBytes l withForeignPtr fp $ \p-> fdGetBuf fd p l fdClose fd return (fp, l) else do mmfd <- myOpenMMap fd ReadMode p <- myMMap mmfd ReadMode 0 l let unmap = do myUnMMap p l myCloseMMap mmfd fdClose fd return () fp <- FC.newForeignPtr p unmap return (fp, l) where mmap_limit = 16*1024 #endif -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com
participants (9)
-
Andrew Coppin
-
Brandon S. Allbery KF8NH
-
Bulat Ziganshin
-
Derek Elkins
-
Hugh Perkins
-
Jonathan Cast
-
Paul Moore
-
Stefan O'Rear
-
Steve Schafer