A better type signature for `forM_`

Hi all, I was recently faced with some unexpected behaviour from a piece of code that type checks and has zero warnings (even with -Wall). The code is below (and depends on the hashtables package). The error was using the <$> operator instead of the =<< operator. Using the former, it just builds up a list of IO actions that never get run. As pointed out to me on IRC (thanks pjdeport), chaning the type signature of `forM_` to forM_' :: (Monad m, Foldable t) => t a -> (a -> m ()) -> m () would have resulted in an error. Yes, this change would break existing code (breaking code would require an explicit `void $` inside the `forM_`) but does anyone else think this is a good idea? Erik import Control.Monad import qualified Data.HashTable.IO as HT type EvenCache = HT.BasicHashTable Int Bool main :: IO () main = do ht <- buildTable xs <- HT.toList ht putStrLn $ "cache: length " ++ show (length xs) buildTable :: IO EvenCache buildTable = do ht <- HT.new forM_ pairs $ \ (k,v) -> maybe (HT.insert ht k v) (const $ abort k) <$> HT.lookup ht k return ht where xs = [1 .. 10] :: [Int] pairs = map (\ i -> (i, even i)) xs abort k = error $ "cache: duplicate key " ++ show k ++ "." -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/

For me the convenience of |for_| (without having to use |void|) is more important than increased safety, but I accept that others’ needs are different – sometimes avoiding such nasty surprises is incredibly important, sometimes not so, and it depends both on the developer and the project. In an ideal world it’d probably be an optional warning (which I’d be able to disable, just like I do with |fwarn-unused-do-bind|), but I have no idea how hard it would be to implement and I can't imagine how it might look anyway. Bottom line: if it will result in an error and not a warning (i.e. changing the type of |mapM_|/|forM_|/|for_|, as originally proposed), I’m mildly against this idea.

For what it's worth, we made that change in mono-traversable already,
although the primary motivation was performance, not safety.
https://github.com/snoyberg/mono-traversable/issues/28
http://www.yesodweb.com/blog/2014/05/foldable-mapm-maybe-recursive
On Fri, Apr 1, 2016, 2:37 AM Artyom
For me the convenience of for_ (without having to use void) is more important than increased safety, but I accept that others’ needs are different – sometimes avoiding such nasty surprises is incredibly important, sometimes not so, and it depends both on the developer and the project.
In an ideal world it’d probably be an optional warning (which I’d be able to disable, just like I do with fwarn-unused-do-bind), but I have no idea how hard it would be to implement and I can't imagine how it might look anyway.
Bottom line: if it will result in an error and not a warning (i.e. changing the type of mapM_/forM_/for_, as originally proposed), I’m mildly against this idea. _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Erik de Castro Lopo
writes:
As pointed out to me on IRC (thanks pjdeport), chaning the type signature of `forM_` to forM_' :: (Monad m, Foldable t) => t a -> (a -> m ()) -> m () would have resulted in an error.
What about mapM_, replicateM_, etc.? In each case, '_' means "if you return a value from your action, it is ignored", and not "you must return unit". -- John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2

John Wiegley wrote:
Erik de Castro Lopo
writes: As pointed out to me on IRC (thanks pjdeport), chaning the type signature of `forM_` to forM_' :: (Monad m, Foldable t) => t a -> (a -> m ()) -> m () would have resulted in an error.
What about mapM_, replicateM_, etc.? In each case, '_' means "if you return a value from your action, it is ignored", and not "you must return unit".
Which is the exact problem. I suspect that most people use `forM_` as "assume the action returns unit" rather than "assume the return value of the action is ignored". FWIW, ignoring a list of unit is perfectly fine, but ignoring a list of `IO ()` is almost certainly not what people meant. Erik -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/

On Fri, Apr 1, 2016 at 1:13 AM, Erik de Castro Lopo
Which is the exact problem. I suspect that most people use `forM_` as "assume the action returns unit" rather than "assume the return value of the action is ignored".
A review of my code shows I mostly use it as "assume the return value of the action is ignored". Many things return data that is sometimes useful, but not always. Rarely do they have an '_' variant or the like to allow you to be more specific. I wouldn't find it to be overly onerous to throw "void" in in those cases though. I will point out that "void" isn't exported from the Prelude, while "mapM_" is. -davean

On Fri, Apr 01, 2016 at 10:23:24AM -0400, davean wrote:
On Fri, Apr 1, 2016 at 1:13 AM, Erik de Castro Lopo wrote:
Which is the exact problem. I suspect that most people use `forM_` as "assume the action returns unit" rather than "assume the return value of the action is ignored".
A review of my code shows I mostly use it as "assume the return value of the action is ignored".
Many things return data that is sometimes useful, but not always. Rarely do they have an '_' variant or the like to allow you to be more specific.
(emphasis added:)
I wouldn't find it to be overly onerous to throw "void" in in those cases though.
The first email in this thread hinted at another possible solution, with "unexpected behavior from... code that... has zero warning (even with -Wall)." Even if forM_ gets a new type, I don't think people will be much assisted with the new message: Expected: IO () Actual: IO (IO Int) That message actually mixes two different problems. If the wisdom becomes "throw a void on it", which I've already seen suggested a few times, you easily get back to the original problem. Change the body of the lambda to void $ maybe (HT.insert ht k v) (const $ abort k) <$> HT.lookup ht k ^^^^^^^ ...and now it *still* typechecks without warnings, but the original problem remains. Better, if possible, would be Warning: Discarded monadic value [in forM_] Even that wouldn't catch my 'void $' above, but it also wouldn't be indirectly *suggesting* it as a solution. However, the thing Snoyberg raised about performance seems important. Busting the stack while trying to throw away variables seems unfortunate. Is Free the best counterexample to the "performance" of `t a -> (a -> m ()) -> m ()`? Does Free really matter? It's not exactly known for being a performant option.

If we provide (under whatever names) both the discard-value and the require-unit versions, then at least for the inlined case we could safely remove the overhead of void with a rewrite rule. On Fri, Apr 1, 2016 at 8:12 AM, Bryan Richter wrote:
On Fri, Apr 01, 2016 at 10:23:24AM -0400, davean wrote:
On Fri, Apr 1, 2016 at 1:13 AM, Erik de Castro Lopo wrote:
Which is the exact problem. I suspect that most people use `forM_` as "assume the action returns unit" rather than "assume the return value of the action is ignored".
A review of my code shows I mostly use it as "assume the return value of the action is ignored".
Many things return data that is sometimes useful, but not always. Rarely do they have an '_' variant or the like to allow you to be more specific.
(emphasis added:)
I wouldn't find it to be overly onerous to throw "void" in in those cases though.
The first email in this thread hinted at another possible solution, with "unexpected behavior from... code that... has zero warning (even with -Wall)."
Even if forM_ gets a new type, I don't think people will be much assisted with the new message:
Expected: IO () Actual: IO (IO Int)
That message actually mixes two different problems. If the wisdom becomes "throw a void on it", which I've already seen suggested a few times, you easily get back to the original problem. Change the body of the lambda to
void $ maybe (HT.insert ht k v) (const $ abort k) <$> HT.lookup ht k ^^^^^^^
...and now it *still* typechecks without warnings, but the original problem remains.
Better, if possible, would be
Warning: Discarded monadic value [in forM_]
Even that wouldn't catch my 'void $' above, but it also wouldn't be indirectly *suggesting* it as a solution.
However, the thing Snoyberg raised about performance seems important. Busting the stack while trying to throw away variables seems unfortunate. Is Free the best counterexample to the "performance" of `t a -> (a -> m ()) -> m ()`? Does Free really matter? It's not exactly known for being a performant option.
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

No, we can't in general. With the exception of Control.Arrow (why that
exception? No clue!) GHC's rewrite rules try very hard not to change the
meaning of even bad user code. Yes, x <$ xs is supposed to be equivalent to
fmap (const x) xs, and fmap is supposed to obey the functor laws. But
neither of these is guaranteed, so a rewrite rule depending on these laws
could change the meaning of user code.
On Apr 1, 2016 1:03 PM, "David Thomas"
If we provide (under whatever names) both the discard-value and the require-unit versions, then at least for the inlined case we could safely remove the overhead of void with a rewrite rule.
On Fri, Apr 1, 2016 at 8:12 AM, Bryan Richter wrote:
On Fri, Apr 01, 2016 at 10:23:24AM -0400, davean wrote:
On Fri, Apr 1, 2016 at 1:13 AM, Erik de Castro Lopo wrote:
Which is the exact problem. I suspect that most people use `forM_` as "assume the action returns unit" rather than "assume the return value of the action is ignored".
A review of my code shows I mostly use it as "assume the return value of the action is ignored".
Many things return data that is sometimes useful, but not always. Rarely do they have an '_' variant or the like to allow you to be more specific.
(emphasis added:)
I wouldn't find it to be overly onerous to throw "void" in in those cases though.
The first email in this thread hinted at another possible solution, with "unexpected behavior from... code that... has zero warning (even with -Wall)."
Even if forM_ gets a new type, I don't think people will be much assisted with the new message:
Expected: IO () Actual: IO (IO Int)
That message actually mixes two different problems. If the wisdom becomes "throw a void on it", which I've already seen suggested a few times, you easily get back to the original problem. Change the body of the lambda to
void $ maybe (HT.insert ht k v) (const $ abort k) <$> HT.lookup ht k ^^^^^^^
...and now it *still* typechecks without warnings, but the original problem remains.
Better, if possible, would be
Warning: Discarded monadic value [in forM_]
Even that wouldn't catch my 'void $' above, but it also wouldn't be indirectly *suggesting* it as a solution.
However, the thing Snoyberg raised about performance seems important. Busting the stack while trying to throw away variables seems unfortunate. Is Free the best counterexample to the "performance" of `t a -> (a -> m ()) -> m ()`? Does Free really matter? It's not exactly known for being a performant option.
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

I think it's an interesting idea from a safety standpoint. Unfortunately,
as Ed pointed out to me in a similar context, `void` isn't always free.
On Mar 31, 2016 7:22 PM, "Erik de Castro Lopo"
Hi all,
I was recently faced with some unexpected behaviour from a piece of code that type checks and has zero warnings (even with -Wall). The code is below (and depends on the hashtables package).
The error was using the <$> operator instead of the =<< operator. Using the former, it just builds up a list of IO actions that never get run.
As pointed out to me on IRC (thanks pjdeport), chaning the type signature of `forM_` to
forM_' :: (Monad m, Foldable t) => t a -> (a -> m ()) -> m ()
would have resulted in an error.
Yes, this change would break existing code (breaking code would require an explicit `void $` inside the `forM_`) but does anyone else think this is a good idea?
Erik
import Control.Monad
import qualified Data.HashTable.IO as HT
type EvenCache = HT.BasicHashTable Int Bool
main :: IO () main = do ht <- buildTable xs <- HT.toList ht putStrLn $ "cache: length " ++ show (length xs)
buildTable :: IO EvenCache buildTable = do ht <- HT.new forM_ pairs $ \ (k,v) -> maybe (HT.insert ht k v) (const $ abort k) <$> HT.lookup ht k return ht where xs = [1 .. 10] :: [Int] pairs = map (\ i -> (i, even i)) xs abort k = error $ "cache: duplicate key " ++ show k ++ "."
-- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/ _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

David Feuer wrote:
I think it's an interesting idea from a safety standpoint. Unfortunately, as Ed pointed out to me in a similar context, `void` isn't always free.
I would love to see an elaboration of that. Erik -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/

Suppose you have something like
data Free f a = Pure a | Free (f (Free f a))
instance Functor f => Functor (Free f) where
fmap = mapFree
-- Separated out for clarity
mapFree :: Functor f
=> (a -> b)
-> Free f a -> Free f b
mapFree f (Pure a) = Pure (f a)
mapFree f (Free m) = Free (fmap (mapFree f) m)
Now void x = () <$ x, and `Free f` doesn't offer any potential for a cheap
<$. So voiding out Free f a is going to cost you some. Laziness will help,
but you'll face extra allocation.
On Apr 1, 2016 1:02 AM, "Erik de Castro Lopo"
David Feuer wrote:
I think it's an interesting idea from a safety standpoint. Unfortunately, as Ed pointed out to me in a similar context, `void` isn't always free.
I would love to see an elaboration of that.
Erik -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/ _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

On Fri, 1 Apr 2016, Erik de Castro Lopo wrote:
As pointed out to me on IRC (thanks pjdeport), changing the type signature of `forM_` to
forM_' :: (Monad m, Foldable t) => t a -> (a -> m ()) -> m ()
would have resulted in an error.
I would prefer that type (and the current behavior could be provided by a different function forVoidM_). If people do not want a change, the next best solution would be to add forUnitM_ with the type above. Then I could add a HLint warning about using current forM_. Or even better: Create a new module Data.Foldable.Unit or Control.Monad.Unit and put all result ignoring functions like forM_, mapM_, replicateM_ there.
participants (9)
-
Artyom
-
Bryan Richter
-
davean
-
David Feuer
-
David Thomas
-
Erik de Castro Lopo
-
Henning Thielemann
-
John Wiegley
-
Michael Snoyman