
Hello cafe, We, the network library maintainers, have questions about System.Mem.Weak.addFinalizer in GHC 7.10. We are trying to make the library extensible especially for the socket type and the socket address type. For this purpose, a proposed definition for Socket in version 3.0.0.0 is a just CInt: newtype Socket = Socket CInt deriving (Eq, Show) One problem is that unreachable Socket cannot be GCed if CInt (a file descriptor) is not closed. To let GC work, we are trying to add a finalizer via addFinalizer: socket family stype protocol = do fd <- c_socket ... ... let s = Socket fd addFinalizer s $ close s ruturn s This works well for many cases. Unfortunately, we *sometime* got the following error with GHC 7.10: tests/SimpleSpec.hs:56: 1) Simple.sendMany works well uncaught exception: IOException of type InvalidArgument (threadWait: invalid argument (Bad file descriptor)) It seems to us that Socket is GCed too early in GHC 7.10. We don't see this behavior in other versions of GHC. So, here are our questions: Q1) Do we use addFinalizer correctly? Q2) Is this a bug of GHC 7.10? Q3) If this is a bug of GHC 7.10, are there any workarounds? --Kazu

On Jan 29, 2018, at 10:25 PM, Kazu Yamamoto (山本和彦)
wrote: socket family stype protocol = do fd <- c_socket ... ... let s = Socket fd addFinalizer s $ close s ruturn s
For the record, I think I've convinced Kazu Yamamoto that this is an anti-pattern. Such a finalizer would easily end up closing already closed sockets, whose file-descriptors may already be associated with other open files or sockets. That way lie all sorts of difficult to isolate race-condition bugs. To make this safe, the close function would need to mutate the socket, invalidating the enclosed file-descriptor, and would then need to be a NOP or just raise an exception if the socket is closed again (the finalizer should invoke a close variant that just returns without raising exceptions if the socket is already closed). There is, AFAIK still an unresolved bug along these lines somewhere in http-client and its dependencies. So far no reproducing cases have been provided. No very recent reports either, perhaps it went away, or people have just been more lucky lately: https://github.com/snoyberg/http-client/issues/252 https://github.com/vincenthz/hs-tls/issues/179 All that said, the original question about addFinalizer vs. GHC 7.10 may still be worth exploring, even if its use-case for Sockets goes away. So, please don't take this poset to mean that the original question should be ignored. -- Viktor.

Hi, I registered this issue to network: https://github.com/haskell/network/issues/302 Viktor, thank you! --Kazu
On Jan 29, 2018, at 10:25 PM, Kazu Yamamoto (山本和彦)
wrote: socket family stype protocol = do fd <- c_socket ... ... let s = Socket fd addFinalizer s $ close s ruturn s
For the record, I think I've convinced Kazu Yamamoto that this is an anti-pattern. Such a finalizer would easily end up closing already closed sockets, whose file-descriptors may already be associated with other open files or sockets. That way lie all sorts of difficult to isolate race-condition bugs. To make this safe, the close function would need to mutate the socket, invalidating the enclosed file-descriptor, and would then need to be a NOP or just raise an exception if the socket is closed again (the finalizer should invoke a close variant that just returns without raising exceptions if the socket is already closed).
There is, AFAIK still an unresolved bug along these lines somewhere in http-client and its dependencies. So far no reproducing cases have been provided. No very recent reports either, perhaps it went away, or people have just been more lucky lately:
https://github.com/snoyberg/http-client/issues/252 https://github.com/vincenthz/hs-tls/issues/179
All that said, the original question about addFinalizer vs. GHC 7.10 may still be worth exploring, even if its use-case for Sockets goes away. So, please don't take this poset to mean that the original question should be ignored.
-- Viktor.
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Attaching a finalizer to a regular data type like that is a bit of a hazard
prone process. If GHC unpack your Socket in another data constructor it
will get "freed" and then your socket can get finalized while you still
have access to the CInt!
data MySockets = MySockets {-# unpack #-} !Socket {-# unpack #-} !Socket
will happily 'eat' your data constructor wrapper and if there are no
references remaining to a copy of it on the heap, their finalizers will
fire. Worker-wrapper transforms can do this even without any "containing"
data type if you just increase your optimization level!
It would be much, much safer to attach the finalizer to something that has
a "presence" all its own, like Weak# () as is done in ForeignPtr. This
would result in something like:
data Socket = Socket !CInt (Weak# ())
Then when it gets unpacked into another data constructor, then the Weak# ()
still exists. This isn't free, it comes at the cost that your sockets take
a couple of words each (plus finalizer space), but the approach you are
taking now isn't free either as it isn't really sound. ;)
*tl;dr* don't attach finalizers to regular Haskell data types if you can
help it
-Edward
On Tue, Jan 30, 2018 at 12:18 AM, Kazu Yamamoto
Hi,
I registered this issue to network:
https://github.com/haskell/network/issues/302
Viktor, thank you!
--Kazu
On Jan 29, 2018, at 10:25 PM, Kazu Yamamoto (山本和彦)
wrote: socket family stype protocol = do fd <- c_socket ... ... let s = Socket fd addFinalizer s $ close s ruturn s
For the record, I think I've convinced Kazu Yamamoto that this is an anti-pattern. Such a finalizer would easily end up closing already closed sockets, whose file-descriptors may already be associated with other open files or sockets. That way lie all sorts of difficult to isolate race-condition bugs. To make this safe, the close function would need to mutate the socket, invalidating the enclosed file-descriptor, and would then need to be a NOP or just raise an exception if the socket is closed again (the finalizer should invoke a close variant that just returns without raising exceptions if the socket is already closed).
There is, AFAIK still an unresolved bug along these lines somewhere in http-client and its dependencies. So far no reproducing cases have been provided. No very recent reports either, perhaps it went away, or people have just been more lucky lately:
https://github.com/snoyberg/http-client/issues/252 https://github.com/vincenthz/hs-tls/issues/179
All that said, the original question about addFinalizer vs. GHC 7.10 may still be worth exploring, even if its use-case for Sockets goes away. So, please don't take this poset to mean that the original question should be ignored.
-- Viktor.
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

On Jan 30, 2018, at 4:50 PM, Edward Kmett
wrote: It would be much, much safer to attach the finalizer to something that has a "presence" all its own, like Weak# () as is done in ForeignPtr. This would result in something like:
data Socket = Socket !CInt (Weak# ())
Then when it gets unpacked into another data constructor, then the Weak# () still exists. This isn't free, it comes at the cost that your sockets take a couple of words each (plus finalizer space), but the approach you are taking now isn't free either as it isn't really sound. ;)
tl;dr don't attach finalizers to regular Haskell data types if you can help it
THanks, good to know. I gather the unpacking can/will happen even if Socket internals are [made] opaque to other modules? And of course in this case, in addition to avoiding running the finalizer too early, it is critical that each socket be closed at most once. Therefore, to support finalization, and make the API safe for multiple close (as seems to be the case with System.IO Handle's for example) there's a need for additional mutable state in the Socket, to keep track of whether it has or has not yet been closed. I am curious as to what your suggestion would be as to how to best keep track of such state. 1. Employ a separate MVar to keep track of socket state, and update it on close to ensure at most once close. 2. Wrap the file descriptor in an IORef, and set it to an invalid value (-1 on Unix, INVALID_SOCKET on Windows) on close. This avoids misuse not only with close, but also with attempts at read/write/... I/O after close. However it does not avoid (far less likely I think) races to close the socket from multiple threads. 3. Move the state to a wrapper structure managed in FFI code so that all socket operations are via a foreign pointer to a C-structure in which the file descriptor is invalidated on close. 4. Other suggestions... -- Viktor.

Yes, this can happen even if you're opaque and don't export stuff.
You could attach the finalizer to an (unpacked) IORef that holds the Bool
that says whether you've been finalized. An IORef holds onto a MutVar#
under the hood, and so it also maintains the same sort of stable presence
Weak# offered above. Similarly you _could_ just stuff the info in an MVar
or ForeignPtr. Each of those has support for attaching a finalizer directly
to the heap allocated part that lives in # and therefore isn't vulnerable
to being inlined or unpacked.
There probably should be able to be similar options for attaching the
finalizer to a MutableByteArray#, even if the combinators don't exist right
now, etc.
I'd personally go with an IORef and then atomicModifyIORef during the
"close" and "finalize" operations to extract and set the flag in one
operation. That way there isn't any communication overhead, but IORef,
ForeignPtr, MVar, etc. could all work.
-Edward
On Tue, Jan 30, 2018 at 5:09 PM, Viktor Dukhovni
On Jan 30, 2018, at 4:50 PM, Edward Kmett
wrote: It would be much, much safer to attach the finalizer to something that has a "presence" all its own, like Weak# () as is done in ForeignPtr. This would result in something like:
data Socket = Socket !CInt (Weak# ())
Then when it gets unpacked into another data constructor, then the Weak# () still exists. This isn't free, it comes at the cost that your sockets take a couple of words each (plus finalizer space), but the approach you are taking now isn't free either as it isn't really sound. ;)
tl;dr don't attach finalizers to regular Haskell data types if you can help it
THanks, good to know. I gather the unpacking can/will happen even if Socket internals are [made] opaque to other modules?
And of course in this case, in addition to avoiding running the finalizer too early, it is critical that each socket be closed at most once. Therefore, to support finalization, and make the API safe for multiple close (as seems to be the case with System.IO Handle's for example) there's a need for additional mutable state in the Socket, to keep track of whether it has or has not yet been closed.
I am curious as to what your suggestion would be as to how to best keep track of such state.
1. Employ a separate MVar to keep track of socket state, and update it on close to ensure at most once close.
2. Wrap the file descriptor in an IORef, and set it to an invalid value (-1 on Unix, INVALID_SOCKET on Windows) on close. This avoids misuse not only with close, but also with attempts at read/write/... I/O after close. However it does not avoid (far less likely I think) races to close the socket from multiple threads.
3. Move the state to a wrapper structure managed in FFI code so that all socket operations are via a foreign pointer to a C-structure in which the file descriptor is invalidated on close.
4. Other suggestions...
-- Viktor.
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Hi Edward, Thank you for your excellent explanation! I'm now being convinced by the IORef approach. I should write PoC for that. --Kazu
Yes, this can happen even if you're opaque and don't export stuff.
You could attach the finalizer to an (unpacked) IORef that holds the Bool that says whether you've been finalized. An IORef holds onto a MutVar# under the hood, and so it also maintains the same sort of stable presence Weak# offered above. Similarly you _could_ just stuff the info in an MVar or ForeignPtr. Each of those has support for attaching a finalizer directly to the heap allocated part that lives in # and therefore isn't vulnerable to being inlined or unpacked.
There probably should be able to be similar options for attaching the finalizer to a MutableByteArray#, even if the combinators don't exist right now, etc.
I'd personally go with an IORef and then atomicModifyIORef during the "close" and "finalize" operations to extract and set the flag in one operation. That way there isn't any communication overhead, but IORef, ForeignPtr, MVar, etc. could all work.
-Edward
On Tue, Jan 30, 2018 at 5:09 PM, Viktor Dukhovni
wrote: On Jan 30, 2018, at 4:50 PM, Edward Kmett
wrote: It would be much, much safer to attach the finalizer to something that has a "presence" all its own, like Weak# () as is done in ForeignPtr. This would result in something like:
data Socket = Socket !CInt (Weak# ())
Then when it gets unpacked into another data constructor, then the Weak# () still exists. This isn't free, it comes at the cost that your sockets take a couple of words each (plus finalizer space), but the approach you are taking now isn't free either as it isn't really sound. ;)
tl;dr don't attach finalizers to regular Haskell data types if you can help it
THanks, good to know. I gather the unpacking can/will happen even if Socket internals are [made] opaque to other modules?
And of course in this case, in addition to avoiding running the finalizer too early, it is critical that each socket be closed at most once. Therefore, to support finalization, and make the API safe for multiple close (as seems to be the case with System.IO Handle's for example) there's a need for additional mutable state in the Socket, to keep track of whether it has or has not yet been closed.
I am curious as to what your suggestion would be as to how to best keep track of such state.
1. Employ a separate MVar to keep track of socket state, and update it on close to ensure at most once close.
2. Wrap the file descriptor in an IORef, and set it to an invalid value (-1 on Unix, INVALID_SOCKET on Windows) on close. This avoids misuse not only with close, but also with attempts at read/write/... I/O after close. However it does not avoid (far less likely I think) races to close the socket from multiple threads.
3. Move the state to a wrapper structure managed in FFI code so that all socket operations are via a foreign pointer to a C-structure in which the file descriptor is invalidated on close.
4. Other suggestions...
-- Viktor.
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
participants (3)
-
Edward Kmett
-
Kazu Yamamoto
-
Viktor Dukhovni