Re: openFile gives "file is locked" error on Linux when creating a non-existing file

9 Oct 2024

      On Wed, Oct 09, 2024 at 12:15:32PM +0530, Harendra Kumar wrote:
...
We do use low level C APIs and GHC APIs to create a Handle in the
event watching module. But that is for the watch-root and not for the
file that is experiencing this problem. So here is how it works. We
have a top level directory which is watched for events using inotify.
We first create this directory, this directory is opened using
inotify_init which returns a C file descriptor. We then create a
Handle from this fd, this Handle is used for watching inotify events.
We are then creating a file inside this directory which is being
watched while we are reading events from the parent directory. The
resource-busy issue occurs when creating a file inside this directory.
So we are not creating the Handle for the file in question in a
non-standard manner, but the parent directory Handle is being created
in that manner. I do not know if that somehow affects anything. Or if
the fact that the directory is being watched using inotify makes any
difference?
The code for creating the watch Handle is here:
https://github.com/composewell/streamly/blob/bbac52d9e09fa5ad760ab6ee5572c70...
. Viktor, you may want to take a quick look at this to see if it can
make any difference to the issue at hand.
I don't have the cycles to isolate the problem.  I still suspect that
your code is somehow directly closing file descriptors associated with a
Handle.  This then orphans the associated logical reader/writer lock,
which then gets inherited by the next incarnation of the same (dev, ino)
pair.  However, if the filesystem underlying "/tmp" were actually "tmpfs",
inode reuse would be quite unlikely, because tmpfs inodes are assigned
from a strictly incrementing counter:

    $ for i in {1..10}; do touch /tmp/foobar; ls -i /tmp/foobar; rm
    /tmp/foobar; done
    3830 /tmp/foobar
    3831 /tmp/foobar
    3832 /tmp/foobar
    3833 /tmp/foobar
    3834 /tmp/foobar
    3835 /tmp/foobar
    3836 /tmp/foobar
    3837 /tmp/foobar
    3838 /tmp/foobar
    3839 /tmp/foobar

but IIRC you mentioned that on Github "/tmp" is ext4, not "tmpfs"
(perhaps RAM-backed storage is a more scarce resource), in which
case indeed inode reuse is quite likely:

    $ for i in {1..10}; do touch /var/tmp/foobar; ls -i /var/tmp/foobar; rm
    /var/tmp/foobar; done
    25854141 /var/tmp/foobar
    25854142 /var/tmp/foobar
    25854141 /var/tmp/foobar
    25854142 /var/tmp/foobar
    25854141 /var/tmp/foobar
    25854142 /var/tmp/foobar
    25854141 /var/tmp/foobar
    25854142 /var/tmp/foobar
    25854141 /var/tmp/foobar
    25854142 /var/tmp/foobar

But since normal open/close of Handles acquires the lock after open, and
releases it before close, the evidence points to a bypass of the normal
open file lifecycle.

Your codebase contains a bunch of custom file management logic, which
could be the source the of problem.  To find the problem code path,
you'd probably need to instrument the RTS lock/unlock code to log its
activity: (mode, descriptor, dev, ino) tuples being added and removed.
And strace execution to be able to identify descriptor open and close
events.  Ideally the problem will be reproducible even with strace.

Good luck.

-- 
    Viktor.

Re: openFile gives "file is locked" error on Linux when creating a non-existing file

Viktor Dukhovni