Generating, serving, and deleting temporary files with Wai/Yesod

Hello all, I've been experimenting with generating graphs from log files using timeplot and serving them with yesod. Each generated graph is only served once, so I need a way to clean them up. A cron job would be ok, but it's one more thing I'd have to remember to do, and I'm lazy. Instead, I'd like to pass a "clean up action" to the enumerator sending the file, to be performed after EOF is reached. For most files, this would just close the handle. For temp files, it would also delete the file. Here's the gist of it: http://gist.github.com/640902 The first revision defines a monolithic fromTempFile :: FilePath -> Enumerator. The second imports a factored version from Network.Wai.Handler (see my wai commit: http://github.com/softmechanics/wai/commit/2cb5ea47ca458f4b91a73e01a1dd0a29e...). Not sure wai is the place for it. Perhaps wai-extra would be a better? Does this seem like a good approach? Regards, -matt

On Fri, Oct 22, 2010 at 8:21 PM, Matt Brown
Hello all,
I've been experimenting with generating graphs from log files using timeplot and serving them with yesod. Each generated graph is only served once, so I need a way to clean them up. A cron job would be ok, but it's one more thing I'd have to remember to do, and I'm lazy.
Instead, I'd like to pass a "clean up action" to the enumerator sending the file, to be performed after EOF is reached. For most files, this would just close the handle. For temp files, it would also delete the file. Here's the gist of it: http://gist.github.com/640902
The first revision defines a monolithic fromTempFile :: FilePath -> Enumerator. The second imports a factored version from Network.Wai.Handler (see my wai commit: http://github.com/softmechanics/wai/commit/2cb5ea47ca458f4b91a73e01a1dd0a29e...). Not sure wai is the place for it. Perhaps wai-extra would be a better?
Does this seem like a good approach?
This does seem like the right approach. Just as fair warning: using an enumerator for serving files loses out on the possible optimization of a sendfile system call. Another possibility may be to keep a pool of file names to use as temporary files and keep rotating their usage. But I *do* think your approach is the correct one. Regarding the API changes to wai: they seem reasonable, and I'd probably accept them. Some concerns about the implementation: * Instead of fromHandle' and fromFile', perhaps fromHandleFinally and fromFileFinally? * In fromFile', why are you calling hClose? withBinaryFile automatically closes the file handle. Are you worried about the temporary file getting deleted while the file handle is still open? * In fromHandle', onEOF won't get called if the iteratee returns Left. Is this intended behavior? * Your code won't work in the presence of exceptions. Ideally the body of fromHandle' would be wrapped with a call to finally. This point is related to the previous one. Michael

Michael Snoyman
This does seem like the right approach. Just as fair warning: using an enumerator for serving files loses out on the possible optimization of a sendfile system call. Another possibility may be to keep a pool of file names to use as temporary files and keep rotating their usage. But I *do* think your approach is the correct one.
On unixy systems the "correct" approach traditionally would be to open
the file, then unlink it before you send the data out the socket. When
the file descriptor is closed, the link count goes to zero and the file
goes away. Obviously this isn't portable though.
G
--
Gregory Collins

On Sat, Oct 23, 2010 at 11:04 PM, Gregory Collins
Michael Snoyman
writes: This does seem like the right approach. Just as fair warning: using an enumerator for serving files loses out on the possible optimization of a sendfile system call. Another possibility may be to keep a pool of file names to use as temporary files and keep rotating their usage. But I *do* think your approach is the correct one.
On unixy systems the "correct" approach traditionally would be to open the file, then unlink it before you send the data out the socket. When the file descriptor is closed, the link count goes to zero and the file goes away. Obviously this isn't portable though.
Besides portability, it doesn't fit in with the WAI request/response timeline. In WAI, the server won't start sending the response until the application returns it. Matt got around this problem by using the ResponseEnumerator constructor, which forces the server to run arbitrary code, but there's no such mechanism for the ResponseFile constructor. So I suppose two other possible approaches might be: * Extend the ResponseFile constructor to have a cleanup function. I don't really like this, as I don't think it's generally necessary. * Fork a thread just before returning the ResponseFile that sleeps for some amount of time (5 seconds should be *more* than ample) and then deletes the file. However, over all, I think Matt has the correct approach given the constraints involved. Michael

Thanks for the feedback. I reverted my last commit and pushed a
rewrite, trying to follow your advice. Seems much cleaner now; what
do you think?
http://github.com/softmechanics/wai/commit/b37747ab9e2fd031512661ea57dc52c96...
-matt
On Sat, Oct 23, 2010 at 10:54 AM, Michael Snoyman
On Fri, Oct 22, 2010 at 8:21 PM, Matt Brown
wrote: Hello all,
I've been experimenting with generating graphs from log files using timeplot and serving them with yesod. Each generated graph is only served once, so I need a way to clean them up. A cron job would be ok, but it's one more thing I'd have to remember to do, and I'm lazy.
Instead, I'd like to pass a "clean up action" to the enumerator sending the file, to be performed after EOF is reached. For most files, this would just close the handle. For temp files, it would also delete the file. Here's the gist of it: http://gist.github.com/640902
The first revision defines a monolithic fromTempFile :: FilePath -> Enumerator. The second imports a factored version from Network.Wai.Handler (see my wai commit: http://github.com/softmechanics/wai/commit/2cb5ea47ca458f4b91a73e01a1dd0a29e...). Not sure wai is the place for it. Perhaps wai-extra would be a better?
Does this seem like a good approach?
This does seem like the right approach. Just as fair warning: using an enumerator for serving files loses out on the possible optimization of a sendfile system call. Another possibility may be to keep a pool of file names to use as temporary files and keep rotating their usage. But I *do* think your approach is the correct one.
Regarding the API changes to wai: they seem reasonable, and I'd probably accept them. Some concerns about the implementation:
* Instead of fromHandle' and fromFile', perhaps fromHandleFinally and fromFileFinally? * In fromFile', why are you calling hClose? withBinaryFile automatically closes the file handle. Are you worried about the temporary file getting deleted while the file handle is still open? * In fromHandle', onEOF won't get called if the iteratee returns Left. Is this intended behavior? * Your code won't work in the presence of exceptions. Ideally the body of fromHandle' would be wrapped with a call to finally. This point is related to the previous one.
Michael

I think it's great; I've just released wai 0.2.2 with those code changes.
Thanks,
Michael
On Sat, Oct 23, 2010 at 11:56 PM, Matt Brown
Thanks for the feedback. I reverted my last commit and pushed a rewrite, trying to follow your advice. Seems much cleaner now; what do you think?
http://github.com/softmechanics/wai/commit/b37747ab9e2fd031512661ea57dc52c96...
-matt
On Sat, Oct 23, 2010 at 10:54 AM, Michael Snoyman
wrote: On Fri, Oct 22, 2010 at 8:21 PM, Matt Brown
wrote: Hello all,
I've been experimenting with generating graphs from log files using timeplot and serving them with yesod. Each generated graph is only served once, so I need a way to clean them up. A cron job would be ok, but it's one more thing I'd have to remember to do, and I'm lazy.
Instead, I'd like to pass a "clean up action" to the enumerator sending the file, to be performed after EOF is reached. For most files, this would just close the handle. For temp files, it would also delete the file. Here's the gist of it: http://gist.github.com/640902
The first revision defines a monolithic fromTempFile :: FilePath -> Enumerator. The second imports a factored version from Network.Wai.Handler (see my wai commit: http://github.com/softmechanics/wai/commit/2cb5ea47ca458f4b91a73e01a1dd0a29e...). Not sure wai is the place for it. Perhaps wai-extra would be a better?
Does this seem like a good approach?
This does seem like the right approach. Just as fair warning: using an enumerator for serving files loses out on the possible optimization of a sendfile system call. Another possibility may be to keep a pool of file names to use as temporary files and keep rotating their usage. But I *do* think your approach is the correct one.
Regarding the API changes to wai: they seem reasonable, and I'd probably accept them. Some concerns about the implementation:
* Instead of fromHandle' and fromFile', perhaps fromHandleFinally and fromFileFinally? * In fromFile', why are you calling hClose? withBinaryFile automatically closes the file handle. Are you worried about the temporary file getting deleted while the file handle is still open? * In fromHandle', onEOF won't get called if the iteratee returns Left. Is this intended behavior? * Your code won't work in the presence of exceptions. Ideally the body of fromHandle' would be wrapped with a call to finally. This point is related to the previous one.
Michael
participants (3)
-
Gregory Collins
-
Matt Brown
-
Michael Snoyman