
Hi, We were discussing how to consume the eventlog incrementally on #ghc today. Would it be feasible to offer an API (e.g. GHC.EventLog) that allows programs to register for events that occur in the program? Programs would register listeners like so: registerEventListener (\ event -> doSomethingWith event) The current write-to-file system could be implemented in terms of this API. We could even allow event logging to be turned on/off from within the application, allowing developers to attach to a running application, enable event logging, diagnose the app, and then turn of event logging again. The RTS would invoke listeners every time a new event is written. This design has many benefits: - We don't need to introduce the serialization, deserialization, and I/O overhead of first writing the eventlog to file and then parsing it again. - Programs could monitor themselves and provide debug output (e.g. via some UI component). - Users could write code that redirects the output elsewhere e.g. to a socket for remote monitoring. Would invoking a callback on each event add too big of an overhead? How about invoking the callback once every time the event buffer is full? Johan

Hello Johan,
I did the initial implementation of GHC.Eventlog. Sadly, I haven't
had time to work on it since starting a full-time job after
university. That being said, I am still interested in GHC and the
improvement of GHC.Eventlog. Hopefully soon, I will have the time to
do more development on GHC... hopefully. ;)
Anyway, from your description, I don't understand how a listener would
consume the eventlog incrementally?
I do think it would be useful to register listeners for events. I do
not think the invocation of a callback would be too much overhead,
rather the action the callback performs could be a very significant
overhead, such as sending eventlog data over a network connection.
But, if you are willing to accept the performance loss from the
callback's action to gain the event data then it seems worthwhile to
me.
I'm sure Simon M knows better than I do regarding this...
I look forward to hearing more. Thanks.
--
Donnie
On Thu, Apr 28, 2011 at 4:31 PM, Johan Tibell
Hi,
We were discussing how to consume the eventlog incrementally on #ghc today. Would it be feasible to offer an API (e.g. GHC.EventLog) that allows programs to register for events that occur in the program? Programs would register listeners like so:
registerEventListener (\ event -> doSomethingWith event)
The current write-to-file system could be implemented in terms of this API. We could even allow event logging to be turned on/off from within the application, allowing developers to attach to a running application, enable event logging, diagnose the app, and then turn of event logging again.
The RTS would invoke listeners every time a new event is written. This design has many benefits:
- We don't need to introduce the serialization, deserialization, and I/O overhead of first writing the eventlog to file and then parsing it again. - Programs could monitor themselves and provide debug output (e.g. via some UI component). - Users could write code that redirects the output elsewhere e.g. to a socket for remote monitoring.
Would invoking a callback on each event add too big of an overhead? How about invoking the callback once every time the event buffer is full?
Johan
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

I'm very interested in what the best way to get incremental event data
from a running GHC process would be.
Looking at the code, we flush the event buffer fairly regularly, but
the event parser is currently strict.
So we'd need a lazy (or incremental) parser, that'll return a list of
successful event parses, then block. I suspect this mode would be
supported.
*My evil plan is to write a little monitoring web app that just
attaches to the event stream and renders it in a useful "heartbeat"
format* , but I need incremental parsing.
-- Don
On Thu, Apr 28, 2011 at 2:53 PM, Donnie Jones
Hello Johan,
I did the initial implementation of GHC.Eventlog. Sadly, I haven't had time to work on it since starting a full-time job after university. That being said, I am still interested in GHC and the improvement of GHC.Eventlog. Hopefully soon, I will have the time to do more development on GHC... hopefully. ;)
Anyway, from your description, I don't understand how a listener would consume the eventlog incrementally?
I do think it would be useful to register listeners for events. I do not think the invocation of a callback would be too much overhead, rather the action the callback performs could be a very significant overhead, such as sending eventlog data over a network connection. But, if you are willing to accept the performance loss from the callback's action to gain the event data then it seems worthwhile to me.
I'm sure Simon M knows better than I do regarding this... I look forward to hearing more. Thanks. -- Donnie
On Thu, Apr 28, 2011 at 4:31 PM, Johan Tibell
wrote: Hi,
We were discussing how to consume the eventlog incrementally on #ghc today. Would it be feasible to offer an API (e.g. GHC.EventLog) that allows programs to register for events that occur in the program? Programs would register listeners like so:
registerEventListener (\ event -> doSomethingWith event)
The current write-to-file system could be implemented in terms of this API. We could even allow event logging to be turned on/off from within the application, allowing developers to attach to a running application, enable event logging, diagnose the app, and then turn of event logging again.
The RTS would invoke listeners every time a new event is written. This design has many benefits:
- We don't need to introduce the serialization, deserialization, and I/O overhead of first writing the eventlog to file and then parsing it again. - Programs could monitor themselves and provide debug output (e.g. via some UI component). - Users could write code that redirects the output elsewhere e.g. to a socket for remote monitoring.
Would invoking a callback on each event add too big of an overhead? How about invoking the callback once every time the event buffer is full?
Johan
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

On Fri, Apr 29, 2011 at 12:00 AM, Don Stewart
I'm very interested in what the best way to get incremental event data from a running GHC process would be.
Looking at the code, we flush the event buffer fairly regularly, but the event parser is currently strict.
So we'd need a lazy (or incremental) parser, that'll return a list of successful event parses, then block. I suspect this mode would be supported.
*My evil plan is to write a little monitoring web app that just attaches to the event stream and renders it in a useful "heartbeat" format* , but I need incremental parsing.
A less general solution might be to have the program itself start a little web server on some port and use the API I proposed to serve JSON data with the aggregate statistics you care about. Example: main = do eventData <- newIORef server <- serveOn 8080 $ \ _req -> readIORef eventData >>= sendResponse eventData registerEventListener $ \ ev -> updateEventData eventData ev runNormalProgram You can wrap the creation of the webserver in a little helper function an make any program "monitorable" simply by doing main = withMonitoring runApp withMonitoring would take care of starting/stopping the webserver and processing events. Just a thought. Johan

I've put a library for incremental parsing of the event log here:
http://code.haskell.org/~dons/code/ghc-events-stream/
The goal is to implement something like:
http://www.erlang.org/doc/man/heart.html
On Sun, May 1, 2011 at 1:44 AM, Johan Tibell
On Fri, Apr 29, 2011 at 12:00 AM, Don Stewart
wrote: I'm very interested in what the best way to get incremental event data from a running GHC process would be.
Looking at the code, we flush the event buffer fairly regularly, but the event parser is currently strict.
So we'd need a lazy (or incremental) parser, that'll return a list of successful event parses, then block. I suspect this mode would be supported.
*My evil plan is to write a little monitoring web app that just attaches to the event stream and renders it in a useful "heartbeat" format* , but I need incremental parsing.
A less general solution might be to have the program itself start a little web server on some port and use the API I proposed to serve JSON data with the aggregate statistics you care about. Example:
main = do eventData <- newIORef server <- serveOn 8080 $ \ _req -> readIORef eventData >>= sendResponse eventData registerEventListener $ \ ev -> updateEventData eventData ev runNormalProgram
You can wrap the creation of the webserver in a little helper function an make any program "monitorable" simply by doing
main = withMonitoring runApp
withMonitoring would take care of starting/stopping the webserver and processing events.
Just a thought.
Johan

On Thu, Apr 28, 2011 at 3:00 PM, Don Stewart
So we'd need a lazy (or incremental) parser, that'll return a list of successful event parses, then block. I suspect this mode would be supported.
A while ago, I hacked something together on top of the current eventlog parser that would consume an event at a time, and record the seek offset of each successful parse. If parsing failed (due to unflushed data), it would try again later. I think I might even claim that this is a somewhat sensible and parsimonious approach, but I'm drinking wine right now, so my judgment might be impaired.

I managed to build one on top of attoparsec's lazy parser that "seems
to work" -- but I'd like ghc to flush a bit more regularly so I could
test it better.
-- Don
On Sun, May 1, 2011 at 7:59 PM, Bryan O'Sullivan
On Thu, Apr 28, 2011 at 3:00 PM, Don Stewart
wrote: So we'd need a lazy (or incremental) parser, that'll return a list of successful event parses, then block. I suspect this mode would be supported.
A while ago, I hacked something together on top of the current eventlog parser that would consume an event at a time, and record the seek offset of each successful parse. If parsing failed (due to unflushed data), it would try again later. I think I might even claim that this is a somewhat sensible and parsimonious approach, but I'm drinking wine right now, so my judgment might be impaired.

On Thu, Apr 28, 2011 at 11:53 PM, Donnie Jones
Anyway, from your description, I don't understand how a listener would consume the eventlog incrementally?
I simply meant that I want to be able to register listeners for events instead of having to parse the eventlog file after the fact.
I do think it would be useful to register listeners for events. I do not think the invocation of a callback would be too much overhead, rather the action the callback performs could be a very significant overhead, such as sending eventlog data over a network connection. But, if you are willing to accept the performance loss from the callback's action to gain the event data then it seems worthwhile to me.
A typical use of the callback would be to update some internal data structure of the program itself, thereby making the program self-monitoring. I've been toying with introducing log levels to the eventlog command line API so the consumer of the event log can specify the number of events it would like to receive. We could do something similar for the API e.g. registerEventListener (schedEvents .|. ioManagerEvents) (\ e -> ...) Johan

On Thu, 2011-04-28 at 23:31 +0200, Johan Tibell wrote:
The RTS would invoke listeners every time a new event is written. This design has many benefits:
- We don't need to introduce the serialization, deserialization, and I/O overhead of first writing the eventlog to file and then parsing it again.
The events are basically generated in serialised form (via C code that writes them directly into the event buffer). They never exist as Haskell data structures, or even C structures.
- Programs could monitor themselves and provide debug output (e.g. via some UI component). - Users could write code that redirects the output elsewhere e.g. to a socket for remote monitoring.
Would invoking a callback on each event add too big of an overhead?
Yes, by orders of magnitude. In fact it's impossible because the act of invoking the callback would generate more events... :-)
How about invoking the callback once every time the event buffer is full?
That's much more realistic. Still, do we need the generality of pushing the event buffers through the Haskell code? For some reason it makes me slightly nervous. How about just setting which output FD the event buffers get written to. Turning all events or various classes of events on/off at runtime should be doable. The design already supports multiple classes, though currently it just has one class (the 'scheduler' class). The current design does not support fine grained filtering at the point of event generation. Those two features combined (plus control over the frequency of event buffer flushing) would be enough to implement a monitoring socket interface (web http or local unix domain socket). Making the parser in the ghc-events package incremental would be sensible and quite doable as people have already demonstrated. Duncan

I've got a proof of concept event-log monitoring server and
incremental parser for event streams:
* http://code.haskell.org/~dons/code/ghc-events-stream/
* http://code.haskell.org/~dons/code/ghc-monitor/
Little screen shot of the snap server running, watching a Haskell
process' eventlog fifo:
* http://i.imgur.com/Xfr9I.png
The main issue at the moment is that GHC is irregular in scheduling
flusing of the event log stream, so it might be hours or days before
you see any activity. This isn't useful for heartbeat style
monitoring.
Also, we need to break out a bit of ThreadScope to give access to its
analytics (e.g. rendering time series).
-- Don
On Sun, May 1, 2011 at 1:51 PM, Duncan Coutts
On Thu, 2011-04-28 at 23:31 +0200, Johan Tibell wrote:
The RTS would invoke listeners every time a new event is written. This design has many benefits:
- We don't need to introduce the serialization, deserialization, and I/O overhead of first writing the eventlog to file and then parsing it again.
The events are basically generated in serialised form (via C code that writes them directly into the event buffer). They never exist as Haskell data structures, or even C structures.
- Programs could monitor themselves and provide debug output (e.g. via some UI component). - Users could write code that redirects the output elsewhere e.g. to a socket for remote monitoring.
Would invoking a callback on each event add too big of an overhead?
Yes, by orders of magnitude. In fact it's impossible because the act of invoking the callback would generate more events... :-)
How about invoking the callback once every time the event buffer is full?
That's much more realistic. Still, do we need the generality of pushing the event buffers through the Haskell code? For some reason it makes me slightly nervous. How about just setting which output FD the event buffers get written to.
Turning all events or various classes of events on/off at runtime should be doable. The design already supports multiple classes, though currently it just has one class (the 'scheduler' class). The current design does not support fine grained filtering at the point of event generation.
Those two features combined (plus control over the frequency of event buffer flushing) would be enough to implement a monitoring socket interface (web http or local unix domain socket).
Making the parser in the ghc-events package incremental would be sensible and quite doable as people have already demonstrated.
Duncan
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
participants (5)
-
Bryan O'Sullivan
-
Don Stewart
-
Donnie Jones
-
Duncan Coutts
-
Johan Tibell