garbage collector woes

I'm at wits end with respect to GHC's garbage collector and would very much appreciate a code review of my MySQL driver for HDBC, which is here: <http://www.maubi.net/~waterson/REPO/HDBC-mysql/Database/HDBC/MySQL/Connectio...
In particular, the problem that I'm having is that my "statements" (really, just iterators over a SQL query result set) are getting garbage collected prematurely. I've included the source for the relevant functions below. The first is newStatement, which creates a new HDBC Statement record (an elided version, below):
newStatement :: ForeignPtr MYSQL -> String -> IO Types.Statement newStatement mysql__ query = withForeignPtr mysql__ $ \mysql_ -> do stmt_ <- mysql_stmt_init mysql_ ... stmt__ <- newForeignPtr mysql_stmt_close stmt_ ... return $ Types.Statement { , Types.fetchRow = fetchRow mysql__ stmt__ results }
The mysql_stmt_init function is the MySQL C API that creates a new statement record. Its result must be freed by mysql_stmt_close, and therefore is wrapped with a ForeignPtr appropriately. The fetchRow function is part of the HDBC Statement API, and its signature is:
fetchRow :: IO (Maybe [SqlValue])
I.e., it retrieves the next row from the result set and returns the column values as a list. The "results" parameter is not relevant to this discussion: it maintains the buffers used for storage of the column values. The next function of interest is my implementation of fetchRow, copied in its entirety, below:
fetchRow :: ForeignPtr MYSQL -> ForeignPtr MYSQL_STMT -> [MYSQL_BIND] -> IO (Maybe [Types.SqlValue]) fetchRow mysql__ stmt__ results = withForeignPtr mysql__ $ \_ -> withForeignPtr stmt__ $ \stmt_ -> do rv <- mysql_stmt_fetch stmt_ case rv of 0 -> row #{const MYSQL_DATA_TRUNCATED} -> row #{const MYSQL_NO_DATA} -> return Nothing _ -> statementError stmt_ where row = mapM cellValue results >>= \cells -> return $ Just cells
It unwraps the ForeignPtrs and invokes the mysql_stmt_fetch C API with the native MYSQL_STMT structure, causing the column values for the next row to get bound into storage areas that we can convert to Haskell values. Typically, this code gets invoked in the generic HDBC layer by the fetchAllRows routine, which looks like this:
fetchAllRows :: Statement -> IO [[SqlValue]] fetchAllRows sth = unsafeInterleaveIO $ do row <- fetchRow sth case row of Nothing -> return [] Just x -> do remainder <- fetchAllRows sth return (x : remainder)
The sth argument is simply an instance of the Statement record that I've returned from low-level MySQL driver. This all works, except... The behavior that I'm seeing is that the ForeignPtr to the native MYSQL_STMT struct has become unreachable after some number of iterations of the above loop. This causes mysql_stmt_close to be invoked, and subsequent calls to mysql_stmt_fetch fail with an error. Curiously, if I replace
stmt__ <- newForeignPtr mysql_stmt_close stmt_
with
stmt__ <- newForeignPtr_ stmt_
in my newStatement routine (i.e., so that mysql_stmt_close is never called), then fetchAllRows will merrily fetch all the rows from my statement handle. The down side, of course, being that I'll leak statement handles. So, I'm completely puzzled as to how I can continue to extract values from the apparently unreachable pointer! Using System.Mem.Weak, I've verified that: * The Statement record that I return from newStatement becomes unreachable. * The applied value of fetchRow becomes unreachable; e.g.: let r = fetchRow mysql__ stmt__ results addFinalizer r $ putStrLn "r is unreachable!" I'm not sure how this can be, given that fetchAllRows appears to be explicitly maintaining a reference to sth, which in turn ought to reference my fetchRow application. I've created a test program that directly invokes my driver code the same way that the generic HDBC layer does, and have verified that the lack of "-fno-cse" with unsafeInterleaveIO in the generic HDBC module is not relevant. I was similarly able to verify that this problem occurs with "-O0", so it's not due to any particular GHC optimization. Any thoughts on how to debug this would be greatly appreciated! Thanks in advance, chris

On Feb 17, 2009, at 12:22 PM, Chris Waterson wrote:
I'm at wits end with respect to GHC's garbage collector and would very much appreciate a code review of my MySQL driver for HDBC, which is here:
<http://www.maubi.net/~waterson/REPO/HDBC-mysql/Database/HDBC/MySQL/Connectio...
In particular, the problem that I'm having is that my "statements" (really, just iterators over a SQL query result set) are getting garbage collected prematurely.
So (*blush*), my woes turned out to be my misunderstanding of the MySQL C API, which I have now come to terms with. I apologize for the noise here. chris

waterson:
On Feb 17, 2009, at 12:22 PM, Chris Waterson wrote:
I'm at wits end with respect to GHC's garbage collector and would very much appreciate a code review of my MySQL driver for HDBC, which is here:
<http://www.maubi.net/~waterson/REPO/HDBC-mysql/Database/HDBC/MySQL/Connectio...
In particular, the problem that I'm having is that my "statements" (really, just iterators over a SQL query result set) are getting garbage collected prematurely.
So (*blush*), my woes turned out to be my misunderstanding of the MySQL C API, which I have now come to terms with. I apologize for the noise here.
Is the solution written up somewhere so we can point to that next time? :)

On Feb 19, 2009, at 3:41 PM, Don Stewart wrote:
Is the solution written up somewhere so we can point to that next time? :)
Well, the applicability to the community at large is probably minimal, but my misadventure follows... The MySQL C API has "statements" that are associated with a database "connection". You connect to the database, and issue statements to query and manipulate it. The statement encapsulates, basically, the state of iteration through a result set. It turns out that a connection allows only one statement to be active at a time, and that "closing" any statement associated with a connection appears to close all other statements associated with that connection, too. I wrap the MySQL "statement" in a ForeignPtr whose finalizer closes the statement. Which, as it turns out, would close the *next* statement that I'd created on the connection as a side effect. I was incorrectly interpreting I tried to mitigate this surprising effect by 1) making sure that a statement gets finalized as soon as its result set is exhausted, and 2) adding some warnings to the driver docs about this wonderful feature. chris
participants (2)
-
Chris Waterson
-
Don Stewart