
#13298: Compact API design improvements -------------------------------------+------------------------------------- Reporter: ezyang | Owner: ezyang Type: feature | Status: new request | Priority: normal | Milestone: 8.2.1 Component: | Version: 8.0.1 libraries/compact | Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- I took a look at the compact API today and I realized that there are a lot of improvements we should make to it: 1. `fmap getCompact . compact` is a pretty common thing to do, if you don't actually care about the compact pointer. We should make a helper function for this. 2. The `SerializedCompact` data structure needs a bunch of instances. Especially, a user who wants to serialize the compact region needs to save this metadata somewhere, but we don't offer any help for doing this. 3. `importCompact` will always print a message to stderr saying pointer fixup is happening. Need to be able to suppress this message. 4. The serialization API is really unsafe; we should make it more safe by default by including some sort of fingerprint, at least. 5. There should be a convenience function for serializing to and from a file, and serializing to and from a handle. RDMA is always going to be delicate business (so keep the old API around) but for these cases we should pave the cowpaths. Here is some sample code that might help: {{{ {-# LANGUAGE ScopedTypeVariables #-} import System.Environment (getArgs) import qualified Data.Set as Set import System.IO import Data.Compact import Data.Compact.Serialized import Foreign.Ptr import Foreign.Storable import Foreign.Marshal.Alloc import Control.Monad main = do [dict_file, out_file] <- getArgs dict <- readFileLatin1 dict_file c <- compact (Set.fromList (words dict)) withBinaryFile out_file WriteMode $ \h -> withSerializedCompact c $ \sc -> do -- Write out the metadata header hPutStorable h (serializedCompactRoot sc) forM_ (serializedCompactBlockList sc) $ \(ptr, l) -> do hPutStorable h ptr hPutStorable h l hPutStorable h nullPtr -- Write out the payload forM_ (serializedCompactBlockList sc) $ \(ptr, l) -> hPutBuf h ptr (fromIntegral l) mb_r <- withBinaryFile out_file ReadMode $ \h -> do -- Read out the metadata header root <- hGetStorable h let go h xs = do ptr <- hGetStorable h if ptr == nullPtr then return (reverse xs) else do l <- hGetStorable h go h ((ptr, l):xs) blocks <- go h [] let sc = SerializedCompact { serializedCompactBlockList = blocks, serializedCompactRoot = root } -- Read the payload into memory importCompact sc $ \ptr l -> void $ hGetBuf h ptr (fromIntegral l) print (fmap getCompact mb_r == Just (getCompact c)) hPutStorable :: forall a. Storable a => Handle -> a -> IO () hPutStorable h a = alloca $ \ptr -> do poke ptr a hPutBuf h ptr (sizeOf (undefined :: a)) hGetStorable :: forall a. Storable a => Handle -> IO a hGetStorable h = alloca $ \ptr -> do hGetBuf h ptr (sizeOf (undefined :: a)) peek ptr readFileLatin1 f = do h <- openFile f ReadMode hSetEncoding h latin1 hGetContents h }}} I'm happy to do these but I want to make sure I'm not stepping on FB's toes, also I don't know if bgamari wants to take patches along these lines so late in the release cycle. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13298 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler