
I have a large tarball I want to link into an executable as a ByteString. What is the best way to do this? I can convert the tarball into a haskell file, but I'm afraid ghc would take a long time to compile it. Is there any way to link constant data directly with ghc? If not, what's the most efficient way to code large ByteStrings for fast compilation?

On Fri, Jan 1, 2010 at 4:09 PM, Tom Hawkins
I have a large tarball I want to link into an executable as a ByteString. What is the best way to do this? I can convert the tarball into a haskell file, but I'm afraid ghc would take a long time to compile it. Is there any way to link constant data directly with ghc? If not, what's the most efficient way to code large ByteStrings for fast compilation?
In the limit, you can convert it to an assembly file. Something like this, though I've done very little checking indeed of the syntax. Consider this to be pseudocode. foo.s ==== .global bytestring, bytestring_end .label bytestring .db 0x0c 0xdf 0xwhatever .label bytestring_end foo.hs === import Foreign import Data.ByteString.Internal foreign import ptr bytestring :: Ptr Word8 foreign import ptr bytestring_end :: Ptr Word8 yourString :: ByteString yourString = unsafePerformIO $ do fptr <- newForeignPtr_ bytestring return $ fromForeignPtr (fptr, bytestring_end `minusPtr` bytestring, 0) -- ^ If I got the foreignPtr parameter order right Unfortunately Data.ByteString.Internal, though still exported, is no longer haddocked; this makes it hard to check the parameters. You should go look up the 6.10.1 version's documentation, which is still correct. -- Svein Ove Aas

On Fri, Jan 1, 2010 at 8:11 PM, John Millikin
On Fri, Jan 1, 2010 at 08:49, Svein Ove Aas
wrote: foo.hs === foreign import ptr bytestring :: Ptr Word8 foreign import ptr bytestring_end :: Ptr Word8
Is this valid syntax? I get a syntax error in 6.10.1, and I don't see it documented in the FFI report.
That's why I called it pseudocode. No, it's not valid syntax, though it wouldn't have overly surprised me if it were. You probably get the idea, though; importing a symbol, instead of a function. Actually reading the FFi got me this, though: foreign import ccall "&" bytestring :: Ptr Word8 foreign import ccall "&" bytestring_end :: Ptr Word8 -- Svein Ove Aas

Thanks, this worked great. Just a few seconds to link in a 5M tarball. Details: test.s: .global test_data test_data: .byte 0 .byte 1 .byte 2 ... Foo.hs: import Foreign import Data.ByteString.Internal import Data.Word import System.IO.Unsafe foreign import ccall "&" test_data :: Ptr Word8 test :: ByteString test = fromForeignPtr (unsafePerformIO (newForeignPtr_ test_data)) 0 lengthOfTestData Compiled with: ghc --make -W -fglasgow-exts -o something test.s ...

On Fri, Jan 1, 2010 at 7:09 AM, Tom Hawkins
I have a large tarball I want to link into an executable as a ByteString. What is the best way to do this? I can convert the tarball into a haskell file, but I'm afraid ghc would take a long time to compile it. Is there any way to link constant data directly with ghc? If not, what's the most efficient way to code large ByteStrings for fast compilation?
Possibly the simplest is to use unsafePackAddress or unsafePackAddressLen: {-# LANGUAGE MagicHash #-} module Const where import Data.ByteString.Unsafe as U import System.IO.Unsafe my_bstr = unsafePerformIO $ U.unsafePackAddress "abcdefg"# This trick of embedding raw strings (of type Addr#) is how happy and alex store their parser lookup tables in the modules they generate. I haven't seen any performance issues with it myself. Hope that helps, -Judah
participants (4)
-
John Millikin
-
Judah Jacobson
-
Svein Ove Aas
-
Tom Hawkins