trouble compiling "import GHC.Prim(MutableByteArray#, ....." (building regex-tdfa from darcs) -- what's that # sign doing?

trying to compile regex-tdfa, I ran into another issue. (earlier I had a cabal problem but that's resolved.) there's a line that won't compile, neither for ghc 6.6.1 nor 6.7 import GHC.Prim(MutableByteArray#,RealWorld,Int#,sizeofMutableByteArray#,unsafeCoerce#) so the fresh darcs regex tdfa package won't build. This line (line 16 below) causes this error for ghc -e '' RunMutState.hs for both ghc 6.1 and 6.7 Much obliged for any help, Thomas. ********************************* hartthoma@linuxpt:~/installs/regex_darcs/regex-tdfa>runghc Setup.hs build Preprocessing library regex-tdfa-0.93... Building regex-tdfa-0.93... Text/Regex/TDFA/RunMutState.hs:16:32: parse error on input `#' hartthoma@linuxpt:~/installs/regex_darcs/regex-tdfa>head -n20 Text/Regex/TDFA/RunMutState.hs | cat -n 1 {-# LANGUAGE CPP #-} 2 module Text.Regex.TDFA.RunMutState(TagEngine(..),newTagEngine,newTagEngine2 3 ,newScratch,tagsToGroupsST 4 ,toInstructions,compareWith,resetScratch 5 ,SScratch(..),MScratch,WScratch) where 6 7 import Control.Monad(forM_,liftM,liftM2,liftM3,foldM) 8 --import Control.Monad.ST.Strict as S (ST) 9 --import qualified Control.Monad.ST.Lazy as L (ST) 10 import Control.Monad.State(MonadState(..),execState) 11 12 import Data.Array.Base(unsafeRead,unsafeWrite,STUArray(..)) 13 #ifdef __GLASGOW_HASKELL__ 14 import GHC.Arr(STArray(..)) 15 import GHC.ST(ST(..)) *** 16 import GHC.Prim(MutableByteArray#,RealWorld,Int#,sizeofMutableByteArray#,unsafeCoerce#) 17 #else 18 import Control.Monad(when) 19 import Control.Monad.ST(ST) 20 import Data.Array.ST(STArray) hartthoma@linuxpt:~/installs/regex_darcs/regex-tdfa> --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

On Fri, Aug 17, 2007 at 04:27:29PM -0400, Thomas Hartman wrote:
trying to compile regex-tdfa, I ran into another issue. (earlier I had a cabal problem but that's resolved.)
there's a line that won't compile, neither for ghc 6.6.1 nor 6.7
import GHC.Prim(MutableByteArray#,RealWorld,Int#,sizeofMutableByteArray#,unsafeCoerce#)
so the fresh darcs regex tdfa package won't build.
This line (line 16 below) causes this error for
ghc -e '' RunMutState.hs
for both ghc 6.1 and 6.7
There are at least two things going on here. 1. GHC-specific unboxed identifiers have a # in the name. I think this is a relic from back when the only reasonable way to namespace was to modify your compiler to add extra identifier characters, and use them in all non-portable identifiers. In any case, you have to enable the -fglasgow-exts option (or -XMagicHash in recent 6.7) to allow imports of such identifiers. 2. Explicitly importing GHC.Prim has been discouraged for as long as I can remember, and GHC HQ has finally made good on the promise to make it impossible. Code which imports it has a bug already, which can be fixed by switching to GHC.Exts. (Why? GHC.Prim is wired into the compiler, while GHC.Exts is a normal Haskell module, so by using GHC.Exts you are insulated from questions of what is primitive and what is derived but still unportable. Yes, this does change.) Stefan

Thanks Stefan. I got regex tdfa to compile on 6.7. FWIW, here's a patch,
generated with darcs whatsnew against a fresh unzip of regex tdfa 0.92
I didn't patch against the darcs head because this uses a "language"
progma in {-# options #-} in some file*, which ghc 6.7 didn't know what to
do with, nor I.
*: Text/Regex/TDFA/RunMutState.hs: {-# LANGUAGE CPP #-} (in darcs head,
which as I said, I did not patch against, rather I patched against 0.92
downloaded and unzipped.)
If there is a better way than this to send patches please advise, as I
don't do this terribly often. (Actually I have no idea how to apply the
below patch... is there a way?)
{
hunk ./Data/IntMap/CharMap.hs 1
+{-# OPTIONS -XGeneralizedNewtypeDeriving #-}
hunk ./Data/IntMap/EnumMap.hs 1
+{-# OPTIONS -XGeneralizedNewtypeDeriving #-}
hunk ./Data/IntSet/EnumSet.hs 1
+{-# OPTIONS -XGeneralizedNewtypeDeriving #-}
hunk ./Text/Regex/TDFA/ByteString/Lazy.hs 1
-{-# OPTIONS_GHC -fno-warn-orphans #-}
+{-# OPTIONS_GHC -fno-warn-orphans -fglasgow-exts #-}
hunk ./Text/Regex/TDFA/Common.hs 1
-{-# OPTIONS -funbox-strict-fields #-}
+{-# OPTIONS -funbox-strict-fields -XGeneralizedNewtypeDeriving #-}
hunk ./Text/Regex/TDFA/CorePattern.hs 1
+{-# OPTIONS -fglasgow-exts #-}
hunk ./Text/Regex/TDFA/CorePattern.hs 38
+import Data.Monoid
+import Control.Monad
hunk ./Text/Regex/TDFA/RunMutState.hs 1
+{-# OPTIONS -fglasgow-exts #-}
hunk ./Text/Regex/TDFA/String.hs 1
-{-# OPTIONS_GHC -fno-warn-orphans #-}
+{-# OPTIONS_GHC -fno-warn-orphans -fglasgow-exts #-}
hunk ./Text/Regex/TDFA/TDFA.hs 1
+{-# OPTIONS -fglasgow-exts #-}
hunk ./Text/Regex/TDFA/TDFA.hs 12
-import Control.Monad.RWS
+import Control.Monad (mplus)
+--import Control.Monad.RWS
hunk ./Text/Regex/TDFA/TDFA.hs 33
+import Data.Monoid
+
hunk ./Text/Regex/TDFA/TNFA.hs 1
-{-# OPTIONS_GHC -fno-warn-orphans #-}
+{-# OPTIONS_GHC -fno-warn-orphans -fglasgow-exts #-}
hunk ./Text/Regex/TDFA/Wrap.hs 1
-{-# OPTIONS -fno-warn-orphans #-}
+{-# OPTIONS -fno-warn-orphans -fglasgow-exts #-}
hunk ./Text/Regex/TDFA.hs 42
- ,module Text.Regex.TDFA.String
- ,module Text.Regex.TDFA.ByteString
- ,module Text.Regex.TDFA.ByteString.Lazy
- ,module Text.Regex.TDFA.Sequence
+ --,module Text.Regex.TDFA.String
+ --,module Text.Regex.TDFA.ByteString
+ --,module Text.Regex.TDFA.ByteString.Lazy
+ --,module Text.Regex.TDFA.Sequence
hunk ./regex-tdfa.cabal 16
-Build-Depends: regex-base >= 0.80, base >= 2.0, parsec, mtl
+Build-Depends: regex-base >= 0.80, base >= 2.0, parsec, mtl,
containers, array, bytestring
}
"Stefan O'Rear"
trying to compile regex-tdfa, I ran into another issue. (earlier I had a
cabal problem but that's resolved.)
there's a line that won't compile, neither for ghc 6.6.1 nor 6.7
import
GHC.Prim(MutableByteArray#,RealWorld,Int#,sizeofMutableByteArray#,unsafeCoerce#)
so the fresh darcs regex tdfa package won't build.
This line (line 16 below) causes this error for
ghc -e '' RunMutState.hs
for both ghc 6.1 and 6.7
There are at least two things going on here. 1. GHC-specific unboxed identifiers have a # in the name. I think this is a relic from back when the only reasonable way to namespace was to modify your compiler to add extra identifier characters, and use them in all non-portable identifiers. In any case, you have to enable the -fglasgow-exts option (or -XMagicHash in recent 6.7) to allow imports of such identifiers. 2. Explicitly importing GHC.Prim has been discouraged for as long as I can remember, and GHC HQ has finally made good on the promise to make it impossible. Code which imports it has a bug already, which can be fixed by switching to GHC.Exts. (Why? GHC.Prim is wired into the compiler, while GHC.Exts is a normal Haskell module, so by using GHC.Exts you are insulated from questions of what is primitive and what is derived but still unportable. Yes, this does change.) Stefan [attachment "signature.asc" deleted by Thomas Hartman/ext/dbcom] --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

On Fri, Aug 17, 2007 at 08:13:55PM -0400, Thomas Hartman wrote:
Thanks Stefan. I got regex tdfa to compile on 6.7. FWIW, here's a patch, generated with darcs whatsnew against a fresh unzip of regex tdfa 0.92
I didn't patch against the darcs head because this uses a "language" progma in {-# options #-} in some file*, which ghc 6.7 didn't know what to do with, nor I.
*: Text/Regex/TDFA/RunMutState.hs: {-# LANGUAGE CPP #-} (in darcs head, which as I said, I did not patch against, rather I patched against 0.92 downloaded and unzipped.)
That's a bug, in either GHC 6.7.x (please specify the date if you can, 6.7 is a pretty wide range!) or regex-tdfa. Does {-# OPTIONS_GHC -cpp #-} (theoretically equivalent) work? What's the error message?
If there is a better way than this to send patches please advise, as I don't do this terribly often. (Actually I have no idea how to apply the below patch... is there a way?)
$ mkdir ~/.darcs
$ echo 'Thomas Hartman
[patch]
{ hunk ./Data/IntMap/CharMap.hs 1 +{-# OPTIONS -XGeneralizedNewtypeDeriving #-}
Ick. {-# OPTIONS is very strongly deprecated, since it doesn't specify a compiler but must use a compiler-specific syntax. Much better to use LANGUAGE: {-# LANGUAGE GeneralizedNewtypeDeriving #-}
+Build-Depends: regex-base >= 0.80, base >= 2.0, parsec, mtl, containers, array, bytestring
That won't work; you must indent continuation lines. Stefan

Stefan O'Rear wrote:
On Fri, Aug 17, 2007 at 04:27:29PM -0400, Thomas Hartman wrote:
trying to compile regex-tdfa, I ran into another issue. (earlier I had a cabal problem but that's resolved.)
there's a line that won't compile, neither for ghc 6.6.1 nor 6.7
import GHC.Prim(MutableByteArray#,RealWorld,Int#,sizeofMutableByteArray#,unsafeCoerce#)
so the fresh darcs regex tdfa package won't build.
This line (line 16 below) causes this error for
ghc -e '' RunMutState.hs
for both ghc 6.1 and 6.7
There are at least two things going on here.
1. GHC-specific unboxed identifiers have a # in the name. I think this is a relic from back when the only reasonable way to namespace was to modify your compiler to add extra identifier characters, and use them in all non-portable identifiers. In any case, you have to enable the -fglasgow-exts option (or -XMagicHash in recent 6.7) to allow imports of such identifiers.
2. Explicitly importing GHC.Prim has been discouraged for as long as I can remember, and GHC HQ has finally made good on the promise to make it impossible. Code which imports it has a bug already, which can be fixed by switching to GHC.Exts. (Why? GHC.Prim is wired into the compiler, while GHC.Exts is a normal Haskell module, so by using GHC.Exts you are insulated from questions of what is primitive and what is derived but still unportable. Yes, this does change.)
Stefan
Hi, I wrote regex-tdfa, and since I don't use beyond GHC 6.6.1 I had not seen this problem emerge. The use of GHC.Prim and CPP is intimitely linked: from http://darcs.haskell.org/packages/regex-unstable/regex-tdfa/Text/Regex/TDFA/...
#ifdef __GLASGOW_HASKELL__ foreign import ccall unsafe "memcpy" memcpy :: MutableByteArray# RealWorld -> MutableByteArray# RealWorld -> Int# -> IO ()
{-# INLINE copySTU #-} copySTU :: (Show i,Ix i,MArray (STUArray s) e (ST s)) => STUArray s i e -> STUArray s i e -> ST s () copySTU (STUArray _ _ msource) (STUArray _ _ mdest) = -- do b1 <- getBounds s1 -- b2 <- getBounds s2 -- when (b1/=b2) (error ("\n\nWTF copySTU: "++show (b1,b2))) ST $ \s1# -> case sizeofMutableByteArray# msource of { n# -> case unsafeCoerce# memcpy mdest msource n# s1# of { (# s2#, () #) -> (# s2#, () #) }}
#else /* !__GLASGOW_HASKELL__ */
copySTU :: (MArray (STUArray s) e (ST s))=> STUArray s Tag e -> STUArray s Tag e -> ST s () copySTU source destination = do b@(start,stop) <- getBounds source b' <- getBounds destination -- traceCopy ("> copySTArray "++show b) $ do when (b/=b') (fail $ "Text.Regex.TDFA.RunMutState copySTUArray bounds mismatch"++show (b,b')) forM_ (range b) $ \index -> unsafeRead source index >>= unsafeWrite destination index #endif /* !__GLASGOW_HASKELL__ */
The entire point of using the ST monad is manage memory more efficiently than with (U)Array. The copySTU simply uses a "memcpy" to copy the whole source array into the destination efficiently. This lets me re-use the already allocated destination array. If there had been a high level "copyMArray" then this would not have been needed. The CPP is used to let non-GHC compilers copy element by element. The *right* solution is to patch the STUArray and/or MArray code to do this behind the scenes. So how does one get the array pointer without GHC.Prim in 6.7 ? -- Chris

On Sun, Aug 19, 2007 at 11:25:49PM +0100, ChrisK wrote:
#ifdef __GLASGOW_HASKELL__ foreign import ccall unsafe "memcpy" memcpy :: MutableByteArray# RealWorld -> MutableByteArray# RealWorld -> Int# -> IO ()
{-# INLINE copySTU #-} copySTU :: (Show i,Ix i,MArray (STUArray s) e (ST s)) => STUArray s i e -> STUArray s i e -> ST s () copySTU (STUArray _ _ msource) (STUArray _ _ mdest) = -- do b1 <- getBounds s1 -- b2 <- getBounds s2 -- when (b1/=b2) (error ("\n\nWTF copySTU: "++show (b1,b2))) ST $ \s1# -> case sizeofMutableByteArray# msource of { n# -> case unsafeCoerce# memcpy mdest msource n# s1# of { (# s2#, () #) -> (# s2#, () #) }}
#else /* !__GLASGOW_HASKELL__ */
copySTU :: (MArray (STUArray s) e (ST s))=> STUArray s Tag e -> STUArray s Tag e -> ST s () copySTU source destination = do b@(start,stop) <- getBounds source b' <- getBounds destination -- traceCopy ("> copySTArray "++show b) $ do when (b/=b') (fail $ "Text.Regex.TDFA.RunMutState copySTUArray bounds mismatch"++show (b,b')) forM_ (range b) $ \index -> unsafeRead source index >>= unsafeWrite destination index #endif /* !__GLASGOW_HASKELL__ */
The entire point of using the ST monad is manage memory more efficiently than with (U)Array. The copySTU simply uses a "memcpy" to copy the whole source array into the destination efficiently. This lets me re-use the already allocated destination array. If there had been a high level "copyMArray" then this would not have been needed. The CPP is used to let non-GHC compilers copy element by element. The *right* solution is to patch the STUArray and/or MArray code to do this behind the scenes.
So how does one get the array pointer without GHC.Prim in 6.7 ?
Import GHC.Exts, which exports everything GHC.Prim does, and according to the docs is "GHC Extensions: this is the Approved Way to get at GHC-specific extensions.". (Can't help you with the CPP issue though.) Stefan
participants (3)
-
ChrisK
-
Stefan O'Rear
-
Thomas Hartman