Detecting system endianness

Hi, Is there some way I can check the endianness of the machine my haskell code is running in? Thanks, Maurício

I think something like this might work:
Prelude GHC.Exts GHC.Word> let W64# x = 0x100000002 in W32# (unsafeCoerce# x)
2
You should get 1 for big-endian and 2 for little-endian.
(Disclaimer: not particularily well-tested.)
-- ryan
On Thu, Dec 18, 2008 at 3:27 AM, Mauricio
Hi,
Is there some way I can check the endianness of the machine my haskell code is running in?
Thanks, Maurício
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Actually, this is probably safer:
import Foreign.Marshal.Alloc
import Foreign.Ptr
import Foreign.Storable
import Data.Word
import System.IO.Unsafe
endianCheck = unsafePerformIO $ alloca $ \p -> poke p (0x01020304 ::
Word32) >> peek (castPtr p :: Ptr Word8)
littleEndian = endianCheck == 4
bigEndian = endianCheck == 1
-- ryan
On Thu, Dec 18, 2008 at 4:33 AM, Ryan Ingram
I think something like this might work:
Prelude GHC.Exts GHC.Word> let W64# x = 0x100000002 in W32# (unsafeCoerce# x) 2
You should get 1 for big-endian and 2 for little-endian.
(Disclaimer: not particularily well-tested.)
-- ryan
On Thu, Dec 18, 2008 at 3:27 AM, Mauricio
wrote: Hi,
Is there some way I can check the endianness of the machine my haskell code is running in?
Thanks, Maurício
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Thursday 18 December 2008 13:40:47 Ryan Ingram wrote:
Actually, this is probably safer:
import Foreign.Marshal.Alloc import Foreign.Ptr import Foreign.Storable import Data.Word import System.IO.Unsafe
endianCheck = unsafePerformIO $ alloca $ \p -> poke p (0x01020304 :: Word32) >> peek (castPtr p :: Ptr Word8)
littleEndian = endianCheck == 4 bigEndian = endianCheck == 1
-- ryan
On Thu, Dec 18, 2008 at 4:33 AM, Ryan Ingram
wrote: I think something like this might work:
Prelude GHC.Exts GHC.Word> let W64# x = 0x100000002 in W32# (unsafeCoerce# x) 2
You should get 1 for big-endian and 2 for little-endian.
(Disclaimer: not particularily well-tested.)
Using modules Data.Binary, Data.Binary.Put and Data.Word, you can define littleEndian = (decode $ runPut $ putWord16host 42 :: Word8) == 42 Under the hood, it also uses peek and poke, but it looks a bit more functional.

I actually don't need a pure function, IO is OK. I'll try something in these lines. It doesn't build yet, with an error message I'll probably take a few months to understand: Couldn't match expected type `forall a. (Storable a) => a -> IO a' against inferred type `a -> IO a' Thanks, Maurício ----- import Control.Monad ; import Foreign ; import Foreign.C ; type CUInt16 = CUShort ; type CUInt8 = CChar ; littleEndianToHost,hostToLittleEndian :: forall a. (Storable a ) => a -> IO a ; (littleEndianToHost,hostToLittleEndian) = (f,f) where { f :: forall a. ( Storable a ) => a -> IO a ; f a = with ( 0x0102 :: CUInt16 ) $ \p -> do { firstByte <- peek ( castPtr p :: Ptr CUInt8 ) ; littleEndian <- return $ firstByte == 0x02 ; halfSize <- return $ div ( alignment a ) 2; reverse <- with a $ \val -> zipWithM (swapByte (castPtr val :: Ptr CUInt8)) [0..halfSize-1] [halfSize..2*halfSize-1] >> peek val ; return $ if littleEndian then a else reverse ; } ; swapByte p n1 n2 = do { v1 <- peekElemOff p n1 ; v2 <- peekElemOff p n2 ; pokeElemOff p n1 v2 ; pokeElemOff p n2 v1 } >> return () } -----
On Thursday 18 December 2008 13:40:47 Ryan Ingram wrote:
Actually, this is probably safer:
import Foreign.Marshal.Alloc import Foreign.Ptr import Foreign.Storable import Data.Word import System.IO.Unsafe
endianCheck = unsafePerformIO $ alloca $ \p -> poke p (0x01020304 :: Word32) >> peek (castPtr p :: Ptr Word8)
littleEndian = endianCheck == 4 bigEndian = endianCheck == 1
(...)

In a similar vein, is there already a function available to give the size of Word in bytes? Or should I write the usual Ptr conversion tricks to figure it out? Holger Siegel wrote:
On Thursday 18 December 2008 13:40:47 Ryan Ingram wrote:
Actually, this is probably safer:
import Foreign.Marshal.Alloc import Foreign.Ptr import Foreign.Storable import Data.Word import System.IO.Unsafe
endianCheck = unsafePerformIO $ alloca $ \p -> poke p (0x01020304 :: Word32) >> peek (castPtr p :: Ptr Word8)
littleEndian = endianCheck == 4 bigEndian = endianCheck == 1
-- ryan
Using modules Data.Binary, Data.Binary.Put and Data.Word, you can define
littleEndian = (decode $ runPut $ putWord16host 42 :: Word8) == 42
Under the hood, it also uses peek and poke, but it looks a bit more functional.
-- Live well, ~wren

Foreign.Storable.sizeOf wren:
In a similar vein, is there already a function available to give the size of Word in bytes? Or should I write the usual Ptr conversion tricks to figure it out?
Holger Siegel wrote:
On Thursday 18 December 2008 13:40:47 Ryan Ingram wrote:
Actually, this is probably safer:
import Foreign.Marshal.Alloc import Foreign.Ptr import Foreign.Storable import Data.Word import System.IO.Unsafe
endianCheck = unsafePerformIO $ alloca $ \p -> poke p (0x01020304 :: Word32) >> peek (castPtr p :: Ptr Word8)
littleEndian = endianCheck == 4 bigEndian = endianCheck == 1
-- ryan
Using modules Data.Binary, Data.Binary.Put and Data.Word, you can define
littleEndian = (decode $ runPut $ putWord16host 42 :: Word8) == 42
Under the hood, it also uses peek and poke, but it looks a bit more functional.
-- Live well, ~wren _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Thu, 2008-12-18 at 22:35 -0500, wren ng thornton wrote:
In a similar vein, is there already a function available to give the size of Word in bytes? Or should I write the usual Ptr conversion tricks to figure it out?
How about this: (`div` 8) $ ceiling $ logBase 2 $ fromIntegral (maxBound :: Word) Could write an integral log_2 function to make it nicer :) - George

But why would you want that? I understand the only situation when talking about number of bytes makes sense is when you are using Foreign and Ptr. Besides that, you can only guess the amount of memory you need to deal with your data (taking laziness, GC etc. into account). Maurício
In a similar vein, is there already a function available to give the size of Word in bytes? Or should I write the usual Ptr conversion tricks to figure it out?
Holger Siegel wrote:
On Thursday 18 December 2008 13:40:47 Ryan Ingram wrote:
Actually, this is probably safer:
import Foreign.Marshal.Alloc import Foreign.Ptr import Foreign.Storable import Data.Word import System.IO.Unsafe

Maurício wrote:
But why would you want that? I understand the only situation when talking about number of bytes makes sense is when you are using Foreign and Ptr. Besides that, you can only guess the amount of memory you need to deal with your data (taking laziness, GC etc. into account).
Because I'm using both Ptr and Foreign? ;) See my recent announcement for bytestring-trie. One of the optimizations I'm working on is to read off a full natural word at a time, instead of just one byte. To do this properly I need to detect the word size so that I don't accidentally read garbage off the end of the ByteString when there's less than a natural word left. Detecting endianness is similar because it determines how to interpret that word as if it were an array of bytes, which is needed to get the correct behavior when interpreting the word as a bit-vector for trieing. That is, if you only read the "first" two bytes on a big-endian machine, then you're skipping the 4/6/? bytes which are actually at the beginning of the bytestring. I'm not sure how important physical endianness of bytes within a word is. For IntMap a common case is to get large contiguous chunks of keys, so logical big-endian trieing improves performance over logical little-endian. I'm not sure how common large contiguous chunks of bytestring keys are, though. Reading a word then changing the physical endianness of the bytes seems expensive. -- Live well, ~wren

But why would you want that? I understand the only situation when talking about number of bytes makes sense is when you are using Foreign and Ptr. (...)
Because I'm using both Ptr and Foreign? ;)
See my recent announcement for bytestring-trie. One of the optimizations I'm working on is to read off a full natural word at a time, (...)
I see, you mean the size of a machine word, not of Data.Word.

Maurício wrote:
But why would you want that? I understand the only situation when talking about number of bytes makes sense is when you are using Foreign and Ptr. (...)
Because I'm using both Ptr and Foreign? ;)
See my recent announcement for bytestring-trie. One of the optimizations I'm working on is to read off a full natural word at a time, (...)
I see, you mean the size of a machine word, not of Data.Word.
AFAIK, Data.Word.Word is defined to be "the same size as Prelude.Int" (which it isn't on GHC 6.8.2 on Intel OS X: 32bits vs 31bits) and Int is defined to be at least 31bits but can be more. My interpretation of this is that Int and Word will generally be implemented by the architecture's natural word size in order to optimize performance, much like C's "int" and "unsigned int" but with better definition of allowed sizes. This seems to be supported by the existence of definite-sized variants Word8, Word16, Word32... So yeah, I'm meaning the machine word, but I think Word is intended to proxy for that. Maybe I'm wrong, but provided that Word contains (or can be persuaded to contain) a round number of Word8 and that operations on Word are cheaper than the analogous sequence of operations on the Word8 representation, that's good enough for my needs. -- Live well, ~wren

On Tue, Dec 23, 2008 at 07:44:14PM -0500, wren ng thornton wrote:
AFAIK, Data.Word.Word is defined to be "the same size as Prelude.Int" (which it isn't on GHC 6.8.2 on Intel OS X: 32bits vs 31bits) and Int is defined to be at least 31bits but can be more. My interpretation of this is that Int and Word will generally be implemented by the architecture's natural word size in order to optimize performance, much like C's "int" and "unsigned int" but with better definition of allowed sizes. This seems to be supported by the existence of definite-sized variants Word8, Word16, Word32...
Of course, natural word size can mean 'natural pointer size' or 'natural int size'. Which are different on many architectures. So, you want to be careful about which you want.
So yeah, I'm meaning the machine word, but I think Word is intended to proxy for that. Maybe I'm wrong, but provided that Word contains (or can be persuaded to contain) a round number of Word8 and that operations on Word are cheaper than the analogous sequence of operations on the Word8 representation, that's good enough for my needs.
If you want to find out the 'natural' sizes, then look at the 'CInt', 'Ptr', and 'FunPtr' types, which follow the C 'int' 'void *' and 'void (*fn)()' types. So they will conform to the architecture ABI for the underlying spec/operating system. If you just want a type guarenteed to be able to hold a pointer or an integer, use 'IntPtr' or 'WordPtr' which are provided for just that case. John -- John Meacham - ⑆repetae.net⑆john⑈
participants (8)
-
Don Stewart
-
George Pollard
-
Holger Siegel
-
John Meacham
-
Mauricio
-
Maurício
-
Ryan Ingram
-
wren ng thornton