Indeed I misunderstood. As you already suspected this wouldn't work for Int# (or other unboxed types) sadly as the GC would assume these to be pointers which no doubt would lead to segfaults or worse.

Rereading your initial mail I can say the runtime currently doesn't support such a heap object.
If I understand you correctly what you would like is basically a something like:

Con n
P I#  P I# P I#  ...
       \/  
\/   \/  
     Pair1 Pair2 Pair3 ...

Where n gives the number of pairs.

I can see how it might be feasible to add a heap object like this to GHC but I'm unsure if it would be worth the complexity as it's layout diverges quite a bit from what GHC usually expects.

The other option would be to expose to users a way to have an object that consist of a given number of words and a bitmap which indicates to the GHC which fields are pointers. This is more or less
the representation that's already used to deal with stack frames iirc so that might not be as far fetched as it seems at first.
It might even be possible to implement some sort of prototype for this using hand written Cmm.

But there are not any plans to implement anything like this as far as I know.

Am 02/08/2022 um 20:51 schrieb Jaro Reinders:

It seems you have misunderstood me. I want to store *unboxed* Int#s inside the array, not just some unlifted types. Surely in the case of unboxed integers the unsafeCoerce# function can make the garbage collector crash as they might be interpreted as arbitrary pointers.

Cheers,

Jaro

On 02/08/2022 20:24, Andreas Klebinger wrote:

I think it's possible to do this *today* using unsafeCoerce#.

I was able to come up with this basic example below. In practice one would at the very least want to abstract away the gnarly stuff inside a
library. But since it sounds like you want to be the one to write a library that shouldn't be a problem.

{-# LANGUAGE MagicHash #-}
{-# LANGUAGE UnboxedTuples #-}
{-# LANGUAGE UnliftedDatatypes #-}
module Main where
import GHC.Exts
import GHC.IO
import Unsafe.Coerce
import Data.Kind
data SA = SA (SmallMutableArray# RealWorld Any)
mkArray :: Int -> a -> IO (SA)
mkArray (I# n) initial = IO $ \s ->
    case unsafeCoerce# (newSmallArray# n initial s) of
        (# s', arr #) -> (# s', SA arr #)
readLifted :: SA -> Int -> IO a
readLifted (SA arr) (I# i) = IO (\s ->
    unsafeCoerce# (readSmallArray# arr i s)
    )
data UWrap (a :: UnliftedType) = UWrap a
-- UWrap is just here because we can't return unlifted types in IO
-- If you don't need your result in IO you can eliminate this indirection.
readUnlifted :: forall a. SA -> Int -> IO (UWrap a)
readUnlifted (SA arr) (I# i) = IO (\s ->
    case unsafeCoerce# (readSmallArray# arr i s) of
        (# s', a :: a #) -> (# s', UWrap a #)
    )
writeLifted :: a -> Int -> SA -> IO ()
writeLifted x (I# i) (SA arr) = IO $ \s ->
    case writeSmallArray# (unsafeCoerce# arr) i x s of
        s -> (# s, () #)
writeUnlifted :: (a :: UnliftedType) -> Int -> SA -> IO ()
writeUnlifted x (I# i) (SA arr) = IO $ \s ->
    case writeSmallArray# arr i (unsafeCoerce# x) s of
        s -> (# s, () #)
type UB :: UnliftedType
data UB = UT | UF
showU :: UWrap UB -> String
showU (UWrap UT) = "UT"
showU (UWrap UF) = "UF"
main :: IO ()
main = do
    arr <- mkArray 4 ()
    writeLifted True 0 arr
    writeLifted False 1 arr
    writeUnlifted UT 2 arr
    writeUnlifted UT 3 arr
    (readLifted arr 0 :: IO Bool) >>= print
    (readLifted arr 1 :: IO Bool) >>= print
    (readUnlifted arr 2 :: IO (UWrap UB)) >>= (putStrLn . showU)
    (readUnlifted arr 3 :: IO (UWrap UB)) >>= (putStrLn . showU)
    return ()

Cheers

Andreas

Am 02/08/2022 um 17:32 schrieb J. Reinders:
Could you use `StablePtr` for the keys?
That might be an option, but I have no idea how performant stable pointers are and manual management is obviously not ideal.

How does the cost of computing object hashes and comparing colliding
objects compare with the potential cache miss cost of using boxed
integers or a separate array?  Would such an "optimisation" be worth
the effort?
Literature on hash tables suggests that cache misses were a very important factor in running time (in 2001): https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.25.4189

I don’t know whether it has become less or more important now, but I have been told there haven’t been that many advances in memory latency.
_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs