Using mutable array after an unsafeFreezeArray, and GC details

A couple of updates: Edward Yang responded here, confirming the sort of track I was thinking on: http://blog.ezyang.com/2014/05/ghc-and-mutable-arrays-a-dirty-little-secret/ And I can report that: 1) cloning a frozen array doesn't provide the benefits of creating a new array and freezing 2) and anyway, I'm seeing some segfaults when cloning, freezing, reading then writing in my library I'd love to learn if there are any other approaches I might take, e.g. maybe with my own CMM primop variants? Thanks, Brandon

On 09/05/2014 19:21, Brandon Simmons wrote:
A couple of updates: Edward Yang responded here, confirming the sort of track I was thinking on:
http://blog.ezyang.com/2014/05/ghc-and-mutable-arrays-a-dirty-little-secret/
And I can report that: 1) cloning a frozen array doesn't provide the benefits of creating a new array and freezing 2) and anyway, I'm seeing some segfaults when cloning, freezing, reading then writing in my library
I'd love to learn if there are any other approaches I might take, e.g. maybe with my own CMM primop variants?
I'm not sure exactly what your workload looks like, but if you have arrays that tend to be unmodified for long periods of time it's sometimes useful to keep them frozen but thaw before mutating. How large are your arrays? Perhaps the new small array type (in HEAD but not 7.8) would help? Cheers, Simon

On Mon, May 12, 2014 at 4:32 AM, Simon Marlow
On 09/05/2014 19:21, Brandon Simmons wrote:
A couple of updates: Edward Yang responded here, confirming the sort of track I was thinking on:
http://blog.ezyang.com/2014/05/ghc-and-mutable-arrays-a-dirty-little-secret/
And I can report that: 1) cloning a frozen array doesn't provide the benefits of creating a new array and freezing 2) and anyway, I'm seeing some segfaults when cloning, freezing, reading then writing in my library
I'd love to learn if there are any other approaches I might take, e.g. maybe with my own CMM primop variants?
I'm not sure exactly what your workload looks like, but if you have arrays that tend to be unmodified for long periods of time it's sometimes useful to keep them frozen but thaw before mutating.
The idea is I'm using two atomic counters to coordinate concurrent readers and writers along an "infinite array" (a linked list of array segments that get allocated as needed and garbage collected as we go). So currently each cell in each array is written to only once, with a CAS.
How large are your arrays? Perhaps the new small array type (in HEAD but not 7.8) would help?
Thanks, maybe so! The arrays can be any size, but probably not smaller than length 64 (this will be static, at compile-time). I read through https://ghc.haskell.org/trac/ghc/ticket/5925, and it seems like the idea is to improve array creation. I'm pretty happy with the speed of cloning an array (but maybe cloneSmallArray will be even faster still). It also looks like stg_casSmallArrayzh (in PrimOps.cmm) omits the card marking (maybe the idea is if the array is already at ~128 elements or less, then the card-marking is all just overhead?). Brandon
Cheers, Simon

On 12/05/2014 21:28, Brandon Simmons wrote:
The idea is I'm using two atomic counters to coordinate concurrent readers and writers along an "infinite array" (a linked list of array segments that get allocated as needed and garbage collected as we go). So currently each cell in each array is written to only once, with a CAS.
Certainly you should freeze arrays when you're done writing to them, this will help the GC a lot.
How large are your arrays? Perhaps the new small array type (in HEAD but not 7.8) would help?
Thanks, maybe so! The arrays can be any size, but probably not smaller than length 64 (this will be static, at compile-time).
I read through https://ghc.haskell.org/trac/ghc/ticket/5925, and it seems like the idea is to improve array creation. I'm pretty happy with the speed of cloning an array (but maybe cloneSmallArray will be even faster still).
It also looks like stg_casSmallArrayzh (in PrimOps.cmm) omits the card marking (maybe the idea is if the array is already at ~128 elements or less, then the card-marking is all just overhead?).
That right, the cards currently cover 128 elements, and there's also a per-array dirty bit, so the card table in an array smaller than 128 elts is just overhead. Cheers, Simon

Given the following code: {-# LANGUAGE GADTs #-} data Any a where AInt :: Int -> Any Int -- demo 1 does not compile {- demo1 a = do case a of (AInt i) -> print i Couldn't match expected type ‘t’ with actual type ‘IO ()’ ‘t’ is untouchable inside the constraints (t1 ~ Int) bound by a pattern with constructor AInt :: Int -> Any Int, in a case alternative at /Users/doaitse/TryHaskell/TestGADT.hs:6:17-22 ‘t’ is a rigid type variable bound by the inferred type of demo1 :: Any t1 -> t at /Users/doaitse/TryHaskell/TestGADT.hs:5:1 Relevant bindings include demo1 :: Any t1 -> t (bound at /Users/doaitse/TryHaskell/TestGADT.hs:5:1) In the expression: print i In a case alternative: (AInt i) -> print i Failed, modules loaded: none. -} -- all the following go through without complaints: a = AInt 3 demo2 = do case a of (AInt i) -> print i demo3 :: IO () demo3 = do case a of (AInt i) -> print i demo4 = do case AInt 3 of (AInt i) -> print i demo5 :: Any Int -> IO () demo5 a = do case a of (AInt i) -> print i I do not see why the error message in demo1 arises. It claims it can't match some t with the type IO (), but when I tell that the result is IO () it can? I think at least the error message is confusing, and not very helpful. I would have in no way been able to get the clue that add a type signature as in the case of demo5 would solve the problem. What am I overlooking? Doaitse

I just hit a similar error the other day. I think the gist of it is that there are two perfectly good types, and neither is more general than the other. A slightly different example shows why more clearly: foo (AInt i) = (3 :: Int) Now, what type should this have? foo :: Any a -> a foo :: Any a -> Int both seem pretty good, and neither is clearly better. It's not *as* obvious what the two candidates might be for your example, but maybe you could imagine something like these two types: demo1 :: AInt a -> IO () type family Foo a type instance Foo Int = IO () demo1 :: AInt a -> Foo a Again, not too clear that one is better, I think. ~d On 2014-05-13 14:20, S. Doaitse Swierstra wrote:
Given the following code:
{-# LANGUAGE GADTs #-} data Any a where AInt :: Int -> Any Int
-- demo 1 does not compile {- demo1 a = do case a of (AInt i) -> print i
Couldn't match expected type ‘t’ with actual type ‘IO ()’ ‘t’ is untouchable inside the constraints (t1 ~ Int) bound by a pattern with constructor AInt :: Int -> Any Int, in a case alternative at /Users/doaitse/TryHaskell/TestGADT.hs:6:17-22 ‘t’ is a rigid type variable bound by the inferred type of demo1 :: Any t1 -> t at /Users/doaitse/TryHaskell/TestGADT.hs:5:1 Relevant bindings include demo1 :: Any t1 -> t (bound at /Users/doaitse/TryHaskell/TestGADT.hs:5:1) In the expression: print i In a case alternative: (AInt i) -> print i Failed, modules loaded: none. -}
-- all the following go through without complaints:
a = AInt 3 demo2 = do case a of (AInt i) -> print i
demo3 :: IO () demo3 = do case a of (AInt i) -> print i
demo4 = do case AInt 3 of (AInt i) -> print i
demo5 :: Any Int -> IO () demo5 a = do case a of (AInt i) -> print i
I do not see why the error message in demo1 arises. It claims it can't match some t with the type IO (), but when I tell that the result is IO () it can? I think at least the error message is confusing, and not very helpful. I would have in no way been able to get the clue that add a type signature as in the case of demo5 would solve the problem.
What am I overlooking?
Doaitse
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Hi. Daniel is certainly right to point out general problems with GADT pattern matching and principal types. Nevertheless, the changing behaviour of GHC over time is currently a bit confusing to me. In GHC-6.12.3, Doaitse's program fails with three errors (demo1, demo2, demo4, all the GADT pattern matches without type signature), and the error messages are: /home/andres/trans/GT.hs:7:15: GADT pattern match in non-rigid context for `AInt' Probable solution: add a type signature for the scrutinee of the case expression In the pattern: AInt i In a case alternative: (AInt i) -> print i In the expression: case a of { (AInt i) -> print i } /home/andres/trans/GT.hs:33:18: GADT pattern match with non-rigid result type `t' Solution: add a type signature for the entire case expression In a case alternative: (AInt i) -> print i In the expression: case a of { (AInt i) -> print i } In the expression: do { case a of { (AInt i) -> print i } } /home/andres/trans/GT.hs:41:18: GADT pattern match with non-rigid result type `t' Solution: add a type signature for the entire case expression In a case alternative: (AInt i) -> print i In the expression: case AInt 3 of { (AInt i) -> print i } In the expression: do { case AInt 3 of { (AInt i) -> print i } } These error messages are conservative, but clear. They ask the user to add a type signature. With GHC-7.0.4, GHC-7.2.1, GHC-7.4.2, and GHC-7.6.3, the program compiles without error, including demo1. With GHC-7.8.2, the compiler reports the error Doaitse mentioned: /home/andres/trans/GT.hs:7:27: Couldn't match expected type ‘t’ with actual type ‘IO ()’ ‘t’ is untouchable inside the constraints (t1 ~ Int) I think the error message would be more helpful if it would mention adding a type signature as a possible fix again. But, assuming for the time being that GHC is "correct" to reject this program and that GHC-7 was "wrong" up until now, then I'd like to know what the new rule for GADT pattern matching is. Obviously, it's more relaxed than it used to be (because demo2 and demo4 are still accepted). Also, am I correct that the OutsideIn JFP paper is still the correct description of what's going on in the type checker? If so, is GHC-7.6 or GHC-7.8 closer to implementing what the OutsideIn paper describes? Is the change in behaviour documented somewhere? Thanks. Cheers, Andres -- Andres Löh, Haskell Consultant Well-Typed LLP, http://www.well-typed.com Registered in England & Wales, OC335890 250 Ice Wharf, 17 New Wharf Road, London N1 9RF, England

My understanding of OutsideIn leads me to believe that GHC 7.8 has the behavior closer to that spec. See Section 5.2 of that paper (http://research.microsoft.com/en-us/um/people/simonpj/Papers/constraints/jfp...), which is a relatively accessible explanation of this phenomenon. Daniel's explanation is essentially a condensed version of that section.
I'm not surprised that the behavior changed between GHC 6.x and 7.x, as I believe 7.x is what brought in OutsideIn. I don't know much about the change between 7.6 and 7.8, though. And, I agree that the "untouchable" error messages are generally inscrutable. When I see that message in my own code, my takeaway is generally "I have a mistake somewhere near that line", nothing more specific or useful. I've accordingly posted a bug report #9109 (https://ghc.haskell.org/trac/ghc/ticket/9109). Please comment there if you have either useful examples or other contributions to the fix -- it may be hard to get this one right.
Richard
On May 13, 2014, at 5:51 PM, Andres Löh
Hi.
Daniel is certainly right to point out general problems with GADT pattern matching and principal types. Nevertheless, the changing behaviour of GHC over time is currently a bit confusing to me.
In GHC-6.12.3, Doaitse's program fails with three errors (demo1, demo2, demo4, all the GADT pattern matches without type signature), and the error messages are:
/home/andres/trans/GT.hs:7:15: GADT pattern match in non-rigid context for `AInt' Probable solution: add a type signature for the scrutinee of the case expression In the pattern: AInt i In a case alternative: (AInt i) -> print i In the expression: case a of { (AInt i) -> print i }
/home/andres/trans/GT.hs:33:18: GADT pattern match with non-rigid result type `t' Solution: add a type signature for the entire case expression In a case alternative: (AInt i) -> print i In the expression: case a of { (AInt i) -> print i } In the expression: do { case a of { (AInt i) -> print i } }
/home/andres/trans/GT.hs:41:18: GADT pattern match with non-rigid result type `t' Solution: add a type signature for the entire case expression In a case alternative: (AInt i) -> print i In the expression: case AInt 3 of { (AInt i) -> print i } In the expression: do { case AInt 3 of { (AInt i) -> print i } }
These error messages are conservative, but clear. They ask the user to add a type signature.
With GHC-7.0.4, GHC-7.2.1, GHC-7.4.2, and GHC-7.6.3, the program compiles without error, including demo1.
With GHC-7.8.2, the compiler reports the error Doaitse mentioned:
/home/andres/trans/GT.hs:7:27: Couldn't match expected type ‘t’ with actual type ‘IO ()’ ‘t’ is untouchable inside the constraints (t1 ~ Int)
I think the error message would be more helpful if it would mention adding a type signature as a possible fix again.
But, assuming for the time being that GHC is "correct" to reject this program and that GHC-7 was "wrong" up until now, then I'd like to know what the new rule for GADT pattern matching is. Obviously, it's more relaxed than it used to be (because demo2 and demo4 are still accepted). Also, am I correct that the OutsideIn JFP paper is still the correct description of what's going on in the type checker? If so, is GHC-7.6 or GHC-7.8 closer to implementing what the OutsideIn paper describes? Is the change in behaviour documented somewhere?
Thanks.
Cheers, Andres
-- Andres Löh, Haskell Consultant Well-Typed LLP, http://www.well-typed.com
Registered in England & Wales, OC335890 250 Ice Wharf, 17 New Wharf Road, London N1 9RF, England _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
participants (6)
-
Andres Löh
-
Brandon Simmons
-
Daniel Wagner
-
Richard Eisenberg
-
S. Doaitse Swierstra
-
Simon Marlow