Strictness in data declaration not matched in assembler?

Consider the following code data Data = Data { unData :: !Int } func :: Data -> Int func x = case unData x of 1 -> 2 _ -> 0 Compiling with GHC 6.8.2 gives the following stg code Main.func = \r [x_slg] case x_slg of tpl_slx { Main.Data ipv_slj -> case ipv_slj of wild_sly { GHC.Base.I# ds_slm -> case ds_slm of ds1_slz { __DEFAULT -> Main.lvl1; 1 -> Main.lvl; }; }; }; The native code generator turns it into the following x86_64 assembler Main_func_info: leaq -8(%rbp),%rax cmpq %r14,%rax jb .LcnU movq %rsi,%rbx movq $sni_info,-8(%rbp) addq $-8,%rbp testq $7,%rbx jne sni_info jmp *(%rbx) .LcnU: movl $Main_func_closure,%ebx jmp *-8(%r13) sni_info: movq 7(%rbx),%rbx movq $snj_info,(%rbp) testq $7,%rbx jne snj_info jmp *(%rbx) snj_info: cmpq $1,7(%rbx) jne .LcnE movl $Main_lvl_closure+1,%ebx addq $8,%rbp jmp *(%rbp) .LcnE: movl $Main_lvl1_closure+1,%ebx addq $8,%rbp jmp *(%rbp) It seems to me that the !Int member of the Data constructor is being treated like it might be a thunk in sni_info (i.e., the whole "testq $7,%rbx" thing). Isn't this unnecessary as the "!" strictness flag means the Int argument must be forced by the Data constructor before being captured? Thanks! -Tyson

Tyson Whitehead wrote:
It seems to me that the !Int member of the Data constructor is being treated like it might be a thunk in sni_info (i.e., the whole "testq $7,%rbx" thing).
Isn't this unnecessary as the "!" strictness flag means the Int argument must be forced by the Data constructor before being captured?
Strictness does not imply unboxing. To see why not, think about the fact that unboxing breaks sharing. By keeping the pointer-indirection in place, we can share even strict fields between related values. Try -funbox-strict-fields or the UNBOX pragma. Jules

On Wednesday 15 October 2008 10:48:26 you wrote:
Strictness does not imply unboxing.
To see why not, think about the fact that unboxing breaks sharing. By keeping the pointer-indirection in place, we can share even strict fields between related values.
I believe I realize that. What I was wondering about was the fact that it seemed to think the pointer might be to a thunk (instead of constructor closure). Doesn't the strictness flag mean the following assembler would work sni_info: movq 7(%rbx),%rbx movq $snj_info,(%rbp) jmp snj_info (which could be cleaned up further by combining it with snj_info) instead of sni_info: movq 7(%rbx),%rbx movq $snj_info,(%rbp) testq $7,%rbx jne snj_info jmp *(%rbx) (i.e., the whole test if it is a thunk and conditionally evaluate it bit is unnecessary due to constructor the strictness flag). Cheers! -Tyson

I totally agree. Getting the value of the field should just evaluate
x and then use a pointer indirection; there should be no conditional
jumps involved in getting the value.
GHC is just doing the wrong thing.
-- Lennart
On Wed, Oct 15, 2008 at 3:58 PM, Tyson Whitehead
On Wednesday 15 October 2008 10:48:26 you wrote:
Strictness does not imply unboxing.
To see why not, think about the fact that unboxing breaks sharing. By keeping the pointer-indirection in place, we can share even strict fields between related values.
I believe I realize that. What I was wondering about was the fact that it seemed to think the pointer might be to a thunk (instead of constructor closure). Doesn't the strictness flag mean the following assembler would work
sni_info: movq 7(%rbx),%rbx movq $snj_info,(%rbp) jmp snj_info
(which could be cleaned up further by combining it with snj_info) instead of
sni_info: movq 7(%rbx),%rbx movq $snj_info,(%rbp) testq $7,%rbx jne snj_info jmp *(%rbx)
(i.e., the whole test if it is a thunk and conditionally evaluate it bit is unnecessary due to constructor the strictness flag).
Cheers! -Tyson _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

On Oct 15, 2008, at 11:08 AM, Lennart Augustsson wrote:
I totally agree. Getting the value of the field should just evaluate x and then use a pointer indirection; there should be no conditional jumps involved in getting the value. GHC is just doing the wrong thing.
Can indirection nodes occur in this context? [I'd think not, but it depends on what pointer we're storing when we force the thunk in the constructor.] I could see the need for the test if indirection handling is required. -Jan
-- Lennart
On Wed, Oct 15, 2008 at 3:58 PM, Tyson Whitehead
wrote: On Wednesday 15 October 2008 10:48:26 you wrote:
Strictness does not imply unboxing.
To see why not, think about the fact that unboxing breaks sharing. By keeping the pointer-indirection in place, we can share even strict fields between related values.
I believe I realize that. What I was wondering about was the fact that it seemed to think the pointer might be to a thunk (instead of constructor closure). Doesn't the strictness flag mean the following assembler would work
sni_info: movq 7(%rbx),%rbx movq $snj_info,(%rbp) jmp snj_info
(which could be cleaned up further by combining it with snj_info) instead of
sni_info: movq 7(%rbx),%rbx movq $snj_info,(%rbp) testq $7,%rbx jne snj_info jmp *(%rbx)
(i.e., the whole test if it is a thunk and conditionally evaluate it bit is unnecessary due to constructor the strictness flag).
Cheers! -Tyson _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

True, if there can be indirections then that's bad news.
That would make strict fields much less efficient.
But I don't see why indirections should be needed. Simon?
On Wed, Oct 15, 2008 at 4:21 PM, Jan-Willem Maessen
On Oct 15, 2008, at 11:08 AM, Lennart Augustsson wrote:
I totally agree. Getting the value of the field should just evaluate x and then use a pointer indirection; there should be no conditional jumps involved in getting the value. GHC is just doing the wrong thing.
Can indirection nodes occur in this context? [I'd think not, but it depends on what pointer we're storing when we force the thunk in the constructor.] I could see the need for the test if indirection handling is required.
-Jan
-- Lennart
On Wed, Oct 15, 2008 at 3:58 PM, Tyson Whitehead
wrote: On Wednesday 15 October 2008 10:48:26 you wrote:
Strictness does not imply unboxing.
To see why not, think about the fact that unboxing breaks sharing. By keeping the pointer-indirection in place, we can share even strict fields between related values.
I believe I realize that. What I was wondering about was the fact that it seemed to think the pointer might be to a thunk (instead of constructor closure). Doesn't the strictness flag mean the following assembler would work
sni_info: movq 7(%rbx),%rbx movq $snj_info,(%rbp) jmp snj_info
(which could be cleaned up further by combining it with snj_info) instead of
sni_info: movq 7(%rbx),%rbx movq $snj_info,(%rbp) testq $7,%rbx jne snj_info jmp *(%rbx)
(i.e., the whole test if it is a thunk and conditionally evaluate it bit is unnecessary due to constructor the strictness flag).
Cheers! -Tyson _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Lennart Augustsson wrote:
True, if there can be indirections then that's bad news. That would make strict fields much less efficient. But I don't see why indirections should be needed. Simon?
This kind of thing has always been a problem for GHC, and IIRC hbc does/did better here. I don't know for sure whether we guarantee not to point to an indirection from a strict constructur field. I imagine it wouldn't be hard to arrange, but it is another invariant we'd have to maintain throughout the Core->Core phases. The real problem is that strictness is not represented in the type system, so we have no way to check that these kind of invariants are being respected. Cheers, Simon
On Wed, Oct 15, 2008 at 4:21 PM, Jan-Willem Maessen
wrote: On Oct 15, 2008, at 11:08 AM, Lennart Augustsson wrote:
I totally agree. Getting the value of the field should just evaluate x and then use a pointer indirection; there should be no conditional jumps involved in getting the value. GHC is just doing the wrong thing. Can indirection nodes occur in this context? [I'd think not, but it depends on what pointer we're storing when we force the thunk in the constructor.] I could see the need for the test if indirection handling is required.
-Jan
-- Lennart
On Wed, Oct 15, 2008 at 3:58 PM, Tyson Whitehead
wrote: On Wednesday 15 October 2008 10:48:26 you wrote:
Strictness does not imply unboxing.
To see why not, think about the fact that unboxing breaks sharing. By keeping the pointer-indirection in place, we can share even strict fields between related values. I believe I realize that. What I was wondering about was the fact that it seemed to think the pointer might be to a thunk (instead of constructor closure). Doesn't the strictness flag mean the following assembler would work
sni_info: movq 7(%rbx),%rbx movq $snj_info,(%rbp) jmp snj_info
(which could be cleaned up further by combining it with snj_info) instead of
sni_info: movq 7(%rbx),%rbx movq $snj_info,(%rbp) testq $7,%rbx jne snj_info jmp *(%rbx)
(i.e., the whole test if it is a thunk and conditionally evaluate it bit is unnecessary due to constructor the strictness flag).
Cheers! -Tyson _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

| I totally agree. Getting the value of the field should just evaluate
| x and then use a pointer indirection; there should be no conditional
| jumps involved in getting the value.
| GHC is just doing the wrong thing.
You're right. As Simon says, GHC's Core language has no type distinction between an Int that is evaluated and an Int that might be a thunk, so we generate conservative code.
However, even for strict functions we do not *guarantee* to pass evaluated arguments. Consider
f x y = if y==0 then g x else 0
g x = x+1
Here g is strict, but f is not (in x). The code that GHC generates for 'f' simply passes 'x' along to 'g'. It does not evaluate 'x' and then pass it. So the fact that 'g' is strict is used to *allow* the caller to use call by value; it does not *require* the caller to do so. So that means that 'g' cannot rely on getting an evaluated argument. One could change that, but that's the way it is right now.
For strict *constructors*, on the other hand, we *do* guarantee to evaluate the argument before building the constructor. We generate a wrapper thus
wC = \ab. case a of { a' -> C a' b }
(Remember 'case' always evaluates in Core.) So for strict constructors we could take advantage of the known evaluated-ness of the result to avoid the test.
BUT people who care probably UNPACK their strict fields too, which is even better. The time you can't do that is for sum types
data T = MkT ![Int]
Whether this case is important enough to be worth the complexity cost, I'm not sure.
If someone felt like creating a ticket for this thread, and pasting in the payload of the messages, that'd help us not to forget it. As of now, it doesn't look like a terribly big win to me. But I could be wrong -- intuition is not always a good guide.
Simon
| -----Original Message-----
| From: glasgow-haskell-users-bounces@haskell.org [mailto:glasgow-haskell-users-
| bounces@haskell.org] On Behalf Of Lennart Augustsson
| Sent: 15 October 2008 16:09
| To: Tyson Whitehead
| Cc: GHC users
| Subject: Re: Strictness in data declaration not matched in assembler?
|
| I totally agree. Getting the value of the field should just evaluate
| x and then use a pointer indirection; there should be no conditional
| jumps involved in getting the value.
| GHC is just doing the wrong thing.
|
| -- Lennart
|
| On Wed, Oct 15, 2008 at 3:58 PM, Tyson Whitehead

On 16/10/2008, at 21:34, Simon Peyton-Jones wrote:
For strict *constructors*, on the other hand, we *do* guarantee to evaluate the argument before building the constructor. We generate a wrapper thus wC = \ab. case a of { a' -> C a' b } (Remember 'case' always evaluates in Core.) So for strict constructors we could take advantage of the known evaluated-ness of the result to avoid the test.
BUT people who care probably UNPACK their strict fields too, which is even better. The time you can't do that is for sum types data T = MkT ![Int]
You also can't do it for polymorphic components. I've used code like: data T a = MkT !a foo :: T (a,b) -> a foo (MkT (x,y)) = x Here, unpacking doesn't work but foo could still access the components of the pair directly. Roman

| > BUT people who care probably UNPACK their strict fields too, which | > is even better. The time you can't do that is for sum types | > data T = MkT ![Int] | | You also can't do it for polymorphic components. I've used code like: | | data T a = MkT !a | | foo :: T (a,b) -> a | foo (MkT (x,y)) = x | | Here, unpacking doesn't work but foo could still access the components | of the pair directly. Excellent point Roman. S

On Thursday 16 October 2008 07:03:05 Roman Leshchinskiy wrote:
On 16/10/2008, at 21:34, Simon Peyton-Jones wrote:
BUT people who care probably UNPACK their strict fields too, which is even better. The time you can't do that is for sum types data T = MkT ![Int]
You also can't do it for polymorphic components. I've used code like:
data T a = MkT !a
foo :: T (a,b) -> a foo (MkT (x,y)) = x
Here, unpacking doesn't work but foo could still access the components of the pair directly.
This is actually the situation I was originally looking at. I just simplified it for the sake of posting readable core and assembler. Specifically, I was looking at some of the assembler GHC was generating for some array code to see if it could do a clean enough job to be used instead of C, and was finding this sort of thing because STUArrau is defined as data STUArray s i a = STUArray !i !i !Int (MutableByteArray# s) I also seem to recall seeing the same sort of thing in some of the state code I was also looking at because STRep is defined as type STRep s a = State# s -> (# State# s, a #) (the a being the issue here). With regard to cost, it is probably not that representative, but a typical code path for the toy example I posted earlier goes from leaq -8(%rbp),%rax cmpq %r14,%rax -- not taken jump -- (stack overflow check passed) movq %rsi,%rbx movq $sni_info,-8(%rbp) addq $-8,%rbp testq $7,%rbx -- taken jump -- (x argument had already been forced) movq 7(%rbx),%rbx movq $snj_info,(%rbp) testq $7,%rbx -- taken jump -- (strict constructor argument is already forced) cmpq $1,7(%rbx) -- not taken jump -- (first of the case options) movl $Main_lvl_closure+1,%ebx addq $8,%rbp jmp *(%rbp) to leaq -8(%rbp),%rax cmpq %r14,%rax -- not taken jump -- (stack overflow check passed) movq %rsi,%rbx movq $sni_info,-8(%rbp) addq $-8,%rbp testq $7,%rbx -- taken jump -- (x argument has already been forced) movq 7(%rbx),%rbx cmpq $1,7(%rbx) -- not taken jump -- (first of the case options) movl $Main_lvl_closure+1,%ebx addq $8,%rbp jmp *(%rbp) which is a 22% reduction (18 to 14) in instructions executed in the entire function, or a 40% reduction (10 to 6) in instruction executed in the core of the function (i.e., after the function's argument is possibly forced). Cheers! -Tyson PS: Thanks everyone for the very informative and interesting discussion.

twhitehead:
On Thursday 16 October 2008 07:03:05 Roman Leshchinskiy wrote:
On 16/10/2008, at 21:34, Simon Peyton-Jones wrote:
BUT people who care probably UNPACK their strict fields too, which is even better. The time you can't do that is for sum types data T = MkT ![Int]
You also can't do it for polymorphic components. I've used code like:
data T a = MkT !a
foo :: T (a,b) -> a foo (MkT (x,y)) = x
Here, unpacking doesn't work but foo could still access the components of the pair directly.
This is actually the situation I was originally looking at. I just simplified it for the sake of posting readable core and assembler.
Specifically, I was looking at some of the assembler GHC was generating for some array code to see if it could do a clean enough job to be used instead of C, and was finding this sort of thing because STUArrau is defined as
data STUArray s i a = STUArray !i !i !Int (MutableByteArray# s)
FWIW, I get much nicer code with uvector (which uses type families to select monomorphic instances of things, and aggressive inlining, to yield much better code in practice). The DPH arrays library uses a similar method. So you might make some progress by taking that direction. -- Don

On Thursday 16 October 2008 14:34:01 Don Stewart wrote:
FWIW, I get much nicer code with uvector (which uses type families to select monomorphic instances of things, and aggressive inlining, to yield much better code in practice). The DPH arrays library uses a similar method.
Thanks for the hint Don! : ) I've had a look at the documentation, and I'll hopefully give it a try soon. Cheers! -Tyson PS: Is the document still up-to-date with regards to recommending via c?

twhitehead:
On Thursday 16 October 2008 14:34:01 Don Stewart wrote:
FWIW, I get much nicer code with uvector (which uses type families to select monomorphic instances of things, and aggressive inlining, to yield much better code in practice). The DPH arrays library uses a similar method.
Thanks for the hint Don! : )
I've had a look at the documentation, and I'll hopefully give it a try soon.
Cheers! -Tyson
PS: Is the document still up-to-date with regards to recommending via c?
Yep, but you can always experiment with -fasm -- Don

twhitehead:
Consider the following code
data Data = Data { unData :: !Int }
func :: Data -> Int func x = case unData x of 1 -> 2 _ -> 0
Compiling with GHC 6.8.2 gives the following stg code
Main.func = \r [x_slg] case x_slg of tpl_slx { Main.Data ipv_slj -> case ipv_slj of wild_sly { GHC.Base.I# ds_slm -> case ds_slm of ds1_slz { __DEFAULT -> Main.lvl1; 1 -> Main.lvl; }; }; };
Note that using -funbox-strict-fields helps, A.func :: A.Data -> Int A.func = \ (x_afx :: A.Data) -> case x_afx of tpl_B2 { A.Data rb_B4 -> case rb_B4 of ds_Xg7 { __DEFAULT -> A.lvl1 1 -> A.lvl } } No I#. I'd expect if 'func' was inlined, for the return to be unboxed as well. -- Don
participants (8)
-
Don Stewart
-
Jan-Willem Maessen
-
Jules Bean
-
Lennart Augustsson
-
Roman Leshchinskiy
-
Simon Marlow
-
Simon Peyton-Jones
-
Tyson Whitehead