[GHC] #10124: Simple case analyses generate too many branches

#10124: Simple case analyses generate too many branches -------------------------------------+------------------------------------- Reporter: bgamari | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.4 Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Blocked By: Test Case: | Related Tickets: Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Take the following example, {{{#!hs f :: Int -> Bool f a = case a of 1 -> True 5 -> True 8 -> True 9 -> True 11 -> True 19 -> True _ -> False {-# NOINLINE f #-} main = print $ f 8 }}} This gets lowered to the following Core (by GHC 7.8), {{{#!hs f f = \ a_s3EH -> case a_s3EH of _ { I# ds_s3EJ -> case ds_s3EJ of _ { __DEFAULT -> False; 1 -> True; 5 -> True; 8 -> True; 9 -> True; 11 -> True; 19 -> True } } }}} I have expected GHC to lower this into a nice string of logical operations, with perhaps a couple of branches at the end to determine the result. Unfortunately, this is not what happens. Instead the C-- is a sea of branches, {{{ c3F7: _s3EK::I64 = I64[R1 + 7]; if (%MO_S_Lt_W64(_s3EK::I64, 9)) goto c3Fz; else goto c3FA; c3Fz: if (%MO_S_Lt_W64(_s3EK::I64, 5)) goto c3Fq; else goto c3Fr; c3Fq: if (_s3EK::I64 != 1) goto c3Ff; else goto c3Fg; c3Fr: if (%MO_S_Lt_W64(_s3EK::I64, 8)) goto c3Fn; else goto c3Fo; c3Fn: if (_s3EK::I64 != 5) goto c3Ff; else goto c3Fg; c3Fo: if (_s3EK::I64 != 8) goto c3Ff; else goto c3Fg; c3FA: if (%MO_S_Lt_W64(_s3EK::I64, 11)) goto c3Fw; else goto c3Fx; c3Fw: if (_s3EK::I64 != 9) goto c3Ff; else goto c3Fg; c3Fx: if (%MO_S_Lt_W64(_s3EK::I64, 19)) goto c3Ft; else goto c3Fu; c3Ft: if (_s3EK::I64 != 11) goto c3Ff; else goto c3Fg; c3Fu: if (_s3EK::I64 != 19) goto c3Ff; else goto c3Fg; c3Ff: R1 = False_closure+1; Sp = Sp + 8; call (P64[Sp])(R1) args: 8, res: 0, upd: 8; c3Fg: R1 = True_closure+2; Sp = Sp + 8; call (P64[Sp])(R1) args: 8, res: 0, upd: 8; ... }}} Which gets turned into the branchy assembler that you would expect. To my surprise even the LLVM backend isn't able to bring this back into a branchless form. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10124 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10124: Simple case analyses generate too many branches -------------------------------------+------------------------------------- Reporter: bgamari | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.4 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Changes (by bgamari): * failure: None/Unknown => Runtime performance bug -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10124#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10124: Simple case analyses generate too many branches -------------------------------------+------------------------------------- Reporter: bgamari | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.4 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #6135 | Differential Revisions: -------------------------------------+------------------------------------- Changes (by bgamari): * related: => #6135 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10124#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10124: Simple case analyses generate too many branches -------------------------------------+------------------------------------- Reporter: bgamari | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.4 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #6135 | Differential Revisions: -------------------------------------+------------------------------------- Comment (by simonpj): What did you mean by "a nice string of logical operations with a couple of branches"? What code do you want from this case expression? Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10124#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10124: Simple case analyses generate too many branches -------------------------------------+------------------------------------- Reporter: bgamari | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.4 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #6135 | Differential Revisions: -------------------------------------+------------------------------------- Comment (by bgamari): I would hope that the example above would get lowered to something like the following, {{{#!hs let !p = a ==# 1 `orI#` a ==# 5 `orI#` a ==# 8 `orI#` a ==# 9 `orI#` a ==# 11 `orI#` a ==# 19 case p of 1 -> do_something 0 -> do_something_else }}} The idea is that I'd want to avoid emitting branches in this case. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10124#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10124: Simple case analyses generate too many branches -------------------------------------+------------------------------------- Reporter: bgamari | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.4 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #6135, #9661 | Differential Revisions: -------------------------------------+------------------------------------- Changes (by jstolarek): * related: #6135 => #6135, #9661 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10124#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10124: Simple case analyses generate too many branches -------------------------------------+------------------------------------- Reporter: bgamari | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.4 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #6135, #9661 | Differential Revisions: -------------------------------------+------------------------------------- Comment (by simonpj): Ah, I see. So maybe you want a rule like this: {{{ isTrue# a || isTrue# b = isTrue# (a `orI#` b) }}} NB that the `(==)` method for `Int` says {{{ (I# x) `eqInt` (I# y) = isTrue# (x ==# y) }}} You'd have delay inlining `(||)` a bit, but you could do that. Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10124#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10124: Simple case analyses generate too many branches -------------------------------------+------------------------------------- Reporter: bgamari | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.4 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #6135, #9661 | Differential Revisions: -------------------------------------+------------------------------------- Comment (by nomeata): Would could go further and expect {{{ f = \ a_s3EH -> case a_s3EH of _ { I# ds_s3EJ -> case ds_s3EJ of _ { __DEFAULT -> x; 1 -> y; 5 -> y; 8 -> y; 9 -> y; 11 -> y; 19 -> y } } }}} to be replaced by {{{ f = \ a_s3EH -> case a_s3EH of _ { I# ds_s3EJ -> let p! = some_branchless_formula_involving ds_s3EJ case p of _ { __DEFAULT -> x; 1 -> y; } } }}} where `some_branchless_formula_involving` could be any expression that is `1` for `1,5,8,9,11,19` – whether it is a disjunction of equalities or some fancy bit-fiddling magic. Even more general, how about turning {{{ f = \ a_s3EH -> case a_s3EH of _ { I# ds_s3EJ -> case ds_s3EJ of _ { __DEFAULT -> x; 1 -> y; 5 -> y; 8 -> z; 9 -> y; 11 -> z; 19 -> z } } }}} into {{{ f = \ a_s3EH -> case a_s3EH of _ { I# ds_s3EJ -> let p! = some_branchless_formula_involving ds_s3EJ case p of _ { __DEFAULT -> x; 1 -> y; 2 -> z; } } }}} This would generate one branch per unique right-hand-side of a `->`, instead of one branch per literal matched against. Not sure if Core is the right place for this, though – it feels more like instruction selection in the code generator. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10124#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10124: Simple case analyses generate too many branches -------------------------------------+------------------------------------- Reporter: bgamari | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.4 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #6135, #9661 | Differential Revisions: -------------------------------------+------------------------------------- Comment (by simonpj): Well Joachim's stuff depends on spotting common RHSs. I'd rather avoid producing them in the first place. But one could imagine doing both. Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10124#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10124: Simple case analyses generate too many branches -------------------------------------+------------------------------------- Reporter: bgamari | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.4 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #6135, #9661 | Differential Revisions: -------------------------------------+------------------------------------- Comment (by bgamari): nomeata, indeed LLVM transforms the C analogue to the example given here into only 9 instructions. I'd agree that it seems better to avoid breaking up the case expression to begin with. This weekend I focused on giving the user the ability to write the branchless implementation explicitly by fixing #9661. It would be nice to have a story for how we could transform cases into branch-less form without user intervention, although I'm not sure how to know when this transform will pay off. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10124#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10124: Simple case analyses generate too many branches -------------------------------------+------------------------------------- Reporter: bgamari | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.4 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #6135, #9661 | Differential Revisions: -------------------------------------+------------------------------------- Comment (by nomeata):
I'd agree that it seems better to avoid breaking up the case expression to begin with.
Not sure if that helps, after all, we want to have good code even when the user writes a case statement to begin with: {{{ myIsSpace ' ' = True myIsSpace '\n' = True myIsSpace '\t' = True myIsSpace _ = False }}} I would expect the compiler to (try to) create optimal code from this specification, whether it’s a linear list of condition, an if-then-else- tree or a jump table. I looked at the CG, which even has a pass to unify the common branches, but we need CMM first, so we have to generate some assembly. We currently unconditionally generate an if-then-else tree there, so it is hard to rewrite that into something smarter. Maybe the way forward would be to implement this todo from 11 years ago: {{{ emitCmmLitSwitch :: CmmExpr -- Tag to switch on -> [(Literal, CmmAGraphScoped)] -- Tagged branches -> CmmAGraphScoped -- Default branch (always) -> FCode () -- Emit the code -- Used for general literals, whose size might not be a word, -- where there is always a default case, and where we don't know -- the range of values for certain. For simplicity we always generate a tree. -- -- ToDo: for integers we could do better here, perhaps by generalising -- mk_switch and using that. --SDM 15/09/2004 }}} For `emitCmmSwitch` there is already a case that generates a `CmmSwitch` statement, used when compiling to C. So we might want to try to create a `switch` statement also in `emitCmmLitSwitch` when compiling via LLVM, to leave it to the compiler. If we emit a switch statement in `emitCmmLitSwitch` unconditionally, then we can run the common block transformation first, and later, in a separate step replace the `CmmSwitch` by a tree of if-then-else statements, or by some branchless code, or whatever looks best by then. For LLVM, we might simply leave it as a switch statement. In order to do so, the type of `CmmSwitch` would have to support sparse switch statements better – it currently takes a `[Maybe Label]`. (Disclaimer: I don’t know much about code generation. If I’m not helpful here, just tell me :-)) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10124#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10124: Simple case analyses generate too many branches -------------------------------------+------------------------------------- Reporter: bgamari | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.4 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #6135, #9661 | Differential Revisions: -------------------------------------+------------------------------------- Comment (by simonpj): I'm all for doing both! * Try to avoid duplication in the first place (comment:6) * Recover sharing even if duplication happens, or the programmer wrote it (comment:10) Re comment:10, one could do it in Cmm, but I suspect it'd be better done in Core. Could be a variant of `Note [Combine identical alternatives]` in `SimplUtils`. Over to you, Ben. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10124#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10124: Simple case analyses generate too many branches -------------------------------------+------------------------------------- Reporter: bgamari | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.4 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #6135, | Differential Revisions: #9661,#10137 | -------------------------------------+------------------------------------- Changes (by nomeata): * related: #6135, #9661 => #6135, #9661,#10137 Comment: I believe that at least some of this can and should be handled in the code generator, where the code generation for branching on integers can be improved anyways. I describe a possible plan in #10137. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10124#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10124: Simple case analyses generate too many branches
-------------------------------------+-------------------------------------
Reporter: bgamari | Owner:
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.8.4
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
Type of failure: Runtime | Unknown/Multiple
performance bug | Test Case:
Blocked By: | Blocking:
Related Tickets: #6135, | Differential Revisions:
#9661,#10137 |
-------------------------------------+-------------------------------------
Comment (by Joachim Breitner

#10124: Simple case analyses generate too many branches -------------------------------------+------------------------------------- Reporter: bgamari | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.4 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #6135, | Differential Revisions: #9661,#10137 | -------------------------------------+------------------------------------- Comment (by nomeata): With the above refactoring in place, it should be easier now to implement this on the Cmm level. I had a quick look, but I’m a bit lost with Cmm here (conditionals vs. boolean ops etc.). If someone can tell me what the correct Cmm code should be for, say, the example from comment:10, that’d be helpful. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10124#comment:14 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10124: Simple case analyses generate too many branches -------------------------------------+------------------------------------- Reporter: bgamari | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.4 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #6135, | Differential Revisions: #9661,#10137 | -------------------------------------+------------------------------------- Comment (by nomeata): Hmm, given the example from comment:10, we currently generate this code: {{{ block_c3S3_info: _c3S3: movq 7(%rbx),%rax cmpq $10,%rax jb _u3Si _u3Sj: cmpq $11,%rax jb _c3Sh _u3Sk: cmpq $32,%rax jne _c3Se _c3Sh: movl $GHC.Types.True_closure+2,%ebx addq $8,%rbp jmp *(%rbp) _c3S7: movl $myIsSpace_rkB_closure,%ebx jmp *-8(%r13) _c3Se: movl $GHC.Types.False_closure+1,%ebx addq $8,%rbp jmp *(%rbp) _u3Si: cmpq $9,%rax jb _c3Se jmp _c3Sh }}} while my code now generates {{{ _c3S3: movq 7(%rbx),%rax cmpq $32,%rax setne %bl movzbl %bl,%ebx cmpq $10,%rax setne %cl movzbl %cl,%ecx andq %rbx,%rcx cmpq $9,%rax setne %al movzbl %al,%eax andq %rcx,%rax testq %rax,%rax jne _c3Se _c3Sh: movl $GHC.Types.True_closure+2,%ebx addq $8,%rbp jmp *(%rbp) _c3S7: movl $myIsSpace_rkB_closure,%ebx jmp *-8(%r13) _c3Se: movl $GHC.Types.False_closure+1,%ebx addq $8,%rbp jmp *(%rbp) .size myIsSpace_rkB_info, .-myIsSpace_rkB_info }}} But not even this simple microbenchmark {{{ main :: IO () main = x `seq` return () where x = length $ filter myIsSpace $ concatMap (replicate 100000000) $ ['\001'..'z'] }}} shows a change in performance. bgamari, is that expected? Is it just that the assembly (generated from {{{ if ((_s3Rr::I64 != 9) & (_s3Rr::I64 != 10) & (_s3Rr::I64 != 32) != 0) goto c3Se; else goto c3Sh; }}} is too bad, or is this not a test case where this will help a lot? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10124#comment:15 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10124: Simple case analyses generate too many branches -------------------------------------+------------------------------------- Reporter: bgamari | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.4 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #6135, | Differential Revisions: #9661,#10137 | -------------------------------------+------------------------------------- Comment (by rwbarton): This benchmark does quite a lot of allocation, unless that changes in HEAD (I tested with 7.10.1). It would be better for microbenchmarking purposes to avoid all the allocation. Anyways I would expect the original to be a bit faster for this particular microbenchmark, since the branches will all be predicted correctly and a predicted branch is basically free. It might be a different matter with `concat (replicate 100000000 ['\001'..'z'])`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10124#comment:16 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10124: Simple case analyses generate too many branches -------------------------------------+------------------------------------- Reporter: bgamari | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.4 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #6135, | Differential Revisions: #9661,#10137 | -------------------------------------+------------------------------------- Comment (by rwbarton): Oh, I needed a `{-# NOINLINE myIsSpace #-}` to avoid the allocation, interesting... -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10124#comment:17 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10124: Simple case analyses generate too many branches -------------------------------------+------------------------------------- Reporter: bgamari | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.4 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #6135, | Differential Revisions: #9661,#10137 | -------------------------------------+------------------------------------- Comment (by rwbarton): That branchless assembly is certainly not the greatest though it doesn't look too terrible to me either. What happens if you use the LLVM backend? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10124#comment:18 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10124: Simple case analyses generate too many branches -------------------------------------+------------------------------------- Reporter: bgamari | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.4 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #6135, | Differential Revisions: #9661,#10137 | -------------------------------------+------------------------------------- Comment (by nomeata):
It might be a different matter with concat (replicate 100000000 ['\001'..'z']).
Eh, that’s of course the code I intended, thanks for noticing! ... unfortunately, not even that code gets faster: {{{ $ python -m timeit -s 'import os' 'os.system("./IsSpace-before")' 10 loops, best of 3: 4.9 sec per loop $ python -m timeit -s 'import os' 'os.system("./IsSpace-after")' 10 loops, best of 3: 4.9 sec per loop }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10124#comment:19 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10124: Simple case analyses generate too many branches -------------------------------------+------------------------------------- Reporter: bgamari | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.4 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #6135, | Differential Revisions: #9661,#10137 | -------------------------------------+------------------------------------- Comment (by nomeata): I think these numbers might be wrong due to mistakes on my side. Now I get {{{ 1 loops, best of 10: 27.1 sec per loop 1 loops, best of 10: 41.5 sec per loop }}} so the branchless code is actually much more expensive. (This is testing `length $ filter myIsSpace $ concatMap (const ['\001'..'z']) $ [0..50000000::Int]` which produces better code than `replicate`.) Also in light of Sven Pannes’s valuable comments (https://mail.haskell.org/pipermail/ghc-devs/2015-April/008858.html) I’m inclined to abandon this idea. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10124#comment:20 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC