[GHC] #9638: Speed up Data.Char.isDigit

#9638: Speed up Data.Char.isDigit -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: feature request | Status: new Priority: normal | Milestone: 7.10.1 Component: Compiler | Version: 7.9 Keywords: | Operating System: Architecture: Unknown/Multiple | Unknown/Multiple Difficulty: Easy (less than 1 | Type of failure: Runtime hour) | performance bug Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- `isDigit` is currently defined like this: {{{#!hs isDigit :: Char -> Bool isDigit c = c >= '0' && c <= '9' }}} This will short-circuit the right way if you're looking for digits mixed with spaces, but the wrong way if you're looking for digits mixed with letters. It also requires a conditional jump to do that short-circuiting (confirmed by inspecting the assembly). It should be better to use an unsigned comparison instead: {{{#!hs isDigit :: Char -> Bool isDigit c = (fromIntegral (ord c) :: Word) - 48 <= 9 }}} The interesting section looks like this {{{ movq 7(%rbx),%rax addq $-48,%rax cmpq $9,%rax setbe %al movzbl %al,%eax shlq $3,%rax movq ghczmprim_GHCziTypes_Bool_closure_tbl(%rax),%rbx addq $8,%rbp jmp *(%rbp) }}} or like this with -fllvm: {{{ movq 7(%rbx), %rax addq $-48, %rax cmpq $10, %rax sbbq %rax, %rax andq $8, %rax movq ghczmprim_GHCziTypes_Bool_closure_tbl(%rax), %rbx movq 8(%rbp), %rax addq $8, %rbp jmpq *%rax # TAILCALL }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9638 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9638: Speed up Data.Char.isDigit -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: feature | Status: new request | Milestone: 7.10.1 Priority: normal | Version: 7.9 Component: Compiler | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Easy (less than 1 Unknown/Multiple | hour) Type of failure: Runtime | Blocked By: performance bug | Related Tickets: Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by dfeuer): The same should probably apply to the (non-exported) `between` function in `GHC.IO.Encoding.UTF8`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9638#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9638: Speed up Data.Char.isDigit -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: feature | Status: new request | Milestone: 7.10.1 Priority: normal | Version: 7.9 Component: | Keywords: libraries/base | Architecture: Unknown/Multiple Resolution: | Difficulty: Easy (less than 1 Operating System: | hour) Unknown/Multiple | Blocked By: Type of failure: Runtime | Related Tickets: performance bug | Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Changes (by dfeuer): * cc: hvr, ekmett (added) * component: Compiler => libraries/base Comment: This approach should also combine well with ways digits are used. For instance, we could write (hopefully with better formatting) {{{#!hs digitToInt :: Char -> Int digitToInt c = case ord c - ord '0' of d | (fromIntegral d::Word) <= 9 -> d | otherwise -> case ord c - ord 'a' of e | (fromIntegral e::Word) <= 5 -> e + 10 | otherwise -> case ord c - ord 'A' of f | (fromIntegral f::Word) <= 5 -> f + 10 | otherwise = error ("Char.digitToInt: not a digit " ++ show c) }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9638#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9638: Speed up Data.Char.isDigit -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: feature | Status: new request | Milestone: 7.10.1 Priority: normal | Version: 7.9 Component: | Keywords: libraries/base | Architecture: Unknown/Multiple Resolution: | Difficulty: Easy (less than 1 Operating System: | hour) Unknown/Multiple | Blocked By: Type of failure: Runtime | Related Tickets: performance bug | Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by carter): These sound reasonable, do we have any (micro)benchmarks to validate that theres a meaningful improvement in performance to pay down the increased complexity of the implementation? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9638#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

These sound reasonable, do we have any (micro)benchmarks to validate
#9638: Speed up Data.Char.isDigit -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: feature | Status: new request | Milestone: 7.10.1 Priority: normal | Version: 7.9 Component: | Keywords: libraries/base | Architecture: Unknown/Multiple Resolution: | Difficulty: Easy (less than 1 Operating System: | hour) Unknown/Multiple | Blocked By: Type of failure: Runtime | Related Tickets: performance bug | Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by dfeuer): Replying to [comment:3 carter]: that there's a meaningful improvement in performance to pay down the increased complexity of the implementation? Yes. I tested evaluating `nf (map isDigit) testcase` with the following test cases: {{{#!hs toosmall = replicate 700 '-' toobig = replicate 700 'a' justright = replicate 700 '7' arb = "12asaa489oneuhl98'c43rub93d'i'i98car278urnatehodth98228hteou84348hcae84--- - - 424 42 hteohu 2 4hto24hth ehua942dp.dghckbaa---; 48923" arbitrary = arb ++ arb ++ reverse arb ++ reverse arb }}} As expected, the fancy code seemed to be very slightly worse for `toosmall`, certainly not worse enough to really get a proper measurement (it's only one extra addition). The rest did better with the fancy code; the difference was consistently biggest for `arbitrary`. Obviously we're not talking huge changes, but I think this function and similar ones get a good bit of use, so it's probably worth bothering. {{{ benchmarking toosmall/old time 12.31 μs (12.27 μs .. 12.36 μs) 1.000 R² (0.999 R² .. 1.000 R²) mean 12.37 μs (12.34 μs .. 12.40 μs) std dev 100.2 ns (69.41 ns .. 138.7 ns) benchmarking toosmall/new time 12.42 μs (12.41 μs .. 12.43 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 12.43 μs (12.42 μs .. 12.44 μs) std dev 36.07 ns (24.75 ns .. 59.63 ns) benchmarking toobig/old time 13.05 μs (13.04 μs .. 13.06 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 13.05 μs (13.04 μs .. 13.06 μs) std dev 26.99 ns (19.95 ns .. 43.44 ns) benchmarking toobig/new time 12.43 μs (12.42 μs .. 12.45 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 12.45 μs (12.44 μs .. 12.48 μs) std dev 65.84 ns (32.99 ns .. 129.0 ns) benchmarking justright/old time 13.07 μs (13.06 μs .. 13.08 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 13.08 μs (13.06 μs .. 13.09 μs) std dev 42.63 ns (30.33 ns .. 56.90 ns) benchmarking justright/new time 12.41 μs (12.40 μs .. 12.42 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 12.41 μs (12.41 μs .. 12.42 μs) std dev 25.29 ns (21.00 ns .. 31.20 ns) benchmarking arbitrary/old time 11.69 μs (11.68 μs .. 11.71 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 11.69 μs (11.68 μs .. 11.71 μs) std dev 50.41 ns (21.45 ns .. 85.71 ns) benchmarking arbitrary/new time 10.65 μs (10.64 μs .. 10.66 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 10.65 μs (10.64 μs .. 10.66 μs) std dev 21.31 ns (14.90 ns .. 36.62 ns) }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9638#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9638: Speed up Data.Char.isDigit -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: feature | Status: new request | Milestone: 7.10.1 Priority: normal | Version: 7.9 Component: | Keywords: libraries/base | Architecture: Unknown/Multiple Resolution: | Difficulty: Easy (less than 1 Operating System: | hour) Unknown/Multiple | Blocked By: Type of failure: Runtime | Related Tickets: performance bug | Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by schyler): +1 from me, cool idea. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9638#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9638: Speed up Data.Char.isDigit -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: feature | Status: new request | Milestone: 7.10.1 Priority: normal | Version: 7.9 Component: | Keywords: libraries/base | Architecture: Unknown/Multiple Resolution: | Difficulty: Easy (less than 1 Operating System: | hour) Unknown/Multiple | Blocked By: Type of failure: Runtime | Related Tickets: performance bug | Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by ekmett): Given the benchmarks it looks good to me. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9638#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9638: Speed up Data.Char.isDigit -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: feature | Status: new request | Milestone: 7.10.1 Priority: normal | Version: 7.9 Component: | Keywords: libraries/base | Architecture: Unknown/Multiple Resolution: | Difficulty: Easy (less than 1 Operating System: | hour) Unknown/Multiple | Blocked By: Type of failure: Runtime | Related Tickets: performance bug | Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by hvr): Well, fire up a code-revision over at Phab:differential :-) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9638#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9638: Speed up Data.Char.isDigit -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: ekmett Type: feature request | Status: new Priority: normal | Milestone: 7.12.1 Component: Core Libraries | Version: 7.9 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Changes (by thomie): * cc: core-libraries-committee@… (added) Comment: dfeuer, is 31571270625a690410b794b7cfe48d866c084e74/ghc sufficient to close this ticket or is there more to be done here? {{{ Author: David Feuer <> Date: Tue Oct 21 15:01:14 2014 -0500 Improve isDigit, isSpace, etc. Summary: Make things less branchy; use unsigned comparisons for range checking. Eliminate non-spaces more quickly in common cases in isSpace. Reviewers: ekmett, carter, austin Reviewed By: austin Subscribers: thomie, carter, ezyang, simonmar Differential Revision: https://phabricator.haskell.org/D340 GHC Trac Issues: #1473 }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9638#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9638: Speed up Data.Char.isDigit -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: feature request | Status: closed Priority: normal | Milestone: 8.0.1 Component: Core Libraries | Version: 7.9 Resolution: fixed | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #11382 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * status: new => closed * resolution: => fixed * related: => #11382 Comment: It seems that David's commit has addressed the principle complaint of this ticket. We can track further improvements in #11382. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9638#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9638: Speed up Data.Char.isDigit -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: feature request | Status: closed Priority: normal | Milestone: Component: Core Libraries | Version: 7.9 Resolution: fixed | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #11382 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by thomie): * milestone: 8.0.1 => -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9638#comment:14 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC