
#9246: GHC generates poor code for repeated uses of min/max --------------------------------------------+------------------------------ Reporter: arotenberg | Owner: Type: feature request | Status: new Priority: normal | Milestone: 7.10.1 Component: Compiler | Version: 7.8.2 Resolution: | Keywords: Operating System: Windows | Architecture: x86_64 Type of failure: Runtime performance bug | (amd64) Test Case: | Difficulty: Unknown Blocking: | Blocked By: | Related Tickets: #6135 --------------------------------------------+------------------------------ Comment (by arotenberg): I tried using criterion to profile different versions of this function, but I couldn't get the results to not be noisy and "microbenchmarky". The way I originally found this issue was by profiling the raytracer program I'm writing, which fingered this function as both the single largest time sink and the largest allocator. I tried replacing the definition with this variant: {{{ testRayAABBIntersection :: Ray -> AABB -> Bool testRayAABBIntersection (Ray (Vec3 (D# ox) (D# oy) (D# oz)) _ (Vec3 (D# invDx) (D# invDy) (D# invDz))) (AABB (Vec3 (D# minX) (D# minY) (D# minZ)) (Vec3 (D# maxX) (D# maxY) (D# maxZ))) = let tx1 = minusTimes## minX ox invDx tx2 = minusTimes## maxX ox invDx ty1 = minusTimes## minY oy invDy ty2 = minusTimes## maxY oy invDy tz1 = minusTimes## minZ oz invDz tz2 = minusTimes## maxZ oz invDz tmin = min## tx1 tx2 `max##` min## ty1 ty2 `max##` min## tz1 tz2 tmax = max## tx1 tx2 `min##` max## ty1 ty2 `min##` max## tz1 tz2 in isTrue# (tmax >=## max## 0.0## tmin) {-# NOINLINE minusTimes## #-} minusTimes## :: Double# -> Double# -> Double# -> Double# minusTimes## minA oa invDa = (minA -## oa) *## invDa {-# NOINLINE min## #-} {-# NOINLINE max## #-} min##, max## :: Double# -> Double# -> Double# max## x y = if isTrue# (x <=## y) then y else x min## x y = if isTrue# (x <=## y) then x else y }}} This gets the function's heap allocation down to the expected zero bytes, but at the cost of actually making the program slower (both with and without profiling)! I haven't tried comparing results with the LLVM backend yet. I might look into it later today. It would be nice to see non-branching min/max implemented for the primitive numeric types where available. Min/max for integer types can probably be implemented efficiently on many platforms using conditional move instructions. I'm not particularly interested in implementing it myself, but I'm happy with having it as a feature request. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9246#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler