
#9246: GHC generates poor code for repeated uses of min/max ------------------------------+-------------------------------------------- Reporter: arotenberg | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Keywords: | Operating System: Windows Architecture: x86_64 | Type of failure: Runtime performance bug (amd64) | Test Case: Difficulty: Unknown | Blocking: Blocked By: | Related Tickets: #6135 | ------------------------------+-------------------------------------------- Consider the following module, which intends to implement a [http://tavianator.com/2011/05/fast-branchless-raybounding-box- intersections/ branchless ray-AABB intersection test]: {{{ module SimpleGeom where data Vec3 = Vec3 {-# UNPACK #-} !Double {-# UNPACK #-} !Double {-# UNPACK #-} !Double data Ray = Ray !Vec3 !Vec3 !Vec3 data AABB = AABB !Vec3 !Vec3 testRayAABBIntersection :: Ray -> AABB -> Bool testRayAABBIntersection (Ray (Vec3 ox oy oz) _ (Vec3 invDx invDy invDz)) (AABB (Vec3 minX minY minZ) (Vec3 maxX maxY maxZ)) = let tx1 = (minX - ox) * invDx tx2 = (maxX - ox) * invDx ty1 = (minY - oy) * invDy ty2 = (maxY - oy) * invDy tz1 = (minZ - oz) * invDz tz2 = (maxZ - oz) * invDz tmin = min tx1 tx2 `max` min ty1 ty2 `max` min tz1 tz2 tmax = max tx1 tx2 `min` max ty1 ty2 `min` max tz1 tz2 in tmax >= max 0 tmin }}} Everything is strict primitive operations, so GHC should generate very simple, fast code, right? But upon compiling with {{{ghc -O -ddump-simpl -ddump-to-file SimpleGeom}}}, I found a mess of nested local lambdas and similar performance-killing expression forms. (See the attached output file.) There are two possible issues I can see here. 1. GHC is trying to expand out all of the branches recursively (I would presume via case-of-cases transformation), which is a bad idea in this instance compared to just performing the cases sequentially and storing their results. 1. GHC is generating branches for floating-point min/max. Instruction sets like SSE2 include non-branching floating-point min/max instructions, which is exactly what this algorithm was designed to exploit, but GHC does not generate code that could take advantage of these instructions. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9246 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler