
#15176: Superclass `Monad m =>` makes program run 100 times slower -------------------------------------+------------------------------------- Reporter: danilo2 | Owner: osa1 Type: bug | Status: new Priority: highest | Milestone: 8.8.1 Component: Compiler | Version: 8.4.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1): I managed to reproduce this. The original instructions no longer work (the git repo disappeared, the branch doesn't exist in the new repo etc.) so here is what I did to reproduce: - Clone https://github.com/luna/luna.git - Run benchmark: `stack bench luna-core` - Apply this patch: {{{ diff --git a/core/src/Data/Graph/Fold/Layer.hs b/core/src/Data/Graph/Fold/Layer.hs index 28d6c6cd..45150ff1 100644 --- a/core/src/Data/Graph/Fold/Layer.hs +++ b/core/src/Data/Graph/Fold/Layer.hs @@ -141,7 +141,7 @@ instance Monad m => Fold.Builder (Scoped s) m (SmallVectorA t alloc n a) -- === FoldableLayers === -- -class LayersFoldableBuilder__ t (layers :: [Type]) m where +class Monad m => LayersFoldableBuilder__ t (layers :: [Type]) m where buildLayersFold__ :: SomePtr -> m (Fold.Result t) -> m (Fold.Result t) instance Monad m => LayersFoldableBuilder__ t '[] m where diff --git a/core/src/Data/Graph/Fold/LayerMap.hs b/core/src/Data/Graph/Fold/LayerMap.hs index 4b12bbf6..e5e54e45 100644 --- a/core/src/Data/Graph/Fold/LayerMap.hs +++ b/core/src/Data/Graph/Fold/LayerMap.hs @@ -117,7 +117,7 @@ instance Monad m => Fold.Builder (Scoped s) m (SmallVectorA t alloc n a) -- === FoldableLayers === -- -class LayersFoldableBuilder__ t (layers :: [Type]) m where +class Monad m => LayersFoldableBuilder__ t (layers :: [Type]) m where buildLayersFold__ :: SomePtr -> m (Fold.Result t) -> m (Fold.Result t) instance Monad m => LayersFoldableBuilder__ t '[] m where diff --git a/core/src/Data/Graph/Fold/Scoped.hs b/core/src/Data/Graph/Fold/Scoped.hs index 2fade0f3..2e6b51df 100644 --- a/core/src/Data/Graph/Fold/Scoped.hs +++ b/core/src/Data/Graph/Fold/Scoped.hs @@ -131,7 +131,7 @@ instance Monad m => Fold.Builder (Scoped t) m (SmallVectorA s alloc n a) -- === FoldableLayers === -- -class LayersFoldableBuilder__ t (layers :: [Type]) m where +class Monad m => LayersFoldableBuilder__ t (layers :: [Type]) m where buildLayersFold__ :: SomePtr -> m (Fold.Result t) -> m (Fold.Result t) instance Monad m => LayersFoldableBuilder__ t '[] m where diff --git a/core/src/Data/Graph/Fold/ScopedMap.hs b/core/src/Data/Graph/Fold/ScopedMap.hs index 217c55a6..4a3d34c8 100644 --- a/core/src/Data/Graph/Fold/ScopedMap.hs +++ b/core/src/Data/Graph/Fold/ScopedMap.hs @@ -129,7 +129,7 @@ instance Monad m => Fold.Builder (ScopedMap s) m (SmallVectorA t alloc n a) -- === FoldableLayers === -- -class LayersFoldableBuilder__ t (layers :: [Type]) m where +class Monad m => LayersFoldableBuilder__ t (layers :: [Type]) m where buildLayersFold__ :: SomePtr -> m (Fold.Result t) -> m (Fold.Result t) instance Monad m => LayersFoldableBuilder__ t '[] m where }}} - Run benchmarks again Most of the benchmarks are not effected, but there are three benchmarks which are effected quite significantly by this change: Before the patch: {{{ benchmarking ir/discovery/generic/10e6 time 61.47 ms (61.14 ms .. 61.80 ms) 1.000 R² (1.000 R² .. 1.000 R²) mean 61.02 ms (60.79 ms .. 61.20 ms) std dev 367.5 μs (224.7 μs .. 582.3 μs) benchmarking ir/discovery/partitions single var/10e6 time 93.94 ms (93.22 ms .. 94.75 ms) 1.000 R² (1.000 R² .. 1.000 R²) mean 93.57 ms (92.95 ms .. 93.94 ms) std dev 746.7 μs (377.0 μs .. 1.245 ms) benchmarking ir/discovery/partitions unify/10e6 time 518.7 ms (508.2 ms .. 523.9 ms) 1.000 R² (1.000 R² .. 1.000 R²) mean 515.6 ms (512.3 ms .. 516.9 ms) std dev 2.350 ms (717.7 μs .. 3.196 ms) variance introduced by outliers: 19% (moderately inflated) }}} After the patch: {{{ benchmarking ir/discovery/generic/10e6 time 1.309 s (1.283 s .. 1.326 s) 1.000 R² (1.000 R² .. 1.000 R²) mean 1.320 s (1.312 s .. 1.334 s) std dev 13.24 ms (767.0 μs .. 16.27 ms) variance introduced by outliers: 19% (moderately inflated) benchmarking ir/discovery/partitions single var/10e6 time 1.355 s (1.351 s .. 1.359 s) 1.000 R² (1.000 R² .. 1.000 R²) mean 1.357 s (1.356 s .. 1.359 s) std dev 1.415 ms (1.209 ms .. 1.452 ms) variance introduced by outliers: 19% (moderately inflated) benchmarking ir/discovery/partitions unify/10e6 time 5.459 s (5.438 s .. 5.501 s) 1.000 R² (1.000 R² .. 1.000 R²) mean 5.444 s (5.435 s .. 5.452 s) std dev 11.24 ms (7.336 ms .. 13.71 ms) variance introduced by outliers: 19% (moderately inflated) }}} Summary: - ir/discovery/generic/10e6: 21x increase - ir/discovery/partitions single var/10e6: 14x increase - ir/discovery/partitions unify/10e6: 10x increase No ideas why yet. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15176#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler