#14566: LiberateCase improvements
-------------------------------------+-------------------------------------
Reporter: simonpj | Owner: (none)
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 8.2.1
Keywords: | Operating System: Unknown/Multiple
Architecture: | Type of failure: None/Unknown
Unknown/Multiple |
Test Case: | Blocked By:
Blocking: | Related Tickets:
Differential Rev(s): | Wiki Page:
-------------------------------------+-------------------------------------
I was looking at the result of compiling `base:Data/Typeable/Internal`.
It has a fairly large top-level recursive group of functions, involving
`mkTrApp`. In doing so I noticed that `LiberateCase` was duplicating the
entire top-level blob at a call of a function that was actually just a
call to `error`. Totally nuts.
This ticket tracks the problem.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/14566>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
#14567: OccAnal loop-breaker scoring for NOINLINE things
-------------------------------------+-------------------------------------
Reporter: simonpj | Owner: (none)
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 8.2.1
Keywords: | Operating System: Unknown/Multiple
Architecture: | Type of failure: None/Unknown
Unknown/Multiple |
Test Case: | Blocked By:
Blocking: | Related Tickets:
Differential Rev(s): | Wiki Page:
-------------------------------------+-------------------------------------
When working on #14566 I found a case where the occurrence analyser
generated
{{{
Rec { {-# NOINLINE f #-}
f = e1[g]
; g {-# LOOPBREAKER #-} = e2[f] }
}}}
That is, even though `f` is marked `NOINLINE` we chose `g` to be the loop
breaker. Stupid! If `f` is marked `NOINLINE` it would be a much better
loop breaker.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/14567>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
#14565: Memory leak switching from -O1 to -O2
-------------------------------------+-------------------------------------
Reporter: dbeacham | Owner: (none)
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 8.2.2
Keywords: | Operating System: Unknown/Multiple
Architecture: x86_64 | Type of failure: Runtime
(amd64) | performance bug
Test Case: | Blocked By:
Blocking: | Related Tickets: #14379
Differential Rev(s): | Wiki Page:
-------------------------------------+-------------------------------------
GHC seems to get stuck at SpecConstr stage when compiling. Cpu and memory
usage then sky rocket.
I've managed to get the example down to what appears to be minimal but
making a number of minor changes makes the compilation go through quickly:
* changing `toIdx` to `const 0`
* removing one of the `V.forM_` layers.
* using "-fno-spec-constr" (unsuprisingly).
I can't reproduce it on 8.0.2.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/14565>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
#8457: -ffull-laziness does more harm than good
------------------------------------+-------------------------------------
Reporter: errge | Owner:
Type: bug | Status: new
Priority: high | Milestone: 7.8.1
Component: Compiler | Version: 7.7
Keywords: | Operating System: Unknown/Multiple
Architecture: Unknown/Multiple | Type of failure: None/Unknown
Difficulty: Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: |
------------------------------------+-------------------------------------
In this bug report I'd like to argue that `-ffull-laziness` shouldn't
be turned on automatically with either `-O` nor `-O2`, because it's
dangerous and can cause serious memory leaks which are hard to debug
or prevent. I'll also try to show that its optimization benefits are
negligible. Actually, my benchmarks show that it's beneficial to turn
it off even in the cases where we don't hit a space leak.
We've met this issue last week, but it had been reported several times
before: e.g. #917 and #5262.
A typical example is the following:
{{{
#!haskell
main :: IO ()
main = task () >> task ()
task :: () -> IO ()
task () = printvalues [1..1000000 :: Int]
printvalues :: [Int] -> IO ()
printvalues (x:xs) = print x >> printvalues xs
printvalues [] = return ()
}}}
We succeed with `-O0`, but fail with `-O`:
{{{
errge@curry:~/tmp $ ~/tmp/ghc/inplace/bin/ghc-stage2 -v0 -O0 -fforce-
recomp lazy && ./lazy +RTS -t >/dev/null
<<ghc: 1620098744 bytes, 3117 GCs, 32265/42580 avg/max bytes residency (3
samples), 2M in use, 0.00 INIT (0.00 elapsed), 1.28 MUT (1.28 elapsed),
0.02 GC (0.02 elapsed) :ghc>>
errge@curry:~/tmp $ ~/tmp/ghc/inplace/bin/ghc-stage2 -v0 -O -fforce-recomp
lazy && ./lazy +RTS -t >/dev/null
<<ghc: 1444098612 bytes, 2761 GCs, 3812497/13044272 avg/max bytes
residency (7 samples), 28M in use, 0.00 INIT (0.00 elapsed), 1.02 MUT
(1.03 elapsed), 0.12 GC (0.12 elapsed) :ghc>>
}}}
28M? What the leak!? Well, it's `-ffull-laziness`:
{{{
errge@curry:~/tmp $ ~/tmp/ghc/inplace/bin/ghc-stage2 -v0 -O -fno-full-
laziness -fforce-recomp lazy && ./lazy +RTS -t >/dev/null
<<ghc: 1484098612 bytes, 2835 GCs, 34812/42580 avg/max bytes residency (2
samples), 1M in use, 0.00 INIT (0.00 elapsed), 1.04 MUT (1.04 elapsed),
0.02 GC (0.02 elapsed) :ghc>>
}}}
We get constant space and the fastest run-time too, since we spare
some cycles on GC.
Note, that in this instance we are trying to explicity disable sharing
by using `()` as a fake argument for the function. Also note, that
this function may easily be a utility function in a larger code base
or in a library, therefore it's impractical to say that you shouldn't
use it twice "too close together".
Quoting from the GHC user guide:
{{{
-O2:
Means: “Apply every non-dangerous optimisation, even if it means
significantly longer compile times.”
The avoided “dangerous” optimisations are those that can make
runtime or space worse if you're unlucky. They are normally turned
on or off individually.
At the moment, -O2 is unlikely to produce better code than -O.
}}}
This seems to be false at the moment.
We decided to make a broader investigation into this issue and wanted
to know if we can disable this optimization without too much pain.
Came up with this benchmark plan:
- let's benchmark GHC,
- compile all stages with -O, but hack the stage1 compiler to
emit `-t` statistics for every file compiled,
- gather these statistics while compiling the libraries and the
stage2 compiler.
On the second run we compile the stage1 compiler with
`-O -fno-full-laziness`, but leave everything else unchanged in the
environment.
When we have both results of the compilation of ~1600 files, we match
them up and compute the (logarithmic) ratio of CPU and memory
difference between compilations, the final results for our benchmark.
The results and the raw data can be found at
https://github.com/errge/notlazy.
The overall compilation time dropped from 26:20 to 25:12, which is a
4% improvement. Investigating the full matching shows that this
overall result is from small improvements all around the place.
The results plotted:
- https://github.com/errge/notlazy/blob/master/cpu.png
- https://github.com/errge/notlazy/blob/master/mem.png
The graphs show the logarithmic (100*log_10(new/orig)) ratio of change
in cpu and memory consumption. Therefore negative results mean that
the new compilation method is faster.
As can be seen on the CPU graph, in most of the cases the difference
is negligible (actually smaller than what can be measured on small
files, this is why we have the spike at 0). In overall we see a small
improvement in CPU, and there are some outliers in both directions,
but there are more drastic improvement cases than drastic regressions.
On the memory graph the situation is much more close to zero. There
is one big positive memory outlier: `DsListComp.lhs`. It uses 69M
originally and now uses 103M. But compiles in 2 seconds both ways and
there are files in the source tree which requires 400M to compile, so
this is not an issue.
After all this, I'd like to hear other opinions about just disabling
this optimization in `-O` and `-O2` and leaving it as an option that
can be turned on when needed, my reasons once more:
- it's unsafe,
- it's hard to debug when you hit its issues,
- the optimization doesn't seem to be very productive,
- it's always easy to force sharing, but it's not easy to force
copying.
Apparently a Haskell programmer should be lazy, but never fully lazy.
Research done by Gergely Risko <errge> and Mihaly Barasz <klao>,
confirmed on two different machines with no other running processes.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/8457>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler