
Hi, I wonder if this might be in any way related to the HUGE amount of new terms/types/coercions created by the specialiser as documented in: https://ghc.haskell.org/trac/ghc/ticket/9630#comment:10 https://ghc.haskell.org/trac/ghc/ticket/9630#comment:10 I don’t have a profiled version of GHC, so I was wondering if you could run your tests with a ‘-fno-specialise’, and see how everything performs then? Cheers, Christiaan
On 12 Apr 2015, at 18:28, Michal Terepeta
wrote: So I've tried to compile Idris/Agda with prof compilers but this didn't quite work out due to deps not compiling (apparently it's not possible to use template haskell with a profiled compiler).
Out of curiosity I had a look at compiling haskell-src-exts since that takes quite a while. I've used ghc HEAD and 7.8.4 (both built with BuildFlavour=prof & bootstrapped with a standard ghc 7.8.4) and it's interesting -- the current HEAD takes quite a bit longer and allocates way more than 7.8.4. One of the main things that stand out is the CallArity analysis (which IIRC was not there in 7.8.4). So unless I messed something up with measuring, the analysis seem to be pretty expensive.
Anyway, the results are below.
Cheers, Michal
** HEAD
Sun Apr 12 15:52 2015 Time and Allocation Profiling Report (Final)
ghc +RTS -p -RTS [...]
total time = 147.84 secs (147841 ticks @ 1000 us, 1 processor) total alloc = 172,378,600,408 bytes (excludes profiling overheads)
COST CENTRE MODULE %time %alloc
SimplTopBinds SimplCore 32.4 28.8 CallArity SimplCore 18.4 25.6 lintAnnots CoreLint 4.5 4.6 CoreTidy HscMain 4.5 5.1 pprNativeCode AsmCodeGen 3.2 3.4 OccAnal SimplCore 3.2 3.1 occAnalBind.assoc OccurAnal 2.6 2.5 StgCmm HscMain 2.3 1.9 Simplify SimplCore 2.1 0.2 RegAlloc AsmCodeGen 2.1 2.4 FloatOutwards SimplCore 2.0 1.6 regLiveness AsmCodeGen 1.9 1.9 tc_rn_src_decls TcRnDriver 1.8 1.3 sink CmmPipeline 1.7 1.5 NewStranal SimplCore 1.3 1.5 genMachCode AsmCodeGen 1.1 1.0 layoutStack CmmPipeline 1.0 1.0
** HEAD with -fno-call-arity
Sun Apr 12 18:16 2015 Time and Allocation Profiling Report (Final)
ghc +RTS -p -RTS [...] -fno-call-arity
total time = 113.71 secs (113714 ticks @ 1000 us, 1 processor) total alloc = 121,884,896,720 bytes (excludes profiling overheads)
COST CENTRE MODULE %time %alloc
SimplTopBinds SimplCore 37.2 36.6 CoreTidy HscMain 6.0 7.3 lintAnnots CoreLint 5.8 6.5 pprNativeCode AsmCodeGen 4.1 4.8 OccAnal SimplCore 3.6 3.8 occAnalBind.assoc OccurAnal 2.9 3.2 StgCmm HscMain 2.9 2.6 RegAlloc AsmCodeGen 2.6 3.4 FloatOutwards SimplCore 2.6 2.3 regLiveness AsmCodeGen 2.5 2.8 tc_rn_src_decls TcRnDriver 2.4 1.9 Simplify SimplCore 2.4 0.3 sink CmmPipeline 2.1 2.2 NewStranal SimplCore 1.7 2.1 genMachCode AsmCodeGen 1.4 1.4 layoutStack CmmPipeline 1.4 1.4 NativeCodeGen CodeOutput 1.1 1.2 FloatInwards SimplCore 1.1 1.4 do_block Hoopl.Dataflow 1.0 0.6 Digraph.scc Digraph 0.8 1.3
** 7.8.4
Sun Apr 12 15:41 2015 Time and Allocation Profiling Report (Final)
ghc +RTS -p -RTS [...]
total time = 93.11 secs (93112 ticks @ 1000 us, 1 processor) total alloc = 103,135,975,120 bytes (excludes profiling overheads)
COST CENTRE MODULE %time %alloc
SimplTopBinds SimplCore 38.5 37.4 pprNativeCode AsmCodeGen 6.2 7.2 StgCmm HscMain 3.9 4.2 RegAlloc AsmCodeGen 3.7 5.1 occAnalBind.assoc OccurAnal 3.3 3.6 OccAnal SimplCore 3.3 3.6 regLiveness AsmCodeGen 3.1 3.4 FloatOutwards SimplCore 2.9 2.4 sink CmmPipeline 2.8 2.8 Simplify SimplCore 2.6 0.3 tc_rn_src_decls TcRnDriver 2.4 2.1 genMachCode AsmCodeGen 1.9 2.0 NewStranal SimplCore 1.8 2.1 layoutStack CmmPipeline 1.8 1.8 Core2Core HscMain 1.3 1.2 deSugar HscMain 1.1 1.1 do_block Hoopl.Dataflow 1.1 0.7 CoreTidy HscMain 1.0 1.1 CorePrep HscMain 1.0 1.1 Digraph.scc Digraph 0.9 1.5 versioninfo MkIface 0.9 1.0 zonkEvBndr_zonkTcTypeToType TcHsSyn 0.6 1.4
On Fri, Apr 3, 2015 at 4:49 PM David Feuer
mailto:david.feuer@gmail.com> wrote: On a machine with an SSD instead of a hard disk, swapping greatly reduces the lifespan of the storage device. On Fri, Apr 3, 2015 at 10:14 AM, Bertram Felgenhauer
mailto:bertram.felgenhauer@googlemail.com> wrote: George Colpitts wrote: I'm curious why the amount of RAM is relevant as all of our OS have virtual memory so it is only the size of the heap and the amount of swap that should be relevant for an Out Of Memory error, right?
The computer may not be your own. VPSs are essentially priced based on RAM available to the virtual server, and have limited swapping space, so this is an area where increased memory consumption hurts. Building binaries elsewhere is usually an option, but more painful than doing it on the VPS itself.
Cheers,
Bertram _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org mailto:Glasgow-haskell-users@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org mailto:Glasgow-haskell-users@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users