RE: first stab at -ffunction-sections

On Mon, Nov 18, 2002 at 11:55:27AM -0000, Simon Marlow wrote:
What exactly are you trying to use -ffunction-sections for? I'm pretty sure it won't work as things stand currently, unless you can guarantee to be able to find a text/data boundary symbol for the garbage collector (currently it has to be able to distinguish text from data, we're in the process of lifting the restriction but it's a lot of work).
Trying to garbage-collect unused functions to reduce the size of the .text section.
You should be aware that -split-objs (the trick we use to build our libraries in lots of little bits) gets most of the benefit that you'd get from using -ffunction-sections. You might get slightly more fine-grained discarding of code with -ffunction-sections, but the effect won't be dramatic (I'm guessing). Also there's the issues of telling the garbage collector and the mangler about it. However, it would be nice to be able to use -ffunction-sections/--gc-sections instead of -split-objs. It's been at the back of my mind to have a go at this someday... Cheers, Simon

On Mon, Nov 18, 2002 at 04:57:50PM -0000, Simon Marlow wrote:
You should be aware that -split-objs (the trick we use to build our libraries in lots of little bits) gets most of the benefit that you'd get from using -ffunction-sections. You might get slightly more fine-grained discarding of code with -ffunction-sections, but the effect won't be dramatic (I'm guessing). Also there's the issues of telling the garbage collector and the mangler about it. However, it would be nice to be able to use -ffunction-sections/--gc-sections instead of -split-objs. It's been at the back of my mind to have a go at this someday...
I noticed a lot of not-obviously-used stuff brought in from various libraries and wanted to nuke some of the unneeded things. Step 1 was trying to compile the libraries with the option, which didn't quite fly... it looks like ghc-asm is the primary sufferer, and I'm not sure the compiler option is needed... $T_DOT_WORD = '\.(long|value|byte|zero)'; $T_DOT_GLOBAL = '\.globl'; $T_HDR_literal = "\.section\t\.rodata\n"; # or just use .text??? (WDP 95/11 ) $T_HDR_misc = "\.text\n\t\.align 4\n"; $T_HDR_data = "\.data\n\t\.align 4\n"; # ToDo: change align?? $T_HDR_consist = "\.text\n"; $T_HDR_closure = "\.data\n\t\.align 4\n"; # ToDo: change align? $T_HDR_srt = "\.text\n\t\.align 4\n"; # ToDo: change align? $T_HDR_info = "\.text\n\t\.align 4\n"; # NB: requires padding $T_HDR_entry = "\.text\n"; # no .align so we're right next to _info (argu ably wrong...?) $T_HDR_fast = "\.text\n\t\.align 4\n"; $T_HDR_vector = "\.text\n\t\.align 4\n"; # NB: requires padding $T_HDR_direct = "\.text\n\t\.align 4\n"; $T_create_word = "\t.word"; So basically it's tagging various items directly with the section they go to as it is, so a compiler option may very well be superfluous; just hack ghc-asm. Adding some extra goo in mangle_asm() to get the section name to be related to the symbol doesn't look too bad. -split-objs I didn't really realize was there. I see (tracing through ghc5, whatever debian's latest shipping version is): ghc/compiler/main/DriverFlags.hs:250 , ( "split-objs" , NoArg (if can_split then do writeIORef v_Split_object_files True add v_Opt_C "-fglobalise-toplev-name s" else hPutStrLn stderr "warning: don't know how to split \ \object files on this architecture" ) ) ghc/compiler/main/DriverFlags.hs:250 opt_EnsureSplittableC = lookUp FSLIT("-fglobalise-toplev-names") ./ghc/compiler/codeGen/CgConTbls.lhs:109 genConInfo comp_info data_con = -- Order of things is to reduce forward references mkAbstractCs [if opt_EnsureSplittableC then CSplitMarker else AbsCNop, closure_code, static_code] ./ghc/compiler/codeGen/CodeGen.lhs:260 maybeExternaliseId id | opt_EnsureSplittableC, -- Externalise the name for -split-objs isInternalName name = moduleName `thenFC` \ mod -> returnFC (setIdName id (mkExternalName uniq mod new_occ (nameSrcLoc name))) ./ghc/compiler/codeGen/CodeGen.lhs:276 maybeSplitCode | opt_EnsureSplittableC = CSplitMarker | otherwise = AbsCNop ./ghc/compiler/nativeGen/AbsCStixGen.lhs:272 gencode CSplitMarker | opt_EnsureSplittableC = returnUs (\xs -> StLabel mkSplitMarkerLabel : xs) | otherwise = returnUs id ./ghc/compiler/absCSyn/CLabel.lhs:245 mkSplitMarkerLabel = RtsLabel (Rts_Code "__stg_split_marker") Then in ghc/driver/split/ghc-split.lprl:287 (there's actually one per arch): # strip the marker $str =~ s/(\.text\n\t\.align .(,0x90)?\n)\.globl\s+.*_?__stg_split_marker.*\ n/$1/; $str =~ s/(\t\.align .(,0x90)?\n)\.globl\s+.*_?__stg_split_marker.*\n/$1/; ... So to me it looks feasible to figure out who's fooling with these things, though it's probably not necessary to do any of this within the compiler except for whatever might circumvent ghc-asm, if anything. At any rate, I am finding the amount of unused code/data linked into the generated executables significant... for instance, in a non- concurrent program: 080a1c64 D MVar_modifyMVarzu_closure 0805aeb8 T MVar_modifyMVarzu_entry 0805aece T MVar_modifyMVarzu_fast3 0805aeb8 T MVar_modifyMVarzu_info ... and as it's a 9-line script to mangle patches, it's certainly not using this: 0805b140 T __stginit_PosixDB The idea with -ffunction-sections or brewing up an equivalent is to build the libraries with it so when the final executable is linked, it imports only the code and statically-allocated data it uses from them. Thanks, Bill
participants (2)
-
Simon Marlow
-
William Lee Irwin III