
In article
<9584A4A864BD8548932F2F88EB30D1C60D17DC7F@TVP-MSG-01.europe.corp.microso
ft.com>,
"Simon Marlow"
-rw-r--r-- 1 ashley ashley 2117554 May 28 04:04 HBase.hi -rw-r--r-- 1 ashley ashley 2119865 May 28 08:15 HBase.p hi -rw-r--r-- 1 ashley ashley 72669 May 28 16:20 HBase.q hi
Wow :-)
It looks like the problem is very data-heavy Unicode property files. For instance, Org.Org.Semantic.HBase.Text.UnicodeNames exports just one value: getCharacterName :: Char -> String Inside the module is an "Array Char String" created from a "[(Char,String)]" that is a long list of Unicode character names. The file is automatically generated from a downloaded data file. For instance:
getCharacterName '\x189F' "MONGOLIAN LETTER MANCHU ALI GALI DDHA"
For some reason, even though only getCharacterName is exported, when optimisation is switched on, the interface file balloons a thousandfold: $ ls -l UnicodeNames.*hi -rw-r--r-- 1 ashley ashley 5854480 May 28 02:49 UnicodeNames.hi -rw-r--r-- 1 ashley ashley 5854497 May 28 06:56 UnicodeNames.p_hi -rw-r--r-- 1 ashley ashley 2385 May 28 15:59 UnicodeNames.q_hi What's the best way to stop this? Is it reasonable to simply switch off profiling just for these few files? Also, I'd like to make all that data disappear when a binary program that doesn't use it is stripped; currently it doesn't. Any ideas? -- Ashley Yakeley, Seattle WA