
I would be more than happy to spend a little time pushing your program through the Hat debugging and tracing tools to see what we can find. Nhc98's heap profiling should also reveal interesting things, and I'll do that too if you like.
Hmmm... The code won't work with NHC. It uses the multi-parameter classes extension of GHC and Hugs.
Ah. The Hat tracing tools are pure Haskell'98, so I'm afraid they can't help when a program uses extensions.
In any case, it's the stack that's the problem, not the heap.
But the two are linked. A heap profile shows everything that is
reachable from the stack, hence a fuller stack will lead to a larger
amount of heap being retained. The kind of thing that is in the
heap then points back in an indirect way to what might be holding
onto it in the stack.
So, I generated a heap profile from Hugs, using -d20000 as the
profiling option. (There appears to be a small bug in Hugs, so the
output is invalid. However, it was pretty easy to patch it up in a
text editor.) I have attached the final PostScript graph below for you
have have a look.
The first thing to notice is that the graph does not grow uniformly
towards the right. This is good, as it means the program is not
retaining stuff unnecessarily long after it has been used.
The second thing to notice is that the graph is extremely spiky.
When the program fails with a stack overflow, at the far right,
it is inside the deepest spike. This suggests that part of your
program has a very deeply nested recursive algorithm.
Now, taking a look at the input file, I see that line 818 (where
Hugs dies with a stack overflow) comes just before a C++ routine
that is relatively long (60 lines). I'd be prepared to bet that
your parsing algorithm needs to get to the end of the individual C++
routine before it can start returning any results at all.
If ghc has a larger stack, then I would expect it to be able to
cope with larger spikes, so my guess is that when it dies at line
3000+, it is once again inside a C++ routine that is longer than
any previously seen in the input file.
To test this hypothesis, I moved the routine from just after line 818
back to earlier in the file, say line 95. Lo and behold, Hugs now
dies after line 95.
And in moving the routine, I suddenly noticed that it is not a
routine at all - it is all entirely within /* */ comments. So,
I tried uncommenting it. Now, Hugs happily accepts this routine,
and continues until we get another control stack overflow after line
1588. Guess what - the beginning of another long comment.
| ... the problem is degenerative, so no matter how large I make the
| stack, there can always be files large enough to fill it up.
Well I think we've found the culprit. Your parser is actually
very well behaved for space usage in most respects, but the parse
function for stripping comments is unnecessarily deeply recursive.
I'll leave it up to you to find a fix.
Regards,
Malcolm
begin 644 profile.ps.gz
M'XL("'^W.3L``W!R;V9I;&4N<',`[/U+TS3)D:4'[O$K Y3S\\_.;O]]\T+D"LJ
MH?6'&-=6MXAYPNH=>9]:)\DVK:_4&.[5[MVF=9F8SZR>$_=I=:!X#U8_BCZS
MNA/5J_+H#*UST7'N,3ZEKC:^!ZO'A7[)ZGAYS+W6_Z+V:75#']=>;PS]B]0I
M\SZ]OIG7:5TTGT'KJ:&1L#IL_*;4;X?NR>J^'@]ZO3CNQ>K,T9]6G\X:M-:U
M0R-A]?"\%ZVC^YCP^GN\/ZO;Q]^S>G^\6],)I/9`]071UZ)+R#75]0QYGTL'
M,?1NII_P?*3J+M+#Z'J-U)RHSL/C)=>'Y'6J*\FVI4?).,1U+#&63/^2>A35
MS<28,+U-:%5,IY.Z(-7WI)Y(=4%QGZ8GBC%H.J2X3],OQ;=BNJ>\3O52,
MB`6P\UG7^^,BURWK8.:0U'+XVF')6J56Q6MG#I[2GV)QS'RR6"-[?7_J/E4M
ME<,^+E;,K+FIA3/ST&K]3(FR6T9]K7*K:;V\7:Q2RZ(Z8@VUM@X;L5AB7YFF
M;>72U4K;.<";2"_<@NL29;?NQGV:Y=>E\&(5[K%;_:F6YF4QKO>^/V\/N>YM
M3>YJ[;TQ%JMM69I3)J%6Z/'=BH4Z98)JO @/('H"6`/X'L"YB-HGP#]!/:G`P'H(`$Z@(`.
M+H`##^B@!#Q@@0YFP!,=\"P(/$7BVXZFR!,(O[&M=W5NKY!;[=1Z;2+E3&-]
MR*XK5.2Q5A?IG=935S2R[P>].K\8>1NUKO7^\_`MID5`WGYG;VH$/V]'HR:5
MP3K:4@4CH[1SGENMG\^CT7WOP[[NQX>YY.&H^C5E!DJ_PHK@[ILHD>7K/3C(
M9GWUK:'5C+/.%EGIDUFFOM":P38Y-&C-3JW/K$X7!\J:U;HJM_?FYF@VK)FD
M?05'QU;LK2"IH2._^9Y]1X949NU6&5QNYQ4QZFS?AI^71O%@E>B(:A=EL*XN
M]U:A/:_2ME:EUF-K1D%7L[[NJBX@605;=W62BI:NGIDIDU5WK&:Z6G<]]U7#
M.5CENZ\KXE[1ED0'M][S/3:).%94T5Z2EJ\=1",S*I0HIA7]%5U)VSOZN;V^
M1#D@6Z*FK@$\-%.MT=:K5G&2*%NBM'Y;U=='AY+-@[DD*LPQH=%D5H8E"CTX
MK&U%KVD9TJBW<_E7F7LD6JY_JEE)*JX29;]R^8^S/L.*SB_=)I5,B>I;QUVA
MRE-V-&LWD.-,=Q'CL,NU^WBT)^"TRXYM[5I:D_QH9=G!;J=WB.=-#B>67=+]
ME4^S`_76[JH/:=_D@'K=E;6"ZRJ1IN[FTI&AN\![ZT%MU[EVC]?6N^Z239%=
M9WWY-5[D\$+9K;;7IA]^M:U=;IN0+G+XLNZ..X=UT5V9[*IOK9EX2F95=N.=
MJ;[*G*R[^%3(R.Y_*`(T:S`R;))MZ+S*Y2*[*\E2U$Q3:Y5D9%=VHS7>/25+
MV\J*/%^>CM-1-J5WSFTED/M<69AK?1DW.71;LC?7UMO5F),#2=]9G\ZVGZ]2
M3==L46?I:T=S=#AJ/4T-LX=D%U=V:N_\Y/DF\^?*:K5IL;Z^VU$VK'=LKQ-S
M5]L[BW;]J4:EA[BN[%NJ#S5K=^^
M<?7_94W#?8/A`Q._X?`7BT^Q=\B]'JPV]3>ZOU]]D1F#N9_2XP+U8>8^5?V;
M':7NPG)3W^?P_YE?U'5RZC,=XUK\J5E?45_KF"/-#^LQG_MH?2UV_VUXP,VW
M&WYM\_N&5T]\PAE#J[]X^/'$ESS>G_F9*_BL#VZU+1_T^&[-/^UU6/5=#[^H
M^+7WUXR_]"'J\\Z8UOWAGG=S7WE\#^9'C[&TJ8^]%KN'WLOROU^;TB2U>?7-
MS^=;?OOA(1:??N;UW=\?<YUP`<::(SR!6X]P.P)C<0C&.BW\@LPS*/<@]]/*
M2QCSIW`6;CTZ=.P*GV$P&(SKD,R'Q8,8\Z=P)#(^<_Y$LC`6MR)S@,Z["`^_
M<3+BVQ2^QNAKX7)D'D5Y'OEM.@?$][#*#TD/CG)'LA:IO)+!@A+.R;P7Y:,$
MCT6X*H/W)#R6P7<SCDO<B_!?:O:ZW$7CXMP8GPN4-Y,Y:N?4>#RH?)O<4SH7
MQ^OVRM/)?9QS>+R^$OP>Y[\(]R=C;^<%Q;,+9V@\@_"),L^N7*/4<B@/J7<G
M^_4BO[DX2KF_5?Y2S32W^LG%&3)ND\=GRGL:3"[A1&5LHWRIW#\HEVK<I_"L
MNB*T=:7ZW:8<K.#0&3\K&&[&W0HNH_"Z,G^FG*_,NSD?S/.MRA4;XT5X9(-=
M)!RSU$0I_ZP5-"]2Y[M-N6G1+\);NW64JBPHX;157^]UV8?C*GS]4R[<8!,:
M3RZX5,:A2R:>\NL\;^K<N^@SX>6-=V2<O>"["9^O1F/-/<+6,JY?,*N,!Q@,
M1>$(#F:<\`?'MV+<PF`O"N\P8TSE)*:GWOF*L58)ES%K+\ISS+A<.9!C7A)^
MY+A/XT[&/+\IKS(88-OB7.;>U_F8,9:$JSG&A/`X,RY7CN<8U\;_#'Z=<$,S
M;ZJ\T<D454YI_J;R38/]9ES4X.4)3W5\T\9A#2ZC\%L';TVXK[D?4UYLUC>=
M,YO,6^73QIPE7-NQYA@/U_='SM&-?A'^;KY;P_:Z=%E@OYFU,42P#T`#"WN-
MV7#$'F8(Q'CO??IM.0,$?9POU8#)_IT+9CFG*H,S!UE:D<X^@0L(.F-<PT?[
MU&?0:5?-&:K:`X\%N+[T7OYQ6>-?L-A9%A&8]H"$+P3WH(XO<'=^AX+[SBVW
M0,+K"6L`K-$F:/$T01B0W-<6P9CG="?P\]S-&#(]'EE!Z[[7%CQ[3ED&=??W
M92CX(-(K0#[8\HJ=]]2$P.ISV!CBWB<(`^-[K.<X_=[C'4'XL_HJZ/X,V`3X
MGU^*'1/@L@$[7.!Z;>W#NTF/)(AC`M9!!D/WKL<?^);'#DT(MZT>M9`&-#F@
M(31Q>JQ#;3(WD;+J81!^%L#/.J[B\I<YKB+N$)X+>N-S'T+/T_N"MPQC`T84
MC$,8O3#F/W\I\'W15PG?,LP`,&_`;`-S%,QLG^=#F$5I[H49&^9Y6!U@38&5
M"-8O6/4^KY6PPM*Z#*LYQ``0.4"\`5$*Q#:?(R*(HRCZ@I@-(CV(#R&JA%@4
M(MC/<2^%RQQF4WA.83UM!V@;0=L/VK;`=H>V2;B]HFT9;>=H&TC;1]IVTG85
MMKFT/<9M-6W':1M/VW]*&U"Z@=(4D-Z@M`BF4R@-0^D;2OM0NHC23)2>@K06
MI<,PC4;I-TK;4;J/TH247J2T)*4S(0U*Z5-,NU*ZEM*\E!ZFM#*EHRF-#>EO
M2IMCNIW2])3>I[(`E1.H#$'E"RA[4+D$RRQ4GJ&R#I6#J(Q$Y2<J6T&YB\ID
M6%ZCLAR5\Z@,2.5#*CM2N1+*G%0>Q;(JE6.IC$OE7RH;4[F9RM2?R]M4%N=R
M.I7AJ7Q/97^2"Y#,@.0)(&L@.03**$A^0;(-DGN03(3D)21+(3D+R&!(/H.R
M&Y+KD,R'Y$$D*R(Y$LF80/Y$LBF46Y%,B^1=)`LC.1G)T$B^!K(WDLNAS([D
M>23K(SD@R0A)?DBR19`[DDP2Y94DRR0Y)\E`23Y*LE.2JX+,E>2Q**LE.2[)
M>$G^2[)ADAN33!GDS22+1CDUR;!)ODVR;Y*+D\R<Y.D@:R<Y/,KH27Y/LGV2
M^Y--@.P%9$L@.P/8(,@^@;8+LFN0S8/L(60K(3L*V5@^VU_(-L-V&[+ID+V'
M;$%D)R(;$MF7P/9$=BFT69$]BVQ=9`<C&QG9S\BV!G8WLLFAO8YL>63G(QL@
MV0?)=DAV1;`YDCT2;95DQR0;)]D_R39*=E.RJ8*]E6RQ:*<E&R[9=\GV2W9A
MLAF3/1ELS62'1ALUV:_)MDUV;[*)D[V<;.ED9P<;/-GGT79/=GVR^1,>@+`"
MA",@C`'@#PB;@+@%PC00WH&P$(23(`P%X2L`>T&X#,1L$)Z#L!Z$`R&,".%'
M"%L"N!/"I"!>A;`LA',A#`SA8P@[0[B:SY@;PN,P5H=P/(3Q(?P/88,(-T28
M(L`;$18)<4J$82)\$V&?"!=%F"G"4P'6BG!8B-$B_!9ANPCW19@PPHL1EHQP
M9H!!(WP:8M<(UT:8-\+#$5:.<'2$L0/\'6'S$+='F#["^Q$6D'""A"$D?"%@
M#PF7B)A%PC,2UI%PD(21)/PD82L!=TF83,1K$I:3<)Z$`25\*&%'"5<*F%/"
MHR)6E7"LA'$E_"MA8PDW2YA:P-L2%A=QNH3A)7PO87\)%TR88<(3`]:8<,B(
M42;\,F&;"?=,F&C"2Q.6FG#6GS'8A,]F[#;AN@GS37APPHH3CIPPYH`_)VPZ
MXM8)TTYX=\+"$TZ>,/2$KP?L/>'R$;-/>'["^M-Q`'2,`!T_0,<6P'$'=$P"
M'J]`QS+0<0YT#`0='_'=1U)06^NR>QJZ'=QG5]%.9]$(RO-=.U-8T^Y1OUQZ
M#:@E];`_KU^],1,+A;V'&+OR_L9Q,?;>??W3\5+S9\LV+T?C+.-!'9]UG_6;
M^_UH7'>%9K=QIM]#N$?L._*UT;\_7P/TN\U:B'[OF4?1>:*S+;47E+EGS2\U
M`]>8D&.)9%Y*#:3.9QVI[;IVR#S8*_;I).-%YL]6^CQNNURWYMUK9PXN=UE7
M=+[.N7S-\[D/T/5AS+NRKM3J6K<IUAI9C[+>H>M8UD5U_4O]IZZ;M9>I]ZYK
MU5IO1Z[$UNDXHDS6]]<N:I-\@<0%J<71>*)&0#WZ>CZ/0](4M>*7L>>2N.=9
M;]+VS!(O]:[FOA_'6==6%3]D;R'QV;A/B^LB9I=X<&_+T+X?QI&9&]7X,_4A
M&K>.?8#$NYG_E#BYU4//FDLE;EWQ=5=O;O:;*RZ?Q]:M>#YUV[H/J-EKJR55
M]C)K_S!R3VO?4;/S_:ES@>Y7VDRU:P["]CF1UUC[HY[5:Y\C1Z7*OJJK&W?;
MRZS]V(BO91\W8F'9_XWW+OO&<8RJ[#<S=M-]:N:A?7_K^A?=%]>]U-^['>ZG
ML\:@^_"L&_K^/8XDE'W_F$,D7S"/VM0\0^SM+3_A-13-:SS;`JC'KTH^9,P]
MDD=Y5'_64BB_J?D7K\=IWJ;=/S<]-EGR/?,8U94GJG%6WZWD!R6_5+]YU1A%
M\U)CSI)\5CU[^\X.CVW-^KOFSUJA?M/]M.3=QEY-\G7C&$?)\XT]L^0'K_67
MNWRZVE9>L=6EM:@>YB/'\:22QVP_;BV]<G3&RG]FC5WSIF,,6KXUCG"6/&W&
M!9K?'7UM>6&OVWL^.7(0DH=.#;GGKV-\6MX[<^DK7SZ.S)0\^\@K2GY^/(/D
M]2N2J8%]/:P'M"OJ5<E]MVD=(7+44G\8SVYUBWAVJ7>\>G;3^L.JD[1RL?8Y
M\@RKOE*S2055>I^K+I/Z2*_GQ+U('2@UW5H_&KD9J3OUK-`R[=6VZE75G_4]
MZ&]JG2MJ/5(?R_VMUM5:Q75^2`[7ZW$UF9RD_K?J>$-[(/6_W%=IW3"?7>N-
M]_H>:J!*W5#KE.Z)T?IFYIZT+EJCJO8K4DN6>NKM:[OM6KN6.FSU2\U94M.7
M^FW.(5[W]1R2UXO]Z%FI,[<3L)<5J4&O^G1J7[6NG9XFK8>G=T#KZ%GOT/I[
MO;\:.%(3EKK]T,E)O?_E%W^HGD%U`J')$'U!QEFN2_#:I^H94@.I.HBL[:I^
M8FA`1'>1>R?7:Z2N9.D\\GM7?<C0"8BN9&C]1(\RWH/H6')-5?W+T$"*;B;W
MS*JW&=^8Z'2&+D'T/;U+?-Y5^[-T03E_JIYHC%W3(87&1?1+]Y=S[?)!]Q2:
M-M%+I6Y;=5;/SAC=5$NU]%FYQJFN:VA.1`\VWI'IR%(KMO1GK<[?9-UTW9K7
M$53OEGL2U\G%=R3ZNLPGNR[/CR-6/5^N*ZX##/V2Z0>]-J^ZP_JOMKJK#\=5
MN/=8=8[WBGQ/^TVN4WVDLP945YFZ;=5C#IV<Z#C'N!;]9^;Y5#?:V<ZK:C5%
M;_KLZ/V^]LRJ4\T]I>M;0VLDNMB:G4]M?UUM2T_;#M>;^/A$AUN]<*HQ+/I6
MT>]V]-X^I=6V=+_UY;?5X_"XBJ&S,IVQ[RU<G^RY2M<UAQ9<]-!#MRTZZFNO
M6IL>Q_'67W>^X'23_9CJMC-?X'IOS^NK3KR^FG9"B(9\Z<O'W"JZ]&OU[&L%
M?;>IGCW&A.C@.YJNI?A0/]\$A.==OB/1W0]/C.CUV_E;T\OR]8C.O\9UA3+B
M0Q%_0&IXU%?0F?,**:5M^1&:T%%!CO@*EH^AYN33^2%>-O$_7-LE>+G(?2[?
M1.XMU&^1=6;W:?@<HOZ.SEH^Q7<MOI`:2_>'KG'J)VFBQ"YY#?>AA)=-_"O/
M%VU"[V7Y7O+=JE^F7=T7B4/<9Q.>0O'GW!KHHCY:\?5D?*U^H.8BW=1W)CZB
MX7T4_]'P/IIO*3QWYG<*OZCYI+R>H_ZJ&L7/6NO7.Q)?UAB#XN>J[Z&VL^+Q
M$Q]8
LH