Hello everyone:
I made some changes to the Hugs sources which give small runtime
speed gains. The results are in the table below. The gains are
minor, but I thought some people in the group might be interested.
WHATIS WHATIS1 FHEAP
Speed Increase (percent)
gamteb 8 9 5
parser 6 8 gcfail
prolog 9 9 2
queens 10 13 4
Extra BSS memory (kBytes) 24 24 0
Nbr source lines changed 50 144 41
---Notes On The Table
The table headings WHATIS, WHATIS1, and FHEAP refer to three
different sets of changes. WHATIS and WHATIS1 are described
below. FHEAP was an experiment that simply used a fixed size
heap (an array instead of a malloc) and is impractical because
it rules out use of the Hugs -h flag.
The (motley) collection of test programs are from nofib.
I timed runhugs for each program without command line flags,
except that for queens I used -h1000000. Different programs
would probably give different results: the usual caveats apply.
These programs were just the nofib ones I had at hand.
I compiled with MSC 5.0 Service Pack 3 using modified Nov 2002
Hugs sources. I ran the timing tests on a Windows 98 machine.
---Strategy
I wanted to see what speed gains could be achieved by taking a
worm's eye view of the code, with no changes to fundamental
algorithms. I also wanted to make changes that would be limited
to just a few places in the code -- a minimal force approach.
To start off, I ran the MSC profiler. This confirmed that
whatIs() is a candidate for optimization, as already noted in
Mark's Gopher implementation document and in comments in
storage.c. But the profiler showed a few other hotspots too.
(An interesting thing to do is sort the profiler results on
execution line-count, which immediately tells you which are the
most executed lines in the program.)
---WHATIS Change
To reduce the time spent in whatIs, I created a byte array
whatCode of whatIs codes. I then created a whatIs macro which
replaces the whatIs function:
#define whatIs(c) (isPair(c)? \
(isTag(fst(c)) ? fst(c) : AP ) : \
whatCode[c])
Negative indexing into whatCode is prevented by the isPair. To
keep the size of whatCode reasonable, I had to reduce the range
of unboxed ints -- the bigger the range, the bigger the whatCode
array. I settled on a range of 2048 i.e. ints between -1023 and
1024 are unboxed, all others are boxed. The extra boxing will
actually slow down execution of certain Haskell programs, but as
far as I can tell the majority of real Hugs programs would be
unaffected. Increasing this range to, say, 10 000 would increase
memory usage by 10 000 bytes -- nothing really when you consider
that the memory footprint of winhugs is about 10 MBytes.
This change yields the speed improvements under the WHATIS
column in the table.
---WHATIS1 Change
IsTag is defined as
#define isTag(c) (TAGMIN<=(c) && (c)
Hello everyone: I made some changes to the Hugs sources which give small runtime speed gains.
I think you're the first person to look at optimizing Hugs in quite a while. Great stuff! I toyed with changing whatIs a few years ago. Instead of your table-based approach, I was thinking of using a bitwise encoding. My idea was to use one bit to select int/not int and that everything else would be interpreted as a triple of the form: (WhatisTag,ModuleId,Index) Where, for example, the WhatisTag is 5 bits, the ModuleId is 10 bits and the Index is 16 bits (the numbers will probably need tweaked somewhat). So, each module would have its own set of tables for storing names, types, etc. There's a whole bunch of tags like BANG, COMP, ASPAT, etc. which don't have any associated data - they'd be in a tuple of the form (WhatisTag,<unused>,MoreTag) I figured this would be faster because most whatis tests could be resolved just by looking at the tag. I also figured it would lead to a more flexible structure in Hugs because: - We regularily overflow a bunch of hardwired limits in Hugs as we load bigger and bigger programs. Programs are getting bigger not because modules are getting bigger but because we load more modules. This change would make those hardwired limits be per-module limits so Hugs ought to scale better. - Hugs uses a stacklike method for loading modules. If a module you loaded early changes, then all modules loaded after it have to be reloaded whether they depend on that module or not. This change could lead the way to relaxing that constraint. There's other ways to achieve both these goals - this just seemed like a good way of killing 3 birds with one stone. I wonder how the this change would stack up against your change? -- Alastair ps I'd really like to ask for your code so I can spend some time looking at it but I have to spend more time on things that actually pay me for the next few months so I doubt I'll manage it. Sorry.
I toyed with changing whatIs a few years ago. Instead of your table-based approach, I was thinking of using a bitwise encoding. My idea was to use one bit to select int/not int and that everything else would be interpreted as a triple of the form:
(WhatisTag,ModuleId,Index) ...
Alastair, as far as I can see, your bit-field approach is really the way to go. It would add small efficiencies in lots of places, and it just generally makes sense on the 32 bit word size we now use. I suppose we need someone who is both qualified and has time for this kind of thing (probably a group of zero) to bite the bullet and (a) figure out if it is worth the effort and (b) actually make the changes. I have scratched my brain considering other ways the execution machine could be optimized (within the same basic architecture). Mark (et al) really seem to have made every machine cycle count. It really is quite a pleasure seeing how the damn thing has been so streamlined. Perhaps a C-- backend could be directed to do more intelligent register dragging than the MSC C compiler; a GLOBALfst on machines other than 68Ks. But I am speculating... On a more down to earth note, it is not clear to me if the changes I made are worth integrating into the code, for the 10% speed increase they give. I'll leave it up to Sigbjorn's judgment. Oh, one more thing: if the following line was removed and "memory" was statically allocated, we would gain a little speed (on x86 architectures anyway). Since NUM_ADDRS is fixed and can't be changed by the user, there appears to be no down side to making this change. machine.c: memory = (Memory)farCalloc(NUM_ADDRS,... Thanks for your interest, and thanks to the others who responded to my original email. ---- Stephen Milborrow
"Stephen Milborrow"
Hello everyone:
I made some changes to the Hugs sources which give small runtime speed gains.
...
If anyone is interested, I would be happy to send the sources. They are modified Nov 2002 sources with all the changes demarcated by #define's. Six files changed.
I would be interested to know what results these changes give on other machines.
Hi Stephen, good writeup. I'd be interested in having a close look at your changes & very possibly integrate them into the CVS sources for the benefit of others to both use and test. --sigbjorn
Hello Sigbjorn,
How should I send the files: diff, diff -c, or all 6 complete changed
source files?
And to you directly, or to the hugs-users' group as well? I'm not sure what
the protocol is here. In any case I will tar and gzip what I send. Let me
know what is easiest for you.
Thanks,
Stephen.
----- Original Message -----
From: Sigbjorn Finne
Hello everyone:
I made some changes to the Hugs sources which give small runtime speed gains.
...
If anyone is interested, I would be happy to send the sources. They are modified Nov 2002 sources with all the changes demarcated by #define's. Six files changed.
I would be interested to know what results these changes give on other machines.
Hi Stephen, good writeup. I'd be interested in having a close look at your changes & very possibly integrate them into the CVS sources for the benefit of others to both use and test. --sigbjorn _______________________________________________ Hugs-Users mailing list Hugs-Users@haskell.org http://www.haskell.org/mailman/listinfo/hugs-users
Hi there,
'diff -u' would be a Fine Choice & if the changes aren't
too big, I suggest Cc'ing the hugs-bugs list for the benefit
of people that don't use CVS.
Hugs survives on the contributions of the community, so your
contrib is most welcome (as is that of others who might have
some changes up their sleeves!)
--sigbjorn
----- Original Message -----
From: "Stephen Milborrow"
Hello Sigbjorn, How should I send the files: diff, diff -c, or all 6 complete changed source files? And to you directly, or to the hugs-users' group as well? I'm not sure what the protocol is here. In any case I will tar and gzip what I send. Let me know what is easiest for you. Thanks, Stephen.
----- Original Message ----- From: Sigbjorn Finne
To: Stephen Milborrow Cc: Sent: Monday, December 02, 2002 4:52 PM Subject: Re: A Faster whatIs "Stephen Milborrow"
writes: Hello everyone:
I made some changes to the Hugs sources which give small runtime speed gains.
...
If anyone is interested, I would be happy to send the sources. They are modified Nov 2002 sources with all the changes demarcated by #define's. Six files changed.
I would be interested to know what results these changes give on other machines.
Hi Stephen,
good writeup. I'd be interested in having a close look at your changes & very possibly integrate them into the CVS sources for the benefit of others to both use and test.
--sigbjorn
Hi, I've attached the "faster whatIs" patches. Any eagle eyes to look over my changes to verify that I haven't introduced a bug would be appreciated. Stephen Milborrow
participants (3)
-
Alastair Reid -
Sigbjorn Finne -
Stephen Milborrow