
On 2001-07-13T16:55:57-0400, Ken Shan wrote:
Current hurdle: ghc-inplace doesn't seem to be finding its .hi files for basic stuff.
puffin:~$ cat Main.hs module Main where import IO main = putStrLn "Hello, world!"
puffin:~$ u/ghc-port/alpha/ghc/compiler/ghc-inplace Main.hs Main.hs:1: failed to load interface for `Prelude': Could not find interface file for `Prelude'
Main.hs:2: failed to load interface for `IO': Could not find interface file for `IO'
This problem was because struct dirent differs between i386-linux and alpha-osf3. I fixed it by running the intermediate C program generated by hsc2hs remotely on our alpha machine instead of locally on our linux machine.
well tracked down.
"Hello, world!" works now! Yay!
:-) great work!
So far, I've discovered 3 reasons why .hc files are not entirely portable across platforms.
There are quite a few bootstrapping and cross-compilation issues to be resolved, and things got harder recently due to the use of hsc2hs to generate some of the Haskell sources, which means the .hs files and therefore the .hc files have platform-dependent content. This is somewhat unfortunate, but using careful cross-compilation techniques (as you suggest) we can work around it. I'm hoping that after this experience(!) we can do two things: - add some support for cross-compilation to the build system. - write down exactly what one needs to do to make this work, and put the instructions in the build system documentation.
1. ghc/includes/MachDeps.h, which is #included by some Haskell source files, in turn #includes ghc/includes/config.h, which differs from platform to platform.
SOLUTION: Modify MachDeps.h to #include the config.h from alpha-osf3, even when compiling on i386-linux.
Yep: for cross compilation of the .hc files, the first thing to do is run ./configure on the target platform and take the output back to the host. In general, the sources and build system should make the distinction between the host platform's config and the target platform's config, but it's probably a lot of work to get this right.
2. The .hs file produced by hsc2hs differs from platform to platform, because the intermediate C program it generates necessarily behaves differently on each platform.
SOLUTION: Add "--keep-tmp-files" flag to hsc2hs. Run the intermediate C program over on alpha-osf3.
Yes.
3. Liveness bitmaps are of different width (32 bits vs 64 bits) between platforms, and the compiler generates different HC code based on the width.
SOLUTION: Make the compiler generate platform-independent HC code that uses newly defined preprocessor macros to switch between 32-bit and 64-bit liveness bitmaps at C compilation time.
yes, we should definitely do this. The .hc files are supposed to be independent of word size, so hopefully this is the only wrinkle.
These fixes (especially #1) make me uneasy about the bootstrapping process. Here's my current limited understanding of the making of HC files:
a. First we use the existing GHC to compile a new compiler (that produces unregisterised code)
b. Second we use the new compiler to compile a new library (keeping the unregisterised HC files) -- This new library is compiled for use on alpha-osf3, not i386-linux, in terms of issues (1) and (2) above.
c. Third we use the new compiler from (a), in conjuction with the new library from (b), to compile a doubly new compiler (that produces unregisterised code) (keeping the unregisterised HC files) -- This doubly new compiler is compiled for use on alpha-osf3, not i386-linux, in terms of issues (1) and (2) above.
d. Finally we ship the HC files kept from steps (b) and (c) for use on the target platform
Note that, in step (c), we run the i386-linux compiler from (a) with the alpha-osf3 library from (b). The library produced in (b) is incorrect as i386-linux code, but that's okay because all we want from (b) are the HC files anyway. Consequently, the doubly new compiler produced in (c) is also incorrect as i386-linux code. That's again okay because all we really want from step (c) are the HC files anyway. Just to make sure, though, could you please confirm the following:?
In step (c), the once-new compiler uses (b) only as data, not as code. In other words, even though (b) is incorrect as i386-linux code (and so (c) is incorrect as i386-linux code), the HC files produced in (c) are still perfectly correct as alpha-osf3 code.
This should be the case, but we need to be careful about which settings from the environment are used when compiling the compiler itself. I can see the following dependencies at the moment: - the compiler has a few #ifdefs for Windows. As long as neither the host nor the target in a cross compilation are Windows machines, we're ok. - the native code generator is platform-dependent. Doesn't affect unregisterised compilation, so we escape again. Cheers, Simon

For better or for worse, this message will be one full of questions... It took a couple (one? two? I can't remember) iterations of ghc building itself purely on the Alpha before the .hc files reached fixpoint... I suspect it's because the compiler on i386-linux didn't realise that it could fit an entire double in one word. I've been looking for test suites to run, to make myself more confident of the port. I found two likely suspects in the CVS repository, namely fptools/testsuite and fptools/ghc/tests. Neither of them pass without unexpected failures on a clean i386-linux build, though. Any suggestions? At what seems now to be a long time ago you said:
Once you have an unregisterised build working (& bootstrapped), you can start trying to get the mangler going for full registerised support. The mangler has Alpha support, but it is old and bound to be rotten to some extent.
Any suggestions for how I should start on this, and what to watch out for? The mangler looks, well, evil. (Time to glorify it, as was done to the driver?) Regarding our wanting to
- add some support for cross-compilation to the build system.
The only cross-compilation support I can see that isn't too hard to add would be documented procedures for shipping the "three wrinkles" between the build and target systems: ghc/includes/config.h, .hs output from hsc2hs, and ghc/compiler/main/Config.hs. Is this kind of support basically what you meant, or did you have something else in mind?
- write down exactly what one needs to do to make this work, and put the instructions in the build system documentation.
I'd be happy to write down what I did in the near future. It's basically the standard steps for creating .hc files, with the abovementioned "three wrinkles". (Provided that the patches I just sent are applied -- prod prod :).
SOLUTION: Modify MachDeps.h to #include the config.h from alpha-osf3, even when compiling on i386-linux.
Yep: for cross compilation of the .hc files, the first thing to do is run ./configure on the target platform and take the output back to the host.
I didn't overwrite the i386 config.h with the Alpha one -- I only changed MachDeps.h and ArrayBase.hs. Should I have taken the more drastic route of overwriting? By the way, why does MachDeps.h #define FLOAT_SIZE_IN_BYTES to be SIZEOF_DOUBLE rather than SIZEOF_FLOAT if SIZEOF_DOUBLE == SIZEOF_VOID_P?
The assertion in question is: /* make sure the info pointer is into text space */ ASSERT(q && (LOOKS_LIKE_GHC_INFO(GET_INFO(q)) || IS_HUGS_CONSTR_INFO(GET_INFO(q))));
It seems that the GC code is sensitive to the layout of the virtual memory address space. In particular, I had to change HEAP_BASE from 0x50000000 to 0x200000000L in MBlock.h to get GC to work even with -static.
So it doesn't work without -static? A HEAP_BASE change is not unexpected, it all depends where the system puts its shared libraries.
Okay, so with -static, I easily found the seemingly working setting of HEAP_BASE == 0x180000000L (with the help of some Alpha assembly programming documentation). Without -static, is there some way to (reliably?) know what HEAP_BASE should be set to? I'm not even sure if the HEAP_BASE setting is the problem, but it seems likely. (Specifically: When I looked at the assert failure core dump inside gdb, GET_INFO(q) did in fact look like ghc info to my human eyes; the reason LOOKS_LIKE_GHC_INFO(GET_INFO(q)) was false was that HEAP_ALLOCED(GET_INFO(q)) was true.) The getMBlocks function in MBlock.c does not check to make sure that the pointer returned by mmap() is the address it asked for. Should it? An entirely separate question: Why are there both rts/Linker.h and includes/Linker.h? -- Edit this signature at http://www.digitas.harvard.edu/cgi-bin/ken/sig Little can be said for Luxembourg.
participants (2)
-
Ken Shan
-
Simon Marlow