
Johannes: I'm happy to report that I was able to do this experiment, and it indeed worked just fine. I compiled a toy program (along the lines of "hello world") using GHC-8.0.1; took the generated binary to a KNL machine, and ran it without any issues. I then repeated the same with a much bigger interactive Haskell program, and while I didn't test all aspects of it, I was able to start it on the KNL machine as well. (This latter program has quite a bit of dependencies on various Haskell libraries.) So, at least from those two experiments, I think there's a lot of hope that you can just copy over a GHC generated binary and expect it to run unmodified. The machine I compiled it on have the following characteristics: $ ghc --version The Glorious Glasgow Haskell Compilation System, version 8.0.1 $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Thread(s) per core: 1 Core(s) per socket: 8 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 45 Stepping: 7 CPU MHz: 1200.000 BogoMIPS: 5199.87 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 20480K NUMA node0 CPU(s): 0-7 NUMA node1 CPU(s): 8-15 And the machine I ran it on (which doesn't have ghc installed): $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 68 On-line CPU(s) list: 0-67 Thread(s) per core: 1 Core(s) per socket: 68 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 87 Model name: Intel(R) Xeon Phi(TM) CPU 7250 @ 1.40GHz Stepping: 1 CPU MHz: 1400.000 BogoMIPS: 2793.32 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 1024K NUMA node0 CPU(s): 0-67 So, it does appear that Intel's "binary-compatible" claim is indeed holding up. I'd be happy to do some "small" experiments if you're particularly worried about some particular feature; let me know. Cheers, -Levent. On Tue, Feb 21, 2017 at 10:29 AM, Johannes Waldmann < johannes.waldmann@htwk-leipzig.de> wrote:
Hi Levent,
The whole point of the Xeon-phi is the availability of large-vector sized floating-point support and many-many cores.
Sure, that's what I'm contemplating - use the many options of writing parallel and concurrent Haskell programs.
So, GHC's RTS should "just work"?
I was hoping someone already had actually seen this on their machine.
- J.