
Dear Cafe - would a program compiled by ghc, or ghc itself, run as-is on Intel Xeon Phi (KNL)? I found this reference http://stackoverflow.com/questions/22253311/running-haskell-on-xeon-phi but it seems to be about the pre-KNL version. Thanks - J.

Johannes Waldmann wrote:
would a program compiled by ghc, or ghc itself, run as-is on Intel Xeon Phi (KNL)?
I found this reference http://stackoverflow.com/questions/22253311/running-haskell-on-xeon-phi but it seems to be about the pre-KNL version.
Xeon Phi is effectively a different architecture, in the same way that the ARM architecture is different from the x86_64 architecture. Currently GHC does not support the Xeon Phi architecture directly. It may however (as the SO response suggests) be possible to generate C code from GHC and compile that C code with with a C compiler that can generate Xeon Phi binaries. However, I am sure that route would be a signicant yak shaving exercise. Erik -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/

Erik de Castro Lopo wrote:
Xeon Phi is effectively a different architecture, in the same way that the ARM architecture is different from the x86_64 architecture.
More info on the Xeon Phi here: https://en.wikipedia.org/wiki/Xeon_Phi suggests that maybe it isn't a different architecture. Erik -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/

On Tue, Feb 21, 2017 at 3:35 AM, Erik de Castro Lopo
It may however (as the SO response suggests) be possible to generate C code from GHC and compile that C code with with a C compiler that can generate Xeon Phi binaries.
ghc hasn't generated C code for a while, aside from unregisterised. It's not truly a different architecture, but a reorganization of the standard architecture. Unfortunately, ghc doesn't currently make good use of the key Xeon Phi components even in the standard architecture; packages that want to make use of them generally use -fllvm because LLVM is better at using them, even given that LLVM isn't very good at understanding what ghc feeds it. This suggests that -fflvm might be useful in taking advantage of Xeon Phi architecture with ghc. -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

Johannes: KNL is "x86-binary-compatible:" https://www.computer.org/cms/Computer.org/ComputingNow/issues/2016/06/mmi201... KNL supports "up-to" AVX-512 instruction set, making it capable of executing binaries that are compiled for, say, regular Xeon machines. The only notable exception are the TSX instructions (the instructions for transactional memory), which I don't believe is generated via the GHC compile pipeline anyhow. So, in theory, you can take an arbitrary binary compiled for a modern x86 machine (say any of the Core line), and run it unmodified on the KNL. Of course, the issue is going to be the software stack: Binaries don't exist in isolation, you also need dynamic-loaded libraries. So, you might have issues with, say, GMP or other libs if they are not yet ported to KNL. (Static linking might be a huge headache.) In practice, however, this would be rather wasteful: The whole point of the Xeon-phi is the availability of large-vector sized floating-point support and many-many cores. If you're running a binary that makes no use of those instructions and is single-threaded, then you will not gain anything. In fact, the single-threaded performance might suffer compared to a regular Xeon machine. Of course, this all depends on what you want to do. Projects like DPH, however, can take great advantage of the Xeon-phi architecture; by parallelizing number-crunching algorithms and distributing over many cores. (https://wiki.haskell.org/GHC/Data_Parallel_Haskell). However, I'm not familiar with the current status of DPH and related projects to opine weather they aim to target AVX-512 and many cores afforded by the Xeon-phi. I'd love to hear if anyone had more recent info on that. -Levent. On Mon, Feb 20, 2017 at 4:27 AM, Johannes Waldmann < johannes.waldmann@htwk-leipzig.de> wrote:
Dear Cafe -
would a program compiled by ghc, or ghc itself, run as-is on Intel Xeon Phi (KNL)?
I found this reference http://stackoverflow.com/questions/22253311/running-haskell-on-xeon-phi but it seems to be about the pre-KNL version.
Thanks - J. _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Hi Levent,
The whole point of the Xeon-phi is the availability of large-vector sized floating-point support and many-many cores.
Sure, that's what I'm contemplating - use the many options of writing parallel and concurrent Haskell programs. So, GHC's RTS should "just work"? I was hoping someone already had actually seen this on their machine. - J.

Johannes: I'm happy to report that I was able to do this experiment, and it indeed worked just fine. I compiled a toy program (along the lines of "hello world") using GHC-8.0.1; took the generated binary to a KNL machine, and ran it without any issues. I then repeated the same with a much bigger interactive Haskell program, and while I didn't test all aspects of it, I was able to start it on the KNL machine as well. (This latter program has quite a bit of dependencies on various Haskell libraries.) So, at least from those two experiments, I think there's a lot of hope that you can just copy over a GHC generated binary and expect it to run unmodified. The machine I compiled it on have the following characteristics: $ ghc --version The Glorious Glasgow Haskell Compilation System, version 8.0.1 $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Thread(s) per core: 1 Core(s) per socket: 8 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 45 Stepping: 7 CPU MHz: 1200.000 BogoMIPS: 5199.87 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 20480K NUMA node0 CPU(s): 0-7 NUMA node1 CPU(s): 8-15 And the machine I ran it on (which doesn't have ghc installed): $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 68 On-line CPU(s) list: 0-67 Thread(s) per core: 1 Core(s) per socket: 68 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 87 Model name: Intel(R) Xeon Phi(TM) CPU 7250 @ 1.40GHz Stepping: 1 CPU MHz: 1400.000 BogoMIPS: 2793.32 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 1024K NUMA node0 CPU(s): 0-67 So, it does appear that Intel's "binary-compatible" claim is indeed holding up. I'd be happy to do some "small" experiments if you're particularly worried about some particular feature; let me know. Cheers, -Levent. On Tue, Feb 21, 2017 at 10:29 AM, Johannes Waldmann < johannes.waldmann@htwk-leipzig.de> wrote:
Hi Levent,
The whole point of the Xeon-phi is the availability of large-vector sized floating-point support and many-many cores.
Sure, that's what I'm contemplating - use the many options of writing parallel and concurrent Haskell programs.
So, GHC's RTS should "just work"?
I was hoping someone already had actually seen this on their machine.
- J.
participants (4)
-
Brandon Allbery
-
Erik de Castro Lopo
-
Johannes Waldmann
-
Levent Erkok