Re: [GHC] #7367: Optimiser / Linker Problem on amd64

#7367: Optimiser / Linker Problem on amd64 --------------------------------------------+------------------------------ Reporter: wurmli | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.8.1 Component: Build System | Version: 7.6.1 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 Type of failure: Runtime performance bug | (amd64) Test Case: | Difficulty: Unknown Blocking: | Blocked By: | Related Tickets: --------------------------------------------+------------------------------ Comment (by carter): Let me preface that I'm very likely not following this thread. So please view my following remarks as also being questions for clarification. I'm trying to follow this thread: 1. the issue initially was that theres overally agressive let floating? I believe the way Manuel addresses this in his Repa Code is by using the touch function to prevent let floating, right? 2. currently: its now a question about having a more systematic way of soundly handling the cost model of let floatings and when to do them? @Hans / Wurmli As a haskell programmer, you can absolutely write great performant low level code in haskell (or cheaply ffi out if need be). It does require really understanding how GHC works, and how it compiles code. Really performant haskell does not treat the compiler as a black box, but rather as a partner to in a conversation. I have some fun examples of bit fiddling haskell code that turns into exactly the straight line register manipulations I"d hope any language to generate. But to really write HPC grade code, you have to really understand your tools! The "standard" way to systematically write very very performant code in haskell, is to first design a library with the "right" abstractions, and in tandem, have a "dialogue" where you figure out how to give the library internals a representation that GHC can aggressively fuse / simplfify away to make things fast. The Vector Library does this with stream fusion, and the GPU lib Accelerate and the CPU libs Repa 3 / Repa 4 libs all have very nice views on some parts of this methodology (I have my own, different take in progress that adds some interest twists too). In some respects, its also an ongoing exploratory engineering effort and research effort to make it better and better. point being: there is no magical compiler, merely compilers that can "collaborate" with the library author. GHC does an great (and getting even better over time) job of this. If you have specific optimizations you'd want, please illustrate what the "input" and "result" codes from the optimization would be! Humans, given enough time, often are the best optimizers, so the best a compiler can do is support library authors writing easy to optimize libraries! Importantly: currently GHC doesn't pass much aliasing information to code generators, though for numerical / bit fiddling codes, LLVM can do some tremendously amazing optimziations. There will also be great support for some basic simd code writing in 7.8. That said, after 7.8 release, and before 7.10 lands, I think its pretty fair to say that a lot of great work will be happening to better support GHC having a good numerical story. If nothing else, its something that I (time permitting), want to improve/help with. That said: in the mean time, its ok to have "fat primops" written in C that you ffi out to, and having all your application / numerical logic, and memory management be on the haskell side. I'm actually doing that quite a bit in my own codes, and while theres plenty of room for even better performance, even with that naive approach I'm able to get temptingly close to Ye Olde ancient but really really fast Fortran Grade performance with fairly little effort on my part. I hope i'm contributing to this thread with these questions and remarks :) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/7367#comment:16 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC