[GHC] #15250: Add support for _mm512_shuffle_epi8 intrinsic

#15250: Add support for _mm512_shuffle_epi8 intrinsic -------------------------------------+------------------------------------- Reporter: newhoggy | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.4.3 Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- {{{#!c __m512i _mm512_shuffle_epi8 (__m512i a, __m512i b) }}} See: https://software.intel.com/sites/landingpage/IntrinsicsGuide/#expand=765,3914,2929,4754,4757&text=_mm512_shuffle_epi8 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15250 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15250: Add support for _mm512_shuffle_epi8 intrinsic -------------------------------------+------------------------------------- Reporter: newhoggy | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.8.1 Component: Compiler | Version: 8.4.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by carter): repeating my remark from phab: hrmmm, i suspect the AVX512 stuff may not be terrible performant / some / many intel CPUS in consumer / dev machine hands wont support them, have you tested using it via C FFI for your inner loops first? This is definitely an example of an instruction where the shuffle data can be at runtime, i'd suggest first making sure an inner loop / subroutine in C has satisfactory perf first. Whats the context driving this effort? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15250#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15250: Add support for _mm512_shuffle_epi8 intrinsic -------------------------------------+------------------------------------- Reporter: newhoggy | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.8.1 Component: Compiler | Version: 8.4.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by newhoggy): Yes, I've recently discovered that my laptop does not have AVX512 support, in which case I'd like to have support equivalent for AVX2 instruction _mm256_shuffle_epi8 and _mm_shuffle_epi8 instructions for lower performance, compatibility on such platforms. I will also need to be able to load an unload packed bytes to and from vectors with require some additional primops. The context is that I would like to implement Data Parallel State Machines which have been documented to run 5x faster than state machines without SIMD support, and also open the way to parallelise state machines (that are normally run serially) across multiple cores. https://www.microsoft.com/en-us/research/wp- content/uploads/2016/02/asplos302-mytkowicz.pdf Being able to run state machines with high performance is in itself a worthwhile endeavour, but I would like to investigate the possibility of using this to build very fast JSON, XML and CSV parsers. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15250#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15250: Add support for _mm512_shuffle_epi8 intrinsic -------------------------------------+------------------------------------- Reporter: newhoggy | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.8.1 Component: Compiler | Version: 8.4.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by newhoggy): I was recently inspired by a talk by Gabriel Gonzalez on this topic: http://www.youtube.com/watch?v=b4bb8EP_pIE -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15250#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC