On 7/15/07, Donald Bruce Stewart <dons@cse.unsw.edu.au> wrote:
Oh, and I forgot you count up by two now. Here's the Haskell
transliteration (again).


    {-# OPTIONS -O2 -optc-O -fbang-patterns #-}

    import Control.Monad.ST
    import Data.Array.ST
    import Data.Array.Base
    import System
    import Control.Monad
    import Data.Bits

    main = print (pureSieve 10000000)

    pureSieve :: Int -> Int
    pureSieve n = runST( sieve n )

    sieve n = do
        a <- newArray (3,n) True :: ST s (STUArray s Int Bool)
        let cutoff = truncate (sqrt (fromIntegral n)) + 1
        go a n cutoff 3 1

    go !a !m cutoff !n !c
      | n >= m    = return c
      | otherwise = do
              e <- unsafeRead a n
              if e then
                if n < cutoff
                    then let loop !j
                              | j < m     = do
                                  x <- unsafeRead a j
                                  when x $ unsafeWrite a j False
                                  loop (j+n)

                              | otherwise = go a m cutoff (n+2) (c+1)

                        in loop ( if n < 46340 then n * n else n `shiftL` 1)
                    else go a m cutoff (n+2) (c+1)

                   else go a m cutoff (n+2) c


Marginally faster:

    $ time ./primes
    664579
    ./primes  0.34s user 0.00s system 89% cpu 0.385 total

Very cache-dependent though, so widely varying runtimes could be
expected.

-- Don

Hi Donald, quick question.  So, one of the things that is very interesting about Haskell is it's potential for automatic threading, ie you write a trivial algorithm that looks like it runs in a single thread, and the runtime splits it across multiple cores automatically.

It's fairly safe to say that maps, foldrs, foldls, and their derivatives are safe to parallelize?  (For example, hand-waving argument, a foldr of (/) on [1,5,7,435,46,2] can be split into a foldr on [1,5,7] and a foldr on [435,46,2], then their results combined).

To what extent is the technology you are using in your algorithm parallizable?  (I actually cant tell, it's a genuine question).  In the case that it is parallelizable, to what extent is it trivial for a runtime to know this?  (Again, I dont have enough information to tell)