Don’t worry about the time to append rather than pretend, it is not “exponential” nor even cubic.

You are making a list of length in an N^2 algorithm. Appending to a list adds another tiny O(N^2) step, which is dominated by the main algorithm (also O(N^2), which both takes longer and  evaluates vastly more values (failures as well as successes.)

Message: 3

Date: Thu, 21 Jan 2016 19:05:17 +0000

From: Rein Henrichs <rein.henrichs@gmail.com>

To: The Haskell-Beginners Mailing List - Discussion of primarily

              beginner-level topics related to Haskell <beginners@haskell.org>,

              masonmlai@gmail.com

Subject: Re: [Haskell-beginners] program not running lazily

Message-ID:

              <CAJp6G8zemeHpYwZ36ShMJ-m_BishVnUJpoqEmK=DosKHRVaqeg@mail.gmail.com>

Content-Type: text/plain; charset="utf-8"

 

s/not/note, sorry

 

On Thu, Jan 21, 2016 at 10:42 AM Rein Henrichs <rein.henrichs@gmail.com>

wrote:

 

> But not that doing so will cause the program to have an exponential

> runtime as each new ys must be repeatedly traversed to append a [y].. The

> alternative is to *unfold* your list by recursing on the right hand side of

> the (:) to add new elements.

> 

> On Thu, Jan 21, 2016 at 5:43 AM Doug McIlroy <doug@cs.dartmouth.edu>

> wrote:

> 

>> Each time you find another good 9-mer, you add it to

>> the head of the list. This means that the ultimate

>> list will be in reverse order of discovery: the first element

>> to be printed is the last one to be found.  To get

>> output in the order it was discovered, build the

>> output by ys++[y] rather than y:ys.

>> _______________________________________________

>> Beginners mailing list

>> Beginners@haskell.org

>> http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners

>> 

> 

-------------- next part --------------

An HTML attachment was scrubbed...

URL: <http://mail.haskell.org/pipermail/beginners/attachments/20160121/9a99b2f6/attachment-0001.html>

 

------------------------------

 

Message: 4

Date: Thu, 21 Jan 2016 11:34:35 -0800

From: Mason Lai <masonmlai@gmail.com>

To: Rein Henrichs <rein.henrichs@gmail.com>, doug@cs.dartmouth.edu

Cc: The Haskell-Beginners Mailing List - Discussion of primarily

              beginner-level topics related to Haskell <beginners@haskell.org>

Subject: Re: [Haskell-beginners] program not running lazily

Message-ID:

              <CAH1iVpdf1tNsdHakTsM-1u4=2hu6GHpb47=gj+MZuxybUtb2Mw@mail.gmail.com>

Content-Type: text/plain; charset="utf-8"

 

I've changed the "if minimum < 3" line with an "any (< 3)" line. This has

sped up the performance to be good enough. (I assume that I have to

calculate the Lev. distance of all the 9-mers in order to take the minimum.

I really only care if any element has a Lev. distance less than three, so I

can stop when I find the first.) The rest of this discussion is purely for

fun.

 

I've swapped "y:ys" for "ys ++ [y]", and while the output is reversed, I

don't appear to be able to take the first n elements still. I haven't timed

how long the program takes now to see if it blows up. Rein, I don't quite

understand your answer; I may need you to be more explicit if I don't

figure this out. Part of what confuses me is that the recursion takes place

in merge, and the (:) is in addInto. I think your comment has given me an

idea, so I'm going to take some time to play with this in the afternoon, so

I'll send an update tonight.

 

Thanks for looking at this!

-- Mason

 

On Thu, Jan 21, 2016 at 11:05 AM, Rein Henrichs <rein.henrichs@gmail.com>

wrote:

 

> s/not/note, sorry

> 

> On Thu, Jan 21, 2016 at 10:42 AM Rein Henrichs <rein.henrichs@gmail.com>

> wrote:

> 

>> But not that doing so will cause the program to have an exponential

>> runtime as each new ys must be repeatedly traversed to append a [y].. The

>> alternative is to *unfold* your list by recursing on the right hand side of

>> the (:) to add new elements.

>> 

>> On Thu, Jan 21, 2016 at 5:43 AM Doug McIlroy <doug@cs.dartmouth.edu>

>> wrote:

>> 

>>> Each time you find another good 9-mer, you add it to

>>> the head of the list. This means that the ultimate

>>> list will be in reverse order of discovery: the first element

>>> to be printed is the last one to be found.  To get

>>> output in the order it was discovered, build the

>>> output by ys++[y] rather than y:ys.

>>> _______________________________________________

>>> Beginners mailing list

>>> Beginners@haskell.org

>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners

>>> 

>> 

-------------- next part --------------

An HTML attachment was scrubbed...

URL: <http://mail.haskell.org/pipermail/beginners/attachments/20160121/15fe120e/attachment-0001.html>

 

------------------------------

 

Message: 5

Date: Fri, 22 Jan 2016 17:33:33 +1100

From: Thomas Koster <tkoster@gmail.com>

To: beginners@haskell.org

Subject: [Haskell-beginners] Increasing capabilities dramatically

              increases            execution time

Message-ID:

<CAG1wH7DDYULMComW8QNntU4s6hcrhU53TAFywKfub+QqJi+BRQ@mail.gmail.com>

Content-Type: text/plain; charset=UTF-8

 

Hi friends,

 

I have encountered a situation in a concurrent program where I see an

unexpected, dramatic increase in execution time when I add additional

capabilities. On a multi-core CPU, "-N1" is fastest by an order of

magnitude and the program increasingly slows for an increasing number

of capabilities (still fewer than the number of real cores, of

course).

 

My question is, why is this so and what can I do to improve this?

 

Some details:

 

I have a shared data structure which I call a "store", which is

basically a strict HashMap String Value. This structure is shared

between Haskell threads using an IORef/MVar pair:

 

data Store = Store (IORef (HashMap Key Value)) (MVar ())

 

Focus on the IORef/MVar pair - the HashMap itself is not important. My

intention is that read-only transactions can take the pure value

straight from the IORef without blocking writers or other readers,

whereas mutating transactions (those that will update the IORef when

they complete) are serialized using the MVar. Alternatively, you can

regard the read-only transaction as working with a private snapshot of

the store that is discarded after it completes.

 

It is my hope that this technique will allow my program to exploit a

multi-core CPU by running several read-only transactions and at most

one mutating transaction in parallel. I am also assuming that this

technique is marginally more efficient than STM for this use case,

especially for write-heavy loads where I am assuming STM would waste

some time on retries (I did not test this).

 

-- | Execute a read-only transaction that returns a value.

withStore :: Store -> (HashMap Key Value -> a) -> IO a

withStore (Store ioref _) f = do

  store <- readIORef ioref

  return (f store)

 

-- | Execute a transaction that updates the store and returns a value.

modifyStore :: Store -> (HashMap Key Value -> (HashMap Key Value, a)) -> IO a

modifyStore (Store ioref mvar) f =

  modifyMVar mvar $ \ z -> do

    store <- readIORef ioref

    let (store', x) = f store

    store' `seq` writeIORef ioref store'

    return (z, x)

 

Stop me right here if this is a silly way to do this.

 

I have a test that forks 4 threads that each increment a counter in

the store 100000 times, with the expectation that the final answer is

400000. I use the "async" package for this. This test is not

necessarily pathological. I expect simple operations like incrementing

counters and concatenating text to be typical transactions.

 

  threads <- replicateM 4 $ async $

               replicateM_ 100000 $

                 modifyStore store (...increment a counter...)

  forM_ threads wait

 

In this test, while any thread is busy modifying the store, the other

three are blocked on the empty MVar, irrespective of how many

capabilities I have. Therefore, I expect the execution time with -N4

to be similar to -N1. I expect the only difference to be attributable

to the runtime's scheduling overheads, which I assume are relatively

insignificant. Instead, I see a dramatic increase in execution time

with increasing capabilities (typical measurements below).

 

By the way, I say I assume scheduling overheads are "relatively

insignificant" compared to incrementing the counter because in my

program, the counter is incremented by interpreting an imperative

EDSL, which I assume is relatively inefficient compared to e.g.

"succ", but perhaps I am mistaken. In a way, my whole question

probably centres around this assumption.

 

I would be grateful is anyone can illuminate the reason for this

dramatic increase in execution time when I increase the number of

capabilities, and any hints on how I can mitigate it.

 

I am using GHC 7.10.2 and compile with -O -threaded. All library

versions are as at Stackage LTS 3.2.

 

A typical measurement, with -N1:

 

                                     Tot time (elapsed)  Avg pause  Max pause

  Gen  0       516 colls,     0 par    0.000s   0.003s     0.0000s    0.0001s

  Gen  1         2 colls,     0 par    0.000s   0.001s     0.0003s    0.0003s

 

  TASKS: 4 (1 bound, 3 peak workers (3 total), using -N1)

 

  SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

 

  INIT    time    0.000s  (  0.001s elapsed)

  MUT     time    0.136s  (  0.139s elapsed)

  GC      time    0.000s  (  0.003s elapsed)

  EXIT    time    0.000s  (  0.001s elapsed)

  Total   time    0.136s  (  0.144s elapsed)

 

With -N2:

 

                                     Tot time (elapsed)  Avg pause  Max pause

  Gen  0       334 colls,   334 par    0.012s   0.006s     0.0000s    0.0001s

  Gen  1         2 colls,     1 par    0.000s   0.000s     0.0002s    0.0003s

 

  Parallel GC work balance: 39.33% (serial 0%, perfect 100%)

 

  TASKS: 6 (1 bound, 5 peak workers (5 total), using -N2)

 

  SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

 

  INIT    time    0.000s  (  0.002s elapsed)

  MUT     time    2.032s  (  2.456s elapsed)

  GC      time    0.012s  (  0.007s elapsed)

  EXIT    time    0.000s  (  0.000s elapsed)

  Total   time    2.044s  (  2.465s elapsed)

 

With -N4:

 

                                     Tot time (elapsed)  Avg pause  Max pause

  Gen  0       133 colls,   133 par    0.032s   0.005s     0.0000s    0.0001s

  Gen  1         2 colls,     1 par    0.000s   0.001s     0.0003s    0.0003s

 

  Parallel GC work balance: 41.33% (serial 0%, perfect 100%)

 

  TASKS: 10 (1 bound, 9 peak workers (9 total), using -N4)

 

  SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

 

  INIT    time    0.000s  (  0.003s elapsed)

  MUT     time    3.516s  (  4.431s elapsed)

  GC      time    0.032s  (  0.005s elapsed)

  EXIT    time    0.000s  (  0.001s elapsed)

  Total   time    3.548s  (  4.439s elapsed)

 

Thanks,

Thomas Koster

 

 

------------------------------

 

Subject: Digest Footer

 

_______________________________________________

Beginners mailing list

Beginners@haskell.org

http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners

 

 

------------------------------

 

End of Beginners Digest, Vol 91, Issue 26

*****************************************