Efficient Factoring Out of Constants

Hi,

I’ve been thinking about factoring constants out of iterations and have spotted another place in my code where I can make use of this.

See the two examples below – the first example iterates over the mcSimulate function – this has changed a little bit but essentially still recurses around passing in
two constants, and two variables that are changed each time it is called – it has the following form:

mcSimulate (startStock, endTime, seedForSeed, runningSum) = ( startStock, endTime, newSeedForSeed, newRunningSum )

I figured I’m passing around the constants startStock and endTime so looked factor these out producing the second example below.

My concern is that although iterate function now only handles two variables, it’s still passing around 1 tuple, which under the bonnet is likely to be represented in machine code as a pointer? Humor me here a little – I know I’m thinking of this in terms of C++, but I’m guessing the final byte code will adhere to this:

Thus each time mcSimulate is called a machine code subroutine will be passed a memory address to the input data. Now, the crux of this is, will it make a COPY of the old input data, BUT with the newSeedForSeed and newRunningSum, to pass to the next iteration? If this is the case each iteration does two useless copies of startStock and endTime? Here the second example should produce better code because nothing constant is copied from 1 iteration to the next. However if the compiler is smarter and simply REPLACES the modified data the second example buys us nothing.

However, again, depending very much on the compiler, the second example may actually be less efficient. Let’s say the compiler is super smart and doesn’t copy around the startStock and endTime on each iteration in the first example. Then we’ve gained nothing. However the second example Will call 3 functions each iteration:

mcWrapper -> mcSimulate -> getStateInfo

In the second example we probably have something like 6 ‘JMP’ statements in machine code – 3 to jump in to each function, and 3 to jump back out. In the first we have 2 – one to jump us into mcSimulate and one to return. So each iteration executes 4 more JMPs in the second example. All others things being equal this will produce slightly less efficient code.

Now I know I’m speculating like crazy, and perhaps I’m drunk with efficiency here, but it would seem to me that whatever way it works there will be a certain critical mass of constant data that you can carry around that once breached (i.e. When the copy operations exceed the CPU time taken for the 4 extra JMPs) you will be better off factoring the constant data out..... That is assuming any of my analysis is correct :-)

If anyone has any insight into how this might looked once compiled down to machine code, or has an opinion one which example below makes for better Haskell, I’d be grateful for any comments, advice or discussion.

Cheers,

Phil.

Note: I recognize the use of getSum and getStateInfo could be polished using data structures instead, and the use of !! will not produce strict evaluation.
-------------

getSum :: (Double, Double, Word64, Double) -> Double
getSum (startStock, endTime, seedForSeed, runningSum) = runningSum

getAveragePayoff :: Double -> Double -> Int -> Word64 -> Double
getAveragePayoff startStock endTime iterations seedForSeed = average
  where
    average = (getSum $ (iterate mcSimulate (startStock, endTime, seedForSeed, 0)) !! iterations ) / fromIntegral iterations

---------------

getStateInfo :: (Double, Double, Word64, Double) -> (Word64,Double)
getStateInfo (startStock, endTime, seedForSeed, runningSum) = (seedForSeed, runningSum)

getAveragePayoff :: Double -> Double -> Int -> Word64 -> Double
getAveragePayoff startStock endTime iterations seedForSeed = average
  where
    average = (snd $ (iterate mcWrapper (seedForSeed,0)) !! iterations ) / fromIntegral iterations
      where
        mcWrapper = \(seed,runningSum) -> getStateInfo $ mcSimulate ( startStock, endTime, seed, runningSum )

On 16/01/2009 01:41, "Phil" <pbeadling@mail2web.com> wrote:

On 16/01/2009 01:28, "Luke Palmer" <lrpalmer@gmail.com> wrote:

Compile-time constants could be handled by simple top-level bindings. This technique is specifically for the case you are after:

mcSimulate :: Double -> Double -> Word64 -> [Double]
mcSimulate startStock endTime seedForSeed = go seedForSeed
where
    go = fst expiryStock : go newSeedForSeed
      where
     expiryStock = iterate evolveUnderlying (startStock, ranq1Init seedForSeed)
                        !! truncate (endTime/timeStep)
     newSeedForSeed = seedForSeed + 246524

See what's going on there?

I don't know about that nested where. In Real Life I would probably use a let instead for expiryStock and newSeedForSeed.

Luke

Ahh, I get it now, that’s pretty neat - ‘go’ is only updating the seedForSeed and the expiryStock, the inner ‘where’ clause keeps everything else constant each time it is called.

Thanks again!

Phil.

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe