Re: [Haskell-cafe] Fast JSON validation - reducing allocations

12 May 2017

      On Fri, 2017-05-12 at 10:27 +0200, Arjen wrote:
...
...
Morning all,

On 11 May 2017 at 20:32, Ben Gamari  wrote:
...
Something that I should have mentioned earlier is that STG has
On Fri, 2017-05-12 at 09:10 +0100, David Turner wrote:
the
...
...
nice
property that all allocation is syntactically obvious: allocated
closures manifest as `let`s. This makes it fairly easy to pick
out
possible allocation sites, even in large dumps.

Ah, that's very useful to know!

Armed with that knowledge, I came to the conclusion that the
allocation was for the sharing of the `nextState` variable.
Inlining
it brings it down to 20us and 22kB per iteration.

https://github.com/DaveCTurner/json-validator/commit/ec994ec9226ca7
bc
2e76f19bef98f42e0b233524

Getting closer, but it looks like waiting for 8.2 is a better
answer.
Looking forward to it!

Cheers,

Maybe this is a silly question, and please let me know why if so,
but:
Has anyone thought about parallelizing it for multiple messages in
order to "produce garbage faster"? While reducing allocation will
make
the single validations faster, doing multiple ones might improve the
throughput per GC ratio. This assumes that the amount of live data in
the heap is small, making GC sort of constant time, and having
multiple
cores available.
I wonder whether a few strategically placed par's and pseq's might
allow you to scale horizontally without requiring invasive changes to
the program's structure. Apologies for not trying to do this myself
first ;-).
Unfortunately, this does not seem to help. I tried it and while I did
see 3.5 and 3.95 CPU's used with parList and parListChunk 8, the
timings show it to be slower than sequential evaluation of a list of
testEvent. Thank you for the problem to try this on :-)

Please let me know if I made a mistake in the testing code below.

kind regards, Arjen

---8<---

benchmarking json-validator/Automaton/testEvent
time                 26.57 μs   (26.07 μs .. 27.12 μs)
                     0.987 R²   (0.973 R² .. 0.995 R²)
mean                 31.01 μs   (28.17 μs .. 44.49 μs)
std dev              15.65 μs   (5.258 μs .. 35.77 μs)
variance introduced by outliers: 99% (severely inflated)

benchmarking json-validator/Automaton-List/testEventList
time                 65.11 ms   (61.54 ms .. 69.54 ms)
                     0.982 R²   (0.951 R² .. 0.996 R²)
mean                 59.18 ms   (56.28 ms .. 63.22 ms)
std dev              5.801 ms   (4.344 ms .. 8.238 ms)
variance introduced by outliers: 32% (moderately inflated)

benchmarking json-validator/Automaton-ParList/testEventList
time                 243.9 ms   (214.9 ms .. 270.6 ms)
                     0.996 R²   (0.986 R² .. 1.000 R²)
mean                 253.7 ms   (243.6 ms .. 260.8 ms)
std dev              10.32 ms   (6.781 ms .. 13.19 ms)
variance introduced by outliers: 16% (moderately inflated)

benchmarking json-validator/Automaton-ParListChunk/testEventList
time                 211.4 ms   (193.1 ms .. 232.3 ms)
                     0.997 R²   (0.990 R² .. 1.000 R²)
mean                 200.3 ms   (193.0 ms .. 206.6 ms)
std dev              9.256 ms   (7.106 ms .. 10.21 ms)
variance introduced by outliers: 14% (moderately inflated)

  19,225,632,224 bytes allocated in the heap
     109,968,376 bytes copied during GC
       2,062,736 bytes maximum residency (20 sample(s))
       2,250,352 bytes maximum slop
              10 MB total memory in use (0 MB lost due to
fragmentation)

                                     Tot time (elapsed)  Avg pause  Max
pause
  Gen  0     28722 colls, 28722
par   17.829s   1.591s     0.0001s    0.0160s
  Gen  1        20 colls,    19
par    0.255s   0.054s     0.0027s    0.0058s

  Parallel GC work balance: 7.41% (serial 0%, perfect 100%)

  TASKS: 13 (1 bound, 12 peak workers (12 total), using -N4)

  SPARKS: 25625 (25570 converted, 0 overflowed, 0 dud, 20 GC'd, 35
fizzled)

  INIT    time    0.001s  (  0.002s elapsed)
  MUT     time   44.403s  ( 22.013s elapsed)
  GC      time   18.084s  (  1.645s elapsed)
  EXIT    time    0.001s  (  0.001s elapsed)
  Total   time   62.490s  ( 23.660s elapsed)

  Alloc rate    432,981,654 bytes per MUT second

  Productivity  71.1% of total user, 93.0% of total elapsed

gc_alloc_block_sync: 25370
whitehole_spin: 0
gen[0].sync: 405
gen[1].sync: 22

---8<---

{-# LANGUAGE OverloadedStrings #-}

module Main where

import           Criterion.Main
import           Data.Aeson
import qualified Data.ByteString.Lazy as BL
import qualified Data.Text.Encoding   as T
import qualified Data.Text.IO as T
import qualified Data.Text as T (append, pack)
import qualified Automaton
import Control.Parallel.Strategies

testEvent :: Int -> BL.ByteString
testEvent i = BL.fromStrict $ T.encodeUtf8 (T.append
"{\"stream\":\"actuals-
stream\",\"submitter\":{\"type\":\"other\",\"description\":\"redactedre
dac\"},\"driverActivities\":[{\"driverActivity\":{\"journey\":{\"headco
de\":\"1A01\",\"description\":\"redactedredactedredactedredactedredacte
dredacte\"},\"activity\":[{\"arrivalTime\":null,\"sequence\":1,\"tiploc
\":\"REDACTE\",\"stop\":true,\"departureTime\":\"2016-06-
09T18:22:28.000000000000Z\"},{\"arrivalTime\":\"2016-06-
09T18:24:43.000000000000Z\",\"sequence\":2,\"tiploc\":\"REDACTE\",\"sto
p\":true,\"departureTime\":\"2016-06-
09T18:25:51.000000000000Z\"},{\"arrivalTime\":\"2016-06-
09T18:26:58.000000000000Z\",\"sequence\":3,\"tiploc\":\"REDACT\",\"stop
\":true,\"departureTime\":\"2016-06-
09T18:28:08.000000000000Z\"},{\"arrivalTime\":\"2016-06-
09T18:29:57.000000000000Z\",\"sequence\":4,\"tiploc\":\"REDACTE\",\"sto
p\":true,\"departureTime\":null}]},\"activityUserId\":\"521be60a-02f2-
4892-b468-c17d9c1c4fcf\"}],\"submissionTime\":\"2016-06-
09T18:36:45.831486000000Z\",\"type\":\"driverActivityLogged" (T.append
(T.pack (show i)) "\"}"))

data AnyJSON = AnyJSON
  deriving (Show, Eq)

instance FromJSON AnyJSON where
  parseJSON _ = pure AnyJSON

main :: IO ()
main = print (testEventList [0]) >> defaultMain
  [ bgroup "json-validator/Automaton"
    [ bench "testEvent" $ whnf Automaton.isValidJson (testEvent 0)
    ]
  , bgroup "json-validator/Automaton-List"
    [ bench "testEventList" $ whnf (\es -> and (testEventList es
`using` rseq)) [1..1000]
    ]
  , bgroup "json-validator/Automaton-ParList"
    [ bench "testEventList" $ whnf (\es -> and (testEventList es
`using` parList rseq)) [1..1000]
    ]
  , bgroup "json-validator/Automaton-ParListChunk"
    [ bench "testEventList" $ whnf (\es -> and (testEventList es
`using` parListChunk 8 rseq)) [1..1000]
    ]
  ]
  where
    testEventList = map (Automaton.isValidJson . testEvent)