Isn't this basically a clearer, more elegant version of my original implementation? I was somewhat surprised that GHC couldn't turn it into Milan's expression. I'm PERFECTLY HAPPY to go with your "effectivized maybe" version. It's clear, it's asymptotically optimal, and its overall performance is on par with the Milan one.

On Wed, Oct 15, 2014 at 10:56 PM, Kim-Ee Yeoh <ky3@atamo.com> wrote:

On Wed, Oct 15, 2014 at 7:32 AM, Andreas Abel <abela@chalmers.se> wrote:
Rowing back, I first did the proofs for

Variant 1: drop (length ys - length xs) ys == xs

Variant 2: drop (length (drop (length xs) ys)) ys == xs

Fab!

In 10 years once I'm fearsome like Andreas I shall take a tilt at Agdaizing David-Milan.

Meanwhile I wonder if Agda can perform the following proof-to-proof transformation.

Let's let-expand Andreas-dropl like this:

   ns `isSuffixOfA` hs =  let
      delta = dropl ns hs in
      ns == dropl delta hs

Recall dropl ns hs ~ drop (length ns) hs.

But needle longer than haystack is a D.O.A. If only there was a kill-switch we could flip.

Cue Maybe as exceptions lite to provide just the thing:

   ns `isSuffixOfA` hs =  fromMaybe False $ do
      delta <- dropm ns hs
      return $ ns == dropl delta hs

Here's dropm and note how dropl ns hs ~ fromMaybe [] $ dropm ns hs.

dropm :: [b] -> [a] -> Maybe [a]
dropm (_:xs) (_:ys) = dropm xs ys
dropm [] ys = Just ys
dropm _ []  = Nothing   -- xs longer than ys

I put it to you that this new, improved, effectivized version is equivalent to David-Milan. Not forgetting that the right combinators will do-desugar back to a one-liner.

Indeed the benchmarks show that M-A-d still pips out D-M in short-long. Best of all, the gap in everything else has shrunk to small single digits. E.g. short-short has narrowed from 9% to 2%:

benchmarking simple shortNeedle in shortHaystack/Milan
time                 136.3 ns   (136.2 ns .. 136.3 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 136.3 ns   (136.3 ns .. 136.4 ns)
std dev              222.5 ps   (36.51 ps .. 466.3 ps)

benchmarking simple shortNeedle in shortHaystack/Abel
time                 139.7 ns   (139.6 ns .. 139.9 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 139.7 ns   (139.6 ns .. 139.9 ns)
std dev              477.6 ps   (296.8 ps .. 782.9 ps)

benchmarking simple shortNeedle in longHaystack/Milan
time                 44.91 μs   (44.91 μs .. 44.92 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 44.91 μs   (44.91 μs .. 44.92 μs)
std dev              14.94 ns   (12.14 ns .. 20.34 ns)

benchmarking simple shortNeedle in longHaystack/Abel
time                 43.57 μs   (43.57 μs .. 43.58 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 43.58 μs   (43.57 μs .. 43.60 μs)
std dev              32.49 ns   (11.69 ns .. 64.84 ns)

benchmarking simple longNeedle in shortHaystack/Milan
time                 118.3 ns   (118.3 ns .. 118.3 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 118.3 ns   (118.3 ns .. 118.3 ns)
std dev              17.68 ps   (13.54 ps .. 23.11 ps)

benchmarking simple longNeedle in shortHaystack/Abel
time                 120.0 ns   (120.0 ns .. 120.0 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 120.0 ns   (120.0 ns .. 120.0 ns)
std dev              12.54 ps   (8.954 ps .. 16.97 ps)

benchmarking simple longNeedle in longHaystack/Milan
time                 46.64 μs   (46.63 μs .. 46.67 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 46.66 μs   (46.65 μs .. 46.67 μs)
std dev              40.74 ns   (29.39 ns .. 59.83 ns)

benchmarking simple longNeedle in longHaystack/Abel
time                 47.25 μs   (47.23 μs .. 47.26 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 47.25 μs   (47.24 μs .. 47.26 μs)
std dev              22.33 ns   (18.24 ns .. 27.98 ns)
benchmarking simple shortNeedle in shortHaystack/Milan
time                 136.3 ns   (136.2 ns .. 136.3 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 136.3 ns   (136.3 ns .. 136.4 ns)
std dev              222.5 ps   (36.51 ps .. 466.3 ps)

benchmarking simple shortNeedle in shortHaystack/Abel
time                 139.7 ns   (139.6 ns .. 139.9 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 139.7 ns   (139.6 ns .. 139.9 ns)
std dev              477.6 ps   (296.8 ps .. 782.9 ps)

benchmarking simple shortNeedle in longHaystack/Milan
time                 44.91 μs   (44.91 μs .. 44.92 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 44.91 μs   (44.91 μs .. 44.92 μs)
std dev              14.94 ns   (12.14 ns .. 20.34 ns)

benchmarking simple shortNeedle in longHaystack/Abel
time                 43.57 μs   (43.57 μs .. 43.58 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 43.58 μs   (43.57 μs .. 43.60 μs)
std dev              32.49 ns   (11.69 ns .. 64.84 ns)

benchmarking simple longNeedle in shortHaystack/Milan
time                 118.3 ns   (118.3 ns .. 118.3 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 118.3 ns   (118.3 ns .. 118.3 ns)
std dev              17.68 ps   (13.54 ps .. 23.11 ps)

benchmarking simple longNeedle in shortHaystack/Abel
time                 120.0 ns   (120.0 ns .. 120.0 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 120.0 ns   (120.0 ns .. 120.0 ns)
std dev              12.54 ps   (8.954 ps .. 16.97 ps)

benchmarking simple longNeedle in longHaystack/Milan
time                 46.64 μs   (46.63 μs .. 46.67 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 46.66 μs   (46.65 μs .. 46.67 μs)
std dev              40.74 ns   (29.39 ns .. 59.83 ns)

benchmarking simple longNeedle in longHaystack/Abel
time                 47.25 μs   (47.23 μs .. 47.26 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 47.25 μs   (47.24 μs .. 47.26 μs)
std dev              22.33 ns   (18.24 ns .. 27.98 ns)


Whether the 2% difference is due to Maybe's injecting additional bottoms is anyone's guess.

Worth investigating is whether a core-to-core transformation can snap shut the gap, but 2% is already magnificent testimony to the efficacy of compositional, transformational, effectful programming that's uniquely Haskell.


-- Kim-Ee

_______________________________________________
Libraries mailing list
Libraries@haskell.org
http://www.haskell.org/mailman/listinfo/libraries