
I was writing a simple utility and I decided to use regexps to parse
filenames. (I know, now I have two problems :-) )
I was surprised at how slow it ran, so I did a profiling build. The
profiled code runs reasonably quickly, and is 7x faster, which makes it
a bit hard to figure out where the slowdown is happening in the
non-profiled code. I’m wondering if I’m doing something wrong, or if
there’s a bug in |regex-tdfa| or in ghc.
I’ve pared my code down to just the following:
|import Text.Regex.TDFA ((=~)) main :: IO () main = do entries <- map
parseFilename . lines <$> getContents let check (Right (_, t)) = last t
== 'Z' check _ = False print $ all check entries parseFilename :: String
-> Either String (String, String) parseFilename fn = case (fn =~ pattern
:: [[String]]) of [[_, full, _, time]] -> Right $ (full, time) _ -> Left
fn where pattern = "^\\./duplicity-(full|inc|new)(-signatures)?\\.\
\([0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]T[0-9][0-9][0-9][0-9][0-9][0-9]Z)\\."
|
The relevant part of my |.cabal| file looks like this:
|executable DuplicityAnalyzer main-is: DuplicityAnalyzer.hs
build-depends: base >=4.6 && <4.11, regex-tdfa >= 1.0 && <1.3
default-language: Haskell2010 ghc-options: -Wall -rtsopts |
To run the profiling, I do:
|cabal clean cabal configure --enable-profiling cabal build
dist/build/DuplicityAnalyzer/DuplicityAnalyzer