
Hello Dominik,
I'm not sure what exactly your algorithm is, but one thing that stands out
to me is the use of (Text,Text) index pairs instead of something more
efficient.
Here is an algorithm that I wrote which (without any optimization tricks)
seems to be fast and correct: https://pastebin.com/rDxUFbt3
Dynamic Programming in Haskell is a pleasure due to laziness, especially if
the choice variables are dense in the integers.
In this case, I have a Vector (Vector Bool), where the outer vector is
indexed by position into the lowercase string and the inner vector is
indexed by position into the uppercase string. vec ! i ! j is True iff
there is a solution to the problem for the first i characters in the
lowercase string and the first j characters in the uppercase string.
Because Vectors are lazy (unless you use one of the packed varieties) you
can assign each vector element the value corresponding to its solution -
before you even know what the solution is!
To get the final solution, you simply look at the vector element
corresponding to using the entirety of both strings. This will force
evaluation of the last cell, which will force the evaluation of some other
cells, which will force the evaluation of some other cells, etc. etc. If a
cell is ever accessed more than once, it still only gets computed one time,
so we have memoization.
This form is a little weird and took me a while to get the first time I saw
it, but I was delighted when I fully understood it.
If your choice variables are not dense in the integers, you can do the same
approach using a memo-trie, although there is a constant factor performance
loss compared to vectors.
--Will
On Sun, Jan 14, 2018 at 3:14 PM, Dominik Bollmann wrote: Hello Haskell-Cafe, While playing with dynamic programming problems, I've been trying to
solve the "Abbreviation" problem found on hackerrank.com at
https://www.hackerrank.com/challenges/abbr/problem. Briefly, this problem asks to decide whether a source string s can be
abbreviated into a target string t by capitalizing some of the
characters in s and deleting its afterwards remaining lowercase
characters. For example, the string s = "aBbdD" can be abbreviated as
target t = "BBD", but target t' = "XYZZ" is not an abbreviation for
source s' = "xyz". My solution to this problem is the following memoization-based
function `isAbbreviation`: ```
import Control.Monad
import Control.Monad.State
import Data.Char
import Data.Map.Strict (Map)
import qualified Data.Map.Strict as Map
import Data.Maybe
import Data.String
import Data.Text (Text)
import qualified Data.Text as Text
import System.IO type Store = Map (Text, Text) Bool isAbbrMemo :: Text -> Text -> State Store Bool
isAbbrMemo s t
| Text.null t = extend s t $ return (Text.all isLower s)
| Text.null s = extend s t $ return False
| otherwise =
let (a, as) = fromJust $ Text.uncons s
(b, bs) = fromJust $ Text.uncons t
in extend s t $ matches a as b bs
where
matches a as b bs
| isLower a && toUpper a /= b = isAbbrMemo as (b `Text.cons` bs)
| isLower a && toUpper a == b = (||) <$> isAbbrMemo as bs
<*> isAbbrMemo as (b
`Text.cons` bs)
| isUpper a && a /= b = return False
| isUpper a && a == b = isAbbrMemo as bs extend :: Text -> Text -> State Store Bool -> State Store Bool
extend s t m = do
st <- get
case Map.lookup (s,t) st of
Just v -> return v
Nothing -> do
v <- m
modify $ Map.insert (s,t) v
return v isAbbreviation :: Text -> Text -> Bool
isAbbreviation s t = evalState (isAbbrMemo s t) Map.empty main :: IO ()
main = do
queries <- readQueries stdin
let answers = map yesNo $ map (uncurry isAbbreviation) queries
forM_ answers putStrLn yesNo :: Bool -> String
yesNo True = "YES"
yesNo False = "NO" readQueries :: IsString a => Handle -> IO [(a, a)]
readQueries h = do
numQueries <- read <$> hGetLine h :: IO Int
forM [1..numQueries] $ \_qid -> do
s <- hGetLine h
t <- hGetLine h
return (fromString s, fromString t)
``` However, running `isAbbreviation` on Hackerrank's input #13 still
takes around 38 seconds on my machine and is therefore too slow to be an
accepted solution. The input of question is attached as a text file. My question is therefore: Where could I further improve the running time
of the function `isAbbreviation`? Is there any low-hanging fruit to
improve upon? Or is my dynamic-programming based approach somehow
flawed in general? (in which I should rather rethink the problem?) Any observations, remarks, and improvements on the above code snippet
are greatly appreciated :-) Thanks, Dominik. _______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.