
Mirko Rahn wrote:
from the letters of that word. A letter can be used at most as many times as it appears in the input word. So, "letter" can only match words with 0, 1, or 2 t's in them.
frequencies = map (\x -> (head x, length x)) . group . sort superset xs = \ys -> let y = frequencies ys in length y == lx && and (zipWith (\(c,i) (d,j) -> c == d && i >= j) x y) where x = frequencies xs lx = length x
As far as I understand the spec, this algorithm is not correct:
superset "ubuntu" "tun" == False
Is at least one 'b' necessary, yes or no?
Oops, you are indeed right, the answer should be "no". I thought I'd came away without primitive recursion, but here's a correct version superset xs = superset' x . sort ys where x = sort xs _ `superset` [] = True [] `superset` _ = False (x:xs) `superset'` (y:ys) | x == y = xs `superset` ys | x < y = xs `superset` (y:ys) | otherwise = False
If the answer is no, the following algorithm solves the problem and is faster then the one above:
del y = del_acc [] where del_acc _ [] = mzero del_acc v (x:xs) | x == y = return (v++xs) del_acc v (x:xs) = del_acc (x:v) xs
super u = not . null . foldM (flip del) u
main = interact $ unlines . filter ("ubuntu" `super`) . lines
The algorithm is correct but it's not faster, xs `super` ys takes O(n*m) time whereas superset takes O(n * log n + m * log m) time given a proper sorting algorithm. Here, n = length xs and m = length ys. Actually, both algorithms are essentially the same except for the sorting that allows to drop some equality tests. (Note that memoizing x = sort xs over different ys speeds things up a bit for the intended application. This way, (sort "ubuntu") is only computed once and the running time over many ys approaches O(n + m*log m).) Regards, apfelmus PS: Some exercises for the interested reader: 1) Still, the algorithm super has an advantage over superset. Which one? 2) Put xs into a good data structure and achieve a O(m * log n) time for multiple ys. 3) Is this running time always better than the aforementioned O(n + m*log m)? What about very large m > n?