
On 2012-01-20 23:44, Gwern Branwen wrote:
On Fri, Jan 20, 2012 at 1:57 PM, Twan van Laarhoven
wrote: Here is some example code (untested):
Well, you're right that it doesn't work. I tried to fix the crucial function, 'atLeastThisManyDescendants', but it's missing something because varying parts doesn't much affect the results when I try it out on example input - it either returns everything or nothing, it seems: atLeastThisManyDescendants :: Int -> Trie a -> [CommonPrefix a] atLeastThisManyDescendants minD trie@(Trie l d t') | d < minD = [] | null forChildren = [Prefix [] trie] | otherwise = forChildren where forChildren = [ Prefix (x:pfx) nms | (x,t) <- Map.toList t' , Prefix pfx nms <- atLeastThisManyDescendants l t ]
It should be "atLeastThisManyDescendants minD t", minD is a threshold for the minimum numer of descendants, and it stays the same in the recursive call. That's what you get for not testing your code :) With the correct function I get a result like: *Main> mapM_ (print . prefix) $ atLeastThisManyDescendants 4 test1 "gumi-" "luka-" "miku-a" "miku-h" "miku-m" "miku-n" "miku-p" "miku-r" "miku-s" "miku-t" "rin-" Notice that there are lots of "miku-X" prefixes found. This is probably not what you want. What exactly do you want the algorithm to do? For example, "" is obviously a prefix of every string, but it is not very long. On the other hand, each string is a prefix of itself, but that prefix is shared by only one string (usually). By the way, the sort and compare adjacent pairs approach corresponds to "atLeastThisManyDescendants 2". Twan