[Haskell-cafe] Decorating a list of strings

3 Nov 2006

      I have a list of text strings:

 ["Alice", "Bob", "Cindy", "Bob", "Bob", "Dave", "Cindy"]

As you can see, some of the strings occur only once; others appear two
or more times.

I would like to end up with a new list, according to the following
rules:

1) If a string occurs only once in the original list, then that string
appears unchanged in the new list.

2) If a string occurs more than once in the original list, then each
occurrences of that string is "decorated" with a numerical suffix
indicating its relative position among its peers.

In the case of the above list, the result of applying these rules would
be:

 ["Alice", "Bob:1", "Cindy:1", "Bob:2", "Bob:3", "Dave", "Cindy:2"]

So, how do we do this? The straightforward approach is to traverse the
list twice, once to build a table of repeat counts for the different
strings, and a second time to generate the new list, based on the old
list plus the table of repeat counts. Of course, it's possible to
streamline this into a single traversal by building up a list of thunks
at the same time as the table of repeat counts is generated, and then to
resolve those thunks by applying them in one fell swoop to the table.
But while this does _conceptually_ streamline the problem, it doesn't
offer any practical improvement; it takes at least as much processing
effort to resolve the thunks as it would to make a second pass through
the list.

Can we do better? I can't think of a way to do better, and my gut
feeling is that there isn't one (in general, it's impossible to know
whether the first element of the new list will be "Alice" or "Alice:1"
until the entire original list has been traversed), but I thought I'd
put the question out to the community to see if anyone had any brilliant
insights.

Steve Schafer
Fenestra Technologies Corp.
http://www.fenestra.com/