
G'day all. On Fri, Jul 26, 2002 at 01:27:48AM +0000, Karen Y wrote:
1. How would I convert capital letters into small letters? 2. How would I remove vowels from a string?
As you've probably found out, these are very hard problems. Haskell gets it a little wrong here, since the result of some of the UnicodePrims functions (see chapter 9, Haskell 98 library report) should really be locale-dependent and therefore _impure_ if you allow changes of locale. Of course, Haskell currently only supports time and data locale information, so this wouldn't help you anyway. Glossing over that concern, current implementations don't support the relevant UnicodePrims fully, so to do it properly you'll probably need to parse the case folding files yourself. See: http://www.unicode.org/unicode/reports/tr21/ Vowels are even harder because I don't think the Unicode standard even defines what a "vowel" is. Removing vowel _marks_ should be straightforward once you expand combining characters, but that doesn't help with the general case. Frankly, I don't like your chances. Can anyone else think up a good solution? Cheers, Andrew Bromage

Are we sure that Karen didn't mean "I don't care of unicode, just want some example with ASCII code?" In that case, well... Karen, what did you mean? Vincenzo

Andrew J Bromage wrote:
G'day all.
Avagoodweekend.
On Fri, Jul 26, 2002 at 01:27:48AM +0000, Karen Y wrote:
1. How would I convert capital letters into small letters? 2. How would I remove vowels from a string?
As you've probably found out, these are very hard problems.
i agree, and I also agree that Haskell hasn't got it quite right. Luckily for us, this problem has been around for a while now, and a solution may be found in a C library, known as "ctype.h". It provides this function: int tolower(int c); So with a little foreign import magic, and a bit of map (details left to reader), we're golden! Caveat: this doesn't necessarily work for Unicode, but it's a start. And you've got to stay within the IO monad. But them's the breaks. As for removing vowels, instead of proposing a solution, I choose to dispute that the problem needs solving and claim victory. Cheers, Andy -- Andy Moran Ph. (503) 526 3472 Galois Connections Inc. Fax. (503) 350 0833 3875 SW Hall Blvd. http://www.galois.com Beaverton, OR 97005 moran@galois.com

On Thu, 2002-07-25 at 19:07, Andrew J Bromage wrote:
G'day all.
On Fri, Jul 26, 2002 at 01:27:48AM +0000, Karen Y wrote:
1. How would I convert capital letters into small letters? 2. How would I remove vowels from a string?
As you've probably found out, these are very hard problems.
Glossing over that concern, current implementations don't support the relevant UnicodePrims fully, so to do it properly you'll probably need to parse the case folding files yourself. See:
http://www.unicode.org/unicode/reports/tr21/
Vowels are even harder because I don't think the Unicode standard even defines what a "vowel" is. Removing vowel _marks_ should be straightforward once you expand combining characters, but that doesn't help with the general case. Frankly, I don't like your chances.
Shouldn't the solution also take care of languages without upper casing? Clearly the translation problem is easy enough with such languages ( "id" will work just fine), but determining (from context?) that the string is in such a language is more than a bit difficult (especially given that numeric codes can correspond to most everything). Vowels are much more difficult - even given that the language is recognizable, what would happen with languages such as Chinese or Arabic which (I believe) have nothing that even resembles a vowel? Of course, Chinese is a whole problem by itself. -- jeff putnam -- jefu.jefu@verizon.net -- http://home1.get.net/res0tm0p

Andy Moran wrote (on Fri, 26 Jul 2002 at 16:52):
> As for removing vowels, instead of proposing a solution, I choose to dispute > that the problem needs solving and claim victory. Removing vowels from identifiers can be very important if you are writing in certain assemblers from the 1970's. Maybe this could be the killer application we are all looking for? Peter Hancock
participants (6)
-
Andrew J Bromage
-
Andy Moran
-
Karen Y
-
Nick Name
-
peter@premise.demon.co.uk
-
that jefu guy