
I just want to repeat something somebody suggested, and which I thought was a really neat idea: Have string constants in programs be replaced by (Prelude.fromString "..") or similar, like numerical constants are handled already.
This was suggested in order to simplify the use of PackedString, but I think it might come in handy for translation issues, too.
I find it a little hard to picture this so let's fill in some details so that we can agree that we're talking about the same thing and also to make the idea more concrete. Using typeclasses in this way would require us to make the encoding explicit in the typesystem. So we'd define a bunch of types corresponding to characters and to strings: data Char = .. -- unicode data Latin1 = ... -- Latin1 ... and we'd define two classes and the basic operations on them. class Enum a => Charset a where fromChar :: Char -> a class Ord a => String a where fromString :: String -> a Why did I define two classes instead of just one? The more obvious design was to have class Enum a => Charset a where fromChar :: Char -> a fromString :: String -> [a] but this wouldn't let us make PackedString an instance of it. This could be fixed using multiparameter type classes but splitting the class is easier. (We might revisit this decision if we want operations to convert Charsets to Strings and the like.) Details: - We might want to add operations to convert back to Unicode - though that might require additional parameters to fill in details not encoded in the type? - What should we do if the conversion fails? For example, if I try to convert the unicode yin-yang character (\u262f) to Latin1? - We probably want additional operations for strings like map, append, etc. - fromString should be applied to strings used in patterns. - This requires a minor change in the report which states that a string literal is just an abbreviation for a list of characters. Overall, this looks like it might be a viable approach. The only potential showstoppers seem to be what to do when conversion fails.
(Naturally, the idea is that Prelude.fromString can be repaced by a function that looks the string up in a translation table, instead of using the default value. Any reason this won't work?)
This goes quite a bit further than what I suggest above but let's try to sketch it out. 1) You have to define a new string type: newtype FrenchString = FS String 2) You have to define an instance: instance String FrenchString where fromString (FS "General Protection Fault") = "..." fromString (FS "File not found") = "..." ... fromString (FS _) = ???? Well, it seems simple enough. Once again though, we have the problem of what to do when the conversion fails. What happens in the real world? Do they print the string in English and hope for the best? I don't feel entirely comfortable with doing things this way. I think I'df prefer to see an explicit call to a translation function like 'toFrench'. I presume that the advantage of this approach would be that you could use existing libraries without change? Unfortunately, the way I've sketched it out, the code has to be modified to use the type 'FrenchString' instead of 'String' so we don't achieve this goal. Overall, this doesn't look like it will work. -- Alastair Reid alastair@reid-consulting-uk.ltd.uk Reid Consulting (UK) Limited http://www.reid-consulting-uk.ltd.uk/alastair/