
On Mon, Sep 01, 2008 at 09:54:38PM -0700, Ashley Yakeley wrote:
These two packages are representations in Haskell of various data in the Unicode 3.2.0 Character Database. Unicode 3.2.0 was the latest version of the Unicode standard at the time I wrote most of the code; later I may move the packages to the latest version (currently 5.1.0).
The unicode-properties package contains functions to determine general category, case, and a wide range of other properties, as well as to do decomposition and case-folding.
The unicode-names package contains just one function, getCharacterName, for getting the name of a character. It's separated out because it's a sufficiently large proportion of the total data.
On a minor point, it would probably be better to avoid prefixing names of constants (e.g. DCVertical). Also, the prefix "get" is usually reserved for functions that have a monadic effect, so names like decomposition :: Char -> Decomposition would be more usual than getDecomposition. Note that Data.Char already has functions generalCategory, toUpper, toLower and toTitle, which should work on the full range. It should probably have majorClass as well.