
On Sat, Feb 20, 2021 at 06:12:58AM -0200, Viktor Dukhovni wrote:
Fortunately, these are also supported:
https://hackage.haskell.org/package/text-icu-0.7.0.1/docs/Data-Text-ICU-Brea...
A complete example (the NFC normalisation may be overkill): {-# LANGUAGE BangPatterns #-} module Main (main) where import qualified Data.Text as T import qualified Data.Text.Lazy as LT import qualified Data.Text.Lazy.Builder as LT import qualified Data.Text.Lazy.IO as LT import qualified Data.Text.Lazy.Builder.Int as LT import Data.Text.ICU.Break import Data.Text.ICU.Normalize import Data.Text.ICU.Types (LocaleName(..)) import System.Environment main :: IO () main = do brkIter <- breakCharacter Current "" getArgs >>= mapM_ (go brkIter . normalize NFC . T.pack) where go :: BreakIterator () -> T.Text -> IO () go b t = do setText b t len <- count b 0 LT.putStrLn $ LT.toLazyText $ LT.fromText t <> LT.fromString " has grapheme length: " <> LT.decimal len where count :: Int -> IO Int count !acc = next b >>= maybe (pure acc) (const $ count $ acc + 1) -- Viktor.