
Hi,
On 15 October 2014 21:23, Niklas Hambüchen
(I'm trying to improve Sublime Text's Haskell lexer.)
https://www.haskell.org/onlinereport/haskell2010/haskellch10.html says uniSymbol → any Unicode symbol or punctuation
What is meant here, is "Unicode symbol" literally \p{Symbol} in regex, or more?
So uniSymbol = \p{Symbol} | \p{Punctuation}
Looking at the source of GHC's lexer [1], the relevant part seems to be: case generalCategory c of [...] ConnectorPunctuation -> symbol DashPunctuation -> symbol [...] OtherPunctuation -> symbol MathSymbol -> symbol CurrencySymbol -> symbol ModifierSymbol -> symbol OtherSymbol -> symbol [...] [1] https://github.com/ghc/ghc/blob/master/compiler/parser/Lexer.x