I personally like this idea. Mathematica allows all sorts of bizarre names and it'd be cool for Haskell to be similar, so that mathematical Haskell scripts and IHaskell notebooks can be just as fancy and incomprehensible as dense Mathematica code!

Since GHC already accepts some unicode, I think it'd be a great idea to extend it in this way.


On Sat, Jun 14, 2014 at 4:58 PM, John Meacham <john@repetae.net> wrote:
I have this feature in jhc, where I have a 'trailing' character class
that can appear at the end of both symbols and ids.

currently it consists of

 $trailing = [₀₁₂₃₄₅₆₇₈₉⁰¹²³⁴⁵⁶⁷⁸⁹₍₎⁽⁾₊₋]

 John

On Sat, Jun 14, 2014 at 7:48 AM, Mikhail Vorozhtsov
<mikhail.vorozhtsov@gmail.com> wrote:
> Hello lists,
>
> As some of you may know, GHC's support for Unicode characters in lexemes is
> rather crude and hence prone to inconsistencies in their handling versus the
> ASCII counterparts. For example, APOSTROPHE is treated differently from
> PRIME:
>
> λ> data a +' b = Plus a b
> <interactive>:3:9:
>     Unexpected type ‘b’
>     In the data declaration for ‘+’
>     A data declaration should have form
>       data + a b c = ...
> λ> data a +′ b = Plus a b
>
> λ> let a' = 1
> λ> let a′ = 1
> <interactive>:10:8: parse error on input ‘=’
>
> Also some rather bizarre looking things are accepted:
>
> λ> let ᵤxᵤy = 1
>
> In the spirit of improving things little by little I would like to propose:
>
> 1. Handle single/double/triple/quadruple Unicode PRIMEs the same way as
> APOSTROPHE, meaning the following alterations to the lexer:
>
> primes -> U+2032 | U+2033 | U+2034 | U+2057
> symbol -> ascSymbol | uniSymbol (EXCEPT special | _ | " | ' | primes)
> graphic -> small | large | symbol | digit | special | " | ' | primes
> varid -> (small { small | large | digit | ' | primes }) (EXCEPT reservedid)
> conid -> large { small | large | digit | ' | primes }
>
> 2. Introduce a new lexer nonterminal "subsup" that would include the Unicode
> sub/superscript[1] versions of numbers, "-", "+", "=", "(", ")", Latin and
> Greek letters. And allow these characters to be used in names and operators:
>
> symbol -> ascSymbol | uniSymbol (EXCEPT special | _ | " | ' | primes |
> subsup )
> digit -> ascDigit | uniDigit (EXCEPT subsup)
> small -> ascSmall | uniSmall (EXCEPT subsup) | _
> large -> ascLarge | uniLarge (EXCEPT subsup)
> graphic -> small | large | symbol | digit | special | " | ' | primes |
> subsup
> varid -> (small { small | large | digit | ' | primes | subsup }) (EXCEPT
> reservedid)
> conid -> large { small | large | digit | ' | primes | subsup }
> varsym -> (symbol (EXCEPT :) {symbol | subsup}) (EXCEPT reservedop | dashes)
> consym -> (: {symbol | subsup}) (EXCEPT reservedop)
>
> If this proposal is received favorably, I'll write a patch for GHC based on
> my previous stab at the problem[2].
>
> P.S. I'm CC-ing Cafe for extra attention, but please keep the discussion to
> the GHC users list.
>
> [1] https://en.wikipedia.org/wiki/Unicode_subscripts_and_superscripts
> [2] https://ghc.haskell.org/trac/ghc/ticket/5108
> _______________________________________________
> Glasgow-haskell-users mailing list
> Glasgow-haskell-users@haskell.org
> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



--
John Meacham - http://notanumber.net/
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe