
OK people, it's random statistics time! Haskell '98 apparently features 25 reserved words. (Not counting "forall" and "mdo" and so on, which AFAIK are not in Haskell '98.) So how does that compare to other languages? C: 32 C++: 62 Borland Turbo Pascal: ~50 [without the OOP extensions added later] Eiffel: 59 VB: The source I checked listed in excess of 120 reserved words, but I'm dubious as to how "reserved" they really are. (Is CInt really reserved? I doubt it!) It also depends wildly on which of the bazillion VB dialects you mean. Java: 50 JavaScript: 36 Smalltalk: 0 Lisp: AFAIK, there are no truly reserved words in Lisp, only predefined functions. (??) Python: 31 Ruby: 38 Tcl: Same analysis as for Lisp I believe. As you can see, this conclusively proves... something. Hmm, I wonder if there's some way to compare the size of the language specification documents? :-} PS. It comes as absolutely no surprise to me that C++ has the most keywords. But then, if I were to add AMOS Professional, that had well over 800 keywords at the last count...

Haskell '98 apparently features 25 reserved words. (Not counting "forall" and "mdo" and so on, which AFAIK are not in Haskell '98.)
21 actually. case, class, data, default, deriving, do, else, if, import, in, infix, infixl, infixr, instance, let, module, newtype, of, then, type, where. There's also three special words that can still be used as identifiers, so aren't reserved: as, qualified, hiding. /Niklas

Niklas Broberg wrote:
Haskell '98 apparently features 25 reserved words. (Not counting "forall" and "mdo" and so on, which AFAIK are not in Haskell '98.)
21 actually. case, class, data, default, deriving, do, else, if, import, in, infix, infixl, infixr, instance, let, module, newtype, of, then, type, where. There's also three special words that can still be used as identifiers, so aren't reserved: as, qualified, hiding.
OK. Well the list I saw was for Haskell plus extensions, and I visually filtered out the inapplicable stuff. Apparently I missed something. Also, the number varies depending on whether you consider "reversed words" or "keywords", and I suspect the situation is subtly different for each possible language. I was going vaguely for anything that's hard-wired into a language, and not just part of the standard libraries. (E.g., "return" is definitely NOT any kind of reversed word or keyword.)

Am Dienstag, den 12.01.2010, 22:22 +0000 schrieb Andrew Coppin:
Niklas Broberg wrote:
Haskell '98 apparently features 25 reserved words. (Not counting "forall" and "mdo" and so on, which AFAIK are not in Haskell '98.)
21 actually. case, class, data, default, deriving, do, else, if, import, in, infix, infixl, infixr, instance, let, module, newtype, of, then, type, where. There's also three special words that can still be used as identifiers, so aren't reserved: as, qualified, hiding.
OK. Well the list I saw was for Haskell plus extensions, and I visually filtered out the inapplicable stuff. Apparently I missed something.
Also, the number varies depending on whether you consider "reversed words" or "keywords", and I suspect the situation is subtly different
"reversed words"? There are some in sh for example, namely 'fi' and 'esac', but other than that they are not that common... ;-)
for each possible language. I was going vaguely for anything that's hard-wired into a language, and not just part of the standard libraries. (E.g., "return" is definitely NOT any kind of reversed word or keyword.)
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Jan 12, 2010, at 17:38 , Michael Hartl wrote:
Also, the number varies depending on whether you consider "reversed words" or "keywords", and I suspect the situation is subtly different
"reversed words"? There are some in sh for example, namely 'fi' and 'esac', but other than that they are not that common... ;-)
Algol used them; the Bourne shell picked them up from Algol 60. pointless-geekery> -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

On 12 Jan 2010, at 22:22, Andrew Coppin wrote:
Niklas Broberg wrote:
Haskell '98 apparently features 25 reserved words. (Not counting "forall" and "mdo" and so on, which AFAIK are not in Haskell '98.)
21 actually. case, class, data, default, deriving, do, else, if, import, in, infix, infixl, infixr, instance, let, module, newtype, of, then, type, where. There's also three special words that can still be used as identifiers, so aren't reserved: as, qualified, hiding.
OK. Well the list I saw was for Haskell plus extensions, and I visually filtered out the inapplicable stuff. Apparently I missed something.
Also, the number varies depending on whether you consider "reversed words" or "keywords",
Aye, there's a subtle distinction between keywords and reserved words, but I think for the purposes of this discussion, they're the same thing. Martin

Am Dienstag 12 Januar 2010 23:12:20 schrieb Niklas Broberg:
Haskell '98 apparently features 25 reserved words. (Not counting "forall" and "mdo" and so on, which AFAIK are not in Haskell '98.)
21 actually. case, class, data, default, deriving, do, else, if, import, in, infix, infixl, infixr, instance, let, module, newtype, of, then, type, where. There's also three special words that can still be used as identifiers, so aren't reserved: as, qualified, hiding.
Okay, 'as' is easy. But can you find a situation where 'qualified' or 'hiding' would be natural choices for an identifier? I'd love to see those in some code :)
/Niklas

Daniel Fischer wrote:
Am Dienstag 12 Januar 2010 23:12:20 schrieb Niklas Broberg:
Haskell '98 apparently features 25 reserved words. (Not counting "forall" and "mdo" and so on, which AFAIK are not in Haskell '98.)
21 actually. case, class, data, default, deriving, do, else, if, import, in, infix, infixl, infixr, instance, let, module, newtype, of, then, type, where. There's also three special words that can still be used as identifiers, so aren't reserved: as, qualified, hiding.
Okay, 'as' is easy. But can you find a situation where 'qualified' or 'hiding' would be natural choices for an identifier? I'd love to see those in some code :)
I quite often try to use "in-file" and "out-file" abbreviated to "if" and "of" - which obviously fails miserably. And "as" is pretty simple, as you point out. It's weird that us Haskell people complain about there being only 26 letters in the alphabet when all other known programming languages emphasize the use of long, descriptive names like "log_file" instead of "lf"...

Colin Paul Adams wrote:
Andrew> It's weird that us Haskell people complain about there Andrew> being only 26 letters in the alphabet
Which alphabet? You have plenty of choice in Unicode.
Er... I was under the impression that Haskell source code uses the ASCII character set, not Unicode. (And even if that's not the case, I've yet to find a way to type in the Unicode characters which are hypothetically possible.)

On Jan 13, 2010, at 14:25 , Andrew Coppin wrote:
Colin Paul Adams wrote:
Andrew> It's weird that us Haskell people complain about there Andrew> being only 26 letters in the alphabet
Which alphabet? You have plenty of choice in Unicode.
Er... I was under the impression that Haskell source code uses the ASCII character set, not Unicode.
The Report would beg to differ with you; see section 2.1. "Haskell uses the Unicode [11] character set. However, source programs are currently biased toward the ASCII character set used in earlier versions of Haskell ." ("Currently" at the time being 1998. Unicode is more prevalent these days.)
(And even if that's not the case, I've yet to find a way to type in the Unicode characters which are hypothetically possible.)
That's a problem with your editor/development environment. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

Brandon S. Allbery KF8NH wrote:
On Jan 13, 2010, at 14:25 , Andrew Coppin wrote:
Colin Paul Adams wrote:
Andrew> It's weird that us Haskell people complain about there Andrew> being only 26 letters in the alphabet
Which alphabet? You have plenty of choice in Unicode.
Er... I was under the impression that Haskell source code uses the ASCII character set, not Unicode.
The Report would beg to differ with you; see section 2.1. "Haskell uses the Unicode [11] character set. However, source programs are currently biased toward the ASCII character set used in earlier versions of Haskell ." ("Currently" at the time being 1998. Unicode is more prevalent these days.)
So... how would GHC tell which of the hundreds of millions of possible character encodings is in use?
(And even if that's not the case, I've yet to find a way to type in the Unicode characters which are hypothetically possible.)
That's a problem with your editor/development environment.
Or rather, the problem with every computer system known to man?

On Jan 13, 2010, at 14:42 , Andrew Coppin wrote:
Brandon S. Allbery KF8NH wrote:
On Jan 13, 2010, at 14:25 , Andrew Coppin wrote:
Colin Paul Adams wrote:
Andrew> It's weird that us Haskell people complain about there Andrew> being only 26 letters in the alphabet
Which alphabet? You have plenty of choice in Unicode.
Er... I was under the impression that Haskell source code uses the ASCII character set, not Unicode.
The Report would beg to differ with you; see section 2.1. "Haskell uses the Unicode [11] character set. However, source programs are currently biased toward the ASCII character set used in earlier versions of Haskell ." ("Currently" at the time being 1998. Unicode is more prevalent these days.)
So... how would GHC tell which of the hundreds of millions of possible character encodings is in use?
That's left to the compiler implementation. I'm not spotting an official statement in the GHC manual, but in practice GHC uses UTF-8. (It might support Windows standard UTF-16 as well; if do, it probably requires the first character of the source file to be a UTF-16 byte order mark.)
(And even if that's not the case, I've yet to find a way to type in the Unicode characters which are hypothetically possible.)
That's a problem with your editor/development environment.
Or rather, the problem with every computer system known to man?
s/man/you/ The existence of -XUnicodeSyntax ( http://www.haskell.org/ghc/docs/latest/html/users_guide/syntax-extns.html#un... ) suggests that at least some other GHC users don't have your problem. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

On Wed, Jan 13, 2010 at 1:28 PM, Brandon S. Allbery KF8NH
On Jan 13, 2010, at 14:25 , Andrew Coppin wrote:
(And even if that's not the case, I've yet to find a way to type in the Unicode characters which are hypothetically possible.)
That's a problem with your editor/development environment.
It's not just one's editor (I use emacs, and it's actually not that hard to type a decent subset of interesting Unicode characters in emacs with the tex input mode), but readability. The ASCII characters are universal and easily recognized (assuming you have a decent monochrome font); having to notice potentially significant differences involving diacritics alone (not to mention all the various mathematical symbols) in identifiers would drive me mad. It's the same reason we try to limit lines of code to ~80 characters — our editors are *capable* of more, sure, but are we?

"Tom" == Tom Tobin
writes:
Tom> readability. The ASCII characters are universal and easily Tom> recognized No they are not. My wife is Chinese. When she was learning pinyin as a child, she asked her father for help with some homework. He replied that he didn't understand them. -- Colin Adams Preston Lancashire

On Thu, Jan 14, 2010 at 12:45 AM, Colin Paul Adams
"Tom" == Tom Tobin
writes: Tom> readability. The ASCII characters are universal and easily Tom> recognized
No they are not. My wife is Chinese. When she was learning pinyin as a child, she asked her father for help with some homework. He replied that he didn't understand them.
I should have said "The ASCII characters are universal and easily recognized *for programmers*." Of course someone who hasn't come in contact with the Latin alphabet, let alone programming, isn't going to recognize ASCII — but I think we'd have an amazingly hard time finding a programmer who wasn't familiar with it, regardless of their native language.

Wow, that's kind of cute: {-# LANGUAGE UnicodeSyntax #-} (*) = (*) (/) = (/) 公式 高 中 低=高*中*低/整數 整數 = 123 Oddly, if I change the order of these definitions I get syntax errors. Very mysterious. Nice how it knows that * is a symbol, but I'm not sure how I'm supposed to name a type. It certainly spells the end of of camelCase arguments though, and can take single-character variable names to dizzying new heights :) I suppose number literals can't be overridden though. Not to mention = and ().
It's not just one's editor (I use emacs, and it's actually not that hard to type a decent subset of interesting Unicode characters in emacs with the tex input mode), but readability. The ASCII characters are universal and easily recognized (assuming you have a decent monochrome font); having to notice potentially significant differences involving diacritics alone (not to mention all the various mathematical symbols) in identifiers would drive me mad. It's the same reason we try to limit lines of code to ~80 characters -- our editors are *capable* of more, sure, but are we?
Unicode identifiers are fun but this is a good point. The line has to
be somewhere, so it might as well be in the historical position unless
there are widely agreed on benefits to moving it.
On Wed, Jan 13, 2010 at 10:45 PM, Colin Paul Adams
My wife is Chinese. When she was learning pinyin as a child, she asked her father for help with some homework. He replied that he didn't understand them.
But that's all kind of beside the point because you already need to learn quite a bit of specialized knowledge to be writing in haskell in the first place. It's real hard to get to that stage without already recognizing ascii. If the problem was just pinyin and not latin letters in general, then most Taiwanese wouldn't understand it either.

2010/1/14 Evan Laforge
Wow, that's kind of cute:
{-# LANGUAGE UnicodeSyntax #-} (*) = (*) (/) = (/) 公式 高 中 低 = 高 * 中 * 低 / 整數 整數 = 123
That code snippet is also perfectly legal Haskell without the UnicodeSyntax language extension. You use UnicodeSyntax if you want to write code like this: {-# LANGUAGE UnicodeSyntax, ScopedTypeVariables #-} swap ∷ ∀ α β. (α, β) → (β, α) swap = uncurry $ flip (,)
Oddly, if I change the order of these definitions I get syntax errors. Very mysterious. Nice how it knows that * is a symbol, but I'm not sure how I'm supposed to name a type.
I was a bit surprised that you could use * as an operator since it is a punctuation character. Maybe there are some corner cases with fullwidth characters or with composition of characters.
Unicode identifiers are fun but this is a good point. The line has to be somewhere, so it might as well be in the historical position unless there are widely agreed on benefits to moving it.
I have already crossed that line: http://hackage.haskell.org/package/base-unicode-symbols http://hackage.haskell.org/package/containers-unicode-symbols But I am aware that there is a point beyond which unicode symbols only make your code harder to understand. So I try to be conservative in my use of them. Still, there are a lot of useful and acceptable symbols which are not part of the historic ASCII set: ∈, ≤, ∪, ∧, ¬, ∘ to name a few.

Am Donnerstag 14 Januar 2010 11:38:57 schrieb Roel van Dijk:
I was a bit surprised that you could use * as an operator since it is a punctuation character. Maybe there are some corner cases with fullwidth characters or with composition of characters.
Thus speaketh the report (http://haskell.org/onlinereport/lexemes.html):
symbol -> ascSymbol | uniSymbol

Thus speaketh the report (http://haskell.org/onlinereport/lexemes.html):
symbol -> ascSymbol | uniSymbol
ascSymbol -> ! | # | $ | % | & | * | + | . | / | < | = | > | ? | @ | \ | ^ | | | - | ~ uniSymbol -> any Unicode symbol or punctuation Punctuation characters are legitimate for operators.
Aha, didn't know that (or forgot it). Also kind of obvious when you consider that '.' and ':' are punctuation characters. I think it is time for an Obfuscated Haskell Contest :-)

On Thu, Jan 14, 2010 at 12:47 PM, Colin Paul Adams
"Roel" == Roel van Dijk
writes: Roel> I think it is time for an Obfuscated Haskell Contest :-)
Are you allowed to use obsolete scripts for your identifiers? :-)
Sure, I'll consider bonus points if you write your program entirely in cuneiform.

Unicode identifiers are fun but this is a good point. The line has to be somewhere, so it might as well be in the historical position unless there are widely agreed on benefits to moving it.
I have already crossed that line:
Ha, well haskell programmers wouldn't be haskell programmers if they weren't already a bunch of line crossers :)

On 13 Jan 2010, at 22:25, Andrew Coppin wrote:
Colin Paul Adams wrote:
Andrew> It's weird that us Haskell people complain about there Andrew> being only 26 letters in the alphabet
Which alphabet? You have plenty of choice in Unicode.
Er... I was under the impression that Haskell source code uses the ASCII character set, not Unicode.
(And even if that's not the case, I've yet to find a way to type in the Unicode characters which are hypothetically possible.)
module Main where import Prelude hiding (putStrLn) import System.IO.UTF8 main = let é = "Andrew Coppin снова оказался неправ" ç = что_то_там where что_то_там = undefined in putStrLn é

On Tue, Jan 12, 2010 at 11:26 PM, Daniel Fischer
Okay, 'as' is easy. But can you find a situation where 'qualified' or 'hiding' would be natural choices for an identifier? I'd love to see those in some code :)
module LordsOfMidnight.Character(Character) where data Character = C { name :: String, location :: (Int,Int), facing :: Direction, hour :: Int, energy :: Int, fear :: Int, riders :: Int, soldiers :: Int, hiding :: Bool }

Fraser Wilson
module LordsOfMidnight.Character(Character) where
data Character = C { name :: String, location :: (Int,Int), facing :: Direction, hour :: Int, energy :: Int, fear :: Int, riders :: Int, soldiers :: Int, hiding :: Bool }
Daniel Fisher thinks again... (With apologies for the rather obscure reference.) -k -- If I haven't seen further, it is by standing in the footprints of giants

On Jan 12, 2010, at 17:12 , Niklas Broberg wrote:
Haskell '98 apparently features 25 reserved words. (Not counting "forall" and "mdo" and so on, which AFAIK are not in Haskell '98.)
21 actually. case, class, data, default, deriving, do, else, if, import, in, infix, infixl, infixr, instance, let, module, newtype, of, then, type, where. There's also three special words that can still be used as identifiers, so aren't reserved: as, qualified, hiding.
Are we counting the FFI annex ("foreign")? -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

Brandon S. Allbery KF8NH wrote:
On Jan 12, 2010, at 17:12 , Niklas Broberg wrote:
Haskell '98 apparently features 25 reserved words. (Not counting "forall" and "mdo" and so on, which AFAIK are not in Haskell '98.)
21 actually. case, class, data, default, deriving, do, else, if, import, in, infix, infixl, infixr, instance, let, module, newtype, of, then, type, where. There's also three special words that can still be used as identifiers, so aren't reserved: as, qualified, hiding.
Are we counting the FFI annex ("foreign")?
Strictly, wasn't that added *after* the Haskell 98 report was written? I.e., if you wanted to be ultra-technical about it, it's not part of the original Haskell '98? At any rate, no I didn't count the FFI. (Since I almost never use it.)

On Jan 13, 2010, at 14:29 , Andrew Coppin wrote:
Brandon S. Allbery KF8NH wrote:
On Jan 12, 2010, at 17:12 , Niklas Broberg wrote:
Haskell '98 apparently features 25 reserved words. (Not counting "forall" and "mdo" and so on, which AFAIK are not in Haskell '98.)
21 actually. case, class, data, default, deriving, do, else, if, import, in, infix, infixl, infixr, instance, let, module, newtype, of, then, type, where. There's also three special words that can still be used as identifiers, so aren't reserved: as, qualified, hiding.
Are we counting the FFI annex ("foreign")?
Strictly, wasn't that added *after* the Haskell 98 report was written? I.e., if you wanted to be ultra-technical about it, it's not part of the original Haskell '98?
That would be the import of "annex", yes. It was not part of the original standard, but is considered part of the working Haskell '98 standard. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

Brandon S. Allbery KF8NH wrote:
On Jan 13, 2010, at 14:29 , Andrew Coppin wrote:
Brandon S. Allbery KF8NH wrote:
Are we counting the FFI annex ("foreign")?
Strictly, wasn't that added *after* the Haskell 98 report was written? I.e., if you wanted to be ultra-technical about it, it's not part of the original Haskell '98?
That would be the import of "annex", yes. It was not part of the original standard, but is considered part of the working Haskell '98 standard.
Of course, you could always use Haskell 2010. It exists now, apparently...

Niklas Broberg wrote:
Haskell '98 apparently features 25 reserved words. (Not counting "forall" and "mdo" and so on, which AFAIK are not in Haskell '98.)
21 actually. case, class, data, default, deriving, do, else, if, import, in, infix, infixl, infixr, instance, let, module, newtype, of, then, type, where. There's also three special words that can still be used as identifiers, so aren't reserved: as, qualified, hiding.
Since you can define operators in Haskell, would it make sense to include '=', '--', ':', ',' etc. as "reserved names" since those can't be used as operator names?
Illegal binding of built-in syntax: : Illegal binding of built-in syntax: (,) parse error (possibly incorrect indentation)
Martijn.

Since you can define operators in Haskell, would it make sense to include '=', '--', ':', ',' etc. as "reserved names" since those can't be used as operator names?
They are indeed reserved operators in the report. 11 of those: .. : :: = \ | <- -> @ ~ => To be fair, _ is also a reserved identifier, so 22 and not 21 as I said previously. So a total of 33 reserved "names". /Niklas

Martijn van Steenbergen wrote:
Niklas Broberg wrote:
21 actually. case, class, data, default, deriving, do, else, if, import, in, infix, infixl, infixr, instance, let, module, newtype, of, then, type, where. There's also three special words that can still be used as identifiers, so aren't reserved: as, qualified, hiding.
Since you can define operators in Haskell, would it make sense to include '=', '--', ':', ',' etc. as "reserved names" since those can't be used as operator names?
Makes sense to me... It's merely more difficult to catelogue this information for a half-dozen different languages. Looking up the reserved word list is usually only a Google search away. Somebody suggested to me that the best metric for "how difficult" a language is to learn is "the number of orthogonal concepts you need to learn". Of course, measuring THAT is going to be no picknick!

On Jan 14, 2010, at 8:38 PM, Andrew Coppin wrote:
Martijn van Steenbergen wrote:
Niklas Broberg wrote:
21 actually. case, class, data, default, deriving, do, else, if, import, in, infix, infixl, infixr, instance, let, module, newtype, of, then, type, where. There's also three special words that can still be used as identifiers, so aren't reserved: as, qualified, hiding.
Since you can define operators in Haskell, would it make sense to include '=', '--', ':', ',' etc. as "reserved names" since those can't be used as operator names?
Makes sense to me...
It's merely more difficult to catelogue this information for a half- dozen different languages. Looking up the reserved word list is usually only a Google search away.
Somebody suggested to me that the best metric for "how difficult" a language is to learn is "the number of orthogonal concepts you need to learn". Of course, measuring THAT is going to be no picknick!
I do not think so. More orthogonal concepts may make it more work to learn the language, but I think orthogonality helps to learn a language that has many concepts. For me, a major problem when learning a language is ad-hocness. The Java Language Specification part on Generics (parametric polymorphism) comes to mind. It is full of ad-hoc restrictions, and operational details. Haskell's polymorphism behaves much more predictably because it is much less ad-hoc. Although I do not have any Python programming experience, I got the impression that Python is very un-ad-hoc. Everything behaves in exactly the same way at all possible levels in the language. You need to master only one idea and it applies everywhere. Even if the way it behaves is strange. Jurriaan

Niklas Broberg schrieb:
Haskell '98 apparently features 25 reserved words. (Not counting "forall" and "mdo" and so on, which AFAIK are not in Haskell '98.)
21 actually. case, class, data, default, deriving, do, else, if, import, in, infix, infixl, infixr, instance, let, module, newtype, of, then, type, where. There's also three special words that can still be used as identifiers, so aren't reserved: as, qualified, hiding.
Recently I added 'export' to my NEdit highlight patterns in order to support FFI statements more completely.

Andrew Coppin wrote:
OK people, it's random statistics time!
Haskell '98 apparently features 25 reserved words. (Not counting "forall" and "mdo" and so on, which AFAIK are not in Haskell '98.) So how does that compare to other languages?
C: 32 C++: 62 Borland Turbo Pascal: ~50 [without the OOP extensions added later] Eiffel: 59 VB: The source I checked listed in excess of 120 reserved words, but I'm dubious as to how "reserved" they really are. (Is CInt really reserved? I doubt it!) It also depends wildly on which of the bazillion VB dialects you mean. Java: 50 JavaScript: 36 Smalltalk: 0 Lisp: AFAIK, there are no truly reserved words in Lisp, only predefined functions. (??) Python: 31 Ruby: 38 Tcl: Same analysis as for Lisp I believe.
As you can see, this conclusively proves... something.
Hmm, I wonder if there's some way to compare the size of the language specification documents? :-}
PS. It comes as absolutely no surprise to me that C++ has the most keywords. But then, if I were to add AMOS Professional, that had well over 800 keywords at the last count...
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Java has 53 reserved words. -- Tony Morris http://tmorris.net/

Hello Andrew, Wednesday, January 13, 2010, 1:54:44 AM, you wrote:
(The material I quoted from had notes about which version of Java added certain of the words. I guess it was outdated.)
you would be more respected in this list if you will compare haskell 1.0 with java'2010 or better '2020 ;) -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Le mardi 12 janvier 2010 à 21:25 +0000, Andrew Coppin a écrit : Hi Andrew,
As you can see, this conclusively proves... something.
What, exactly? Take Eiffel in its last version: I have identified 11 keywords that are either used for Design By Contract or source-code documentation. These are software engineering tasks that in the other languages you cite are not supported at the grammar level and, if ever, fulfilled by enriching the code using comments, annotations or compiler extensions. Removed, this make Eiffel the less "verbose" object-oriented language of your list, excepted Smalltalk. In any case the less verbose statically typed compiled OO language. One factor is how rich is the set of OO constructs supported by the language. Visibility rule keywords may shrink or extend the list of reserved words. Same for math or logical operators: some languages define as keyword what belongs to libraries in another. Same for exception mechanism. Some languages defines the code block constructs using keywords while other use tokens that are not considered reserved words. Contrast for example the following Ada code (4 keywords): procedure Hello is begin ... end Hello; against the equivalent C (1 keyword): void HelloWorld() { ... } or the equivalent Haskell (0 keywords): helloworld = ... Ada in its 71 reserved words features keywords for concurrent programming that doesn't exist at the same level in the other languages in your list, and an explicit "pragma" keyword to allow the programmer to specify things like the alignement used in the generated binary. So a significant factor in programming language "verbosity" is how much control the language intends to give to the programmer on the implementation level - or said otherwise how abstracted is the language from its run-time environment. Let me order your list: Smalltalk: 0 Lisp: 0 Tcl: 0 Haskell: 21 * Python: 31 C: 32 * JavaScript: 36 Ruby: 38 --- Borland Turbo Pascal: ~50 Java: 53 Eiffel: 59 C++: 62 Interestingly enough, interpreted languages tend to need less keywords, which support my observation above. Let ignore C. Noticeable is Haskell, with little reserved words while being efficiently compiled. It reflects the fact that it does not have elaborated object-oriented constructs, that code is structured using layout rules, that the exception and concurrency mechanisms are provided at the library level, that there is no way to control the code generated, and that there is no built-in support for software engineering like the sort Eiffel provides. It reflects as well the "expressiveness" of Haskell and how good at abstracting the description of a computation from its run-time environment it is. But if you really wanted to compare apples to apples you would, for instance, add GHC pragmas and "magic" things like `par` to the mix. I wonder if the picture would change much?
Hmm, I wonder if there's some way to compare the size of the language specification documents? :-} Maybe comparing the grammars in a standardized form (BNF) ?
Cheers, Sylvain

sylvain
Let me order your list:
Smalltalk: 0 Lisp: 0 Tcl: 0
If you count reserved tokens, I guess Lisp reserves parentheses and whitespace?
Haskell: 21 * Python: 31 C: 32 * JavaScript: 36 Ruby: 38 --- Borland Turbo Pascal: ~50 Java: 53 Eiffel: 59 C++: 62
Interestingly enough, interpreted languages tend to need less keywords, which support my observation above.
I can't help but notice that the top three are untyped (all right, "dynamically typed") languages. Static typing seems to require at least a few reserved words (does it make sense to redefine 'data' or 'type' in Haskell?)
But if you really wanted to compare apples to apples you would, for instance, add GHC pragmas and "magic" things like `par` to the mix. I wonder if the picture would change much?
Looking for a minimal subset that everything else can be implemented in terms of? Still, having 'par' as a user redefinable token lets you replace it with your own implementation (par = seq, for instance :-). So I think there's a benefit, even if it is normally implemented using magic. -k -- If I haven't seen further, it is by standing in the footprints of giants

sylvain wrote:
Le mardi 12 janvier 2010 à 21:25 +0000, Andrew Coppin a écrit :
Hi Andrew,
As you can see, this conclusively proves... something.
What, exactly?
Not a lot. As you so elegantly point out, the number of keywords in a language is a fairly crude measurement of how complex the language is or is not. On the other hand, like lines of code, it's something that's easy to measure. ;-) Other such ways include taking the size of a gzip of a typical block of source code, like the Shootout does. This doesn't really prove anything though, because any benchmark can be implemented in more than one way. You could also try using how many pages it takes to explain the syntax of the language - but that also depends on how you explain it. Simplicity is really in the eye of the beholder, after all...

Andrew Coppin wrote:
OK people, it's random statistics time!
OK, my version of meaningless statistics: C++ (ISO/IEC 14882:1998(E)): 325 pages (712 including standard libraries) C# (ECMA-334): 505 pages (language only) Java: 450 pages (language only?) Scala (2.7): 125 pages (157 including standard library) Eiffel (ECMA-367): 160 pages (language only) ANSI SQL-92: 685 pages (language only) Haskell-98: 77 pages (247 including Prelude) Erlang (4.7.3) 162 pages (251 including builtin functions) Scheme (R5RS): 17 pages (45 including standard procedures) -- View this message in context: http://old.nabble.com/Language-simplicity-tp27134989p27137827.html Sent from the Haskell - Haskell-Cafe mailing list archive at Nabble.com.

On Wed, Jan 13, 2010 at 12:55 AM, Eduard Sergeev
Andrew Coppin wrote:
OK people, it's random statistics time!
OK, my version of meaningless statistics:
C++ (ISO/IEC 14882:1998(E)): 325 pages (712 including standard libraries) C# (ECMA-334): 505 pages (language only) Java: 450 pages (language only?) Scala (2.7): 125 pages (157 including standard library) Eiffel (ECMA-367): 160 pages (language only) ANSI SQL-92: 685 pages (language only) Haskell-98: 77 pages (247 including Prelude) Erlang (4.7.3) 162 pages (251 including builtin functions) Scheme (R5RS): 17 pages (45 including standard procedures)
Oberon: 16 pages, including table of contents and Appendix (containing EBNF grammar). -- Sebastian Sylvan

Eduard Sergeev schrieb:
Andrew Coppin wrote:
OK people, it's random statistics time!
OK, my version of meaningless statistics:
C++ (ISO/IEC 14882:1998(E)): 325 pages (712 including standard libraries) C# (ECMA-334): 505 pages (language only) Java: 450 pages (language only?) Scala (2.7): 125 pages (157 including standard library) Eiffel (ECMA-367): 160 pages (language only) ANSI SQL-92: 685 pages (language only) Haskell-98: 77 pages (247 including Prelude) Erlang (4.7.3) 162 pages (251 including builtin functions) Scheme (R5RS): 17 pages (45 including standard procedures)
Modula-3 is advertised for needing only 60 pages for the language definition.

Andrew Coppin wrote:
OK people, it's random statistics time!
OK, my version of meaningless statistics:
Java: 450 pages (language only?) Which version is this? The version of the Java Language Specification (version 3.0, 2005) I am currently reading has 684 pages. I'd prefer to read only 450.
Jur

Eduard Sergeev wrote:
OK, my version of meaningless statistics:
C++ (ISO/IEC 14882:1998(E)): 325 pages (712 including standard libraries) C# (ECMA-334): 505 pages (language only) Java: 450 pages (language only?) Scala (2.7): 125 pages (157 including standard library) Eiffel (ECMA-367): 160 pages (language only) ANSI SQL-92: 685 pages (language only) Haskell-98: 77 pages (247 including Prelude) Erlang (4.7.3) 162 pages (251 including builtin functions) Scheme (R5RS): 17 pages (45 including standard procedures)
Interesting. So Scheme is the shortest by a mile, followed by Haskell '98, followed by another big gap. Now is that because Haskell is simple? Or is it because the Report assumes that you already know what functional programming and Milner-Hindley type inference are?

On 12 Jan 2010, at 21:25, Andrew Coppin wrote:
OK people, it's random statistics time!
Haskell '98 apparently features 25 reserved words. (Not counting "forall" and "mdo" and so on, which AFAIK are not in Haskell '98.) So how does that compare to other languages?
C: 32 C++: 62 Borland Turbo Pascal: ~50 [without the OOP extensions added later] Eiffel: 59 VB: The source I checked listed in excess of 120 reserved words, but I'm dubious as to how "reserved" they really are. (Is CInt really reserved? I doubt it!) It also depends wildly on which of the bazillion VB dialects you mean. Java: 50 JavaScript: 36 Smalltalk: 0
There are six singleton pseudo-variables that act as reserved words: true,false, nil, self, super and thisContext.
Lisp: AFAIK, there are no truly reserved words in Lisp, only predefined functions. (??)
All Lisps have "special forms" which are evaluated uniquely and differently from function application and are therefore reserved words by another name. For example, Clojure has def, if, do, let, var, quote, fn, loop, recur, throw, try, monitor-enter, monitor-exit, dot, new and set!.
Python: 31 Ruby: 38 Tcl: Same analysis as for Lisp I believe.
COBOL: Over 400 (!)
As you can see, this conclusively proves... something.
Generally speaking, the most widely used languages seem to be near the upper end of the range. I don't think it really tells you that much. Possibly that a little superficial complexity through syntactic sugar can make your language a lot more human-friendly, but that it's possible to go too far and end up like C++. Martin

On Jan 13, 2010, at 03:49 , Martin Coxall wrote:
COBOL: Over 400 (!)
If we're going to go that far, FORTRAN and PL/1 have none. FORTRAN is somewhat infamous for this: "DO 10 I = 1, 400" is a loop start, "DO 10 I = 1. 400" (note typo, "." for ",") parses as the assignment "DO10I = 1.400". (This is often cited as the cause of the failure of Mariner 1, but that's an urban legend. See http://catless.ncl.ac.uk/Risks/9.54.html#subj1.1 for discussion, including the possible origin of this UL.) -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

"Brandon S. Allbery KF8NH"
If we're going to go that far, FORTRAN and PL/1 have none. FORTRAN is somewhat infamous for this:
There's also the option (perhaps this was PL/1?) of writing constructs like: IF THEN THEN IF ELSE THEN etc. Having few reserved words isn't necessarily a benefit. :-) -k -- If I haven't seen further, it is by standing in the footprints of giants

On Jan 13, 2010, at 05:45 , Ketil Malde wrote:
"Brandon S. Allbery KF8NH"
writes: If we're going to go that far, FORTRAN and PL/1 have none. FORTRAN is somewhat infamous for this:
There's also the option (perhaps this was PL/1?) of writing constructs like: IF THEN THEN IF ELSE THEN etc. Having few reserved words isn't necessarily a benefit. :-)
That'd be PL/I, and a prime example of why languages use keywords these days (as if FORTRAN weren't enough). :) -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

PL/I has keywords, they're just not reserved words.
With as many keywords as PL/I has, there something to say for not
making them reserved. :)
On Wed, Jan 13, 2010 at 11:50 AM, Brandon S. Allbery KF8NH
On Jan 13, 2010, at 05:45 , Ketil Malde wrote:
"Brandon S. Allbery KF8NH"
writes: If we're going to go that far, FORTRAN and PL/1 have none. FORTRAN is somewhat infamous for this:
There's also the option (perhaps this was PL/1?) of writing constructs like: IF THEN THEN IF ELSE THEN etc. Having few reserved words isn't necessarily a benefit. :-)
That'd be PL/I, and a prime example of why languages use keywords these days (as if FORTRAN weren't enough). :)
-- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

All Lisps have "special forms" which are evaluated uniquely and differently from function application and are therefore reserved words by another name. For example, Clojure has def, if, do, let, var, quote, fn, loop, recur, throw, try, monitor-enter, monitor-exit, dot, new and set!.
Yes, but the special forms are not distinguishable from user defined macros --- and some Lisp-implemantations special forms are another implementations macros. E.g. you can choose to make `if' a macro that expands to `cond' or vice versa. I do not know whether you are allowed to shadow the name of special-forms.
If you count reserved tokens, I guess Lisp reserves parentheses and whitespace?
Not if you are using Common Lisp. There you can install reader-macros that act on characters in the input-stream. (Most macros act on stuff in the already parsed syntrax tree.) Forth is also a remarkably flexible language in this regard. Matthias.

On 14 Jan 2010, at 14:42, Matthias Görgens wrote:
All Lisps have "special forms" which are evaluated uniquely and differently from function application and are therefore reserved words by another name. For example, Clojure has def, if, do, let, var, quote, fn, loop, recur, throw, try, monitor-enter, monitor-exit, dot, new and set!.
Yes, but the special forms are not distinguishable from user defined macros --- and some Lisp-implemantations special forms are another implementations macros. E.g. you can choose to make `if' a macro that expands to `cond' or vice versa. I do not know whether you are allowed to shadow the name of special-forms.
Clojure's a lot more 'syntaxy' than most Lisps. It has literals for large classes of entities that get represented as lists in most other Lisps. Which I guess is clearly a pragmatic design decision: be as syntax-heavy as is reasonably practicable without sacrificing homoiconicity and ending up like Dylan. Martin

On Thu, 14 Jan 2010 14:42:06 +0000, you wrote:
All Lisps have "special forms" which are evaluated uniquely and differently from function application and are therefore reserved words by another name. For example, Clojure has def, if, do, let, var, quote, fn, loop, recur, throw, try, monitor-enter, monitor-exit, dot, new and set!.
Yes, but the special forms are not distinguishable from user defined macros --- and some Lisp-implemantations special forms are another implementations macros. E.g. you can choose to make `if' a macro that expands to `cond' or vice versa. I do not know whether you are allowed to shadow the name of special-forms.
You can in Scheme; syntactic-keyword bindings can shadow variable bindings, and vice versa: The following is given as an example in R5RS: (let-syntax ((when (syntax-rules () ((when test stmt1 stmt2 ...) (if test (begin stmt1 stmt2 ...)))))) (let ((if #t)) (when if (set! if 'now)) if)) Evaluating the above returns "now." Steve Schafer Fenestra Technologies Corp. http://www.fenestra.com/

As you can see, this conclusively proves... something.
What about brainfuck? 8 different signs are used. -> http://de.wikipedia.org/wiki/Brainfuck#cite_note-0 The first link points to a page saying there is an interpreter 98 bytes in size.. What does this prove? :-) Marc Weber

Marc Weber wrote:
As you can see, this conclusively proves... something.
What about brainfuck? 8 different signs are used. -> http://de.wikipedia.org/wiki/Brainfuck#cite_note-0
The first link points to a page saying there is an interpreter 98 bytes in size..
What does this prove? :-)
Exhibit A: The Iota calculus. It has one value (the Iota function), and one operator (function application). It is Turing-complete. I have literally *no idea* how big an interpretter would be...

Andrew Coppin schrieb:
Hmm, I wonder if there's some way to compare the size of the language specification documents? :-}
PS. It comes as absolutely no surprise to me that C++ has the most keywords. But then, if I were to add AMOS Professional, that had well over 800 keywords at the last count... Because those BASIC dialects had a large set of built-in functions that would have been library functions in other languages. BlitzBasic II had many such functions defined as library functions. GFA-Basic provided many system library functions as built-in functions. There was a famous book about Murphy's law applied to computers where programming languages were advertised by the number of their built-in commands. :-)
participants (25)
-
Andrew Coppin
-
Brandon S. Allbery KF8NH
-
Bulat Ziganshin
-
Colin Paul Adams
-
Daniel Fischer
-
Eduard Sergeev
-
Evan Laforge
-
Fraser Wilson
-
Henning Thielemann
-
jur
-
Ketil Malde
-
Lennart Augustsson
-
Marc Weber
-
Martijn van Steenbergen
-
Martin Coxall
-
Matthias Görgens
-
Michael Hartl
-
Miguel Mitrofanov
-
Niklas Broberg
-
Roel van Dijk
-
Sebastian Sylvan
-
Steve Schafer
-
sylvain
-
Tom Tobin
-
Tony Morris