
On 2/09/2013, at 3:55 PM, Rustom Mody wrote:
On Mon, Sep 2, 2013 at 5:43 AM, Richard A. O'Keefe wrote:
A slogan I have programmed by since I first met C and recognised how vastly superior to PL/I it was for text manipulation _because_ it didn't have a proper string type is "Strings are Wrong!".
I wonder if you notice the irony in your use of 'C' as exemplar in this context?
In all seriousness, a text editor application written in C was *an order of magnitude* smaller in C than in PL/I *because* of the lack of a string data type.
C rode to fame on the back of Unix. And Unix's innovation – one of many – is that at the OS level the string type was made common fare – a universal type. So everything from file names to file contents to IPC is a string.
The idea of file names being strings was no innovation. Yes, in crippled monstrosities like TOPS-10 file names were weird records -- I can still remember too much of the details -- and every ruddy TOPS-10 program had to do its own file name parsing and it seemed as if they all did it differently. But the B6700 MCP interfaces treated file names as strings before UNIX was dreamed of. File contents in UNIX are *not* strings and never have been -- NUL termination is no part of files and binary files have been commonplace since the beginning (an a.out file is not a string!). They are *byte arrays*. As for IPC, since when have System V shared memory, semaphores, or message queues had anything to do with strings? (Hint: the 'name' of a System V shared memory segment is a key_t, and that's an integral type, not a string. Hint: the 'name' of a System V semaphore is also a key_t integer, not a string. Hint: the 'name' of a System V message queue is also a key_t integer, not a string. Hint: messages sent using msgsnd are not strings, they are byte arrays with a separate count parameter. ) Classic UNIX uses strings for file names, and really, that's it. (The command line argv[] is not really an exception, because it was used for file names as well as options, and in fact mixing the two up caused endless problems.) Everything else in V7, S3, or SysV was identified by a *number*. Plan 9 has exit(string) but Unix has exit(byte). From the perspective of someone who used UNIX v6 in 1979, *POSIX* IPC -- with its IPC objects *might* be in the file system but then again might *not* be so their names are sorta-kinda-like file names but not really) -- and /proc are recent innovations. The idea that 'string' was even remotely like a "universal type" in UNIX is bizarre. Heck, UNIX never even used 'string' for *lines* in text files!
Of course when instructing a beginning programmer your basic premise 'Strings are Wrong!' is most likely right.
No, I'm talking about experienced programmers writing high performance programs.
However if programs are seen as entities interacting with an 'external' world, the currency at the portals is invariably string.
- The currency at the portals is *not* invariably string. Learn PowerShell. - "Text" is one thing and "string" is another. This was the B6700 lesson (well, really the B5500 lesson): for many purposes you want a text *stream* not a text *string* at the interface. It's also the what-Smalltalk-got-right-and-Java-got-wrong lesson: the right way to convert objects to text is via a *stream* interface, not a *string* interface.
And more than just noob programmers have got this wrong – think of the precious one-byte opcodes that Intel wastes on ascii and decimal arithmetic.
Hang on, they are there in order to *support* the "numbers are text" model. You can't have it both ways.
So while this is true:
If bar is predefined, it *isn't* the string 'b':'a':'r':[]. If bar is a number, it *isn't* a string. So "other strings" is quite misleading.
in the innards of haskell, bar is a string
No, in the innards of Haskell, bar is possibly a number, possibly a pointer to some sort of record, possibly some other data structure, but almost certainly *not* a string.