Claus,

Thanks for your thoughtful response. Let me note that fully abstract semantics for PCF -- a total toy, mind you, just lambda + bools + naturals -- took some 25 years from characterization of the problem to a solution. That would seem to indicate shoe-horning, in my book ;-). Moreover, when i look at proposals to revisit Church versus Parigot encodings for data structures (ala Oliveira's thesis), i think we are still at the very beginnings of an understanding of how data fits into algebraic accounts of computation (such as lambda and ð-calculi). 

Obviously, we've come a long way. The relationship between types and pattern-matching, for example, is now heavily exploited and generally a good thing. But, do we really understand what's at work here -- or is this just another 'shut up and calculate' situation, like we have in certain areas of physics. Frankly, i think we are really just starting to understand what types are and how they relate to programs. This really begs fundamental questions. Can we give a compelling type-theoretic account of the separation of program from data?

The existence of such an account has all kinds of implications, too. For example, the current "classification" of notions of quantity (number) is entirely one of history and accident.
Can we give a type theoretic reconstruction of these notions (of traditional data types) that would unify -- or heaven forbid -- redraw the usual distinctions along lines that make more sense in terms of applications that provide value to users? Conway's ideas of numbers as games is remarkably unifying and captures all numbers in a single elegant data type. Can this (or something like it) be further rationally partitioned to provide better notions of numeric type? Is there a point where such an activity hits a wall and what we thought was data (real numbers; sequences of naturals; ...) are just too close to programs to be well served by data-centric treatments?

Best wishes,

--greg

2008/11/24 Claus Reinke <claus.reinke@talk21.com>
 - i am interested in a first-principles notion of data. Neither lambda

 nor ð-calculus come with a criterion for determining which terms represent
 data and which programs. You can shoe-horn in such notions -- and it is
 clear that practical programming relies on such a separation -- but along
 come nice abstractions like generic programming and the boundary starts
 moving again. (Note, also that one of the reasons i mention ð-calculus is
 because when you start shipping data between processes you'd like to know
 that this term really is data and not some nasty little program...)

I wouldn't call the usual representations of data in lambda shoe-horned
(but perhaps you meant the criterion for distinguishing programs from
data, not the notion of data?). Exposing data structures as nothing but
placeholders for the contexts operating on their components, by making
the structure components parameters to yet-to-be-determined continuations,
seems to be as reduced to first-principles as one can get.

It is also close to the old saying that "data are just dumb programs"
(does anyone know who originated that phrase?) - when storage
was tight, programmers couldn't always afford to fill it with dead
code, so they encoded data in (the state of executing) their routines.
When data was separated from real program code, associating data
with the code needed to interpret it was still common. When high-level
languages came along, treating programs as data (via reflective meta-
programming, not higher order functions) remained desirable in some
communities. Procedural abstraction was investigated as an alternative
to abstract data types. Shipping an interpreter to run stored code was
sometimes used to reduce memory footprint.

If your interest is in security of mobile code, I doubt that you want to
distinguish programs and data - "non-program" data which, when
interpreted by otherwise safe-looking code, does nasty things, isn't
much safer than code that does non-safe things directly. Unless you
rule out code which may, depending on the data it operates on, do
unwanted things, which is tricky without restricting expressiveness.

More likely, you want to distinguish different kinds/types of
programs/data, in terms of what using them together can do (in
terms of pi-calculus, you're interested in typing process communication,
restricting access to certain resources, or limiting communication
to certain protocols). In the presence of suitably expressive type
systems, procedural data abstractions need not be any less "safe"
than dead bits interpreted by a separate program. Or if reasoning
by suitable observational equivalences tells you that a piece of code
can't do anything unsafe, does it matter whether that piece is
program or data?

That may be too simplistic for your purposes, but I thought I'd put
in a word for the representation of data structures in lambda.

Claus

Ps. there have been lots of variation of pi-calculus, including
  some targetting more direct programming styles, such as
  the blue calculus (pi-calculus in direct style, Boudol et al).
  But I suspect you are aware of all that.




--
L.G. Meredith
Managing Partner
Biosimilarity LLC
806 55th St NE
Seattle, WA 98105

+1 206.650.3740

http://biosimilarity.blogspot.com