Re: [Haskell-cafe] Haskell-Cafe Digest, Vol 158, Issue 29

Richard
thanks for your response and the references which I'll look into, the
Dijkstra sounds particularly relevant. "Procedural-functional" is not
novel; and as you also mentioned, Haskell, Clean and Mercury would qualify,
if you chose to look at it that way, as would other languages. Any novelty
in the note would only ever be in the way that the mix is provided. You
raise salient points about the sort of challenges that languages will need
to confront although a search has left me still unsure about PGPUs. Can I
ask you to say a bit more about programming styles: what Java can't do,
what others can do, how that scales?
Regards
Rik
On 27 October 2016 at 13:00,
Send Haskell-Cafe mailing list submissions to haskell-cafe@haskell.org
To subscribe or unsubscribe via the World Wide Web, visit http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe or, via email, send a message with subject or body 'help' to haskell-cafe-request@haskell.org
You can reach the person managing the list at haskell-cafe-owner@haskell.org
When replying, please edit your Subject line so it is more specific than "Re: Contents of Haskell-Cafe digest..."
Today's Topics:
1. Re: A Procedural-Functional Language (WIP) (Rik Howard) 2. Re: A Procedural-Functional Language (WIP) (Rik Howard) 3. Re: A Procedural-Functional Language (WIP) (Richard A. O'Keefe) 4. Proof of concept of explanations for instance resolution (Michael Sloan) 5. Re: A Procedural-Functional Language (WIP) (Joachim Durchholz) 6. Re: Proof of concept of explanations for instance resolution (Tom Ellis) 7. Fwd: How to best display type variables with the same name (Christopher Done) 8. CoALP-Ty'16: Call for Participation (František Farka)
----------------------------------------------------------------------
Message: 1 Date: Wed, 26 Oct 2016 16:48:47 +0100 From: Rik Howard
To: Joachim Durchholz Cc: haskell-cafe@haskell.org Subject: Re: [Haskell-cafe] A Procedural-Functional Language (WIP) Message-ID: Content-Type: text/plain; charset="utf-8" Jo
Question is not whether these things are precluded, question is how you want to tackle them. It's not even stating design goals here.
The section on Types has been amended to note that these questions form a part of the ongoing work.
The salient point of this and some other features is that they make it
easier to reason about a given program's properties, at the expense of making programming harder.
You put that point well.
The basic concept of subtypes is simple, but establishing a definition of "subtype" that is both useful and sound is far from trivial.
I hope that there is nothing that I've said that could be interpreted as me thinking otherwise. As you mentioned elsewhere, though, they look too appealing to ignore.
For example. mutable circles and mutable ellipses are not in a subtype relationship to each other if there is an updating "scale" operation with an x and y scaling factor (you cannot guarantee that a scaled circle
stays
circular). The design space for dealing with this is far from fully explored.
I'm not sure that the language could support mutable structures but I take your point that there are complications.
Also, subtypes and binary operators do not really mix; google for "parallel type hierarchy". (The core of the problem is that if you make Byte a subtype of Word, declaring the (+) operator in Word as Word ->
Word
will preclude Byte from being a subtype because you want a covariant signature in Byte but that violates subtyping rules for functions. So you need parametric polymorphism, but now you cannot use the simple methods for subtyping anymore.)
Clearly there is more to be done in this area.
The key point to mention is that you want to maintain referential integrity.
The document now mentions this.
Sounds pretty much like the conseqences of having the IO monad in
Haskell.
That seems fair to me although the broader impact on an entire program would be different I think.
I think you should elaborate similarities and differences with how
Haskell
does IO, that's a well-known standard it is going to make the paper easier to read. Same goes for Clean&Mercury.
Something like that is addressed in Related Work. Clean is already on the list but it sounds, from your comments and those of others, as if Mercury may be worth including as well.
It's hard to tell whether it is actually new, too many details are
missing.
Certainly you have spotted the vagueness in the types however I think that that issue can be momentarily set aside from the point of view of novelty. The language is purely functional with respect to functions and provides out-vars as the only mechanism for dealing with IO. Let's assume for the moment that that all hangs together: if there is another language that does that, no novelty; otherwise, there is novelty.
Once again, your feedback has been useful and stimulating. Many thanks!
Regards Rik

On 28/10/16 8:41 AM, Rik Howard wrote:
Any novelty in the note would only ever be in the way that the mix is provided. You raise salient points about the sort of challenges that languages will need to confront although a search has left me still unsure about PGPUs. Can I ask you to say a bit more about programming styles: what Java can't do, what others can do, how that scales?
The fundamental issue is that Java is very much an imperative language (although books on concurrent programming in Java tend to strongly recommending immutable data structures whenever practical, because they are safer to share). The basic computational model of (even concurrent) imperative languages is the RAM: there is a set of threads living in a single address space where all memory is equally and as easily accessible to all threads. Already that's not true. One of the machines sitting on my desk is a Parallela: 2 ARM cores, 16 RISC cores, there's a single address space shared by the RISC cores but each of them "owns" a chunk of it and access is not uniform. Getting information between the ARM cores and the RISC cores is not trivial. Indeed, one programming model for the Parallela is OpenCL 1.1, although as they note, "Creating an API for architectures not even considered during the creation of a standard is challenging. This can be seen in the case of Epiphany, which possesses an architecture very different from a GPU, and which supports functionality not yet supported by a GPU. OpenCL as an API for Epiphany is good, but not perfect." The thing is that the Epiphany chip is more *like* a GPU than it is like anything say Java might want to run on. For that matter, there is the IBM "Cell" processor, basically a Power core and a bunch of RISCish cores, not entirely unlike the Epiphany. As the Wikipedia page on the Cell notes, "Cell is widely regarded as a challenging environment for software development". Again, Java wants a (1) large (2) flat (3) shared address space, and that's *not* what Cell delivers. The memory space available to each "SPE" in a Cell is effectively what would have been L1 cache on a more conventional machine, and transfers between that and main memory are non-trivial. So Cell memory is (1) small (2) heterogeneous and (3) partitioned. The Science Data Processor for the Square Kilometre Array is still being designed. As far as I know, they haven't committed to a CPU architecture yet, and they probably want to leave that pretty late. Cell might be a candidate, but I suspect they'll not want to spend much of their software development budget on a "challenging" architecture. Hmm. Scaling. Here's the issue. It looks as though the future of scaling is *lots* of processors, running *slower* than typical desktops, with things turned down or off as much as possible, so you won't be able to pull the Parallela/Epiphany trick of always being able to access another chip's local memory. Any programming model that relies on large flat shared address spaces is out; message passing that copies stuff is going to be much easier to manage than passing a pointer to memory that might be powered off when you need it; anything that creates tight coupling between the execution orders of separate processors is going to be a nightmare. We're also looking at more things moving into special-purpose hardware, in order to reduce power costs. It would be nice to be able to do this without a complete rewrite... Coarray Fortran (in the current standard) is an attempt to deal with the kinds of machines I'm talking about. Whether it's a good attempt I couldn't say, I'm still trying to get my head around it. (More precisely, I think I understand what it's about, but I haven't a clue about how to *use* the feature effectively.) There are people at Rice who think it could be better. Reverting to the subject of declarative/procedural, I recently came across Lee Naish's "Pawns" language. Still very much a prototype, and he is interested in the semantics, not the syntax. https://github.com/lee-naish/Pawns http://people.eng.unimelb.edu.au/lee/papers/pawns/

thanks for the reply. Conceptually I like the idea of a single address
space, it can then be a matter of configuration as to whether what you're
addressing is another local process, processor or something more remote.
Some assumptions about what can be expected from local resources need to be
dropped but I believe that it works in other situations. Your point about
not wanting to have to rewrite when the underlying platform evolves seems
relevant. Perhaps that suggests that a language, while needing to be aware
of its environment, oughtn't to shape itself entirely for that
environment. While we're on the subject of rewrites, that is the fate of
the WIP. I was wrong.
On 28 October 2016 at 01:38, Richard A. O'Keefe
On 28/10/16 8:41 AM, Rik Howard wrote:
Any novelty in the note would only ever be in the way that the mix is provided. You raise salient points about the sort of challenges that languages will need to confront although a search has left me still unsure about PGPUs. Can I ask you to say a bit more about programming styles: what Java can't do, what others can do, how that scales?
The fundamental issue is that Java is very much an imperative language (although books on concurrent programming in Java tend to strongly recommending immutable data structures whenever practical, because they are safer to share).
The basic computational model of (even concurrent) imperative languages is the RAM: there is a set of threads living in a single address space where all memory is equally and as easily accessible to all threads.
Already that's not true. One of the machines sitting on my desk is a Parallela: 2 ARM cores, 16 RISC cores, there's a single address space shared by the RISC cores but each of them "owns" a chunk of it and access is not uniform. Getting information between the ARM cores and the RISC cores is not trivial. Indeed, one programming model for the Parallela is OpenCL 1.1, although as they note, "Creating an API for architectures not even considered during the creation of a standard is challenging. This can be seen in the case of Epiphany, which possesses an architecture very different from a GPU, and which supports functionality not yet supported by a GPU. OpenCL as an API for Epiphany is good, but not perfect." The thing is that the Epiphany chip is more *like* a GPU than it is like anything say Java might want to run on.
For that matter, there is the IBM "Cell" processor, basically a Power core and a bunch of RISCish cores, not entirely unlike the Epiphany. As the Wikipedia page on the Cell notes, "Cell is widely regarded as a challenging environment for software development".
Again, Java wants a (1) large (2) flat (3) shared address space, and that's *not* what Cell delivers. The memory space available to each "SPE" in a Cell is effectively what would have been L1 cache on a more conventional machine, and transfers between that and main memory are non-trivial. So Cell memory is (1) small (2) heterogeneous and (3) partitioned.
The Science Data Processor for the Square Kilometre Array is still being designed. As far as I know, they haven't committed to a CPU architecture yet, and they probably want to leave that pretty late. Cell might be a candidate, but I suspect they'll not want to spend much of their software development budget on a "challenging" architecture.
Hmm. Scaling.
Here's the issue. It looks as though the future of scaling is *lots* of processors, running *slower* than typical desktops, with things turned down or off as much as possible, so you won't be able to pull the Parallela/Epiphany trick of always being able to access another chip's local memory. Any programming model that relies on large flat shared address spaces is out; message passing that copies stuff is going to be much easier to manage than passing a pointer to memory that might be powered off when you need it; anything that creates tight coupling between the execution orders of separate processors is going to be a nightmare.
We're also looking at more things moving into special-purpose hardware, in order to reduce power costs. It would be nice to be able to do this without a complete rewrite...
Coarray Fortran (in the current standard) is an attempt to deal with the kinds of machines I'm talking about. Whether it's a good attempt I couldn't say, I'm still trying to get my head around it. (More precisely, I think I understand what it's about, but I haven't a clue about how to *use* the feature effectively.) There are people at Rice who think it could be better.
Reverting to the subject of declarative/procedural, I recently came across Lee Naish's "Pawns" language. Still very much a prototype, and he is interested in the semantics, not the syntax. https://github.com/lee-naish/Pawns http://people.eng.unimelb.edu.au/lee/papers/pawns/

All
thank you for the feedback and the bandwidth. It has been invaluable and
is appreciated.
Regards
Rik
On 30 October 2016 at 16:44, Rik Howard
thanks for the reply. Conceptually I like the idea of a single address space, it can then be a matter of configuration as to whether what you're addressing is another local process, processor or something more remote. Some assumptions about what can be expected from local resources need to be dropped but I believe that it works in other situations. Your point about not wanting to have to rewrite when the underlying platform evolves seems relevant. Perhaps that suggests that a language, while needing to be aware of its environment, oughtn't to shape itself entirely for that environment. While we're on the subject of rewrites, that is the fate of the WIP. I was wrong.
On 28 October 2016 at 01:38, Richard A. O'Keefe
wrote: On 28/10/16 8:41 AM, Rik Howard wrote:
Any novelty in the note would only ever be in the way that the mix is provided. You raise salient points about the sort of challenges that languages will need to confront although a search has left me still unsure about PGPUs. Can I ask you to say a bit more about programming styles: what Java can't do, what others can do, how that scales?
The fundamental issue is that Java is very much an imperative language (although books on concurrent programming in Java tend to strongly recommending immutable data structures whenever practical, because they are safer to share).
The basic computational model of (even concurrent) imperative languages is the RAM: there is a set of threads living in a single address space where all memory is equally and as easily accessible to all threads.
Already that's not true. One of the machines sitting on my desk is a Parallela: 2 ARM cores, 16 RISC cores, there's a single address space shared by the RISC cores but each of them "owns" a chunk of it and access is not uniform. Getting information between the ARM cores and the RISC cores is not trivial. Indeed, one programming model for the Parallela is OpenCL 1.1, although as they note, "Creating an API for architectures not even considered during the creation of a standard is challenging. This can be seen in the case of Epiphany, which possesses an architecture very different from a GPU, and which supports functionality not yet supported by a GPU. OpenCL as an API for Epiphany is good, but not perfect." The thing is that the Epiphany chip is more *like* a GPU than it is like anything say Java might want to run on.
For that matter, there is the IBM "Cell" processor, basically a Power core and a bunch of RISCish cores, not entirely unlike the Epiphany. As the Wikipedia page on the Cell notes, "Cell is widely regarded as a challenging environment for software development".
Again, Java wants a (1) large (2) flat (3) shared address space, and that's *not* what Cell delivers. The memory space available to each "SPE" in a Cell is effectively what would have been L1 cache on a more conventional machine, and transfers between that and main memory are non-trivial. So Cell memory is (1) small (2) heterogeneous and (3) partitioned.
The Science Data Processor for the Square Kilometre Array is still being designed. As far as I know, they haven't committed to a CPU architecture yet, and they probably want to leave that pretty late. Cell might be a candidate, but I suspect they'll not want to spend much of their software development budget on a "challenging" architecture.
Hmm. Scaling.
Here's the issue. It looks as though the future of scaling is *lots* of processors, running *slower* than typical desktops, with things turned down or off as much as possible, so you won't be able to pull the Parallela/Epiphany trick of always being able to access another chip's local memory. Any programming model that relies on large flat shared address spaces is out; message passing that copies stuff is going to be much easier to manage than passing a pointer to memory that might be powered off when you need it; anything that creates tight coupling between the execution orders of separate processors is going to be a nightmare.
We're also looking at more things moving into special-purpose hardware, in order to reduce power costs. It would be nice to be able to do this without a complete rewrite...
Coarray Fortran (in the current standard) is an attempt to deal with the kinds of machines I'm talking about. Whether it's a good attempt I couldn't say, I'm still trying to get my head around it. (More precisely, I think I understand what it's about, but I haven't a clue about how to *use* the feature effectively.) There are people at Rice who think it could be better.
Reverting to the subject of declarative/procedural, I recently came across Lee Naish's "Pawns" language. Still very much a prototype, and he is interested in the semantics, not the syntax. https://github.com/lee-naish/Pawns http://people.eng.unimelb.edu.au/lee/papers/pawns/

On 31/10/16 5:44 AM, Rik Howard wrote:
thanks for the reply. Conceptually I like the idea of a single address space, it can then be a matter of configuration as to whether what you're addressing is another local process, processor or something more remote.
The world doesn't care what you or I happen to like. I completely agree in *liking* a single address space. But it's not true to what is *there*, and if you program for that model, you're going to get terrible performance. I've just been attending a 1-day introduction to our national HPC system. There are two clusters. One has about 3,300 cores and the other over 6,000. One is POWER+AIX, the other Intel+Linux. One has Xeon Phis (amongst other things), the other does not. Hint: neither of them has a single address space, and while we know about software distributed memory (indeed, one of the people here has published innovative research in that area), it is *not* like a single address space and is not notably easy to use. It's possible to "tame" single address space. When you start to learn Ada, you *think* you're dealing with a single address space language, until you learn about partitioning programs for distributed execution. For that matter, Occam has the same property (which is one of the reasons why Occam didn't have pointers, so that it would be *logically* a matter of indifference whether two concurrent processors were on the same chip or not). But even when communication is disguised as memory accessing, it's still communication, it still *costs* like communication, and if you want high performance, you had better *measure* it as communication. One of the presenters was working with a million lines of Fortran, almost all of it written by other people. How do we make that safe?

Am 31.10.2016 um 05:07 schrieb Richard A. O'Keefe:
But even when communication is disguised as memory accessing, it's still communication, it still *costs* like communication, and if you want high performance, you had better *measure* it as communication.
And you need to control memory coherence, i.e. you need to define what data goes together with what processes. In an ideal world, the compiler would be smart enough to do that for you. I have been reading fantasies that FPLs with their immutable data structures are better suited for this kind of automation; has compiler research progressed enough to make that a realistic option? Without that, you'd code explicit multithreading, which means that communication does not look like memory access at all.

On 1/11/16 9:54 AM, Joachim Durchholz wrote:
And you need to control memory coherence, i.e. you need to define what data goes together with what processes.
At this point I'm puzzled. Languages like Occam, ZPL, and Co-Array Fortran basically say NO! to memory coherence. Of course you say which data goes with what process. If the data need to be available in some other process, there is some sort of fairly explicit communication.
In an ideal world, the compiler would be smart enough to do that for you. I have been reading fantasies that FPLs with their immutable data structures are better suited for this kind of automation;
Memory coherence exists as an issue when data are replicated and one of the copies gets mutated, so that the copies are now inconsistent. With immutable data this cannot be a problem. For what it's worth, the "Clean" programming language used to be called "Concurrent Clean" because it was set up to run on a cluster of Macintoshes.
Without that, you'd code explicit multithreading, which means that communication does not look like memory access at all.
I believe my argument was that it *shouldn't* look like memory access.

Am 01.11.2016 um 01:37 schrieb Richard A. O'Keefe:
On 1/11/16 9:54 AM, Joachim Durchholz wrote:
And you need to control memory coherence, i.e. you need to define what data goes together with what processes.
At this point I'm puzzled. Languages like Occam, ZPL, and Co-Array Fortran basically say NO! to memory coherence.
Sure, but unrelated.
Of course you say which data goes with what process.
The hope with FPLs was that you do not need to explicitly specify it anymore, because the compiler can manage that. Or maybe the considerably weaker scenario: that the programmer still explicitly defines what computation with what data forms a unit, but that it is easy to move boundaries.
If the data need to be available in some other process, there is some sort of fairly explicit communication.
Which means that you do not have a simple function call anymore, but an extra API layer.
In an ideal world, the compiler would be smart enough to do that for you. I have been reading fantasies that FPLs with their immutable data structures are better suited for this kind of automation;
Memory coherence exists as an issue when data are replicated and one of the copies gets mutated, so that the copies are now inconsistent. With immutable data this cannot be a problem.
This still does not tell you where to draw the boundaries inside your system. If anything, it is getting harder with non-strict languages because it is harder to predict what computation will be run at what time.
For what it's worth, the "Clean" programming language used to be called "Concurrent Clean" because it was set up to run on a cluster of Macintoshes.
Clean is strict ;-)
Without that, you'd code explicit multithreading, which means that communication does not look like memory access at all.
I believe my argument was that it *shouldn't* look like memory access.
It's something that some people want(ed). Yours Truly being one of them, actually. It's just that I have become sceptical about the trade-offs. Plus, the more I read about various forms of partitioning computations (not just NUMA but also IPC and networking), the more it seems that hardware moves towards making the barriers higher, not lower (the reason being that this helps making computations within the barrier more efficient). If that's a general trend, that's bad news for network, IPC, or NUMA transparency. Which is going to make programming for these harder, not easier :-(

On 2/11/16 10:28 AM, Joachim Durchholz wrote: We're still not really communicating, so it may be time to draw this thread to a close soon.
Am 01.11.2016 um 01:37 schrieb Richard A. O'Keefe:
On 1/11/16 9:54 AM, Joachim Durchholz wrote:
And you need to control memory coherence, i.e. you need to define what data goes together with what processes.
At this point I'm puzzled. Languages like Occam, ZPL, and Co-Array Fortran basically say NO! to memory coherence.
Sure, but unrelated.
How is it unrelated? If you don't need, don't want, and don't have "memory coherence", then you don't have to control it.
The hope with FPLs was that you do not need to explicitly specify it anymore, because the compiler can manage that.
The Alan Perlis quote applies: When someone says: "I want a programming language in which I need only say what I wish done", give him a lollipop. In declarative languages, you give up explicit control over some things in order to make other things easier to express. A compiler has three ways to decide which are the "important" things: - by analysis - by being told (there's the famous anecdote about the original Fortran FREQUENCY statement being implemented backwards and nobody noticing) - by measuring (standard technology in C compilers for years now) Nearly 20 years ago I was using a C compiler that would shuffle functions around in your executable in order to reduce page faults. That's the "measurement" approach.
If the data need to be available in some other process, there is some sort of fairly explicit communication.
Which means that you do not have a simple function call anymore, but an extra API layer.
Here you have left me behind. WHAT was "a simple function call"? WHAT is "an extra API layer", as opposed to annotations like the distribution annotations in ZPL and HPFortran?
For what it's worth, the "Clean" programming language used to be called "Concurrent Clean" because it was set up to run on a cluster of Macintoshes.
Clean is strict ;-)
That one line provoked this reply. Clean isn't any stricter than Haskell. It does strictness inference, just like GHC. It allows strictness annotations, just like GHC (though not in Haskell 2010). Plus, the more I read about various
forms of partitioning computations (not just NUMA but also IPC and networking), the more it seems that hardware moves towards making the barriers higher, not lower (the reason being that this helps making computations within the barrier more efficient).
If that's a general trend, that's bad news for network, IPC, or NUMA transparency. Which is going to make programming for these harder, not easier :-(
On the one hand, agreed. On the other hand, while programming in the 1960s still *is* getting harder, it's not clear that we cannot find an approach that will be easier. To my astonishment, quite a lot of the people using NZ's supercomputer facility are programming in Python (basically glue code hooking up existing applications), or Matlab (serious number-crunching, which they say they've benchmarked against C), or even R (yes, R; NOT one of the world's faster languages, BUT again we're mostly talking about glue code hooking together existing building-blocks). And this facility has people who will help researchers with their supercomputer programming for free. The Science Data Processor for the Square Kilometre Array will have a multi-level structure - compute elements contain multiple cores, probably GPUs, and probably FPGAs. These are roughly the equivalent of laptops. - compute nodes are tightly integrated clusters containing multiple compute elements + interconnects + extra storage - compute islands are looser clusters containing multiple compute nodes and management stuff - the SDP will contain multiple compute islands + massive interconnects (3TB/sec data coming in) - there will be two SDPs, each sharing a multimegawatt power station with a signal processing system that talks directly to the telescopes. Managing massive data flows and scheduling the computational tasks is going to be seriously hard work. One of the main tools for managing stuff like this is ISOLATION, and LIMITING communication as much as possible, which is the very opposite of the large flat shared memory model.

Am 02.11.2016 um 01:24 schrieb Richard A. O'Keefe:
The hope with FPLs was that you do not need to explicitly specify it anymore, because the compiler can manage that.
The Alan Perlis quote applies: When someone says: "I want a programming language in which I need only say what I wish done", give him a lollipop.
It does not really apply, the quote talks about how "say what one wishes one" turns into a programming language, albeit a higher-level one. We're talking about assigning memory to memory areas. That's a very different story, and one where other previously unthinkable features are applied by GHC today, such as efficient automatic memory reclamation, or elimination of intermediate data structures.
If the data need to be available in some other process, there is some sort of fairly explicit communication.
Which means that you do not have a simple function call anymore, but an extra API layer.
Here you have left me behind. WHAT was "a simple function call"? WHAT is "an extra API layer", as opposed to annotations like the distribution annotations in ZPL and HPFortran?
I wasn't aware of these annotations, so I was talking from the perspective how you'd do it in one of today's vanilla languages.
Plus, the more I read about various forms of partitioning computations (not just NUMA but also IPC and networking), the more it seems that hardware moves towards making the barriers higher, not lower (the reason being that this helps making computations within the barrier more efficient).
If that's a general trend, that's bad news for network, IPC, or NUMA transparency. Which is going to make programming for these harder, not easier :-(
On the one hand, agreed. On the other hand, while programming in the 1960s still *is* getting harder, it's not clear that we cannot find an approach that will be easier.
Agreed.

As usual, you give me much to ponder. For some reason it pleases that the world is not too concerned with what we happen to like. But it's not true to what is *there*, and if you program for that model,
you're going to get terrible performance.
I heard recently of a type system that captures the complexity of functions in their signatures. With that information available to the machine, perhaps the machine could be equipped with a way to plan an execution such that performance is optimised. Your day with the HPC system sounds fascinating. Do you think that an Ada/Occam-like approach to partitioned distribution could tame the sort address space that you encountered on the day?
Any programming model that relies on large flat shared address spaces is
out; message passing that copies stuff is going to be much easier to manage than passing a pointer to memory that might be powered off when you need it
But there'll still be some call for shared memory? Or maybe only for persistence stores? One of the presenters was working with a million lines of Fortran, almost
all of it written by other people. How do we make that safe?
Ultimately only proof can verify safety. (I'm trying to address something
like that in my rewrite, which given the high quality of feedback from this
list, I hope to post soon.)
On 31 October 2016 at 04:07, Richard A. O'Keefe
On 31/10/16 5:44 AM, Rik Howard wrote:
thanks for the reply. Conceptually I like the idea of a single address space, it can then be a matter of configuration as to whether what you're addressing is another local process, processor or something more remote.
The world doesn't care what you or I happen to like. I completely agree in *liking* a single address space. But it's not true to what is *there*, and if you program for that model, you're going to get terrible performance.
I've just been attending a 1-day introduction to our national HPC system. There are two clusters. One has about 3,300 cores and the other over 6,000. One is POWER+AIX, the other Intel+Linux. One has Xeon Phis (amongst other things), the other does not. Hint: neither of them has a single address space, and while we know about software distributed memory (indeed, one of the people here has published innovative research in that area), it is *not* like a single address space and is not notably easy to use.
It's possible to "tame" single address space. When you start to learn Ada, you *think* you're dealing with a single address space language, until you learn about partitioning programs for distributed execution. For that matter, Occam has the same property (which is one of the reasons why Occam didn't have pointers, so that it would be *logically* a matter of indifference whether two concurrent processors were on the same chip or not).
But even when communication is disguised as memory accessing, it's still communication, it still *costs* like communication, and if you want high performance, you had better *measure* it as communication.
One of the presenters was working with a million lines of Fortran, almost all of it written by other people. How do we make that safe?
participants (3)
-
Joachim Durchholz
-
Richard A. O'Keefe
-
Rik Howard