Is Safe Haskell intended to allow segfaults?

We're trying to spend some cycles pushing on Safe Haskell within the stackage packages. (It's looking like a slog.) But we're running up against some basic questions regarding the core packages and Safe Haskell guarantees. The manual currently says: https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/safe_haskell... *Functions in the IO monad are still allowed and behave as usual. * As usual? So it is ok to segfault GHC? Elsewhere it says "in the safe language you can trust the types", and I'd always assumed that meant Safe Haskell is a type safe language, even in the IO fragment. Was there an explicit decision to allow segfaults and memory corruption? This can happen not just with FFI calls but with uses of Ptrs within Haskell, for example the following: ``` {-# LANGUAGE Safe #-} module Main where import Foreign.Marshal.Alloc import Foreign.Storable import Foreign.Ptr import System.Random fn :: Ptr Int -> IO () fn p = do -- This is kosher: poke p 3 print =<< peek p -- This should crash the system: ix <- randomIO pokeElemOff p ix 0xcc main = alloca fn ``` -Ryan

On Mon, Aug 8, 2016 at 1:27 PM, Ryan Newton
As usual? So it is ok to segfault GHC? Elsewhere it says "in the safe language you can trust the types", and I'd always assumed that meant Safe Haskell is a type safe language, even in the IO fragment.
Pretty sure it's impossible to allow IO without enabling all of it one way or another. And as soon as you allow *any* IO, you're open to various kinds of failures including segfaults. The only way you will get your type system to prevent that is fully specified dependent typing both in your program *and in the OS level interfaces*. I wish you luck on the latter. -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

Pretty sure it's impossible to allow IO without enabling all of it one way or another. And as soon as you allow *any* IO, you're open to various kinds of failures including segfaults. The only way you will get your type system to prevent that is fully specified dependent typing both in your program *and in the OS level interfaces*. I wish you luck on the latter.
I don't agree. This would seem to imply that type safe / memory safe imperative languages are impossible, and that OCaml/Java/C#/etc are a false promise! Which APIs and dangers are you thinking of? I'm worried about bounds checks on memory access, and the capability of foreign code to poke at arbitrary memory. But other APIs like these should be fine: - IORef API - MVar API - STM API - File IO API

Hello Ryan, The guarantee that Safe Haskell gives with regards to IO is a little subtle and is mentioned in Section 3.1 of the paper, and detailed in Section 5.1. Essentially, to use Safe Haskell, you are responsible for defining the type at which untrusted code is to be called. Using an untrusted value at type IO a in main imposes no safety restrictions by design--it's up to the user of Safe Haskell to decide what kind of security properties it needs out of user code. Edward Excerpts from Ryan Newton's message of 2016-08-08 13:27:16 -0400:
We're trying to spend some cycles pushing on Safe Haskell within the stackage packages. (It's looking like a slog.)
But we're running up against some basic questions regarding the core packages and Safe Haskell guarantees. The manual currently says: https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/safe_haskell...
*Functions in the IO monad are still allowed and behave as usual. * As usual? So it is ok to segfault GHC? Elsewhere it says "in the safe language you can trust the types", and I'd always assumed that meant Safe Haskell is a type safe language, even in the IO fragment.
Was there an explicit decision to allow segfaults and memory corruption? This can happen not just with FFI calls but with uses of Ptrs within Haskell, for example the following:
```
{-# LANGUAGE Safe #-}
module Main where
import Foreign.Marshal.Alloc
import Foreign.Storable
import Foreign.Ptr
import System.Random
fn :: Ptr Int -> IO ()
fn p = do
-- This is kosher:
poke p 3
print =<< peek p
-- This should crash the system:
ix <- randomIO
pokeElemOff p ix 0xcc
main = alloca fn
```
-Ryan

Hi Ed,
Thanks, I went back to the paper and read those sections.
It sounds like this design heavily reflects the untrusted-code use case. I
believe ceding the entire IO monad is too pessimistic, and it's also
against the *spirit* of the type safety guarantee mentioned in the paper:
"Type safety. In Milner’s famous phrase, well-typed programs do not go
wrong. "
There's not really very much discussion of this "punting on IO" decision in
the paper. The paper mentions not deleting users files as an example of a
higher level security policy -- and a reason not to try to lock down IO --
but that's not really the issue here.
Sure, there are many use cases that require an RIO newtype, but the memory
model and memory protection are really internal to the language. My
interest is in a type safe language with a memory model *in the IO monad*.
I think it's quite *reasonable to expect Safe Haskell to be as safe as
Java! * There are hundreds of thousands or millions of lines of code
written in the IO monad [1], and a lot of that code could probably be
memory safe by construction.
One question I have for people is which of the following they would favor:
1. Redefine -XSafe to guarantee something about IO and its memory safety
(and even its memory model)
2. Add a fourth safety level, "SafeExceptIO", that corresponds to the
current guarantees, while extending "Safe" to say something about IO.
3. Leave safe Haskell as it is.
(2) sounds a bit clunky to me, and I favor (1) most of all.
Best,
-Ryan
[1] 17M lines on hackage total, hard to count how much is in an IO monad or
related monad.
On Mon, Aug 8, 2016 at 2:05 PM, Edward Z. Yang
Hello Ryan,
The guarantee that Safe Haskell gives with regards to IO is a little subtle and is mentioned in Section 3.1 of the paper, and detailed in Section 5.1. Essentially, to use Safe Haskell, you are responsible for defining the type at which untrusted code is to be called. Using an untrusted value at type IO a in main imposes no safety restrictions by design--it's up to the user of Safe Haskell to decide what kind of security properties it needs out of user code.
Edward
Excerpts from Ryan Newton's message of 2016-08-08 13:27:16 -0400:
We're trying to spend some cycles pushing on Safe Haskell within the stackage packages. (It's looking like a slog.)
But we're running up against some basic questions regarding the core packages and Safe Haskell guarantees. The manual currently says: <https://downloads.haskell.org/~ghc/latest/docs/html/ users_guide/safe_haskell.html#safe-language>
*Functions in the IO monad are still allowed and behave as usual. * As usual? So it is ok to segfault GHC? Elsewhere it says "in the safe language you can trust the types", and I'd always assumed that meant Safe Haskell is a type safe language, even in the IO fragment.
Was there an explicit decision to allow segfaults and memory corruption? This can happen not just with FFI calls but with uses of Ptrs within Haskell, for example the following:
```
{-# LANGUAGE Safe #-}
module Main where
import Foreign.Marshal.Alloc
import Foreign.Storable
import Foreign.Ptr
import System.Random
fn :: Ptr Int -> IO ()
fn p = do
-- This is kosher:
poke p 3
print =<< peek p
-- This should crash the system:
ix <- randomIO
pokeElemOff p ix 0xcc
main = alloca fn
```
-Ryan

I think what you are proposing is reasonable. I wasn't present when Safe Haskell's design was originally fleshed out so I don't know why this route wasn't taken. Edward Excerpts from Ryan Newton's message of 2016-08-08 16:00:38 -0400:
Hi Ed,
Thanks, I went back to the paper and read those sections.
It sounds like this design heavily reflects the untrusted-code use case. I believe ceding the entire IO monad is too pessimistic, and it's also against the *spirit* of the type safety guarantee mentioned in the paper:
"Type safety. In Milner’s famous phrase, well-typed programs do not go wrong. " There's not really very much discussion of this "punting on IO" decision in the paper. The paper mentions not deleting users files as an example of a higher level security policy -- and a reason not to try to lock down IO -- but that's not really the issue here.
Sure, there are many use cases that require an RIO newtype, but the memory model and memory protection are really internal to the language. My interest is in a type safe language with a memory model *in the IO monad*. I think it's quite *reasonable to expect Safe Haskell to be as safe as Java! * There are hundreds of thousands or millions of lines of code written in the IO monad [1], and a lot of that code could probably be memory safe by construction.
One question I have for people is which of the following they would favor:
1. Redefine -XSafe to guarantee something about IO and its memory safety (and even its memory model) 2. Add a fourth safety level, "SafeExceptIO", that corresponds to the current guarantees, while extending "Safe" to say something about IO. 3. Leave safe Haskell as it is.
(2) sounds a bit clunky to me, and I favor (1) most of all.
Best, -Ryan
[1] 17M lines on hackage total, hard to count how much is in an IO monad or related monad.
On Mon, Aug 8, 2016 at 2:05 PM, Edward Z. Yang
wrote: Hello Ryan,
The guarantee that Safe Haskell gives with regards to IO is a little subtle and is mentioned in Section 3.1 of the paper, and detailed in Section 5.1. Essentially, to use Safe Haskell, you are responsible for defining the type at which untrusted code is to be called. Using an untrusted value at type IO a in main imposes no safety restrictions by design--it's up to the user of Safe Haskell to decide what kind of security properties it needs out of user code.
Edward
Excerpts from Ryan Newton's message of 2016-08-08 13:27:16 -0400:
We're trying to spend some cycles pushing on Safe Haskell within the stackage packages. (It's looking like a slog.)
But we're running up against some basic questions regarding the core packages and Safe Haskell guarantees. The manual currently says: <https://downloads.haskell.org/~ghc/latest/docs/html/ users_guide/safe_haskell.html#safe-language>
*Functions in the IO monad are still allowed and behave as usual. * As usual? So it is ok to segfault GHC? Elsewhere it says "in the safe language you can trust the types", and I'd always assumed that meant Safe Haskell is a type safe language, even in the IO fragment.
Was there an explicit decision to allow segfaults and memory corruption? This can happen not just with FFI calls but with uses of Ptrs within Haskell, for example the following:
```
{-# LANGUAGE Safe #-}
module Main where
import Foreign.Marshal.Alloc
import Foreign.Storable
import Foreign.Ptr
import System.Random
fn :: Ptr Int -> IO ()
fn p = do
-- This is kosher:
poke p 3
print =<< peek p
-- This should crash the system:
ix <- randomIO
pokeElemOff p ix 0xcc
main = alloca fn
```
-Ryan

Thanks for bringing this up Ryan, it's an interesting direction of thought.
It's been a while since we originally designed SafeHaskell, so I can't
remember our thinking very well with this area. I believe it came down
to the following:
* Our primary concern was to be able to 'trust the types', which is
perhaps a little weaker than 'type safety'. So IO can do ugly things,
but at least we know that, and know what something without IO, or
using a restricted IO monad can do.
* Our intuition was that offering stronger guarantees about IO would
be very difficult -- I don't remember how far we explored this
intuition.
* We were also motivated very strongly by the security use case, that
may have sadly blinded us a little to something more ambitious as you
propose.
If you have the energy, it'd be great to put some of this thinking
into a wiki page (https://ghc.haskell.org/trac/ghc/wiki/SafeHaskell)
and flesh out a first approximation of what IO API's cause issues. Is
it just Ptr not carrying bounds around with it? Maybe, but then we
need to secure how Ptr's can be created, which excludes FFI returning
Ptr's.
I imagine in Java, that I can construct an invalid pointer in foreign
code, and then cause segfaults without the Java code having any
issues. Just guessing at this, so very interested to know how it's
prevented if I can't.
Securing the memory model may be very challenging. E.g., as Russ Cox
points out, in Go one can use a data race to break type safety. Does
GHC-Haskell have any similar issues? What performance impact would
result from fixing these edge cases?
http://research.swtch.com/gorace
I think your (1) proposal is reasonable and desirable, but we'd really
want to understand the difficulty further to guide us between 1--3.
Cheers,
David
On 8 August 2016 at 13:00, Ryan Newton
Hi Ed,
Thanks, I went back to the paper and read those sections.
It sounds like this design heavily reflects the untrusted-code use case. I believe ceding the entire IO monad is too pessimistic, and it's also against the spirit of the type safety guarantee mentioned in the paper:
"Type safety. In Milner’s famous phrase, well-typed programs do not go wrong. "
There's not really very much discussion of this "punting on IO" decision in the paper. The paper mentions not deleting users files as an example of a higher level security policy -- and a reason not to try to lock down IO -- but that's not really the issue here.
Sure, there are many use cases that require an RIO newtype, but the memory model and memory protection are really internal to the language. My interest is in a type safe language with a memory model in the IO monad. I think it's quite reasonable to expect Safe Haskell to be as safe as Java! There are hundreds of thousands or millions of lines of code written in the IO monad [1], and a lot of that code could probably be memory safe by construction.
One question I have for people is which of the following they would favor:
Redefine -XSafe to guarantee something about IO and its memory safety (and even its memory model) Add a fourth safety level, "SafeExceptIO", that corresponds to the current guarantees, while extending "Safe" to say something about IO. Leave safe Haskell as it is.
(2) sounds a bit clunky to me, and I favor (1) most of all.
Best, -Ryan
[1] 17M lines on hackage total, hard to count how much is in an IO monad or related monad.
On Mon, Aug 8, 2016 at 2:05 PM, Edward Z. Yang
wrote: Hello Ryan,
The guarantee that Safe Haskell gives with regards to IO is a little subtle and is mentioned in Section 3.1 of the paper, and detailed in Section 5.1. Essentially, to use Safe Haskell, you are responsible for defining the type at which untrusted code is to be called. Using an untrusted value at type IO a in main imposes no safety restrictions by design--it's up to the user of Safe Haskell to decide what kind of security properties it needs out of user code.
Edward
Excerpts from Ryan Newton's message of 2016-08-08 13:27:16 -0400:
We're trying to spend some cycles pushing on Safe Haskell within the stackage packages. (It's looking like a slog.)
But we're running up against some basic questions regarding the core packages and Safe Haskell guarantees. The manual currently says:
https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/safe_haskell...
*Functions in the IO monad are still allowed and behave as usual. * As usual? So it is ok to segfault GHC? Elsewhere it says "in the safe language you can trust the types", and I'd always assumed that meant Safe Haskell is a type safe language, even in the IO fragment.
Was there an explicit decision to allow segfaults and memory corruption? This can happen not just with FFI calls but with uses of Ptrs within Haskell, for example the following:
```
{-# LANGUAGE Safe #-}
module Main where
import Foreign.Marshal.Alloc
import Foreign.Storable
import Foreign.Ptr
import System.Random
fn :: Ptr Int -> IO ()
fn p = do
-- This is kosher:
poke p 3
print =<< peek p
-- This should crash the system:
ix <- randomIO
pokeElemOff p ix 0xcc
main = alloca fn
```
-Ryan

Hi,
On 9 August 2016 at 01:32, David Terei
I imagine in Java, that I can construct an invalid pointer in foreign code, and then cause segfaults without the Java code having any issues. Just guessing at this, so very interested to know how it's prevented if I can't.
Yes, this can be done with JNI, see e.g. [1]. Additionally, by using sun.misc.Unsafe [2], one can cause segfaults even from pure Java. [1] https://www.cs.princeton.edu/~appel/papers/safejni.pdf [2] http://www.inf.usi.ch/faculty/lanza/Downloads/Mast2015a.pdf

* Mikhail Glushenkov:
Hi,
On 9 August 2016 at 01:32, David Terei
wrote: I imagine in Java, that I can construct an invalid pointer in foreign code, and then cause segfaults without the Java code having any issues. Just guessing at this, so very interested to know how it's prevented if I can't.
Yes, this can be done with JNI, see e.g. [1]. Additionally, by using sun.misc.Unsafe [2], one can cause segfaults even from pure Java.
You can also bring the JVM into an unstable state by triggering a VirtualMachineError, basically an out-of-memory condition, a stack overflow, or any other unrecoverable error. You can also exhaust limited resources such as file descriptors pretty easily.

On Mon, Aug 8, 2016 at 8:46 PM, Mikhail Glushenkov
Yes, this can be done with JNI, see e.g. [1]. Additionally, by using sun.misc.Unsafe [2], one can cause segfaults even from pure Java. [1] https://www.cs.princeton.edu/~appel/papers/safejni.pdf [2] http://www.inf.usi.ch/faculty/lanza/Downloads/Mast2015a.pdf
Ah, I see. I thought that, ruling out FFI, that you couldn't segfault with
pure Java code. Good to know about the unsafe interface.
On Mon, Aug 8, 2016 at 7:32 PM, David Terei
If you have the energy, it'd be great to put some of this thinking into a wiki page (https://ghc.haskell.org/trac/ghc/wiki/SafeHaskell) and flesh out a first approximation of what IO API's cause issues. Is it just Ptr not carrying bounds around with it? Maybe, but then we need to secure how Ptr's can be created, which excludes FFI returning Ptr's.
Yes, we can add a Wiki page. Btw I was thinking more of kicking FFI/peek/poke outside of the Safe bubble, not changing their operational behavior. First of all, it's nigh impossible to lock down FFI calls. When someone, e.g. Bob Harper https://existentialtype.wordpress.com/2012/08/14/haskell-is-exceptionally-un..., points out a problem in Haskell, we sometimes respond "hey, *Safe Haskell* is the real objet d'art! It's a safe language." Yet it isn't really a full *language* if people cannot write and run programs in it! (Because every program must be ultimately be `main::IO()`.) Kicking out segfaulty features would still leave a safe language that people can write complete programs in. Best, -Ryan

If you can execute subprocesses, you could always spawn gdb to attach via ptrace() to the parent process and then poke around memory. Yes this is a "dumb" example but I think it goes to show how important it is to correctly characterize what the threat model is. A "no-segfault" fragment of Haskell doesn't seem so much as a security feature as it is a "good practices" lint pass. Edward Excerpts from Ryan Newton's message of 2016-08-09 10:41:44 -0400:
On Mon, Aug 8, 2016 at 8:46 PM, Mikhail Glushenkov
wrote: Yes, this can be done with JNI, see e.g. [1]. Additionally, by using sun.misc.Unsafe [2], one can cause segfaults even from pure Java. [1] https://www.cs.princeton.edu/~appel/papers/safejni.pdf [2] http://www.inf.usi.ch/faculty/lanza/Downloads/Mast2015a.pdf
Ah, I see. I thought that, ruling out FFI, that you couldn't segfault with pure Java code. Good to know about the unsafe interface.
On Mon, Aug 8, 2016 at 7:32 PM, David Terei
wrote: If you have the energy, it'd be great to put some of this thinking into a wiki page (https://ghc.haskell.org/trac/ghc/wiki/SafeHaskell) and flesh out a first approximation of what IO API's cause issues. Is it just Ptr not carrying bounds around with it? Maybe, but then we need to secure how Ptr's can be created, which excludes FFI returning Ptr's.
Yes, we can add a Wiki page.
Btw I was thinking more of kicking FFI/peek/poke outside of the Safe bubble, not changing their operational behavior. First of all, it's nigh impossible to lock down FFI calls.
When someone, e.g. Bob Harper https://existentialtype.wordpress.com/2012/08/14/haskell-is-exceptionally-un..., points out a problem in Haskell, we sometimes respond "hey, *Safe Haskell* is the real objet d'art! It's a safe language." Yet it isn't really a full *language* if people cannot write and run programs in it! (Because every program must be ultimately be `main::IO()`.) Kicking out segfaulty features would still leave a safe language that people can write complete programs in.
Best, -Ryan

On Tue, Aug 9, 2016 at 4:19 PM, Edward Z. Yang
If you can execute subprocesses, you could always spawn gdb to attach via ptrace() to the parent process and then poke around memory.
Don't even need that if you're just talking segfaults, you can always spawn a subprocess "kill -SEGV $PPID" :) Unless you have full control over all the code that could be run in subprocesses, it's not going to be safe much less Safe. -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

Heh, ok, segfaults themselves are a red herring. More precisely:
The operational semantics for a SafeIO language should always accurately
model its memory state. The application should not compute (take a step in
the semantics) in a way that exposes corrupt memory or arbitrary undefined
behavior. Nor should it violate the memory model.
Moving immediately to the "Terminated" state is fine, whether it be due to
out-of-memory, kill -SEGV, cosmic rays, or hardware failure. An FFI call
that corrupts memory is bad (may result in arbitrary behavior, not just
termination) as is ptrace'ing.
Naturally, all Unsafe code is part of the TCB, as is the OS and GHC, and
low-level data structure libs and bindings. It's a big TCB. Still, it's
something to be able to write an app that doesn't automatically get added
to this TCB just by virtue of being an *app* (main::IO).
On Tue, Aug 9, 2016 at 10:52 PM, Brandon Allbery
On Tue, Aug 9, 2016 at 4:19 PM, Edward Z. Yang
wrote: If you can execute subprocesses, you could always spawn gdb to attach via ptrace() to the parent process and then poke around memory.
Don't even need that if you're just talking segfaults, you can always spawn a subprocess "kill -SEGV $PPID" :)
Unless you have full control over all the code that could be run in subprocesses, it's not going to be safe much less Safe.
-- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

I've always treated Safe Haskell as "Safe until you allow IO" -- in that
all 'evil' things get tainted by an IO type that you can't get rid of by
the usual means. So if you go to run pure Safe Haskell code in say,
lambdabot, which doesn't give the user a means to execute IO, it can't
segfault if all of the Trustworthy modules you depend upon actually are
trustworthy.
Trying to shore up segfault safety under Safe in IO seems like a losing
proposition.
-Edward
On Mon, Aug 8, 2016 at 1:27 PM, Ryan Newton
We're trying to spend some cycles pushing on Safe Haskell within the stackage packages. (It's looking like a slog.)
But we're running up against some basic questions regarding the core packages and Safe Haskell guarantees. The manual currently says: https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/safe_haskell...
*Functions in the IO monad are still allowed and behave as usual. * As usual? So it is ok to segfault GHC? Elsewhere it says "in the safe language you can trust the types", and I'd always assumed that meant Safe Haskell is a type safe language, even in the IO fragment.
Was there an explicit decision to allow segfaults and memory corruption? This can happen not just with FFI calls but with uses of Ptrs within Haskell, for example the following:
```
{-# LANGUAGE Safe #-}
module Main where
import Foreign.Marshal.Alloc
import Foreign.Storable
import Foreign.Ptr
import System.Random
fn :: Ptr Int -> IO ()
fn p = do
-- This is kosher:
poke p 3
print =<< peek p
-- This should crash the system:
ix <- randomIO
pokeElemOff p ix 0xcc
main = alloca fn
```
-Ryan
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

I'm hearing that Safe Haskell is great for pure use cases (lambda bot).
But that doesn't depend on being able to write arbitrary IO code inside the
Safe bubble, does it? In fact *all* of IO could be outside the safe
boundary for this use case, could it not? Are there any existing cases
where it is important to be able to build up unsafe IO values inside -XSafe
code?
Edward, why does it seem like a losing proposition? Are there further
problems that come to mind? ezyang mentioned the subprocess problem. I
don't have a strong opinion on that one. But I tend to think the safe IO
language *should* allow subprocess calls, and its a matter of configuring
your OS to not allow ptrace in that situation. This would be part of a set
of requirements for how to compile and launch a complete "Safe Haskell"
*program* in order to get a guarantee.
My primary interest is actually not segfault-freedom, per-se, but being
able to define a memory model for Safe Haskell (for which I'd suggest
sequential consistency). FFI undermines that, and peek/poke seems like it
should cluster with FFI as an unsafe feature. I'm not inclined to give a
memory model to peek or FFI -- at that level you get what the architecture
gives you -- but I do want a memory model for IORefs, IOVectors, etc.
We're poking at the Stackage package set now to figure out what pressure
point to push on to increase the percentage of Stackage that is Safe. I'll
be able to say more when we have more data on dependencies and problem
points. Across all of hackage, Safe Haskell has modest use: of the ~100K
modules on Hackage, ~636 are marked Safe, ~874 trustworthy, and ~118
Unsafe. It should be easy to check if any of this Safe code is currently
importing "Foreign.*" or using FFI.
My general plea is that we not give the imperative partition of Haskell too
much the short end of the stick [1]. There is oodles of code in IO (or
MonadIO), and probably relatively little in "RIO". To my knowledge, we
don't have great ways to coin "RIO" newtypes without having to wrap and
reexport rather a lot of IO functions. Maybe if APIs like MVars or files
were overloaded in a class then GND could do some of the work...
-Ryan
[1] In safety guarantees, in optimizations, primops, whatever... For
instance, I find in microbenchmarks that IO code still runs 2X slower than
pure code, even if no IO effects are performed.
On Tue, Aug 9, 2016 at 5:13 PM, Edward Kmett
I've always treated Safe Haskell as "Safe until you allow IO" -- in that all 'evil' things get tainted by an IO type that you can't get rid of by the usual means. So if you go to run pure Safe Haskell code in say, lambdabot, which doesn't give the user a means to execute IO, it can't segfault if all of the Trustworthy modules you depend upon actually are trustworthy.
Trying to shore up segfault safety under Safe in IO seems like a losing proposition.
-Edward
On Mon, Aug 8, 2016 at 1:27 PM, Ryan Newton
wrote: We're trying to spend some cycles pushing on Safe Haskell within the stackage packages. (It's looking like a slog.)
But we're running up against some basic questions regarding the core packages and Safe Haskell guarantees. The manual currently says: https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/safe_haskell...
*Functions in the IO monad are still allowed and behave as usual. * As usual? So it is ok to segfault GHC? Elsewhere it says "in the safe language you can trust the types", and I'd always assumed that meant Safe Haskell is a type safe language, even in the IO fragment.
Was there an explicit decision to allow segfaults and memory corruption? This can happen not just with FFI calls but with uses of Ptrs within Haskell, for example the following:
```
{-# LANGUAGE Safe #-}
module Main where
import Foreign.Marshal.Alloc
import Foreign.Storable
import Foreign.Ptr
import System.Random
fn :: Ptr Int -> IO ()
fn p = do
-- This is kosher:
poke p 3
print =<< peek p
-- This should crash the system:
ix <- randomIO
pokeElemOff p ix 0xcc
main = alloca fn
```
-Ryan
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

I see three major stories here:
1.) If you remove IO from being able to be compiled inside Safe code _at
all_ most packages I have that bother to expose Safe information will have
to stop bothering. I'd have to cut up too many APIs into too many
fine-grained pieces. This would considerably reduce the utility of Safe
Haskell to me. Many of them expose a few combinators here and there that
happen to live in IO and I can view offering Safe or Trustworthy to users
as a 'the pure stuff looks really pure' guarantee. For the most part it
'just works' and Trustworthy annotations can be put in when I know the
semantics of the hacks I'm using under the hood.
2.) Assuming instead that you're talking about a stronger-than-Safe
additional language extension, say ReallySafe or SafeIO, it all comes down
to what the user is allowed to do in IO, doesn't it? What effects are users
granted access to? We don't have a very fine-grained system for IO-effect
management, and it seems pretty much any choice you pick for what to put in
the sandbox will be wrong for some users, so you'd need some sort of pragma
for each IO operation saying what bins it falls into and to track that
while type checking, etc. At least then you could say what you are safe
with respect to. That all seems to be rather a big mess, roughly equivalent
to modeling an effect system for IO operations, and then retroactively
categorizing everything, putting a big burden on maintainers and requiring
a lot of community buy-in, sight unseen.
3.) On the other hand, someone could _build_ an effect system in Haskell
that happens to sit on top of IO, holding effects in an HList, undischarged
nullary class constraint, etc. then pull a couple of Trustworthy modules
around it for embedding the effects they want to permit and build this
today without any compiler support, they'd just have to make a final
application-specific Trustworthy wrapper to run whatever effects they want
to permit into their program. It is more invasive to the code in question,
but it requires zero community organizing and we've already got all the
compiler mojo we need. The downside is the Trustworthy wrappers at the
bottom of the heap and that it doesn't interoperate with basically anything
already written.
-Edward
On Tue, Aug 9, 2016 at 10:45 PM, Ryan Newton
I'm hearing that Safe Haskell is great for pure use cases (lambda bot). But that doesn't depend on being able to write arbitrary IO code inside the Safe bubble, does it? In fact *all* of IO could be outside the safe boundary for this use case, could it not? Are there any existing cases where it is important to be able to build up unsafe IO values inside -XSafe code?
Edward, why does it seem like a losing proposition? Are there further problems that come to mind? ezyang mentioned the subprocess problem. I don't have a strong opinion on that one. But I tend to think the safe IO language *should* allow subprocess calls, and its a matter of configuring your OS to not allow ptrace in that situation. This would be part of a set of requirements for how to compile and launch a complete "Safe Haskell" *program* in order to get a guarantee.
My primary interest is actually not segfault-freedom, per-se, but being able to define a memory model for Safe Haskell (for which I'd suggest sequential consistency). FFI undermines that, and peek/poke seems like it should cluster with FFI as an unsafe feature. I'm not inclined to give a memory model to peek or FFI -- at that level you get what the architecture gives you -- but I do want a memory model for IORefs, IOVectors, etc.
We're poking at the Stackage package set now to figure out what pressure point to push on to increase the percentage of Stackage that is Safe. I'll be able to say more when we have more data on dependencies and problem points. Across all of hackage, Safe Haskell has modest use: of the ~100K modules on Hackage, ~636 are marked Safe, ~874 trustworthy, and ~118 Unsafe. It should be easy to check if any of this Safe code is currently importing "Foreign.*" or using FFI.
My general plea is that we not give the imperative partition of Haskell too much the short end of the stick [1]. There is oodles of code in IO (or MonadIO), and probably relatively little in "RIO". To my knowledge, we don't have great ways to coin "RIO" newtypes without having to wrap and reexport rather a lot of IO functions. Maybe if APIs like MVars or files were overloaded in a class then GND could do some of the work...
-Ryan
[1] In safety guarantees, in optimizations, primops, whatever... For instance, I find in microbenchmarks that IO code still runs 2X slower than pure code, even if no IO effects are performed.
On Tue, Aug 9, 2016 at 5:13 PM, Edward Kmett
wrote: I've always treated Safe Haskell as "Safe until you allow IO" -- in that all 'evil' things get tainted by an IO type that you can't get rid of by the usual means. So if you go to run pure Safe Haskell code in say, lambdabot, which doesn't give the user a means to execute IO, it can't segfault if all of the Trustworthy modules you depend upon actually are trustworthy.
Trying to shore up segfault safety under Safe in IO seems like a losing proposition.
-Edward
On Mon, Aug 8, 2016 at 1:27 PM, Ryan Newton
wrote: We're trying to spend some cycles pushing on Safe Haskell within the stackage packages. (It's looking like a slog.)
But we're running up against some basic questions regarding the core packages and Safe Haskell guarantees. The manual currently says: https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/safe_haskell...
*Functions in the IO monad are still allowed and behave as usual. * As usual? So it is ok to segfault GHC? Elsewhere it says "in the safe language you can trust the types", and I'd always assumed that meant Safe Haskell is a type safe language, even in the IO fragment.
Was there an explicit decision to allow segfaults and memory corruption? This can happen not just with FFI calls but with uses of Ptrs within Haskell, for example the following:
```
{-# LANGUAGE Safe #-}
module Main where
import Foreign.Marshal.Alloc
import Foreign.Storable
import Foreign.Ptr
import System.Random
fn :: Ptr Int -> IO ()
fn p = do
-- This is kosher:
poke p 3
print =<< peek p
-- This should crash the system:
ix <- randomIO
pokeElemOff p ix 0xcc
main = alloca fn
```
-Ryan
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Right - Safe Haskell provides the minimum that you need to be able to
safely run untrusted code: the ability to trust the type system and the
module system. Definining a safe subset of IO is usually an
application-specific decision, e.g. do you want to allow access to the
filesystem but without allowing openFile "/dev/mem"? It's a minefield.
Ryan, if you want to give an operational semantics for memory in IO, why
not start from the subset of IO that you can accurately model - the basic
IO structure together with a set of primitives - and define the semantics
for that? That's typically what we do when we're talking about something
in the IO monad.
Cheers
Simon
On 10 August 2016 at 04:58, Edward Kmett
I see three major stories here:
1.) If you remove IO from being able to be compiled inside Safe code _at all_ most packages I have that bother to expose Safe information will have to stop bothering. I'd have to cut up too many APIs into too many fine-grained pieces. This would considerably reduce the utility of Safe Haskell to me. Many of them expose a few combinators here and there that happen to live in IO and I can view offering Safe or Trustworthy to users as a 'the pure stuff looks really pure' guarantee. For the most part it 'just works' and Trustworthy annotations can be put in when I know the semantics of the hacks I'm using under the hood.
2.) Assuming instead that you're talking about a stronger-than-Safe additional language extension, say ReallySafe or SafeIO, it all comes down to what the user is allowed to do in IO, doesn't it? What effects are users granted access to? We don't have a very fine-grained system for IO-effect management, and it seems pretty much any choice you pick for what to put in the sandbox will be wrong for some users, so you'd need some sort of pragma for each IO operation saying what bins it falls into and to track that while type checking, etc. At least then you could say what you are safe with respect to. That all seems to be rather a big mess, roughly equivalent to modeling an effect system for IO operations, and then retroactively categorizing everything, putting a big burden on maintainers and requiring a lot of community buy-in, sight unseen.
3.) On the other hand, someone could _build_ an effect system in Haskell that happens to sit on top of IO, holding effects in an HList, undischarged nullary class constraint, etc. then pull a couple of Trustworthy modules around it for embedding the effects they want to permit and build this today without any compiler support, they'd just have to make a final application-specific Trustworthy wrapper to run whatever effects they want to permit into their program. It is more invasive to the code in question, but it requires zero community organizing and we've already got all the compiler mojo we need. The downside is the Trustworthy wrappers at the bottom of the heap and that it doesn't interoperate with basically anything already written.
-Edward
On Tue, Aug 9, 2016 at 10:45 PM, Ryan Newton
wrote: I'm hearing that Safe Haskell is great for pure use cases (lambda bot). But that doesn't depend on being able to write arbitrary IO code inside the Safe bubble, does it? In fact *all* of IO could be outside the safe boundary for this use case, could it not? Are there any existing cases where it is important to be able to build up unsafe IO values inside -XSafe code?
Edward, why does it seem like a losing proposition? Are there further problems that come to mind? ezyang mentioned the subprocess problem. I don't have a strong opinion on that one. But I tend to think the safe IO language *should* allow subprocess calls, and its a matter of configuring your OS to not allow ptrace in that situation. This would be part of a set of requirements for how to compile and launch a complete "Safe Haskell" *program* in order to get a guarantee.
My primary interest is actually not segfault-freedom, per-se, but being able to define a memory model for Safe Haskell (for which I'd suggest sequential consistency). FFI undermines that, and peek/poke seems like it should cluster with FFI as an unsafe feature. I'm not inclined to give a memory model to peek or FFI -- at that level you get what the architecture gives you -- but I do want a memory model for IORefs, IOVectors, etc.
We're poking at the Stackage package set now to figure out what pressure point to push on to increase the percentage of Stackage that is Safe. I'll be able to say more when we have more data on dependencies and problem points. Across all of hackage, Safe Haskell has modest use: of the ~100K modules on Hackage, ~636 are marked Safe, ~874 trustworthy, and ~118 Unsafe. It should be easy to check if any of this Safe code is currently importing "Foreign.*" or using FFI.
My general plea is that we not give the imperative partition of Haskell too much the short end of the stick [1]. There is oodles of code in IO (or MonadIO), and probably relatively little in "RIO". To my knowledge, we don't have great ways to coin "RIO" newtypes without having to wrap and reexport rather a lot of IO functions. Maybe if APIs like MVars or files were overloaded in a class then GND could do some of the work...
-Ryan
[1] In safety guarantees, in optimizations, primops, whatever... For instance, I find in microbenchmarks that IO code still runs 2X slower than pure code, even if no IO effects are performed.
On Tue, Aug 9, 2016 at 5:13 PM, Edward Kmett
wrote: I've always treated Safe Haskell as "Safe until you allow IO" -- in that all 'evil' things get tainted by an IO type that you can't get rid of by the usual means. So if you go to run pure Safe Haskell code in say, lambdabot, which doesn't give the user a means to execute IO, it can't segfault if all of the Trustworthy modules you depend upon actually are trustworthy.
Trying to shore up segfault safety under Safe in IO seems like a losing proposition.
-Edward
On Mon, Aug 8, 2016 at 1:27 PM, Ryan Newton
wrote: We're trying to spend some cycles pushing on Safe Haskell within the stackage packages. (It's looking like a slog.)
But we're running up against some basic questions regarding the core packages and Safe Haskell guarantees. The manual currently says: https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/safe_haskell...
*Functions in the IO monad are still allowed and behave as usual. * As usual? So it is ok to segfault GHC? Elsewhere it says "in the safe language you can trust the types", and I'd always assumed that meant Safe Haskell is a type safe language, even in the IO fragment.
Was there an explicit decision to allow segfaults and memory corruption? This can happen not just with FFI calls but with uses of Ptrs within Haskell, for example the following:
```
{-# LANGUAGE Safe #-}
module Main where
import Foreign.Marshal.Alloc
import Foreign.Storable
import Foreign.Ptr
import System.Random
fn :: Ptr Int -> IO ()
fn p = do
-- This is kosher:
poke p 3
print =<< peek p
-- This should crash the system:
ix <- randomIO
pokeElemOff p ix 0xcc
main = alloca fn
```
-Ryan
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Hi Simon,
On Wed, Aug 10, 2016 at 4:24 AM, Simon Marlow
Definining a safe subset of IO is usually an application-specific decision, e.g. do you want to allow access to the filesystem but without allowing openFile "/dev/mem"? It's a minefield.
I respect your intuition that this is a "minefield" (and Edward's "big mess"), but I still want to unpack this further. I would propose that a MemSafe app is a two-part contract that is half an OS responsibility and half language/runtime responsibility. 1. OS: guarantee that all system calls (including file access) from the process do not violate the processes memory, outside of certain opt-in DMA regions. 2. Lang/runtime: create a binary "ensuring" that all instructions executed in the process maintain the type safety of memory and follow the memory model. (Scare quotes due to large TCB.) With this division, "/dev/mem" and subprocess/ptrace are definitely the job of the OS or containerization. Blame for a bug would fall to category 1. A MemSafe app needs a launcher/harness to check that category 1 is enforced properly. I'm not proposing that GHC needs to generate that. It should be separate. Ryan, if you want to give an operational semantics for memory in IO, why
not start from the subset of IO that you can accurately model - the basic IO structure together with a set of primitives - and define the semantics for that? That's typically what we do when we're talking about something in the IO monad.
Yes, indeed, that has been the http://research.microsoft.com/en-us/um/people/simonpj/papers/concurrent-hask... approach http://homepages.dcc.ufmg.br/~camarao/fp/articles/lazy-state.pdf for decades (and the same thing you and I do with other monads like Par http://research.microsoft.com/en-us/um/people/simonpj/papers/parallel/monad-...). There's something unsatisfying here though -- we love to build model languages that include the effects we want to talk about *at that time* -- perhaps just MVars, just putChar, or just STM. But of course there's a big leap from these small treatments to GHC, with its multitude of moving parts. That's partly why I think SafeHaskell is so great; it's a Full Language which aims for serious, statically enforced guarantees. Well, almost a full language ;-). You can't run programs in it, unless you *extend* the TCB. It's kind of like Rust if you were required to use an "unsafe" block to write main(). Anyway, here's a draft paper that does propose a small model language. It adds a memory model to Haskell plus IORefs/STRefs/TVars, and explains how, when executed on a machine with relaxed TSO memory (store buffers), the IO writes need to be fenced, but the ST and STM ones don't: http://www.cs.indiana.edu/~rrnewton/papers/sc-haskell_draft.pdf Do we need a memory model? I concur with David Terei's arguments from 5 years ago https://mail.haskell.org/pipermail/haskell-cafe/2011-May/091963.html. Note however, that the implementation in this paper is TSO only. Implementation on ARM would be a bit different, but the relationship between IO/ST and IO/STM would stay the same. Ryan Yates & others, I'm curious if you think the treatment of STM is satisfactory. Like ppopp05, we do not model retrying explicitly, and our goal here is to show the interaction of the different effects. Cheers, -Ryan

Hi Edward,
On Tue, Aug 9, 2016 at 11:58 PM, Edward Kmett
1.) If you remove IO from being able to be compiled inside Safe code _at all_ most packages I have that bother to expose Safe information will have to stop bothering.
I definitely wouldn't argue for removing it entirely. But it's good to know that there are instances where IO functions get mixed up in safe modules. I'll try to systematically find all of these on hackage, but in the meantime do you have a sample list of modules? My modest starting proposal is marking certain Foreign.* modules as Unsafe rather than Trustworthy. We'll find all the modules affected. But, again, are there any modules you know of offhand that are affected? They should fall into two categories: 1. Safe modules that must become Trustworthy (if they import Foreign bits, but don't expose the ability to corrupt memory to the clients of their APIs). 2. Safe modules that must become Unsafe or be split further into smaller modules. Obviously (2) is the biggest source of potential disruption. I wouldn't ask anyone to accept a patch on GHC until we'd explored these impacts pretty thoroughly. I'd have to cut up too many APIs into too many fine-grained pieces.
Yeah, the module-level business is pretty annoying. "vector' removed ".Safe" modules and no one has gotten around to adding the ".Unsafe".
2.) Assuming instead that you're talking about a stronger-than-Safe additional language extension, say ReallySafe or SafeIO, it all comes down to what the user is allowed to do in IO, doesn't it? What effects are users granted access to? We don't have a very fine-grained system for IO-effect management, and it seems pretty much any choice you pick for what to put in the sandbox will be wrong for some users, so you'd need some sort of pragma for each IO operation saying what bins it falls into and to track that while type checking, etc.
Well, *maybe* it is a slippery slope that leads to a full effect system. But I'd like to see these issues enumerated. Does memory safety as a goal really involve so many different effects? Do you think there will be 1, 3, 10, or 100 things beyond Foreign.Ptr to worry about? 3.) On the other hand, someone could _build_ an effect system in Haskell
that happens to sit on top of IO, holding effects in an HList, undischarged nullary class constraint, etc.
Well, sure, I hope we will continue to aim for this as well. This is effectively what we do with our "LVish" Par monad, where we use Safe Haskell to ensure users cannot break the effect system in -XSafe code. Best, -Ryan

Hi Ryan,
I have similar concerns with safety and STM. In particular, lazy
validation allows for execution after inconsistent reads from TVars. The
obvious problem to avoid is falling into an infinite loop. As long as
-fno-omit-yields is used (on every module?) and maybe some other conditions
like eventually GC happens, the transaction will be validated and killed
off. But other problems can happen. Consider a transactional hash table
with an array and some hashes already computed to be inside the array. If
an execution sees updated hashes, but not the updated array, then using
unsafeRead could lead to a segfault. I don't think this is completely
obvious, especially when people will reach for STM precisely to avoid this
sort of problem. I worry about code that abstracts over mutable variables
that work given sequential execution, but could fail with STM. ByteString
can lead to similar issues. Data.ByteString.replicate can be asked to
allocate very large pinned data leading to immediate heap overflow. But if
the request is from an inconsistent view of data it seams the programmer
has already done their due diligence in preventing this from happening!
Anyway, I would like to work toward reasoning about these things more
precisely.
On Wed, Aug 10, 2016 at 10:23 AM, Ryan Newton
Hi Edward,
On Tue, Aug 9, 2016 at 11:58 PM, Edward Kmett
wrote: 1.) If you remove IO from being able to be compiled inside Safe code _at all_ most packages I have that bother to expose Safe information will have to stop bothering.
I definitely wouldn't argue for removing it entirely. But it's good to know that there are instances where IO functions get mixed up in safe modules. I'll try to systematically find all of these on hackage, but in the meantime do you have a sample list of modules?
My modest starting proposal is marking certain Foreign.* modules as Unsafe rather than Trustworthy. We'll find all the modules affected. But, again, are there any modules you know of offhand that are affected? They should fall into two categories:
1. Safe modules that must become Trustworthy (if they import Foreign bits, but don't expose the ability to corrupt memory to the clients of their APIs). 2. Safe modules that must become Unsafe or be split further into smaller modules.
Obviously (2) is the biggest source of potential disruption.
I wouldn't ask anyone to accept a patch on GHC until we'd explored these impacts pretty thoroughly.
I'd have to cut up too many APIs into too many fine-grained pieces.
Yeah, the module-level business is pretty annoying. "vector' removed ".Safe" modules and no one has gotten around to adding the ".Unsafe".
2.) Assuming instead that you're talking about a stronger-than-Safe additional language extension, say ReallySafe or SafeIO, it all comes down to what the user is allowed to do in IO, doesn't it? What effects are users granted access to? We don't have a very fine-grained system for IO-effect management, and it seems pretty much any choice you pick for what to put in the sandbox will be wrong for some users, so you'd need some sort of pragma for each IO operation saying what bins it falls into and to track that while type checking, etc.
Well, *maybe* it is a slippery slope that leads to a full effect system. But I'd like to see these issues enumerated. Does memory safety as a goal really involve so many different effects? Do you think there will be 1, 3, 10, or 100 things beyond Foreign.Ptr to worry about?
3.) On the other hand, someone could _build_ an effect system in Haskell
that happens to sit on top of IO, holding effects in an HList, undischarged nullary class constraint, etc.
Well, sure, I hope we will continue to aim for this as well. This is effectively what we do with our "LVish" Par monad, where we use Safe Haskell to ensure users cannot break the effect system in -XSafe code.
Best, -Ryan
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

As for a sample list of modules, let's just start with your very first
example, Foreign.Ptr:
In and of itself nothing in Foreign.Ptr is unsafe! It allows a bit of
arithmetic on a type you can't actually use with anything, and provides an
IO action mixed into an otherwise pure module that happens to create a
FunPtr slot from a haskell function. In fact this module is a textbook
example of an otherwise perfectly cromulent Trustworthy module today that
happens to have a single IO action in it.
I can grab Ptr from it, use its Storable instance to make a default
signature for other safe code and still be perfectly safe.
It gives no tools for manipulating the contents of the Ptr. It is no more
dangerous than an Int with a phantom type argument.
You could randomly declare that this module is Unsafe because it combines
badly with APIs that would be safe if you could rely on any Ptr T actually
pointing to a T, and that users could then be granted the power to ferry
them around, but we don't trust a user to be able to do that today.
It's the combinators that read/write to a Ptr are the dangerous bits, not
pure math.
-Edward
On Wed, Aug 10, 2016 at 10:23 AM, Ryan Newton
Hi Edward,
On Tue, Aug 9, 2016 at 11:58 PM, Edward Kmett
wrote: 1.) If you remove IO from being able to be compiled inside Safe code _at all_ most packages I have that bother to expose Safe information will have to stop bothering.
I definitely wouldn't argue for removing it entirely. But it's good to know that there are instances where IO functions get mixed up in safe modules. I'll try to systematically find all of these on hackage, but in the meantime do you have a sample list of modules?
My modest starting proposal is marking certain Foreign.* modules as Unsafe rather than Trustworthy. We'll find all the modules affected. But, again, are there any modules you know of offhand that are affected? They should fall into two categories:
1. Safe modules that must become Trustworthy (if they import Foreign bits, but don't expose the ability to corrupt memory to the clients of their APIs). 2. Safe modules that must become Unsafe or be split further into smaller modules.
Obviously (2) is the biggest source of potential disruption.
I wouldn't ask anyone to accept a patch on GHC until we'd explored these impacts pretty thoroughly.
I'd have to cut up too many APIs into too many fine-grained pieces.
Yeah, the module-level business is pretty annoying. "vector' removed ".Safe" modules and no one has gotten around to adding the ".Unsafe".
2.) Assuming instead that you're talking about a stronger-than-Safe additional language extension, say ReallySafe or SafeIO, it all comes down to what the user is allowed to do in IO, doesn't it? What effects are users granted access to? We don't have a very fine-grained system for IO-effect management, and it seems pretty much any choice you pick for what to put in the sandbox will be wrong for some users, so you'd need some sort of pragma for each IO operation saying what bins it falls into and to track that while type checking, etc.
Well, *maybe* it is a slippery slope that leads to a full effect system. But I'd like to see these issues enumerated. Does memory safety as a goal really involve so many different effects? Do you think there will be 1, 3, 10, or 100 things beyond Foreign.Ptr to worry about?
3.) On the other hand, someone could _build_ an effect system in Haskell
that happens to sit on top of IO, holding effects in an HList, undischarged nullary class constraint, etc.
Well, sure, I hope we will continue to aim for this as well. This is effectively what we do with our "LVish" Par monad, where we use Safe Haskell to ensure users cannot break the effect system in -XSafe code.
Best, -Ryan

Yes, it is peek and poke that are dangerous. It was Foreign.Storable that
I wanted to mark as Unsafe.
But we do sometimes run into examples where there's an A and a B, and if
you import both, you can make A+B which blows up. So preventing access to
A+B may mean arbitrarily marking one or the other (or both) as Unsafe.
What I was hoping for examples of are modules you have that are Safe and
import Foreign.Storable.
On Fri, Aug 12, 2016 at 9:49 AM, Edward Kmett
As for a sample list of modules, let's just start with your very first example, Foreign.Ptr:
In and of itself nothing in Foreign.Ptr is unsafe! It allows a bit of arithmetic on a type you can't actually use with anything, and provides an IO action mixed into an otherwise pure module that happens to create a FunPtr slot from a haskell function. In fact this module is a textbook example of an otherwise perfectly cromulent Trustworthy module today that happens to have a single IO action in it.
I can grab Ptr from it, use its Storable instance to make a default signature for other safe code and still be perfectly safe.
It gives no tools for manipulating the contents of the Ptr. It is no more dangerous than an Int with a phantom type argument.
You could randomly declare that this module is Unsafe because it combines badly with APIs that would be safe if you could rely on any Ptr T actually pointing to a T, and that users could then be granted the power to ferry them around, but we don't trust a user to be able to do that today.
It's the combinators that read/write to a Ptr are the dangerous bits, not pure math.
-Edward
On Wed, Aug 10, 2016 at 10:23 AM, Ryan Newton
wrote: Hi Edward,
On Tue, Aug 9, 2016 at 11:58 PM, Edward Kmett
wrote: 1.) If you remove IO from being able to be compiled inside Safe code _at all_ most packages I have that bother to expose Safe information will have to stop bothering.
I definitely wouldn't argue for removing it entirely. But it's good to know that there are instances where IO functions get mixed up in safe modules. I'll try to systematically find all of these on hackage, but in the meantime do you have a sample list of modules?
My modest starting proposal is marking certain Foreign.* modules as Unsafe rather than Trustworthy. We'll find all the modules affected. But, again, are there any modules you know of offhand that are affected? They should fall into two categories:
1. Safe modules that must become Trustworthy (if they import Foreign bits, but don't expose the ability to corrupt memory to the clients of their APIs). 2. Safe modules that must become Unsafe or be split further into smaller modules.
Obviously (2) is the biggest source of potential disruption.
I wouldn't ask anyone to accept a patch on GHC until we'd explored these impacts pretty thoroughly.
I'd have to cut up too many APIs into too many fine-grained pieces.
Yeah, the module-level business is pretty annoying. "vector' removed ".Safe" modules and no one has gotten around to adding the ".Unsafe".
2.) Assuming instead that you're talking about a stronger-than-Safe additional language extension, say ReallySafe or SafeIO, it all comes down to what the user is allowed to do in IO, doesn't it? What effects are users granted access to? We don't have a very fine-grained system for IO-effect management, and it seems pretty much any choice you pick for what to put in the sandbox will be wrong for some users, so you'd need some sort of pragma for each IO operation saying what bins it falls into and to track that while type checking, etc.
Well, *maybe* it is a slippery slope that leads to a full effect system. But I'd like to see these issues enumerated. Does memory safety as a goal really involve so many different effects? Do you think there will be 1, 3, 10, or 100 things beyond Foreign.Ptr to worry about?
3.) On the other hand, someone could _build_ an effect system in Haskell
that happens to sit on top of IO, holding effects in an HList, undischarged nullary class constraint, etc.
Well, sure, I hope we will continue to aim for this as well. This is effectively what we do with our "LVish" Par monad, where we use Safe Haskell to ensure users cannot break the effect system in -XSafe code.
Best, -Ryan

What about consuming Storable Vectors carefully, or simply working
parameterized over vector type, where Storable vectors are one of the
options?
-Edward
On Fri, Aug 12, 2016 at 12:58 PM, Ryan Newton
Yes, it is peek and poke that are dangerous. It was Foreign.Storable that I wanted to mark as Unsafe.
But we do sometimes run into examples where there's an A and a B, and if you import both, you can make A+B which blows up. So preventing access to A+B may mean arbitrarily marking one or the other (or both) as Unsafe.
What I was hoping for examples of are modules you have that are Safe and import Foreign.Storable.
On Fri, Aug 12, 2016 at 9:49 AM, Edward Kmett
wrote: As for a sample list of modules, let's just start with your very first example, Foreign.Ptr:
In and of itself nothing in Foreign.Ptr is unsafe! It allows a bit of arithmetic on a type you can't actually use with anything, and provides an IO action mixed into an otherwise pure module that happens to create a FunPtr slot from a haskell function. In fact this module is a textbook example of an otherwise perfectly cromulent Trustworthy module today that happens to have a single IO action in it.
I can grab Ptr from it, use its Storable instance to make a default signature for other safe code and still be perfectly safe.
It gives no tools for manipulating the contents of the Ptr. It is no more dangerous than an Int with a phantom type argument.
You could randomly declare that this module is Unsafe because it combines badly with APIs that would be safe if you could rely on any Ptr T actually pointing to a T, and that users could then be granted the power to ferry them around, but we don't trust a user to be able to do that today.
It's the combinators that read/write to a Ptr are the dangerous bits, not pure math.
-Edward
On Wed, Aug 10, 2016 at 10:23 AM, Ryan Newton
wrote: Hi Edward,
On Tue, Aug 9, 2016 at 11:58 PM, Edward Kmett
wrote: 1.) If you remove IO from being able to be compiled inside Safe code _at all_ most packages I have that bother to expose Safe information will have to stop bothering.
I definitely wouldn't argue for removing it entirely. But it's good to know that there are instances where IO functions get mixed up in safe modules. I'll try to systematically find all of these on hackage, but in the meantime do you have a sample list of modules?
My modest starting proposal is marking certain Foreign.* modules as Unsafe rather than Trustworthy. We'll find all the modules affected. But, again, are there any modules you know of offhand that are affected? They should fall into two categories:
1. Safe modules that must become Trustworthy (if they import Foreign bits, but don't expose the ability to corrupt memory to the clients of their APIs). 2. Safe modules that must become Unsafe or be split further into smaller modules.
Obviously (2) is the biggest source of potential disruption.
I wouldn't ask anyone to accept a patch on GHC until we'd explored these impacts pretty thoroughly.
I'd have to cut up too many APIs into too many fine-grained pieces.
Yeah, the module-level business is pretty annoying. "vector' removed ".Safe" modules and no one has gotten around to adding the ".Unsafe".
2.) Assuming instead that you're talking about a stronger-than-Safe additional language extension, say ReallySafe or SafeIO, it all comes down to what the user is allowed to do in IO, doesn't it? What effects are users granted access to? We don't have a very fine-grained system for IO-effect management, and it seems pretty much any choice you pick for what to put in the sandbox will be wrong for some users, so you'd need some sort of pragma for each IO operation saying what bins it falls into and to track that while type checking, etc.
Well, *maybe* it is a slippery slope that leads to a full effect system. But I'd like to see these issues enumerated. Does memory safety as a goal really involve so many different effects? Do you think there will be 1, 3, 10, or 100 things beyond Foreign.Ptr to worry about?
3.) On the other hand, someone could _build_ an effect system in Haskell
that happens to sit on top of IO, holding effects in an HList, undischarged nullary class constraint, etc.
Well, sure, I hope we will continue to aim for this as well. This is effectively what we do with our "LVish" Par monad, where we use Safe Haskell to ensure users cannot break the effect system in -XSafe code.
Best, -Ryan

On 2016-08-12 20:37, Edward Kmett wrote:
What about consuming Storable Vectors carefully, or simply working parameterized over vector type, where Storable vectors are one of the options?
There was actually a great paper about a very similar thing (i.e. "here's the usual interface" vs. "here's how it should be done") recently: http://ozark.hendrix.edu/~yorgey/pub/twisted.pdf It may be of interest -- at least as far as an "improved Ptr/Storable" goes. Cheers,
participants (10)
-
Bardur Arantsson
-
Brandon Allbery
-
David Terei
-
Edward Kmett
-
Edward Z. Yang
-
Florian Weimer
-
Mikhail Glushenkov
-
Ryan Newton
-
Ryan Yates
-
Simon Marlow