Is Safe Haskell intended to allow segfaults?

newer
Do we have free bits in the info...

Ryan Newton

8 Aug 2016 8 Aug '16

5:27 p.m.

We're trying to spend some cycles pushing on Safe Haskell within the stackage packages. (It's looking like a slog.) But we're running up against some basic questions regarding the core packages and Safe Haskell guarantees. The manual currently says: https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/safe_haskell... *Functions in the IO monad are still allowed and behave as usual. * As usual? So it is ok to segfault GHC? Elsewhere it says "in the safe language you can trust the types", and I'd always assumed that meant Safe Haskell is a type safe language, even in the IO fragment. Was there an explicit decision to allow segfaults and memory corruption? This can happen not just with FFI calls but with uses of Ptrs within Haskell, for example the following: ``` {-# LANGUAGE Safe #-} module Main where import Foreign.Marshal.Alloc import Foreign.Storable import Foreign.Ptr import System.Random fn :: Ptr Int -> IO () fn p = do -- This is kosher: poke p 3 print =<< peek p -- This should crash the system: ix <- randomIO pokeElemOff p ix 0xcc main = alloca fn ``` -Ryan

Attachments:

attachment.html (text/html — 2.1 KB)

Show replies by date

Brandon Allbery

8 Aug 8 Aug

5:34 p.m.

On Mon, Aug 8, 2016 at 1:27 PM, Ryan Newton wrote:

...

As usual? So it is ok to segfault GHC? Elsewhere it says "in the safe language you can trust the types", and I'd always assumed that meant Safe Haskell is a type safe language, even in the IO fragment.

Pretty sure it's impossible to allow IO without enabling all of it one way or another. And as soon as you allow *any* IO, you're open to various kinds of failures including segfaults. The only way you will get your type system to prevent that is fully specified dependent typing both in your program *and in the OS level interfaces*. I wish you luck on the latter. -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

Ryan Newton

5:41 p.m.

...

Pretty sure it's impossible to allow IO without enabling all of it one way or another. And as soon as you allow *any* IO, you're open to various kinds of failures including segfaults. The only way you will get your type system to prevent that is fully specified dependent typing both in your program *and in the OS level interfaces*. I wish you luck on the latter.

I don't agree. This would seem to imply that type safe / memory safe imperative languages are impossible, and that OCaml/Java/C#/etc are a false promise! Which APIs and dangers are you thinking of? I'm worried about bounds checks on memory access, and the capability of foreign code to poke at arbitrary memory. But other APIs like these should be fine: - IORef API - MVar API - STM API - File IO API

Edward Z. Yang

6:05 p.m.

Hello Ryan, The guarantee that Safe Haskell gives with regards to IO is a little subtle and is mentioned in Section 3.1 of the paper, and detailed in Section 5.1. Essentially, to use Safe Haskell, you are responsible for defining the type at which untrusted code is to be called. Using an untrusted value at type IO a in main imposes no safety restrictions by design--it's up to the user of Safe Haskell to decide what kind of security properties it needs out of user code. Edward Excerpts from Ryan Newton's message of 2016-08-08 13:27:16 -0400:

...

We're trying to spend some cycles pushing on Safe Haskell within the stackage packages. (It's looking like a slog.)

But we're running up against some basic questions regarding the core packages and Safe Haskell guarantees. The manual currently says: https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/safe_haskell...

*Functions in the IO monad are still allowed and behave as usual. * As usual? So it is ok to segfault GHC? Elsewhere it says "in the safe language you can trust the types", and I'd always assumed that meant Safe Haskell is a type safe language, even in the IO fragment.

Was there an explicit decision to allow segfaults and memory corruption? This can happen not just with FFI calls but with uses of Ptrs within Haskell, for example the following:

```

{-# LANGUAGE Safe #-}

module Main where

import Foreign.Marshal.Alloc

import Foreign.Storable

import Foreign.Ptr

import System.Random

fn :: Ptr Int -> IO ()

fn p = do

-- This is kosher:

poke p 3

print =<< peek p

-- This should crash the system:

ix <- randomIO

pokeElemOff p ix 0xcc

main = alloca fn

```

-Ryan

Ryan Newton

8 p.m.

Hi Ed, Thanks, I went back to the paper and read those sections. It sounds like this design heavily reflects the untrusted-code use case. I believe ceding the entire IO monad is too pessimistic, and it's also against the *spirit* of the type safety guarantee mentioned in the paper: "Type safety. In Milner’s famous phrase, well-typed programs do not go wrong. " There's not really very much discussion of this "punting on IO" decision in the paper. The paper mentions not deleting users files as an example of a higher level security policy -- and a reason not to try to lock down IO -- but that's not really the issue here. Sure, there are many use cases that require an RIO newtype, but the memory model and memory protection are really internal to the language. My interest is in a type safe language with a memory model *in the IO monad*. I think it's quite *reasonable to expect Safe Haskell to be as safe as Java! * There are hundreds of thousands or millions of lines of code written in the IO monad [1], and a lot of that code could probably be memory safe by construction. One question I have for people is which of the following they would favor: 1. Redefine -XSafe to guarantee something about IO and its memory safety (and even its memory model) 2. Add a fourth safety level, "SafeExceptIO", that corresponds to the current guarantees, while extending "Safe" to say something about IO. 3. Leave safe Haskell as it is. (2) sounds a bit clunky to me, and I favor (1) most of all. Best, -Ryan [1] 17M lines on hackage total, hard to count how much is in an IO monad or related monad. On Mon, Aug 8, 2016 at 2:05 PM, Edward Z. Yang wrote:

...

Hello Ryan,

The guarantee that Safe Haskell gives with regards to IO is a little subtle and is mentioned in Section 3.1 of the paper, and detailed in Section 5.1. Essentially, to use Safe Haskell, you are responsible for defining the type at which untrusted code is to be called. Using an untrusted value at type IO a in main imposes no safety restrictions by design--it's up to the user of Safe Haskell to decide what kind of security properties it needs out of user code.

Edward

Excerpts from Ryan Newton's message of 2016-08-08 13:27:16 -0400:

...
We're trying to spend some cycles pushing on Safe Haskell within the stackage packages. (It's looking like a slog.)

But we're running up against some basic questions regarding the core packages and Safe Haskell guarantees. The manual currently says: <https://downloads.haskell.org/~ghc/latest/docs/html/ users_guide/safe_haskell.html#safe-language>

*Functions in the IO monad are still allowed and behave as usual. * As usual? So it is ok to segfault GHC? Elsewhere it says "in the safe language you can trust the types", and I'd always assumed that meant Safe Haskell is a type safe language, even in the IO fragment.

Was there an explicit decision to allow segfaults and memory corruption? This can happen not just with FFI calls but with uses of Ptrs within Haskell, for example the following:

```

{-# LANGUAGE Safe #-}

module Main where

import Foreign.Marshal.Alloc

import Foreign.Storable

import Foreign.Ptr

import System.Random

fn :: Ptr Int -> IO ()

fn p = do

-- This is kosher:

poke p 3

print =<< peek p

-- This should crash the system:

ix <- randomIO

pokeElemOff p ix 0xcc

main = alloca fn

```

-Ryan

Edward Z. Yang

8:15 p.m.

I think what you are proposing is reasonable. I wasn't present when Safe Haskell's design was originally fleshed out so I don't know why this route wasn't taken. Edward Excerpts from Ryan Newton's message of 2016-08-08 16:00:38 -0400:

...

Hi Ed,

Thanks, I went back to the paper and read those sections.

It sounds like this design heavily reflects the untrusted-code use case. I believe ceding the entire IO monad is too pessimistic, and it's also against the *spirit* of the type safety guarantee mentioned in the paper:

"Type safety. In Milner’s famous phrase, well-typed programs do not go wrong. " There's not really very much discussion of this "punting on IO" decision in the paper. The paper mentions not deleting users files as an example of a higher level security policy -- and a reason not to try to lock down IO -- but that's not really the issue here.

Sure, there are many use cases that require an RIO newtype, but the memory model and memory protection are really internal to the language. My interest is in a type safe language with a memory model *in the IO monad*. I think it's quite *reasonable to expect Safe Haskell to be as safe as Java! * There are hundreds of thousands or millions of lines of code written in the IO monad [1], and a lot of that code could probably be memory safe by construction.

One question I have for people is which of the following they would favor:

1. Redefine -XSafe to guarantee something about IO and its memory safety (and even its memory model) 2. Add a fourth safety level, "SafeExceptIO", that corresponds to the current guarantees, while extending "Safe" to say something about IO. 3. Leave safe Haskell as it is.

(2) sounds a bit clunky to me, and I favor (1) most of all.

Best, -Ryan

[1] 17M lines on hackage total, hard to count how much is in an IO monad or related monad.

On Mon, Aug 8, 2016 at 2:05 PM, Edward Z. Yang wrote:

...
Hello Ryan,

The guarantee that Safe Haskell gives with regards to IO is a little subtle and is mentioned in Section 3.1 of the paper, and detailed in Section 5.1. Essentially, to use Safe Haskell, you are responsible for defining the type at which untrusted code is to be called. Using an untrusted value at type IO a in main imposes no safety restrictions by design--it's up to the user of Safe Haskell to decide what kind of security properties it needs out of user code.

Edward

Excerpts from Ryan Newton's message of 2016-08-08 13:27:16 -0400:

...
We're trying to spend some cycles pushing on Safe Haskell within the stackage packages. (It's looking like a slog.)

But we're running up against some basic questions regarding the core packages and Safe Haskell guarantees. The manual currently says: <https://downloads.haskell.org/~ghc/latest/docs/html/ users_guide/safe_haskell.html#safe-language>

*Functions in the IO monad are still allowed and behave as usual. * As usual? So it is ok to segfault GHC? Elsewhere it says "in the safe language you can trust the types", and I'd always assumed that meant Safe Haskell is a type safe language, even in the IO fragment.

Was there an explicit decision to allow segfaults and memory corruption? This can happen not just with FFI calls but with uses of Ptrs within Haskell, for example the following:

```

{-# LANGUAGE Safe #-}

module Main where

import Foreign.Marshal.Alloc

import Foreign.Storable

import Foreign.Ptr

import System.Random

fn :: Ptr Int -> IO ()

fn p = do

-- This is kosher:

poke p 3

print =<< peek p

-- This should crash the system:

ix <- randomIO

pokeElemOff p ix 0xcc

main = alloca fn

```

-Ryan

David Terei

11:32 p.m.

Thanks for bringing this up Ryan, it's an interesting direction of thought. It's been a while since we originally designed SafeHaskell, so I can't remember our thinking very well with this area. I believe it came down to the following: * Our primary concern was to be able to 'trust the types', which is perhaps a little weaker than 'type safety'. So IO can do ugly things, but at least we know that, and know what something without IO, or using a restricted IO monad can do. * Our intuition was that offering stronger guarantees about IO would be very difficult -- I don't remember how far we explored this intuition. * We were also motivated very strongly by the security use case, that may have sadly blinded us a little to something more ambitious as you propose. If you have the energy, it'd be great to put some of this thinking into a wiki page (https://ghc.haskell.org/trac/ghc/wiki/SafeHaskell) and flesh out a first approximation of what IO API's cause issues. Is it just Ptr not carrying bounds around with it? Maybe, but then we need to secure how Ptr's can be created, which excludes FFI returning Ptr's. I imagine in Java, that I can construct an invalid pointer in foreign code, and then cause segfaults without the Java code having any issues. Just guessing at this, so very interested to know how it's prevented if I can't. Securing the memory model may be very challenging. E.g., as Russ Cox points out, in Go one can use a data race to break type safety. Does GHC-Haskell have any similar issues? What performance impact would result from fixing these edge cases? http://research.swtch.com/gorace I think your (1) proposal is reasonable and desirable, but we'd really want to understand the difficulty further to guide us between 1--3. Cheers, David On 8 August 2016 at 13:00, Ryan Newton wrote:

...

Hi Ed,

Thanks, I went back to the paper and read those sections.

It sounds like this design heavily reflects the untrusted-code use case. I believe ceding the entire IO monad is too pessimistic, and it's also against the spirit of the type safety guarantee mentioned in the paper:

"Type safety. In Milner’s famous phrase, well-typed programs do not go wrong. "

There's not really very much discussion of this "punting on IO" decision in the paper. The paper mentions not deleting users files as an example of a higher level security policy -- and a reason not to try to lock down IO -- but that's not really the issue here.

Sure, there are many use cases that require an RIO newtype, but the memory model and memory protection are really internal to the language. My interest is in a type safe language with a memory model in the IO monad. I think it's quite reasonable to expect Safe Haskell to be as safe as Java! There are hundreds of thousands or millions of lines of code written in the IO monad [1], and a lot of that code could probably be memory safe by construction.

One question I have for people is which of the following they would favor:

Redefine -XSafe to guarantee something about IO and its memory safety (and even its memory model) Add a fourth safety level, "SafeExceptIO", that corresponds to the current guarantees, while extending "Safe" to say something about IO. Leave safe Haskell as it is.

(2) sounds a bit clunky to me, and I favor (1) most of all.

Best, -Ryan

[1] 17M lines on hackage total, hard to count how much is in an IO monad or related monad.

On Mon, Aug 8, 2016 at 2:05 PM, Edward Z. Yang wrote:

...
Hello Ryan,

The guarantee that Safe Haskell gives with regards to IO is a little subtle and is mentioned in Section 3.1 of the paper, and detailed in Section 5.1. Essentially, to use Safe Haskell, you are responsible for defining the type at which untrusted code is to be called. Using an untrusted value at type IO a in main imposes no safety restrictions by design--it's up to the user of Safe Haskell to decide what kind of security properties it needs out of user code.

Edward

Excerpts from Ryan Newton's message of 2016-08-08 13:27:16 -0400:

...
We're trying to spend some cycles pushing on Safe Haskell within the stackage packages. (It's looking like a slog.)

But we're running up against some basic questions regarding the core packages and Safe Haskell guarantees. The manual currently says:

https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/safe_haskell...

*Functions in the IO monad are still allowed and behave as usual. * As usual? So it is ok to segfault GHC? Elsewhere it says "in the safe language you can trust the types", and I'd always assumed that meant Safe Haskell is a type safe language, even in the IO fragment.

Was there an explicit decision to allow segfaults and memory corruption? This can happen not just with FFI calls but with uses of Ptrs within Haskell, for example the following:

```

{-# LANGUAGE Safe #-}

module Main where

import Foreign.Marshal.Alloc

import Foreign.Storable

import Foreign.Ptr

import System.Random

fn :: Ptr Int -> IO ()

fn p = do

-- This is kosher:

poke p 3

print =<< peek p

-- This should crash the system:

ix <- randomIO

pokeElemOff p ix 0xcc

main = alloca fn

```

-Ryan

Mikhail Glushenkov

9 Aug 9 Aug

12:46 a.m.

Hi, On 9 August 2016 at 01:32, David Terei wrote:

...

I imagine in Java, that I can construct an invalid pointer in foreign code, and then cause segfaults without the Java code having any issues. Just guessing at this, so very interested to know how it's prevented if I can't.

Yes, this can be done with JNI, see e.g. [1]. Additionally, by using sun.misc.Unsafe [2], one can cause segfaults even from pure Java. [1] https://www.cs.princeton.edu/~appel/papers/safejni.pdf [2] http://www.inf.usi.ch/faculty/lanza/Downloads/Mast2015a.pdf

Florian Weimer

16 Sep 16 Sep

11:02 p.m.

* Mikhail Glushenkov:

...

Hi,

On 9 August 2016 at 01:32, David Terei wrote:

...
I imagine in Java, that I can construct an invalid pointer in foreign code, and then cause segfaults without the Java code having any issues. Just guessing at this, so very interested to know how it's prevented if I can't.

Yes, this can be done with JNI, see e.g. [1]. Additionally, by using sun.misc.Unsafe [2], one can cause segfaults even from pure Java.

You can also bring the JVM into an unstable state by triggering a VirtualMachineError, basically an out-of-memory condition, a stack overflow, or any other unrecoverable error. You can also exhaust limited resources such as file descriptors pretty easily.

Ryan Newton

9 Aug 9 Aug

2:41 p.m.

On Mon, Aug 8, 2016 at 8:46 PM, Mikhail Glushenkov wrote:

...

Yes, this can be done with JNI, see e.g. [1]. Additionally, by using sun.misc.Unsafe [2], one can cause segfaults even from pure Java. [1] https://www.cs.princeton.edu/~appel/papers/safejni.pdf [2] http://www.inf.usi.ch/faculty/lanza/Downloads/Mast2015a.pdf

Ah, I see. I thought that, ruling out FFI, that you couldn't segfault with pure Java code. Good to know about the unsafe interface. On Mon, Aug 8, 2016 at 7:32 PM, David Terei wrote:

...

If you have the energy, it'd be great to put some of this thinking into a wiki page (https://ghc.haskell.org/trac/ghc/wiki/SafeHaskell) and flesh out a first approximation of what IO API's cause issues. Is it just Ptr not carrying bounds around with it? Maybe, but then we need to secure how Ptr's can be created, which excludes FFI returning Ptr's.

Yes, we can add a Wiki page. Btw I was thinking more of kicking FFI/peek/poke outside of the Safe bubble, not changing their operational behavior. First of all, it's nigh impossible to lock down FFI calls. When someone, e.g. Bob Harper https://existentialtype.wordpress.com/2012/08/14/haskell-is-exceptionally-un..., points out a problem in Haskell, we sometimes respond "hey, *Safe Haskell* is the real objet d'art! It's a safe language." Yet it isn't really a full *language* if people cannot write and run programs in it! (Because every program must be ultimately be `main::IO()`.) Kicking out segfaulty features would still leave a safe language that people can write complete programs in. Best, -Ryan

Edward Z. Yang

8:19 p.m.

If you can execute subprocesses, you could always spawn gdb to attach via ptrace() to the parent process and then poke around memory. Yes this is a "dumb" example but I think it goes to show how important it is to correctly characterize what the threat model is. A "no-segfault" fragment of Haskell doesn't seem so much as a security feature as it is a "good practices" lint pass. Edward Excerpts from Ryan Newton's message of 2016-08-09 10:41:44 -0400:

...

On Mon, Aug 8, 2016 at 8:46 PM, Mikhail Glushenkov wrote:

...
Yes, this can be done with JNI, see e.g. [1]. Additionally, by using sun.misc.Unsafe [2], one can cause segfaults even from pure Java. [1] https://www.cs.princeton.edu/~appel/papers/safejni.pdf [2] http://www.inf.usi.ch/faculty/lanza/Downloads/Mast2015a.pdf

Ah, I see. I thought that, ruling out FFI, that you couldn't segfault with pure Java code. Good to know about the unsafe interface.

On Mon, Aug 8, 2016 at 7:32 PM, David Terei wrote:

...
If you have the energy, it'd be great to put some of this thinking into a wiki page (https://ghc.haskell.org/trac/ghc/wiki/SafeHaskell) and flesh out a first approximation of what IO API's cause issues. Is it just Ptr not carrying bounds around with it? Maybe, but then we need to secure how Ptr's can be created, which excludes FFI returning Ptr's.

Yes, we can add a Wiki page.

Btw I was thinking more of kicking FFI/peek/poke outside of the Safe bubble, not changing their operational behavior. First of all, it's nigh impossible to lock down FFI calls.

When someone, e.g. Bob Harper https://existentialtype.wordpress.com/2012/08/14/haskell-is-exceptionally-un..., points out a problem in Haskell, we sometimes respond "hey, *Safe Haskell* is the real objet d'art! It's a safe language." Yet it isn't really a full *language* if people cannot write and run programs in it! (Because every program must be ultimately be `main::IO()`.) Kicking out segfaulty features would still leave a safe language that people can write complete programs in.

Best, -Ryan

Brandon Allbery

10 Aug 10 Aug

2:52 a.m.

On Tue, Aug 9, 2016 at 4:19 PM, Edward Z. Yang wrote:

...

If you can execute subprocesses, you could always spawn gdb to attach via ptrace() to the parent process and then poke around memory.

Don't even need that if you're just talking segfaults, you can always spawn a subprocess "kill -SEGV $PPID" :) Unless you have full control over all the code that could be run in subprocesses, it's not going to be safe much less Safe. -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

Ryan Newton

3:15 a.m.

Heh, ok, segfaults themselves are a red herring. More precisely: The operational semantics for a SafeIO language should always accurately model its memory state. The application should not compute (take a step in the semantics) in a way that exposes corrupt memory or arbitrary undefined behavior. Nor should it violate the memory model. Moving immediately to the "Terminated" state is fine, whether it be due to out-of-memory, kill -SEGV, cosmic rays, or hardware failure. An FFI call that corrupts memory is bad (may result in arbitrary behavior, not just termination) as is ptrace'ing. Naturally, all Unsafe code is part of the TCB, as is the OS and GHC, and low-level data structure libs and bindings. It's a big TCB. Still, it's something to be able to write an app that doesn't automatically get added to this TCB just by virtue of being an *app* (main::IO). On Tue, Aug 9, 2016 at 10:52 PM, Brandon Allbery wrote:

...

On Tue, Aug 9, 2016 at 4:19 PM, Edward Z. Yang wrote:

...
If you can execute subprocesses, you could always spawn gdb to attach via ptrace() to the parent process and then poke around memory.

Don't even need that if you're just talking segfaults, you can always spawn a subprocess "kill -SEGV $PPID" :)

Unless you have full control over all the code that could be run in subprocesses, it's not going to be safe much less Safe.

-- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

Edward Kmett

9 Aug 9 Aug

9:13 p.m.

I've always treated Safe Haskell as "Safe until you allow IO" -- in that all 'evil' things get tainted by an IO type that you can't get rid of by the usual means. So if you go to run pure Safe Haskell code in say, lambdabot, which doesn't give the user a means to execute IO, it can't segfault if all of the Trustworthy modules you depend upon actually are trustworthy. Trying to shore up segfault safety under Safe in IO seems like a losing proposition. -Edward On Mon, Aug 8, 2016 at 1:27 PM, Ryan Newton wrote:

...

We're trying to spend some cycles pushing on Safe Haskell within the stackage packages. (It's looking like a slog.)

But we're running up against some basic questions regarding the core packages and Safe Haskell guarantees. The manual currently says: https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/safe_haskell...

*Functions in the IO monad are still allowed and behave as usual. * As usual? So it is ok to segfault GHC? Elsewhere it says "in the safe language you can trust the types", and I'd always assumed that meant Safe Haskell is a type safe language, even in the IO fragment.

Was there an explicit decision to allow segfaults and memory corruption? This can happen not just with FFI calls but with uses of Ptrs within Haskell, for example the following:

```

{-# LANGUAGE Safe #-}

module Main where

import Foreign.Marshal.Alloc

import Foreign.Storable

import Foreign.Ptr

import System.Random

fn :: Ptr Int -> IO ()

fn p = do

-- This is kosher:

poke p 3

print =<< peek p

-- This should crash the system:

ix <- randomIO

pokeElemOff p ix 0xcc

main = alloca fn

```

-Ryan

_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Ryan Newton

10 Aug 10 Aug

2:45 a.m.

I'm hearing that Safe Haskell is great for pure use cases (lambda bot). But that doesn't depend on being able to write arbitrary IO code inside the Safe bubble, does it? In fact *all* of IO could be outside the safe boundary for this use case, could it not? Are there any existing cases where it is important to be able to build up unsafe IO values inside -XSafe code? Edward, why does it seem like a losing proposition? Are there further problems that come to mind? ezyang mentioned the subprocess problem. I don't have a strong opinion on that one. But I tend to think the safe IO language *should* allow subprocess calls, and its a matter of configuring your OS to not allow ptrace in that situation. This would be part of a set of requirements for how to compile and launch a complete "Safe Haskell" *program* in order to get a guarantee. My primary interest is actually not segfault-freedom, per-se, but being able to define a memory model for Safe Haskell (for which I'd suggest sequential consistency). FFI undermines that, and peek/poke seems like it should cluster with FFI as an unsafe feature. I'm not inclined to give a memory model to peek or FFI -- at that level you get what the architecture gives you -- but I do want a memory model for IORefs, IOVectors, etc. We're poking at the Stackage package set now to figure out what pressure point to push on to increase the percentage of Stackage that is Safe. I'll be able to say more when we have more data on dependencies and problem points. Across all of hackage, Safe Haskell has modest use: of the ~100K modules on Hackage, ~636 are marked Safe, ~874 trustworthy, and ~118 Unsafe. It should be easy to check if any of this Safe code is currently importing "Foreign.*" or using FFI. My general plea is that we not give the imperative partition of Haskell too much the short end of the stick [1]. There is oodles of code in IO (or MonadIO), and probably relatively little in "RIO". To my knowledge, we don't have great ways to coin "RIO" newtypes without having to wrap and reexport rather a lot of IO functions. Maybe if APIs like MVars or files were overloaded in a class then GND could do some of the work... -Ryan [1] In safety guarantees, in optimizations, primops, whatever... For instance, I find in microbenchmarks that IO code still runs 2X slower than pure code, even if no IO effects are performed. On Tue, Aug 9, 2016 at 5:13 PM, Edward Kmett wrote:

...

I've always treated Safe Haskell as "Safe until you allow IO" -- in that all 'evil' things get tainted by an IO type that you can't get rid of by the usual means. So if you go to run pure Safe Haskell code in say, lambdabot, which doesn't give the user a means to execute IO, it can't segfault if all of the Trustworthy modules you depend upon actually are trustworthy.

Trying to shore up segfault safety under Safe in IO seems like a losing proposition.

-Edward

On Mon, Aug 8, 2016 at 1:27 PM, Ryan Newton wrote:

...
We're trying to spend some cycles pushing on Safe Haskell within the stackage packages. (It's looking like a slog.)

But we're running up against some basic questions regarding the core packages and Safe Haskell guarantees. The manual currently says: https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/safe_haskell...

*Functions in the IO monad are still allowed and behave as usual. * As usual? So it is ok to segfault GHC? Elsewhere it says "in the safe language you can trust the types", and I'd always assumed that meant Safe Haskell is a type safe language, even in the IO fragment.

Was there an explicit decision to allow segfaults and memory corruption? This can happen not just with FFI calls but with uses of Ptrs within Haskell, for example the following:

```

{-# LANGUAGE Safe #-}

module Main where

import Foreign.Marshal.Alloc

import Foreign.Storable

import Foreign.Ptr

import System.Random

fn :: Ptr Int -> IO ()

fn p = do

-- This is kosher:

poke p 3

print =<< peek p

-- This should crash the system:

ix <- randomIO

pokeElemOff p ix 0xcc

main = alloca fn

```

-Ryan

_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Edward Kmett

3:58 a.m.

I see three major stories here: 1.) If you remove IO from being able to be compiled inside Safe code _at all_ most packages I have that bother to expose Safe information will have to stop bothering. I'd have to cut up too many APIs into too many fine-grained pieces. This would considerably reduce the utility of Safe Haskell to me. Many of them expose a few combinators here and there that happen to live in IO and I can view offering Safe or Trustworthy to users as a 'the pure stuff looks really pure' guarantee. For the most part it 'just works' and Trustworthy annotations can be put in when I know the semantics of the hacks I'm using under the hood. 2.) Assuming instead that you're talking about a stronger-than-Safe additional language extension, say ReallySafe or SafeIO, it all comes down to what the user is allowed to do in IO, doesn't it? What effects are users granted access to? We don't have a very fine-grained system for IO-effect management, and it seems pretty much any choice you pick for what to put in the sandbox will be wrong for some users, so you'd need some sort of pragma for each IO operation saying what bins it falls into and to track that while type checking, etc. At least then you could say what you are safe with respect to. That all seems to be rather a big mess, roughly equivalent to modeling an effect system for IO operations, and then retroactively categorizing everything, putting a big burden on maintainers and requiring a lot of community buy-in, sight unseen. 3.) On the other hand, someone could _build_ an effect system in Haskell that happens to sit on top of IO, holding effects in an HList, undischarged nullary class constraint, etc. then pull a couple of Trustworthy modules around it for embedding the effects they want to permit and build this today without any compiler support, they'd just have to make a final application-specific Trustworthy wrapper to run whatever effects they want to permit into their program. It is more invasive to the code in question, but it requires zero community organizing and we've already got all the compiler mojo we need. The downside is the Trustworthy wrappers at the bottom of the heap and that it doesn't interoperate with basically anything already written. -Edward On Tue, Aug 9, 2016 at 10:45 PM, Ryan Newton wrote:

...

I'm hearing that Safe Haskell is great for pure use cases (lambda bot). But that doesn't depend on being able to write arbitrary IO code inside the Safe bubble, does it? In fact *all* of IO could be outside the safe boundary for this use case, could it not? Are there any existing cases where it is important to be able to build up unsafe IO values inside -XSafe code?

Edward, why does it seem like a losing proposition? Are there further problems that come to mind? ezyang mentioned the subprocess problem. I don't have a strong opinion on that one. But I tend to think the safe IO language *should* allow subprocess calls, and its a matter of configuring your OS to not allow ptrace in that situation. This would be part of a set of requirements for how to compile and launch a complete "Safe Haskell" *program* in order to get a guarantee.

My primary interest is actually not segfault-freedom, per-se, but being able to define a memory model for Safe Haskell (for which I'd suggest sequential consistency). FFI undermines that, and peek/poke seems like it should cluster with FFI as an unsafe feature. I'm not inclined to give a memory model to peek or FFI -- at that level you get what the architecture gives you -- but I do want a memory model for IORefs, IOVectors, etc.

We're poking at the Stackage package set now to figure out what pressure point to push on to increase the percentage of Stackage that is Safe. I'll be able to say more when we have more data on dependencies and problem points. Across all of hackage, Safe Haskell has modest use: of the ~100K modules on Hackage, ~636 are marked Safe, ~874 trustworthy, and ~118 Unsafe. It should be easy to check if any of this Safe code is currently importing "Foreign.*" or using FFI.

My general plea is that we not give the imperative partition of Haskell too much the short end of the stick [1]. There is oodles of code in IO (or MonadIO), and probably relatively little in "RIO". To my knowledge, we don't have great ways to coin "RIO" newtypes without having to wrap and reexport rather a lot of IO functions. Maybe if APIs like MVars or files were overloaded in a class then GND could do some of the work...

-Ryan

[1] In safety guarantees, in optimizations, primops, whatever... For instance, I find in microbenchmarks that IO code still runs 2X slower than pure code, even if no IO effects are performed.

On Tue, Aug 9, 2016 at 5:13 PM, Edward Kmett wrote:

...
I've always treated Safe Haskell as "Safe until you allow IO" -- in that all 'evil' things get tainted by an IO type that you can't get rid of by the usual means. So if you go to run pure Safe Haskell code in say, lambdabot, which doesn't give the user a means to execute IO, it can't segfault if all of the Trustworthy modules you depend upon actually are trustworthy.

Trying to shore up segfault safety under Safe in IO seems like a losing proposition.

-Edward

On Mon, Aug 8, 2016 at 1:27 PM, Ryan Newton wrote:

...
We're trying to spend some cycles pushing on Safe Haskell within the stackage packages. (It's looking like a slog.)

But we're running up against some basic questions regarding the core packages and Safe Haskell guarantees. The manual currently says: https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/safe_haskell...

*Functions in the IO monad are still allowed and behave as usual. * As usual? So it is ok to segfault GHC? Elsewhere it says "in the safe language you can trust the types", and I'd always assumed that meant Safe Haskell is a type safe language, even in the IO fragment.

Was there an explicit decision to allow segfaults and memory corruption? This can happen not just with FFI calls but with uses of Ptrs within Haskell, for example the following:

```

{-# LANGUAGE Safe #-}

module Main where

import Foreign.Marshal.Alloc

import Foreign.Storable

import Foreign.Ptr

import System.Random

fn :: Ptr Int -> IO ()

fn p = do

-- This is kosher:

poke p 3

print =<< peek p

-- This should crash the system:

ix <- randomIO

pokeElemOff p ix 0xcc

main = alloca fn

```

-Ryan

_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Simon Marlow

8:24 a.m.

Right - Safe Haskell provides the minimum that you need to be able to safely run untrusted code: the ability to trust the type system and the module system. Definining a safe subset of IO is usually an application-specific decision, e.g. do you want to allow access to the filesystem but without allowing openFile "/dev/mem"? It's a minefield. Ryan, if you want to give an operational semantics for memory in IO, why not start from the subset of IO that you can accurately model - the basic IO structure together with a set of primitives - and define the semantics for that? That's typically what we do when we're talking about something in the IO monad. Cheers Simon On 10 August 2016 at 04:58, Edward Kmett wrote:

...

I see three major stories here:

1.) If you remove IO from being able to be compiled inside Safe code _at all_ most packages I have that bother to expose Safe information will have to stop bothering. I'd have to cut up too many APIs into too many fine-grained pieces. This would considerably reduce the utility of Safe Haskell to me. Many of them expose a few combinators here and there that happen to live in IO and I can view offering Safe or Trustworthy to users as a 'the pure stuff looks really pure' guarantee. For the most part it 'just works' and Trustworthy annotations can be put in when I know the semantics of the hacks I'm using under the hood.

2.) Assuming instead that you're talking about a stronger-than-Safe additional language extension, say ReallySafe or SafeIO, it all comes down to what the user is allowed to do in IO, doesn't it? What effects are users granted access to? We don't have a very fine-grained system for IO-effect management, and it seems pretty much any choice you pick for what to put in the sandbox will be wrong for some users, so you'd need some sort of pragma for each IO operation saying what bins it falls into and to track that while type checking, etc. At least then you could say what you are safe with respect to. That all seems to be rather a big mess, roughly equivalent to modeling an effect system for IO operations, and then retroactively categorizing everything, putting a big burden on maintainers and requiring a lot of community buy-in, sight unseen.

3.) On the other hand, someone could _build_ an effect system in Haskell that happens to sit on top of IO, holding effects in an HList, undischarged nullary class constraint, etc. then pull a couple of Trustworthy modules around it for embedding the effects they want to permit and build this today without any compiler support, they'd just have to make a final application-specific Trustworthy wrapper to run whatever effects they want to permit into their program. It is more invasive to the code in question, but it requires zero community organizing and we've already got all the compiler mojo we need. The downside is the Trustworthy wrappers at the bottom of the heap and that it doesn't interoperate with basically anything already written.

-Edward

On Tue, Aug 9, 2016 at 10:45 PM, Ryan Newton wrote:

...
I'm hearing that Safe Haskell is great for pure use cases (lambda bot). But that doesn't depend on being able to write arbitrary IO code inside the Safe bubble, does it? In fact *all* of IO could be outside the safe boundary for this use case, could it not? Are there any existing cases where it is important to be able to build up unsafe IO values inside -XSafe code?

Edward, why does it seem like a losing proposition? Are there further problems that come to mind? ezyang mentioned the subprocess problem. I don't have a strong opinion on that one. But I tend to think the safe IO language *should* allow subprocess calls, and its a matter of configuring your OS to not allow ptrace in that situation. This would be part of a set of requirements for how to compile and launch a complete "Safe Haskell" *program* in order to get a guarantee.

My primary interest is actually not segfault-freedom, per-se, but being able to define a memory model for Safe Haskell (for which I'd suggest sequential consistency). FFI undermines that, and peek/poke seems like it should cluster with FFI as an unsafe feature. I'm not inclined to give a memory model to peek or FFI -- at that level you get what the architecture gives you -- but I do want a memory model for IORefs, IOVectors, etc.

We're poking at the Stackage package set now to figure out what pressure point to push on to increase the percentage of Stackage that is Safe. I'll be able to say more when we have more data on dependencies and problem points. Across all of hackage, Safe Haskell has modest use: of the ~100K modules on Hackage, ~636 are marked Safe, ~874 trustworthy, and ~118 Unsafe. It should be easy to check if any of this Safe code is currently importing "Foreign.*" or using FFI.

My general plea is that we not give the imperative partition of Haskell too much the short end of the stick [1]. There is oodles of code in IO (or MonadIO), and probably relatively little in "RIO". To my knowledge, we don't have great ways to coin "RIO" newtypes without having to wrap and reexport rather a lot of IO functions. Maybe if APIs like MVars or files were overloaded in a class then GND could do some of the work...

-Ryan

[1] In safety guarantees, in optimizations, primops, whatever... For instance, I find in microbenchmarks that IO code still runs 2X slower than pure code, even if no IO effects are performed.

On Tue, Aug 9, 2016 at 5:13 PM, Edward Kmett wrote:

...
I've always treated Safe Haskell as "Safe until you allow IO" -- in that all 'evil' things get tainted by an IO type that you can't get rid of by the usual means. So if you go to run pure Safe Haskell code in say, lambdabot, which doesn't give the user a means to execute IO, it can't segfault if all of the Trustworthy modules you depend upon actually are trustworthy.

Trying to shore up segfault safety under Safe in IO seems like a losing proposition.

-Edward

On Mon, Aug 8, 2016 at 1:27 PM, Ryan Newton wrote:

...
We're trying to spend some cycles pushing on Safe Haskell within the stackage packages. (It's looking like a slog.)

But we're running up against some basic questions regarding the core packages and Safe Haskell guarantees. The manual currently says: https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/safe_haskell...

*Functions in the IO monad are still allowed and behave as usual. * As usual? So it is ok to segfault GHC? Elsewhere it says "in the safe language you can trust the types", and I'd always assumed that meant Safe Haskell is a type safe language, even in the IO fragment.

Was there an explicit decision to allow segfaults and memory corruption? This can happen not just with FFI calls but with uses of Ptrs within Haskell, for example the following:

```

{-# LANGUAGE Safe #-}

module Main where

import Foreign.Marshal.Alloc

import Foreign.Storable

import Foreign.Ptr

import System.Random

fn :: Ptr Int -> IO ()

fn p = do

-- This is kosher:

poke p 3

print =<< peek p

-- This should crash the system:

ix <- randomIO

pokeElemOff p ix 0xcc

main = alloca fn

```

-Ryan

_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Ryan Newton

8:48 p.m.

Hi Simon, On Wed, Aug 10, 2016 at 4:24 AM, Simon Marlow https://mail.google.com/mail/?view=cm&fs=1&tf=1&to=marlowsd@gmail.com> wrote:

...

Definining a safe subset of IO is usually an application-specific decision, e.g. do you want to allow access to the filesystem but without allowing openFile "/dev/mem"? It's a minefield.

I respect your intuition that this is a "minefield" (and Edward's "big mess"), but I still want to unpack this further. I would propose that a MemSafe app is a two-part contract that is half an OS responsibility and half language/runtime responsibility. 1. OS: guarantee that all system calls (including file access) from the process do not violate the processes memory, outside of certain opt-in DMA regions. 2. Lang/runtime: create a binary "ensuring" that all instructions executed in the process maintain the type safety of memory and follow the memory model. (Scare quotes due to large TCB.) With this division, "/dev/mem" and subprocess/ptrace are definitely the job of the OS or containerization. Blame for a bug would fall to category 1. A MemSafe app needs a launcher/harness to check that category 1 is enforced properly. I'm not proposing that GHC needs to generate that. It should be separate. Ryan, if you want to give an operational semantics for memory in IO, why

...

not start from the subset of IO that you can accurately model - the basic IO structure together with a set of primitives - and define the semantics for that? That's typically what we do when we're talking about something in the IO monad.

Yes, indeed, that has been the http://research.microsoft.com/en-us/um/people/simonpj/papers/concurrent-hask... approach http://homepages.dcc.ufmg.br/~camarao/fp/articles/lazy-state.pdf for decades (and the same thing you and I do with other monads like Par http://research.microsoft.com/en-us/um/people/simonpj/papers/parallel/monad-...). There's something unsatisfying here though -- we love to build model languages that include the effects we want to talk about *at that time* -- perhaps just MVars, just putChar, or just STM. But of course there's a big leap from these small treatments to GHC, with its multitude of moving parts. That's partly why I think SafeHaskell is so great; it's a Full Language which aims for serious, statically enforced guarantees. Well, almost a full language ;-). You can't run programs in it, unless you *extend* the TCB. It's kind of like Rust if you were required to use an "unsafe" block to write main(). Anyway, here's a draft paper that does propose a small model language. It adds a memory model to Haskell plus IORefs/STRefs/TVars, and explains how, when executed on a machine with relaxed TSO memory (store buffers), the IO writes need to be fenced, but the ST and STM ones don't: http://www.cs.indiana.edu/~rrnewton/papers/sc-haskell_draft.pdf Do we need a memory model? I concur with David Terei's arguments from 5 years ago https://mail.haskell.org/pipermail/haskell-cafe/2011-May/091963.html. Note however, that the implementation in this paper is TSO only. Implementation on ARM would be a bit different, but the relationship between IO/ST and IO/STM would stay the same. Ryan Yates & others, I'm curious if you think the treatment of STM is satisfactory. Like ppopp05, we do not model retrying explicitly, and our goal here is to show the interaction of the different effects. Cheers, -Ryan

Ryan Newton

2:23 p.m.

Hi Edward, On Tue, Aug 9, 2016 at 11:58 PM, Edward Kmett wrote:

...

1.) If you remove IO from being able to be compiled inside Safe code _at all_ most packages I have that bother to expose Safe information will have to stop bothering.

I definitely wouldn't argue for removing it entirely. But it's good to know that there are instances where IO functions get mixed up in safe modules. I'll try to systematically find all of these on hackage, but in the meantime do you have a sample list of modules? My modest starting proposal is marking certain Foreign.* modules as Unsafe rather than Trustworthy. We'll find all the modules affected. But, again, are there any modules you know of offhand that are affected? They should fall into two categories: 1. Safe modules that must become Trustworthy (if they import Foreign bits, but don't expose the ability to corrupt memory to the clients of their APIs). 2. Safe modules that must become Unsafe or be split further into smaller modules. Obviously (2) is the biggest source of potential disruption. I wouldn't ask anyone to accept a patch on GHC until we'd explored these impacts pretty thoroughly. I'd have to cut up too many APIs into too many fine-grained pieces.

...

Yeah, the module-level business is pretty annoying. "vector' removed ".Safe" modules and no one has gotten around to adding the ".Unsafe".

...

2.) Assuming instead that you're talking about a stronger-than-Safe additional language extension, say ReallySafe or SafeIO, it all comes down to what the user is allowed to do in IO, doesn't it? What effects are users granted access to? We don't have a very fine-grained system for IO-effect management, and it seems pretty much any choice you pick for what to put in the sandbox will be wrong for some users, so you'd need some sort of pragma for each IO operation saying what bins it falls into and to track that while type checking, etc.

Well, *maybe* it is a slippery slope that leads to a full effect system. But I'd like to see these issues enumerated. Does memory safety as a goal really involve so many different effects? Do you think there will be 1, 3, 10, or 100 things beyond Foreign.Ptr to worry about? 3.) On the other hand, someone could _build_ an effect system in Haskell

...

that happens to sit on top of IO, holding effects in an HList, undischarged nullary class constraint, etc.

Well, sure, I hope we will continue to aim for this as well. This is effectively what we do with our "LVish" Par monad, where we use Safe Haskell to ensure users cannot break the effect system in -XSafe code. Best, -Ryan

Ryan Yates

3:14 p.m.

Hi Ryan, I have similar concerns with safety and STM. In particular, lazy validation allows for execution after inconsistent reads from TVars. The obvious problem to avoid is falling into an infinite loop. As long as -fno-omit-yields is used (on every module?) and maybe some other conditions like eventually GC happens, the transaction will be validated and killed off. But other problems can happen. Consider a transactional hash table with an array and some hashes already computed to be inside the array. If an execution sees updated hashes, but not the updated array, then using unsafeRead could lead to a segfault. I don't think this is completely obvious, especially when people will reach for STM precisely to avoid this sort of problem. I worry about code that abstracts over mutable variables that work given sequential execution, but could fail with STM. ByteString can lead to similar issues. Data.ByteString.replicate can be asked to allocate very large pinned data leading to immediate heap overflow. But if the request is from an inconsistent view of data it seams the programmer has already done their due diligence in preventing this from happening! Anyway, I would like to work toward reasoning about these things more precisely. On Wed, Aug 10, 2016 at 10:23 AM, Ryan Newton wrote:

...

Hi Edward,

On Tue, Aug 9, 2016 at 11:58 PM, Edward Kmett wrote:

...
1.) If you remove IO from being able to be compiled inside Safe code _at all_ most packages I have that bother to expose Safe information will have to stop bothering.

I definitely wouldn't argue for removing it entirely. But it's good to know that there are instances where IO functions get mixed up in safe modules. I'll try to systematically find all of these on hackage, but in the meantime do you have a sample list of modules?

My modest starting proposal is marking certain Foreign.* modules as Unsafe rather than Trustworthy. We'll find all the modules affected. But, again, are there any modules you know of offhand that are affected? They should fall into two categories:

1. Safe modules that must become Trustworthy (if they import Foreign bits, but don't expose the ability to corrupt memory to the clients of their APIs). 2. Safe modules that must become Unsafe or be split further into smaller modules.

Obviously (2) is the biggest source of potential disruption.

I wouldn't ask anyone to accept a patch on GHC until we'd explored these impacts pretty thoroughly.

I'd have to cut up too many APIs into too many fine-grained pieces.

...
Yeah, the module-level business is pretty annoying. "vector' removed ".Safe" modules and no one has gotten around to adding the ".Unsafe".

...
2.) Assuming instead that you're talking about a stronger-than-Safe additional language extension, say ReallySafe or SafeIO, it all comes down to what the user is allowed to do in IO, doesn't it? What effects are users granted access to? We don't have a very fine-grained system for IO-effect management, and it seems pretty much any choice you pick for what to put in the sandbox will be wrong for some users, so you'd need some sort of pragma for each IO operation saying what bins it falls into and to track that while type checking, etc.

Well, *maybe* it is a slippery slope that leads to a full effect system. But I'd like to see these issues enumerated. Does memory safety as a goal really involve so many different effects? Do you think there will be 1, 3, 10, or 100 things beyond Foreign.Ptr to worry about?

3.) On the other hand, someone could _build_ an effect system in Haskell

...
that happens to sit on top of IO, holding effects in an HList, undischarged nullary class constraint, etc.

Well, sure, I hope we will continue to aim for this as well. This is effectively what we do with our "LVish" Par monad, where we use Safe Haskell to ensure users cannot break the effect system in -XSafe code.

Best, -Ryan

_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Edward Kmett

12 Aug 12 Aug

1:49 p.m.

As for a sample list of modules, let's just start with your very first example, Foreign.Ptr: In and of itself nothing in Foreign.Ptr is unsafe! It allows a bit of arithmetic on a type you can't actually use with anything, and provides an IO action mixed into an otherwise pure module that happens to create a FunPtr slot from a haskell function. In fact this module is a textbook example of an otherwise perfectly cromulent Trustworthy module today that happens to have a single IO action in it. I can grab Ptr from it, use its Storable instance to make a default signature for other safe code and still be perfectly safe. It gives no tools for manipulating the contents of the Ptr. It is no more dangerous than an Int with a phantom type argument. You could randomly declare that this module is Unsafe because it combines badly with APIs that would be safe if you could rely on any Ptr T actually pointing to a T, and that users could then be granted the power to ferry them around, but we don't trust a user to be able to do that today. It's the combinators that read/write to a Ptr are the dangerous bits, not pure math. -Edward On Wed, Aug 10, 2016 at 10:23 AM, Ryan Newton wrote:

...

Hi Edward,

On Tue, Aug 9, 2016 at 11:58 PM, Edward Kmett wrote:

...
1.) If you remove IO from being able to be compiled inside Safe code _at all_ most packages I have that bother to expose Safe information will have to stop bothering.

I definitely wouldn't argue for removing it entirely. But it's good to know that there are instances where IO functions get mixed up in safe modules. I'll try to systematically find all of these on hackage, but in the meantime do you have a sample list of modules?

My modest starting proposal is marking certain Foreign.* modules as Unsafe rather than Trustworthy. We'll find all the modules affected. But, again, are there any modules you know of offhand that are affected? They should fall into two categories:

1. Safe modules that must become Trustworthy (if they import Foreign bits, but don't expose the ability to corrupt memory to the clients of their APIs). 2. Safe modules that must become Unsafe or be split further into smaller modules.

Obviously (2) is the biggest source of potential disruption.

I wouldn't ask anyone to accept a patch on GHC until we'd explored these impacts pretty thoroughly.

I'd have to cut up too many APIs into too many fine-grained pieces.

...
Yeah, the module-level business is pretty annoying. "vector' removed ".Safe" modules and no one has gotten around to adding the ".Unsafe".

...
2.) Assuming instead that you're talking about a stronger-than-Safe additional language extension, say ReallySafe or SafeIO, it all comes down to what the user is allowed to do in IO, doesn't it? What effects are users granted access to? We don't have a very fine-grained system for IO-effect management, and it seems pretty much any choice you pick for what to put in the sandbox will be wrong for some users, so you'd need some sort of pragma for each IO operation saying what bins it falls into and to track that while type checking, etc.

Well, *maybe* it is a slippery slope that leads to a full effect system. But I'd like to see these issues enumerated. Does memory safety as a goal really involve so many different effects? Do you think there will be 1, 3, 10, or 100 things beyond Foreign.Ptr to worry about?

3.) On the other hand, someone could _build_ an effect system in Haskell

...
that happens to sit on top of IO, holding effects in an HList, undischarged nullary class constraint, etc.

Well, sure, I hope we will continue to aim for this as well. This is effectively what we do with our "LVish" Par monad, where we use Safe Haskell to ensure users cannot break the effect system in -XSafe code.

Best, -Ryan

Ryan Newton

4:58 p.m.

Yes, it is peek and poke that are dangerous. It was Foreign.Storable that I wanted to mark as Unsafe. But we do sometimes run into examples where there's an A and a B, and if you import both, you can make A+B which blows up. So preventing access to A+B may mean arbitrarily marking one or the other (or both) as Unsafe. What I was hoping for examples of are modules you have that are Safe and import Foreign.Storable. On Fri, Aug 12, 2016 at 9:49 AM, Edward Kmett wrote:

...

As for a sample list of modules, let's just start with your very first example, Foreign.Ptr:

In and of itself nothing in Foreign.Ptr is unsafe! It allows a bit of arithmetic on a type you can't actually use with anything, and provides an IO action mixed into an otherwise pure module that happens to create a FunPtr slot from a haskell function. In fact this module is a textbook example of an otherwise perfectly cromulent Trustworthy module today that happens to have a single IO action in it.

I can grab Ptr from it, use its Storable instance to make a default signature for other safe code and still be perfectly safe.

It gives no tools for manipulating the contents of the Ptr. It is no more dangerous than an Int with a phantom type argument.

You could randomly declare that this module is Unsafe because it combines badly with APIs that would be safe if you could rely on any Ptr T actually pointing to a T, and that users could then be granted the power to ferry them around, but we don't trust a user to be able to do that today.

It's the combinators that read/write to a Ptr are the dangerous bits, not pure math.

-Edward

On Wed, Aug 10, 2016 at 10:23 AM, Ryan Newton wrote:

...
Hi Edward,

On Tue, Aug 9, 2016 at 11:58 PM, Edward Kmett wrote:

...
1.) If you remove IO from being able to be compiled inside Safe code _at all_ most packages I have that bother to expose Safe information will have to stop bothering.

I definitely wouldn't argue for removing it entirely. But it's good to know that there are instances where IO functions get mixed up in safe modules. I'll try to systematically find all of these on hackage, but in the meantime do you have a sample list of modules?

My modest starting proposal is marking certain Foreign.* modules as Unsafe rather than Trustworthy. We'll find all the modules affected. But, again, are there any modules you know of offhand that are affected? They should fall into two categories:

1. Safe modules that must become Trustworthy (if they import Foreign bits, but don't expose the ability to corrupt memory to the clients of their APIs). 2. Safe modules that must become Unsafe or be split further into smaller modules.

Obviously (2) is the biggest source of potential disruption.

I wouldn't ask anyone to accept a patch on GHC until we'd explored these impacts pretty thoroughly.

I'd have to cut up too many APIs into too many fine-grained pieces.

...
Yeah, the module-level business is pretty annoying. "vector' removed ".Safe" modules and no one has gotten around to adding the ".Unsafe".

...
2.) Assuming instead that you're talking about a stronger-than-Safe additional language extension, say ReallySafe or SafeIO, it all comes down to what the user is allowed to do in IO, doesn't it? What effects are users granted access to? We don't have a very fine-grained system for IO-effect management, and it seems pretty much any choice you pick for what to put in the sandbox will be wrong for some users, so you'd need some sort of pragma for each IO operation saying what bins it falls into and to track that while type checking, etc.

Well, *maybe* it is a slippery slope that leads to a full effect system. But I'd like to see these issues enumerated. Does memory safety as a goal really involve so many different effects? Do you think there will be 1, 3, 10, or 100 things beyond Foreign.Ptr to worry about?

3.) On the other hand, someone could _build_ an effect system in Haskell

...
that happens to sit on top of IO, holding effects in an HList, undischarged nullary class constraint, etc.

Well, sure, I hope we will continue to aim for this as well. This is effectively what we do with our "LVish" Par monad, where we use Safe Haskell to ensure users cannot break the effect system in -XSafe code.

Best, -Ryan

Edward Kmett

6:37 p.m.

What about consuming Storable Vectors carefully, or simply working parameterized over vector type, where Storable vectors are one of the options? -Edward On Fri, Aug 12, 2016 at 12:58 PM, Ryan Newton wrote:

...

Yes, it is peek and poke that are dangerous. It was Foreign.Storable that I wanted to mark as Unsafe.

But we do sometimes run into examples where there's an A and a B, and if you import both, you can make A+B which blows up. So preventing access to A+B may mean arbitrarily marking one or the other (or both) as Unsafe.

What I was hoping for examples of are modules you have that are Safe and import Foreign.Storable.

On Fri, Aug 12, 2016 at 9:49 AM, Edward Kmett wrote:

...
As for a sample list of modules, let's just start with your very first example, Foreign.Ptr:

In and of itself nothing in Foreign.Ptr is unsafe! It allows a bit of arithmetic on a type you can't actually use with anything, and provides an IO action mixed into an otherwise pure module that happens to create a FunPtr slot from a haskell function. In fact this module is a textbook example of an otherwise perfectly cromulent Trustworthy module today that happens to have a single IO action in it.

I can grab Ptr from it, use its Storable instance to make a default signature for other safe code and still be perfectly safe.

It gives no tools for manipulating the contents of the Ptr. It is no more dangerous than an Int with a phantom type argument.

You could randomly declare that this module is Unsafe because it combines badly with APIs that would be safe if you could rely on any Ptr T actually pointing to a T, and that users could then be granted the power to ferry them around, but we don't trust a user to be able to do that today.

It's the combinators that read/write to a Ptr are the dangerous bits, not pure math.

-Edward

On Wed, Aug 10, 2016 at 10:23 AM, Ryan Newton wrote:

...
Hi Edward,

On Tue, Aug 9, 2016 at 11:58 PM, Edward Kmett wrote:

...
1.) If you remove IO from being able to be compiled inside Safe code _at all_ most packages I have that bother to expose Safe information will have to stop bothering.

I definitely wouldn't argue for removing it entirely. But it's good to know that there are instances where IO functions get mixed up in safe modules. I'll try to systematically find all of these on hackage, but in the meantime do you have a sample list of modules?

My modest starting proposal is marking certain Foreign.* modules as Unsafe rather than Trustworthy. We'll find all the modules affected. But, again, are there any modules you know of offhand that are affected? They should fall into two categories:

1. Safe modules that must become Trustworthy (if they import Foreign bits, but don't expose the ability to corrupt memory to the clients of their APIs). 2. Safe modules that must become Unsafe or be split further into smaller modules.

Obviously (2) is the biggest source of potential disruption.

I wouldn't ask anyone to accept a patch on GHC until we'd explored these impacts pretty thoroughly.

I'd have to cut up too many APIs into too many fine-grained pieces.

...
Yeah, the module-level business is pretty annoying. "vector' removed ".Safe" modules and no one has gotten around to adding the ".Unsafe".

...
2.) Assuming instead that you're talking about a stronger-than-Safe additional language extension, say ReallySafe or SafeIO, it all comes down to what the user is allowed to do in IO, doesn't it? What effects are users granted access to? We don't have a very fine-grained system for IO-effect management, and it seems pretty much any choice you pick for what to put in the sandbox will be wrong for some users, so you'd need some sort of pragma for each IO operation saying what bins it falls into and to track that while type checking, etc.

Well, *maybe* it is a slippery slope that leads to a full effect system. But I'd like to see these issues enumerated. Does memory safety as a goal really involve so many different effects? Do you think there will be 1, 3, 10, or 100 things beyond Foreign.Ptr to worry about?

3.) On the other hand, someone could _build_ an effect system in Haskell

...
that happens to sit on top of IO, holding effects in an HList, undischarged nullary class constraint, etc.

Well, sure, I hope we will continue to aim for this as well. This is effectively what we do with our "LVish" Par monad, where we use Safe Haskell to ensure users cannot break the effect system in -XSafe code.

Best, -Ryan

Bardur Arantsson

9:06 p.m.

On 2016-08-12 20:37, Edward Kmett wrote:

...

What about consuming Storable Vectors carefully, or simply working parameterized over vector type, where Storable vectors are one of the options?

There was actually a great paper about a very similar thing (i.e. "here's the usual interface" vs. "here's how it should be done") recently: http://ozark.hendrix.edu/~yorgey/pub/twisted.pdf It may be of interest -- at least as far as an "improved Ptr/Storable" goes. Cheers,

3215

Age (days ago)

3254

Last active (days ago)

List overview

Download

23 comments

10 participants

participants (10)

Bardur Arantsson
Brandon Allbery
David Terei
Edward Kmett
Edward Z. Yang
Florian Weimer
Mikhail Glushenkov
Ryan Newton
Ryan Yates
Simon Marlow