Safe Haskell?

older
Journal of Functional Programming...

Richard Eisenberg

16 Apr 2021 16 Apr '21

7:02 p.m.

Hi café, Do you use Safe Haskell? Do you know someone who does? If you do, which of Safe Haskell's guarantees do you rely on? Here, a user of Safe Haskell is someone who relies on any guarantees that Safe Haskell provides, not someone who makes sure to have the right pragmas, etc., in your library so that users can import it Safely. Context: Safe Haskell is not lightweight to support within GHC and the ecosystem. Despite being a formidable research project with a (in my opinion) quite worthwhile goal, it's unclear which of Safe Haskell's purported guarantees are actually guaranteed by GHC. (The lack of unsafeCoerce is not actually guaranteed: https://gitlab.haskell.org/ghc/ghc/-/issues/9562 https://gitlab.haskell.org/ghc/ghc/-/issues/9562.) Recent design questions about what should be Safe and what shouldn't be (somehow cannot find the discussion after a few minutes of searching; perhaps fill this in) have been answered only by stabs in the dark. The status quo is causing pain: https://gitlab.haskell.org/ghc/ghc/-/issues/19590 https://gitlab.haskell.org/ghc/ghc/-/issues/19590. There are hundreds (maybe thousands) of lines of delicate logic within GHC to support Safe Haskell. These parts of GHC have to be read, understood, and maintained by people with limited time. I thus wonder about deprecating and eventually removing Safe Haskell. I don't have a concrete plan for how to do this yet, but I'm confident we could come up with a migration strategy. The set of people who would win by removing Safe Haskell is easy enough to discover. But this email is intended to discover who would be harmed by doing so. If you know, speak up. Otherwise, I expect I will write up a GHC proposal to remove the feature. Thanks, Richard

Attachments:

attachment.html (text/html — 2.3 KB)

Show replies by date

Chris Smith

16 Apr 16 Apr

7:31 p.m.

I do not object! I did use SafeHaskell long ago, to provide a server that executed student code in a class. It lasted for a year or so, and caused me great anxiety because I was never entirely sure that it was really safe. At a minimum, it required a bunch of extra logic at build time, careful monitoring of resource usage, etc. I now believe that this use case is far better served by virtualization, which is now a quite well-supported feature across all major operating systems. It would be a big mistake for someone today to try to accomplish with SafeHaskell what is better accomplished with a virtual machine. On Fri, Apr 16, 2021 at 3:07 PM Richard Eisenberg wrote:

...

Hi café,

Do you use Safe Haskell? Do you know someone who does? If you do, which of Safe Haskell's guarantees do you rely on?

Here, a user of Safe Haskell is someone who relies on any guarantees that Safe Haskell provides, not someone who makes sure to have the right pragmas, etc., in your library so that users can import it Safely.

Context: Safe Haskell is not lightweight to support within GHC and the ecosystem. Despite being a formidable research project with a (in my opinion) quite worthwhile goal, it's unclear which of Safe Haskell's purported guarantees are actually guaranteed by GHC. (The lack of unsafeCoerce is not actually guaranteed: https://gitlab.haskell.org/ghc/ghc/-/issues/9562.) Recent design questions about what should be Safe and what shouldn't be (somehow cannot find the discussion after a few minutes of searching; perhaps fill this in) have been answered only by stabs in the dark. The status quo is causing pain: https://gitlab.haskell.org/ghc/ghc/-/issues/19590. There are hundreds (maybe thousands) of lines of delicate logic within GHC to support Safe Haskell. These parts of GHC have to be read, understood, and maintained by people with limited time.

I thus wonder about deprecating and eventually removing Safe Haskell. I don't have a concrete plan for how to do this yet, but I'm confident we could come up with a migration strategy.

The set of people who would win by removing Safe Haskell is easy enough to discover. But this email is intended to discover who would be harmed by doing so. If you know, speak up. Otherwise, I expect I will write up a GHC proposal to remove the feature.

Thanks, Richard _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Tom Ellis

7:38 p.m.

On Fri, Apr 16, 2021 at 03:31:20PM -0400, Chris Smith wrote:

...

I did use SafeHaskell long ago, to provide a server that executed student code in a class ... I now believe that this use case is far better served by virtualization, which is now a quite well-supported feature across all major operating systems.

Tangentially I'd like to plug an excellent resource for this that I recently discovered:

...

NsJail is a process isolation tool for Linux. It utilizes Linux namespace subsystem, resource limits, and the seccomp-bpf syscall filters of the Linux kernel.

https://github.com/google/nsjail/#overview It's an extremely impressive program! It allows isolation of system resources at a very fine-grained level. Tom

Bertram Felgenhauer

17 Apr 17 Apr

1:03 p.m.

Richard Eisenberg wrote:

...

Hi café,

Do you use Safe Haskell? Do you know someone who does? If you do, which of Safe Haskell's guarantees do you rely on?

Yes. lambdabot's evaluation mechanism is essentially designed around Safe Haskell: expressions being evaluated are wrapped in `show`, so there's nothing triggering IO actions from outside. Safe Haskell (barring bugs or evil libraries) ensures that there's no unsafePerformIO nor unsafeCoerce to break the type system, so there's no way to perform arbitrary IO actions inside pure code. The libraries are curated, so evil libraries have not been an issue. As for bugs, there have been holes in Typeable in the past, but currently I believe they're closed, except for the use of MD5 as a hash function (but that may require ~2^64 hash operations to exploit because it's hashing UTF-16 data, rendering the existing differential path collision attacks useless... as far as I know, nobody has done this yet). Preventing the use of Template Haskell is another aspect that lambdabot relies on. Nowadays, lambdabot (as deployed on Freenode) also uses a sandbox for evaluation, but I'm thinking of that as a second line of defense rather than the primary mechanism for keeping things safe. So I'd be sad to see SafeHaskell go away. Cheers, Bertram

Richard Eisenberg

2:55 p.m.

Hi Bertram, Thanks for speaking up here. I feel like I'm missing something I should know, but how does Safe help you? Looking at the lambdabot docs, users cannot import their own modules, and you describe the libraries as curated. So, presumably, that's enough to keep unsafeCoerce and unsafePerformIO from being in scope. Along similar lines, I don't see a way in lambdabot to enable extensions, so Template Haskell is not an issue for you (I believe). Maybe the role of Safe is in helping you curate your libraries? That is, you can use the Safety of a module in determining whether or not it should be imported. That is indeed helpful. Is that it, though? Does enabling -XSafe when compiling user-supplied code catch some scenarios that would be uncaught otherwise? Thanks for educating me about this -- it's important to know how the feature is being used if we are going to maintain it. Richard

...

On Apr 17, 2021, at 9:03 AM, Bertram Felgenhauer via Haskell-Cafe wrote:

Richard Eisenberg wrote:

...
Hi café,

Do you use Safe Haskell? Do you know someone who does? If you do, which of Safe Haskell's guarantees do you rely on?

Yes. lambdabot's evaluation mechanism is essentially designed around Safe Haskell: expressions being evaluated are wrapped in `show`, so there's nothing triggering IO actions from outside.

Safe Haskell (barring bugs or evil libraries) ensures that there's no unsafePerformIO nor unsafeCoerce to break the type system, so there's no way to perform arbitrary IO actions inside pure code.

The libraries are curated, so evil libraries have not been an issue.

As for bugs, there have been holes in Typeable in the past, but currently I believe they're closed, except for the use of MD5 as a hash function (but that may require ~2^64 hash operations to exploit because it's hashing UTF-16 data, rendering the existing differential path collision attacks useless... as far as I know, nobody has done this yet).

Preventing the use of Template Haskell is another aspect that lambdabot relies on.

Nowadays, lambdabot (as deployed on Freenode) also uses a sandbox for evaluation, but I'm thinking of that as a second line of defense rather than the primary mechanism for keeping things safe.

So I'd be sad to see SafeHaskell go away.

Cheers,

Bertram _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Bertram Felgenhauer

4:08 p.m.

Dear Richard,

...

Thanks for speaking up here. I feel like I'm missing something I should know, but how does Safe help you? Looking at the lambdabot docs, users cannot import their own modules, and you describe the libraries as curated. So, presumably, that's enough to keep unsafeCoerce and unsafePerformIO from being in scope. Along similar lines, I don't see a way in lambdabot to enable extensions, so Template Haskell is not an issue for you (I believe).

Lambdabot maintains a Haskell source file (the pristine version can be seen here, for example): https://silicon.int-e.eu/lambdabot/State/Pristine.hs There's an @let command that can add to that source file, and while the name suggests that it is for adding definitions, it can also add imports. So one can try @let import System.IO.Unsafe which is currently blocked by SafeHaskell: <lambdabot> .L.hs:140:1: error: <lambdabot> System.IO.Unsafe: Can't be safely imported! The use of @let for importing modulues probably not documented anywhere, but it is occasionally demonstrated on the #haskell IRC channel. You're right about Template Haskell though. @let doesn't allow adding new language pragmas.

...

Maybe the role of Safe is in helping you curate your libraries? That is, you can use the Safety of a module in determining whether or not it should be imported. That is indeed helpful. Is that it, though? Does enabling -XSafe when compiling user-supplied code catch some scenarios that would be uncaught otherwise?

There are two ways in which Safe Haskell helps with curation. First, the import restriction that helps @let also helps with adding new permanent imports to Pristine.hs in pretty much the same way; a module that can be safely imported should not break the type system or allow "pure" functions that do arbitrary IO. But Safe Haskell also helps on a package level, since we have two tiers of trust: - for untrusted packages, modules are only regarded as safe if their safety is inferred, and the inference mechanism is strong enough to propagate the two safety criteria lambdabot cares about (type safety, and lack of arbitrary IO ala unsafePerformIO). So just installing a package without trusting it requires little review (one can still worry about all the things a package can do during installation, through configure, Setup.hs or Template Haskell...) For trivial packages this can be good enough. (Sometimes it isn't for stupid reasons like IsList only being available through the Unsafe GHC.Exts module.) - for trusted packages, at least one can focus ones attention to modules marked Trustworthy. How much this helps depends very much on the package; all imports of a Trustworthy package may have to be reviewed as well. Because of this, the real curation happens at the level of packages to be trusted; adding an untrusted package requires far less attention. Cheers, Bertram

Carter Schonwald

4:11 p.m.

I think safe Haskell is largely meant for exactly stuff like lambda bot. One gotcha that I’ve seen come up when people use it ... safe Haskell modules disable all rewrite rules (trustworthy doesn’t). And I’ve seen folks mark a module in their benchmarks as safe... and I suppose they didn’t understand the implications of that. On Sat, Apr 17, 2021 at 10:59 AM Richard Eisenberg wrote:

...

Hi Bertram,

Thanks for speaking up here. I feel like I'm missing something I should know, but how does Safe help you? Looking at the lambdabot docs, users cannot import their own modules, and you describe the libraries as curated. So, presumably, that's enough to keep unsafeCoerce and unsafePerformIO from being in scope. Along similar lines, I don't see a way in lambdabot to enable extensions, so Template Haskell is not an issue for you (I believe).

Maybe the role of Safe is in helping you curate your libraries? That is, you can use the Safety of a module in determining whether or not it should be imported. That is indeed helpful. Is that it, though? Does enabling -XSafe when compiling user-supplied code catch some scenarios that would be uncaught otherwise?

Thanks for educating me about this -- it's important to know how the feature is being used if we are going to maintain it.

Richard

...
On Apr 17, 2021, at 9:03 AM, Bertram Felgenhauer via Haskell-Cafe < haskell-cafe@haskell.org> wrote:

Richard Eisenberg wrote:

...
Hi café,

Do you use Safe Haskell? Do you know someone who does? If you do, which of Safe Haskell's guarantees do you rely on?

Yes. lambdabot's evaluation mechanism is essentially designed around Safe Haskell: expressions being evaluated are wrapped in `show`, so there's nothing triggering IO actions from outside.

Safe Haskell (barring bugs or evil libraries) ensures that there's no unsafePerformIO nor unsafeCoerce to break the type system, so there's no way to perform arbitrary IO actions inside pure code.

The libraries are curated, so evil libraries have not been an issue.

As for bugs, there have been holes in Typeable in the past, but currently I believe they're closed, except for the use of MD5 as a hash function (but that may require ~2^64 hash operations to exploit because it's hashing UTF-16 data, rendering the existing differential path collision attacks useless... as far as I know, nobody has done this yet).

Preventing the use of Template Haskell is another aspect that lambdabot relies on.

Nowadays, lambdabot (as deployed on Freenode) also uses a sandbox for evaluation, but I'm thinking of that as a second line of defense rather than the primary mechanism for keeping things safe.

So I'd be sad to see SafeHaskell go away.

Cheers,

Bertram _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Askar Safin

26 Apr 26 Apr

7:37 p.m.

Hi. Let me describe two hypothetical use cases for safe Haskell. 1. Isabelle ( http://isabelle.in.tum.de ) is written in SML. Input for Isabelle (theory files) are allowed to contain embedded SML source code for extending Isabelle. If someone sends you Isabelle theory and you open it in your Isabelle IDE, this embedded code starts to automatically execute. I don't know whether Isabelle authors put some restrictions to deny arbitrary code execution. So, as you probably guessed, I think this is possible to write similar prover in Haskell. Its theory files will be allowed to contain embedded Haskell code to extend prover. And SafeHaskell will be used to deny arbitrary code execution. 2. Imagine OS entirely written in Haskell. Both kernel and user space programs. In such OS we don't need any hardware methods of restriction of user programs, i. e. we don't need MMU, context switches, separation of ring 0 and ring 3. Security will be achieved by using SafeHaskell. User will be allowed to run code in user space only if it is compiled using trusted Haskell compiler with SafeHaskell enabled. After compilation the code is loaded directly into kernel address space and executed in ring 0. It will be very easy to implement capability-based security. Benefits of such model: a) We don't have MMU and context switch overhead. System call is just function call b) User space programs will not be able to read/write memory directly, so they will not be able to exploit Spectre/Meltdown, so there is no need in costly mitigations for this vulnerabilities (I'm not sure in b)) c) Programs live in one address space, so they can just pass each other data directly. IPC is just function call d) Speed up caused by a), b) and c) can be so huge that it will be bigger than slow down caused by using Haskell instead of C, so usual arguments "Haskell is too slow" will go away == Askar Safin https://github.com/safinaskar

Bertram Felgenhauer

18 Apr 18 Apr

11:31 a.m.

Hi again, I wrote:

...

So I'd be sad to see SafeHaskell go away.

I've slept on this. The above is still true, but only because I happen to have a use case that fits SafeHaskell pretty much perfectly. The bigger picture is bleaker, and SafeHaskell may well be holding back safe Haskell from reaching its full potential. Its main drawbacks are: 1) SafeHaskell imposes a single notion of safety upon the whole world. First, safety is expressed through language pragmas, which are Booleans attached to source code; basically whatever interpretation the base library gives to safety is the one that everyone has to use. On top of that, library maintainers are supposed to do the work of declaring safety of their code, even if they do not care about SafeHaskell at all. Secondly, the mechanism for inferring safety is baked into the compiler, but different notions of safety may require different inference mechanisms. Unless the use case for which SafeHaskell was designed is common (and the replies here indicate that it's not), this is hard to justify. 2) Since SafeHaskell is integrated in GHC, it's hard to improve, and SafeHaskell is lacking in some regards. For example, safety is declared at the module level, rather than the level of exported symbols. This is unfortunate, because the real cost is not in adding annotations to files, but in reviewing individual definitions; the more definitions can be inferred as safe the better, and a finer granularity would help there. In principle, most of SafeHaskell's functionality could be covered by external tools, including managing lists of safe/trusted/unsafe modules (or exports) and tracking safety dependencies for inferred safety or unsafety. There could be a standard file format for such lists, so they can be shared easily. (The information could even be expressed in source code pragmas and be extracted from there, so there doesn't need to be any loss compared to the existing SafeHaskell.) The parts that require compiler support are inference of safety, and checking language extensions and imports against a whitelist. For these things, compiler plugins should work [*]. Maybe a set of utility functions or even an abstraction layer from the GHC API can be found to support these tasks. Ideally, we could have a nice suite of tools for code auditing (define a desired property, chase it through a code base) that supports checking functions automatically through compiler plugins. And if enough people care about a particular property and trust each other's analysis, results could be shared. And of course, the original SafeHaskell use case ala lambdabot would still be covered, since it boils down to only allowing code whose safety can be inferred automatically. There would inevitably be a mess of different safety notions. But that is already the case now. The difference is that currently, only one notion is blessed and supported by GHC and all the others are not. Cheers, Bertram [*] I'm a bit worried that I'm being too naive there. Why is SafeHaskell such a burden within GHC? Is it because it touches many parts, from language pragmas to interface files to actually checking imports? Or am I forgetting something about SafeHaskell that requires deep integration, say, into the type checker?

Tom Ellis

2:21 p.m.

On Sun, Apr 18, 2021 at 01:31:08PM +0200, Bertram Felgenhauer via Haskell-Cafe wrote:

...

In principle, most of SafeHaskell's functionality could be covered by external tools, including managing lists of safe/trusted/unsafe modules (or exports) and tracking safety dependencies for inferred safety or unsafety. [...] [*] I'm a bit worried that I'm being too naive there. Why is SafeHaskell such a burden within GHC? Is it because it touches many parts, from language pragmas to interface files to actually checking imports? Or am I forgetting something about SafeHaskell that requires deep integration, say, into the type checker?

To this point, I think it's worth noting what aspect of SafeHaskell is difficult to maintain in GHC yet would be easy to maintain outside of GHC, if anything.

Bertram Felgenhauer

20 Apr 20 Apr

1:19 p.m.

Bertram Felgenhauer via Haskell-Cafe wrote:

...

Unless the use case for which SafeHaskell was designed is common (and the replies here indicate that it's not), this is hard to justify.

The feedback here is not wholly representative. There's a reddit thread [1] where djdlc points out https://uniprocess.org/effects.html This is interesting because it demonstrates that the notion of safety can be *refined* from its use by the `base` library in the context of DSLs, because one can express which notion of safety applies through types, and confine the code that is ultimately executed through the type system. Obviously this will still break down when the type system is subverted as in https://gitlab.haskell.org/ghc/ghc/-/issues/9562 which Richard pointed out, or https://gitlab.haskell.org/ghc/ghc/-/issues/19287 which wz1000 demonstrated on IRC. But these are terrible bugs anyway; it's just that SafeHaskell boosts their implact from code that people shouldn't write to a potential security issue. Is anybody maintaining a list of these type system unsoundness issues? Apparently some people also enjoy the extra code discipline that producing Safe code requires (link by gentauro (=djdlc) on Freenode): http://blog.stermon.com/articles/2019/02/21/the-main-reason-i-use-safe-haske... Cheers, Bertram [1] https://reddit.com/r/haskell/comments/msa3oq/safe_haskell/ or https://teddit.net/r/haskell/comments/msa3oq/safe_haskell/

Richard Eisenberg

8:50 p.m.

After keeping up with this thread, the reddit thread, and a Twitter thread, I started to write a GHC proposal to remove Safe Haskell. However, a conversation with Krzysztof Gogolewski (aka monoidal) and a post on reddit made me change my mind: we need to renovate Safe Haskell, not remove it. And doing so only really makes sense in the context of a larger security overhaul. We are as a loose encampment in an open field with a few night sentries. Safe Haskell is a slightly-crumbling earthen rampart along two sides of the encampment. As such, it's really just an obstacle, and does little (but not nothing) to protect us. I was thinking to clear away the obstacle. But of course the better solution is to reinforce the rampart, build two more sides of it, and create a proper defensive position. This will be hard, and I do not propose to take charge of such an act now. But I recognize that the existing structure naturally forms part of this greater whole. See also https://www.reddit.com/r/haskell/comments/msa3oq/safe_haskell/gv8vph9/ https://www.reddit.com/r/haskell/comments/msa3oq/safe_haskell/gv8vph9/, where I make similar, if not as evocative, comments. Thanks much for the input here! Richard

...

On Apr 20, 2021, at 9:19 AM, Bertram Felgenhauer via Haskell-Cafe wrote:

Bertram Felgenhauer via Haskell-Cafe wrote:

...
Unless the use case for which SafeHaskell was designed is common (and the replies here indicate that it's not), this is hard to justify.

The feedback here is not wholly representative.

There's a reddit thread [1] where djdlc points out

https://uniprocess.org/effects.html

This is interesting because it demonstrates that the notion of safety can be *refined* from its use by the `base` library in the context of DSLs, because one can express which notion of safety applies through types, and confine the code that is ultimately executed through the type system.

Obviously this will still break down when the type system is subverted as in

https://gitlab.haskell.org/ghc/ghc/-/issues/9562

which Richard pointed out, or

https://gitlab.haskell.org/ghc/ghc/-/issues/19287

which wz1000 demonstrated on IRC. But these are terrible bugs anyway; it's just that SafeHaskell boosts their implact from code that people shouldn't write to a potential security issue. Is anybody maintaining a list of these type system unsoundness issues?

Apparently some people also enjoy the extra code discipline that producing Safe code requires (link by gentauro (=djdlc) on Freenode):

http://blog.stermon.com/articles/2019/02/21/the-main-reason-i-use-safe-haske...

Cheers,

Bertram

[1] https://reddit.com/r/haskell/comments/msa3oq/safe_haskell/ or https://teddit.net/r/haskell/comments/msa3oq/safe_haskell/ _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Mario

21 Apr 21 Apr

12:45 p.m.

On 2021-04-20 4:50 p.m., Richard Eisenberg wrote:

...

We are as a loose encampment in an open field with a few night sentries. Safe Haskell is a slightly-crumbling earthen rampart along two sides of the encampment. As such, it's really just an obstacle, and does little (but not nothing) to protect us. I was thinking to clear away the obstacle. But of course the better solution is to reinforce the rampart, build two more sides of it, and create a proper defensive position. This will be hard, and I do not propose to take charge of such an act now. But I recognize that the existing structure naturally forms part of this greater whole.

If your main goal is a defensive position you shouldn't be in an open field to begin with. You should encamp on a high ground, with plenty of nearby lumber to build ramparts, and a secure source of fresh water to withstand a siege. That kind of position limits your maneuverability, admittedly, but it's much easier to protect. Sorry, I couldn't resist your analogy. I'm glad that Safe Haskell continues to be maintained, but if you start with the goal of security Haskell is really not the right place to start from. No general-purpose language (open field) is. You want to design from scratch, starting with a secure core language (high ground). You can use Haskell as an inspiration; Marlowe and probably some other blockchain languages do.

Sven Panne

1:36 p.m.

Am Mi., 21. Apr. 2021 um 14:55 Uhr schrieb Mario :

...

[...] No general-purpose language (open field) is. You want to design from scratch, starting with a secure core language (high ground). You can use Haskell as an inspiration; Marlowe and probably some other blockchain languages do.

That's not totally correct: You can use anything you like when you have a sandbox while executing it. This is even much more safe than relying on a language (which can have conceptual and/or implementation bugs) alone. The attack surface of any non-trivial language, its implementation and its runtime is just too big for anything serious. Sandboxes are complex, too, but less so, and you implement them once and you can use them for many things. Having said that, my personal view is that Safe Haskell has almost no valid use case anymore, given the various sandboxing technologies available today. But that's just my 2c... Cheers, S.

Joachim Durchholz

1:52 p.m.

Am 21.04.21 um 15:36 schrieb Sven Panne:

...

That's not totally correct: You can use anything you like when you have a sandbox while executing it. This is even much more safe than relying on a language (which can have conceptual and/or implementation bugs) alone.

Actually Sandboxes have a just as complicated attack surface as languages. I also believe they are different domains. Secure languages deal with making guarantees about what a program does and, more importantly, what it does not do. So you can control things like IO effects, Capabilities, and the language can even make these guarantees statically. A sandbox deals more with API objects. This is a much more complicated surface because today's APIs tend to be large, complex, and interact in unexpected fashions; on the other hand, it is much nearer to the actual objects being protected. I.e. I believe the two approaches, while they have some overlap, they serve different purposes and need to complement each other.

...

The attack surface of any non-trivial language, its implementation and its runtime is just too big for anything serious. Sandboxes are complex, too, but less so,

I believe the opposite is true. APIs change over time. Languages do that, too, but to a much lesser extent, and type system guarantees tend to hold for decades. Even filesystem APIs are less stable than that (think NFS, or filesystem-dependent ACLs). Regards, Jo

Sven Panne

4:39 p.m.

Am Mi., 21. Apr. 2021 um 15:57 Uhr schrieb Joachim Durchholz < jo@durchholz.org>:

...

Actually Sandboxes have a just as complicated attack surface as languages.

Perhaps we are talking about different kinds of sandboxes: I'm talking about sandboxes at the syscall level. Compared to higher level stuff they are relatively simple. They are not small (tons of syscalls exist nowadays), but each syscall is relatively simple compared to stuff built upon it. Size != complexity. And the SW engineering point is: You program your sandbox SW once and can use it for every language/program, without any need to follow the latest and greatest hype in languages, compilers, type systems etc. This means a *much* higher ROI compared to a language-specific solution. Of course you won't get a PhD for a syscall sandbox...

...

I also believe they are different domains. Secure languages deal with making guarantees about what a program does and, more importantly, what it does not do. So you can control things like IO effects, Capabilities, and the language can even make these guarantees statically.

Only if you trust your language itself, your implementation and its RTS. And given the complexity of each of these aspects, basically any language has security holes (see e.g. the recent discussion here about various GHC bugs). A sandbox has a much easier job, it can e.g. always err on the safe side.

...

A sandbox deals more with API objects. This is a much more complicated surface because today's APIs tend to be large, complex, and interact in unexpected fashions; on the other hand, it is much nearer to the actual objects being protected. I.e. I believe the two approaches, while they have some overlap, they serve different purposes and need to complement each other.

A syscall sandbox needs no help whatsoever from a language.

...

I believe the opposite is true.APIs change over time.

The syscall interface is extremely stable, at least compared to the rest of all APIs in existence.

...

Languages do that, too, but to a much lesser extent, and type system guarantees tend to hold for decades.

There is no such thing as a type system in the machine code actually running, so you can't trust any higher-level guarantees.

...

Even filesystem APIs are less stable than that (think NFS, or filesystem-dependent ACLs).

Do you really see the filesystem implementation on the syscall level? Unless you're doing serious fcntl()/ioctl()-Kung-Fu or stuff like that, I don't think so, but that's just a guess.

Joachim Durchholz

22 Apr 22 Apr

7:23 p.m.

Am 21.04.21 um 18:39 schrieb Sven Panne:

...

Am Mi., 21. Apr. 2021 um 15:57 Uhr schrieb Joachim Durchholz mailto:jo@durchholz.org>:

Actually Sandboxes have a just as complicated attack surface as languages.

Perhaps we are talking about different kinds of sandboxes: I'm talking about sandboxes at the syscall level. Compared to higher level stuff they are relatively simple. They are not small (tons of syscalls exist nowadays), but each syscall is relatively simple compared to stuff built upon it. Size != complexity.

True, but the semantics behind each syscall can be horrendously complex. The syscall for opening a file is straightforward, but its semantics depends on a ton of factors, like whether it's in /proc, on a ramdisk, a remote file system, and the semantics of whatever filesystem is plugged behind. That's why you can have a sandbox and this still doesn't protect you from symlink timing attacks on /tmp, for example (wait for a system process to create a temporary file, delete the file and insert a symlink to a file of your choice - will work with a low probability if there's a window between file creation and file opening, that's why mktemp and friends were built to make this atomic, and we hope(!) that all security-relevant programs have been updated to use that instead of creating a temp file to open it later).

...

And the SW engineering point is: You program your sandbox SW once and can use it for every language/program, without any need to follow the latest and greatest hype in languages, compilers, type systems etc.

Instead you have to make sure that all software uses mktemp instead of doing nonatomic file creation&opening. Which isn't secure by design, you still have to double-check all software.

...

This means a *much* higher ROI compared to a language-specific solution. Of course you won't get a PhD for a syscall sandbox...

Except that there is no such thing as an inherently safe syscall interface, there are unsafe ways to use it. And that's where language-based safety can help.

...

I also believe they are different domains. Secure languages deal with making guarantees about what a program does and, more importantly, what it does not do. So you can control things like IO effects, Capabilities, and the language can even make these guarantees statically.

Only if you trust your language itself, your implementation and its RTS.

Again, trusting an operating system isn't a smaller surface. After all, you need to trust the OS as a whole, namely that all the syscalls do what they're supposed to do. Not a win, actually. Operating systems are vastly more complex than language runtimes. It's even stronger. The compiler code to emit useful error messages is complex, because it needs to infer the programmer's intentions. The compiler code for type inference is pretty complex as well - it ranges from pretty easy for pure Hindley-Milner typing to arbitrary complexity for dependent types and other things. However, the code that checks if the program type checks after all type inference has been applied - that's actually trivial. It's the moral equivalent of a proof checker for first-order predicate logic, and that's something that you can do at university (I actually did such a thing for my Diploma); *finding* a proof is hard but *checking* one is trivial, and that translates to type systems and things like Secure Haskell. There's still another challenge: Set up a system of axioms, i.e. security properties and how they apply when going through the various primitives, and find a set that's (a) interesting and (b) understandable for programmers. This is pretty hard to do for syscalls, because their semantics was never designed to be easily provable. It's easier to languages if you restrict syscall use to a safe subset (for whatever definition of "safe" resp. for what set of security-relevant properties you're interested in). Such a restriction is good for an easy first prototype, but it falls flat as soon has you allow arbitrary syscalls, or linking arbitrary libraries. So for system security regardless of what programming language people are using, you probably need something at the syscall level - sandboxes, jails, cgroups, whatever else. None of these are perfect - kernel bugs still exist, and some syscalls have well-known security issues - so you *still* need defense in depth.

...

And given the complexity of each of these aspects, basically any language has security holes (see e.g. the recent discussion here about various GHC bugs). A sandbox has a much easier job, it can e.g. always err on the safe side.

No sandbox does that. Well, the new ones do. Then people start demanding additional features, and off you go with new features, new bugs, and new unforeseen interactions. I'm speaking from experience with the JVM sandboxes, which is an in-language mechanism but the problems I saw there could happen regardless of whether the sandboxing is done inside a runtime or at the syscall level.

...

A sandbox deals more with API objects. This is a much more complicated surface because today's APIs tend to be large, complex, and interact in unexpected fashions; on the other hand, it is much nearer to the actual objects being protected. I.e. I believe the two approaches, while they have some overlap, they serve different purposes and need to complement each other.

A syscall sandbox needs no help whatsoever from a language.

Well, I hope I just argued why a syscall sandbox is never going to be complete. An additional argument: syscall sandboxes can't prevent data exfiltration. There are just too many channels. And in-language security check could prevent exfiltration of user passwords (with the right axioms and inference rules).

...

I believe the opposite is true.APIs change over time.

The syscall interface is extremely stable, at least compared to the rest of all APIs in existence.

But first-order predicate logic is immutable, and that's all you need for a checker. (The axioms and inference rules can depend on the API, but that's just the same as with syscalls.) BTW syscalls aren't immutable. They're backwards-compatible. That's irrelevant for a programmer, but it is a world of difference for security - a programmer is interested in what can be done, security is interested in what can *not* be done, and this changes the picture completely.

...

Languages do that, too, but to a much lesser extent, and type system guarantees tend to hold for decades.

There is no such thing as a type system in the machine code actually running, so you can't trust any higher-level guarantees.

You can't trust the compiler, you can't trust the runtime, and you can't trust the kernel. There's really no big difference there. You just happen to know the ghc loopholes better than the syscall loopholes (for me it's the other way round).

...

Even filesystem APIs are less stable than that (think NFS, or filesystem-dependent ACLs).

Do you really see the filesystem implementation on the syscall level? Unless you're doing serious fcntl()/ioctl()-Kung-Fu or stuff like that, I don't think so, but that's just a guess.

No, it's much worse. Some syscalls are atomic on a local filesystem and nonatomic on NFS. So if you're writing security-relevant code, you have to know whether you're on NFS or not, and I'm not even sure if you can reliably detect if the file you're trying to create&open is local or NFS. Then there's tons of interesting games you can play with symlinks and hardlinks, further complicated by questions like what happens if a symlink is cross-filesystem (it may well be that all atomicity guarantees go out of the window). Do you even write your code in a way that you detect if a physical file was turned into a symlink behind your back? Even if your code is running inside a VM and might be subject to suspend/resume? (Cloud management software like OpenShift does things like suspend your VM, move it do a different hardware, and resume; filesystem boundaries can indeed change, and it's easy to confuse OpenShift enough to trigger that "oh we need to relocate to another hardware" reaction). Quite a while ago, attacks have started shifting from exploiting operating system bugs to exploiting application programmer expectations and making these fail. And a syscall will do nothing to stop that. A language-based security framework cannot fully stop that, but it can catch another class of errors. So I still stand by my assessment that the approaches don't replace each other, they're complementing each other and you should pursue both. That's my 5 cents - YMMV. Regards, Jo

Sven Panne

8:36 p.m.

Am Do., 22. Apr. 2021 um 21:29 Uhr schrieb Joachim Durchholz < jo@durchholz.org>:

...

True, but the semantics behind each syscall can be horrendously complex. [...]

That's correct, but a sandbox doesn't need to implement all of it. Checking that e.g. only something below a given directory can be opened (perhaps e.g. only for reading) is relatively easy, prohibiting creation of sockets, limiting the amount of memory which can be mmapped (and how), etc. etc. I can hardly imagine what could be considered "safe" for a program which can use all of e.g. POSIX.

...

That's why you can have a sandbox and this still doesn't protect you from symlink timing attacks on /tmp [...]

Well, if it is *your* sandbox and some processes from outside the sandbox can change its contents arbitrarily, then you have more security issues than simple symlink attacks.

...

Instead you have to make sure that all software uses mktemp instead of doing nonatomic file creation&opening. [...]

Nope, one has just to make sure that the sandboxes are isolated. This is relatively easy on the syscall level (e.g. simulating "your" own /tmp etc.). Except that there is no such thing as an inherently safe syscall

...

interface, there are unsafe ways to use it.

And that's exactly the reason why you don't give the full power of all syscalls to a sandboxed program.

...

And that's where language-based safety can help. [...]

Only if *all* of your program is written in that single language, which is hardly the case for every non-toy program: Sooner or later you call out to a C library, and then all bets are off. In general: I think all security-related discussions are futile unless one precisely defines what is considered a threat and what is considered to be safe. And I think we agree to disagree here. :-)

Joachim Durchholz

23 Apr 23 Apr

3:50 p.m.

Am 22.04.21 um 22:36 schrieb Sven Panne:

...

Am Do., 22. Apr. 2021 um 21:29 Uhr schrieb Joachim Durchholz mailto:jo@durchholz.org>:

True, but the semantics behind each syscall can be horrendously complex. [...]

That's correct, but a sandbox doesn't need to implement all of it. Checking that e.g. only something below a given directory can be opened (perhaps e.g. only for reading) is relatively easy,

That's exactly what I mean: the API is deceptively simple, but the actual semantics is pretty complicated, even open-ended. Things you can have that complicate the picture: - Symlinks - mount -o remount /dir - Prohibiting a subdirectory on NFS but it happens to live on the local machine so you have to remember to prohibit both paths - NFS in general can allow mapping - then there's also VFS, which offers even more mapping options

...

prohibiting creation

...

of sockets, limiting the amount of memory which can be mmapped (and how), etc. etc.

These things can indeed be managed at the OS level. Though in practice it's surprisingly hard to close all loopholes. And attackers think in terms of loopholes.

...

I can hardly imagine what could be considered "safe" for a program which can use all of e.g. POSIX.

Well, that's exactly my point: sandboxes live at the Posix level (actually, at the level of the operating system), and that's a huge and complicated surface to harden.

...

That's why you can have a sandbox and this still doesn't protect you from symlink timing attacks on /tmp [...]

Well, if it is *your* sandbox and some processes from outside the sandbox can change its contents arbitrarily, then you have more security issues than simple symlink attacks.

I'm not sure what you mean. The sandbox can run such a symlink attack on its own - assuming it officially has access to /tmp. Of course, sandbox makers have been made aware of this attack and are hardening their sandboxes against it. The point isn't this particular attack, it's that seemingly simple APIs can offer very unexpected loopholes, just by not providing atomicity. I simply don't believe it's possible to build a reliable sandbox. 20 years of Javascript hardening attempts and proved that it's possible to make attacks harder, but we still see pown2own successes.

...

Except that there is no such thing as an inherently safe syscall interface, there are unsafe ways to use it.

And that's exactly the reason why you don't give the full power of all syscalls to a sandboxed program.

Which is exactly the point at which sandbox makers get pressured into adding yet another feature to work around a restriction, on the grounds that "but THIS way of doing it is safe" - which it often enough isn't. The semantics is too complex, it's riddled with aliasing and atomicity problems.

...

And that's where language-based safety can help. [...]

Only if *all* of your program is written in that single language, which is hardly the case for every non-toy program: Sooner or later you call out to a C library, and then all bets are off.

That's why the approaches are complementary. They can't replace each other.

...

In general: I think all security-related discussions are futile unless one precisely defines what is considered a threat and what is considered to be safe. And I think we agree to disagree here. :-)

Actually I agree with that. I just disagree with the idea that making syscall-level sandboxes has a better ROI than making language checkers.

Tom Ellis

26 Apr 26 Apr

6:01 p.m.

On Fri, Apr 23, 2021 at 05:50:37PM +0200, Joachim Durchholz wrote:

...

I just disagree with the idea that making syscall-level sandboxes has a better ROI than making language checkers.

I'm curious whether there's anyone in this thread who takes a different point of view, in absolute terms. The point of contention for me (and I would guess for others too) is whether meagre resources at our disposal should be put towards SafeHaskell and other Haskell-based language checkers, or we should just use what the (comparatively) large and experienced Linux, *BSD, etc.. developers are already providing and many users are already using for hardening efforts. Tom

Henning Thielemann

6:23 p.m.

On Mon, 26 Apr 2021, Tom Ellis wrote:

...

I'm curious whether there's anyone in this thread who takes a different point of view, in absolute terms.

The point of contention for me (and I would guess for others too) is whether meagre resources at our disposal should be put towards SafeHaskell and other Haskell-based language checkers, or we should just use what the (comparatively) large and experienced Linux, *BSD, etc.. developers are already providing and many users are already using for hardening efforts.

When SafeHaskell came out I found it a good way to mark modules as Safe in order to be warned by GHC if they are actually unsafe. Unfortunately I did not mark many modules this way. If GHC features are too complex to give such safeness warranties then I think this is a problem on its own. How would we evaluate code of other authors? Today, I would add Safe to all modules of a critical package and watch where that fails and why. This use case cannot be managed by any sandboxing, container or virtualization technique.

Sven Panne

8:27 p.m.

Am Mo., 26. Apr. 2021 um 20:02 Uhr schrieb Tom Ellis < tom-lists-haskell-cafe-2017@jaguarpaw.co.uk>:

...

[...] The point of contention for me (and I would guess for others too) is whether meagre resources at our disposal should be put towards SafeHaskell and other Haskell-based language checkers, or we should just use what the (comparatively) large and experienced Linux, *BSD, etc.. developers are already providing and many users are already using for hardening efforts.

I think my POV is clear by now. ;-) The current Haskell ecosystem is a bit obscure: There is already a GHC 9.2 alpha, while stack has just an alpha supporting 9.0, and Stackage & Haskell Language Server are still stuck at 8.10.4, so effectively only GHC <= 8.10.4 is usable for quite a few people, I guess. Setting up an up-to-date Haskell development environment is still a difficult process, involving various non-trivial and only lightly documented steps (Example: Given a fully working Spacemacs, but no Haskell SW at all: Try to set up your system for a stack-based workflow with GHC-9.0, including a formatter/linter/code completion/...) . In the meantime, people are discussing more and more esoteric type system extensions for the next GHC release and the advantages of relatively niche features like SafeHaskell. I'm totally aware of the fact that there are different people working on the different parts, nevertheless the overall emphasis is a bit... strange. :-/

Jon Fairbairn

27 Apr 27 Apr

8:29 a.m.

Tom Ellis writes:

...

On Fri, Apr 23, 2021 at 05:50:37PM +0200, Joachim Durchholz wrote:

...
I just disagree with the idea that making syscall-level sandboxes has a better ROI than making language checkers.

I'm curious whether there's anyone in this thread who takes a different point of view, in absolute terms.

The point of contention for me (and I would guess for others too) is whether meagre resources at our disposal should be put towards SafeHaskell and other Haskell-based language checkers, or we should just use what the (comparatively) large and experienced Linux, *BSD, etc.. developers are already providing and many users are already using for hardening efforts.

Surely the whole point of Haskell is that it does things differently from other languages. Right from the beginning static checks were valued over runtime ones, and I’m sad to see that this aspect of the language seems to be undervalued these days. -- Jón Fairbairn Jon.Fairbairn@cl.cam.ac.uk

Mario

21 Apr 21 Apr

2:29 p.m.

On 2021-04-21 9:36 a.m., Sven Panne wrote:

...

Am Mi., 21. Apr. 2021 um 14:55 Uhr schrieb Mario mailto:blamario@rogers.com>:

[...] No general-purpose language (open field) is. You want to design from scratch, starting with a secure core language (high ground). You can use Haskell as an inspiration; Marlowe and probably some other blockchain languages do.

That's not totally correct: You can use anything you like when you have a sandbox while executing it. This is even much more safe than relying on a language (which can have conceptual and/or implementation bugs) alone. The attack surface of any non-trivial language, its implementation and its runtime is just too big for anything serious. Sandboxes are complex, too, but less so, and you implement them once and you can use them for many things. Having said that, my personal view is that Safe Haskell has almost no valid use case anymore, given the various sandboxing technologies available today. But that's just my 2c...

I'm disappointed you haven't continued the encampment analogy. That's especially unforgivable because it actually does provide an insight in this case. Allow me to analogize your argument: Instead of worrying about digging ditches and raising ramparts, why don't we just march into this friendly fort nearby, surrounded by a moat with a drawbridge? Let's just frolic inside. To which the age-old answer is: why not both? Sure, use the ready fortifications if you can find them. But that doesn't mean you can just drop your guard, because enemy could infiltrate any outer defense. Even assuming the fortifications are impenetrable, you'll have to open your gates and lower the drawbridge occasionally. So you still want to keep patrols, guards, inner security, etc. Defense in depth. Analogy over, no sandbox will protect you from attacks like SQL injections or application-layer denial-of-service attacks. A type system designed for the purpose can protect you from both.

Viktor Dukhovni

6:24 p.m.

On Wed, Apr 21, 2021 at 08:45:03AM -0400, Mario wrote:

...

If your main goal is a defensive position you shouldn't be in an open field to begin with. You should encamp on a high ground, with plenty of nearby lumber to build ramparts, and a secure source of fresh water to withstand a siege. That kind of position limits your maneuverability, admittedly, but it's much easier to protect.

On the topic of whether safety can/should be in the language or be implemented via containerisation, much depends on the details of how the language goes about sandboxing. Some decades back I've used Safe Tcl to limit what untrusted scripts can execute in a server process. With Safe Tcl one populates the untrusted interpreted with a very limited set of verbs and optional aliases into wrapper verbs that run in a separate trusted interpreter. The wrapper verbs can inspect the arguments of restricted commands and allow only appropriate access. Similar features are likely available by embedding Lua or another similar interpreter where it is possible to explicitly expose only a specific language dialect. Such designs tend to be robust. On the other hand, I've always been sceptical of the sandboxing in Java, it was much too complex to be obviously correct, and the long history of bugs in Java applet sandboxing justified the scepticism. Fortunately we longer use Java applets, though JavaScript security in browsers continues to be challenging. So it should not be surprising that I'd also be sceptical of any strong safety claims from Safe Haskell. It seems unlikely to be simple enough to be "obviously correct". This does not mean that it can't be a useful element of defense in depth, but it is difficult to envision how it would be sufficient on its own. The challenge with Safe Haskell, unlike with safe Tcl, is that there is no hard separation between the untrusted and trusted components of the program, they're only separated by static analysis, not separation into distinct execution engines with a controlled communication channel as with Safe Tcl and slave interpreters. For Safe Haskell to be safe, I'd want to see internal virtualisation, with untrusted code using a stripped down RTS that is incapable of supporting external I/O, and all unsafe operations proxied into a separate trusted RTS running separately compiled code. All unsafe operations in the untrusted RTS would then need to be RPC calls into the trusted RTS. The difficulty is though that unlike the case with Tcl, many basic Haskell libraries that export safe pure interfaces, internally use unsafe interfaces (e.g. raw memory access) for performance reasons. So it is rather unclear how to expose a usable untrusted subset of the language except through static analysis (type level guarantess) of the sort that I'm reluctant to trust as a result of high complexity. So while I take no position on whether Safe Haskell should or should not be abandoned, I agree that it is a valid question because the problem is rather non-trivial, and it is not clear whether it is actually possible to implement a sufficiently robust design. -- Viktor.

Joachim Durchholz

22 Apr 22 Apr

7:35 p.m.

Am 21.04.21 um 20:24 schrieb Viktor Dukhovni:
> On the other hand, I've always been sceptical of the sandboxing in Java,
> it was much too complex to be obviously correct, and the long history of
> bugs in Java applet sandboxing justified the scepticism.  Fortunately we
> longer use Java applets, though JavaScript security in browsers
> continues to be challenging.

Same here.

> So it should not be surprising that I'd also be sceptical of any
> strong safety claims from Safe Haskell.  It seems unlikely to be
> simple enough to be "obviously correct".  This does not mean that it
> can't be a useful element of defense in depth, but it is difficult to
> envision how it would be sufficient on its own.
> 
> The challenge with Safe Haskell, unlike with safe Tcl, is that there
> is no hard separation between the untrusted and trusted components of
> the program, they're only separated by static analysis, not
> separation into distinct execution engines with a controlled
> communication channel as with Safe Tcl and slave interpreters.
For Safe Haskell, the substrate language is much easier to reason about 
- there's no way for static analysis to work well with Java (aliasing 
alone will preven that), but such a thing *is* possible with Haskell. 
Heck, ghc is doing deforestation and similar things, no such thing would 
be even remotely thinkable for any imperative language!

So while I agree in principle, I believe Safe Haskell could be safer 
than the experience with sandboxing in an imperative language would be.

The other thing is that the Java sandbox pretty obviously spent more 
time on writing enabling code than on validating that a new feature 
doesn't influence existing ones, and you need the opposite approach when 
doing sandboxes.
I.e. I believe the Java sandbox failed not just because it's inherently 
more difficult, but also because the budget was too limited.

> Haskell libraries that export safe pure interfaces, internally use
> unsafe interfaces (e.g. raw memory access) for performance reasons.
> So it is rather unclear how to expose a usable untrusted subset of the
> language except through static analysis (type level guarantess) of the
> sort that I'm reluctant to trust as a result of high complexity.

Validating that an inference chain is correct is much easier.
You need to have pretty simple axioms to make sure you don't get an 
inconsistent system, or one that doesn't prove the things you believe it 
proves. (This did indeed happen, IIRC the $ operator turned out to be 
much less safe than originally believed sometime around 2000.)

> So while I take no position on whether Safe Haskell should or should not
> be abandoned, I agree that it is a valid question because the problem is
> rather non-trivial, and it is not clear whether it is actually possible to
> implement a sufficiently robust design.

Fully agreed.
Not because the inference engine cannot be trusted, but because it's 
hard to set up a good set of axioms if you have things like unsafe 
operations and still want to prove that their use in a particular piece 
of is safe under Haskell's semantics.
OTOH this is one of the things that I would have liked to see happen two 
decades ago and it didn't materialze; quite the opposite, the attempts 
at putting up a formal semantics for Haskell, while promising initially, 
were given up because it would be too much work, and that means any 
inference engine is walking on shifting ground. I do not feel very 
confident in *that* regard, I have to say.
Still, it could be more trustworthy than syscall sandboxing, if Safe 
Haskell is usefully strict enough about its inferencing. I just don't 
have enough insight into Safe Haskell so make any predictions there :-)

Regards,
Jo

Ryan Yates

21 Apr 21 Apr

5:27 p.m.

Hi Richard, To add another place I think Safe Haskell is useful: GHC's STM performs lazy validation which means it allows execution of "zombie" or "doomed" transactions, an executing transaction that has observed inconsistent reads from `TVar`s, but has not yet detected it. Given this, unsafe operations can introduce failures when executing a transaction that would not happen executing sequentially. I argued in my thesis that Safe Haskell is sufficient to prevent this. That is, libraries that are safe are safe to use in transactions. Thanks for your work getting this discussion rolling, Ryan On Tue, Apr 20, 2021 at 4:54 PM Richard Eisenberg wrote:

...

After keeping up with this thread, the reddit thread, and a Twitter thread, I started to write a GHC proposal to remove Safe Haskell. However, a conversation with Krzysztof Gogolewski (aka monoidal) and a post on reddit made me change my mind: we need to renovate Safe Haskell, not remove it. And doing so only really makes sense in the context of a larger security overhaul.

We are as a loose encampment in an open field with a few night sentries. Safe Haskell is a slightly-crumbling earthen rampart along two sides of the encampment. As such, it's really just an obstacle, and does little (but not nothing) to protect us. I was thinking to clear away the obstacle. But of course the better solution is to reinforce the rampart, build two more sides of it, and create a proper defensive position. This will be hard, and I do not propose to take charge of such an act now. But I recognize that the existing structure naturally forms part of this greater whole.

See also https://www.reddit.com/r/haskell/comments/msa3oq/safe_haskell/gv8vph9/, where I make similar, if not as evocative, comments.

Thanks much for the input here! Richard

On Apr 20, 2021, at 9:19 AM, Bertram Felgenhauer via Haskell-Cafe < haskell-cafe@haskell.org> wrote:

Bertram Felgenhauer via Haskell-Cafe wrote:

Unless the use case for which SafeHaskell was designed is common (and the replies here indicate that it's not), this is hard to justify.

The feedback here is not wholly representative.

There's a reddit thread [1] where djdlc points out

https://uniprocess.org/effects.html

This is interesting because it demonstrates that the notion of safety can be *refined* from its use by the `base` library in the context of DSLs, because one can express which notion of safety applies through types, and confine the code that is ultimately executed through the type system.

Obviously this will still break down when the type system is subverted as in

https://gitlab.haskell.org/ghc/ghc/-/issues/9562

which Richard pointed out, or

https://gitlab.haskell.org/ghc/ghc/-/issues/19287

which wz1000 demonstrated on IRC. But these are terrible bugs anyway; it's just that SafeHaskell boosts their implact from code that people shouldn't write to a potential security issue. Is anybody maintaining a list of these type system unsoundness issues?

Apparently some people also enjoy the extra code discipline that producing Safe code requires (link by gentauro (=djdlc) on Freenode):

http://blog.stermon.com/articles/2019/02/21/the-main-reason-i-use-safe-haske...

Cheers,

Bertram

[1] https://reddit.com/r/haskell/comments/msa3oq/safe_haskell/ or https://teddit.net/r/haskell/comments/msa3oq/safe_haskell/ _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

David Feuer

17 Apr 17 Apr

3:56 p.m.

There's really only one situation where I've personally felt I needed something like Safe Haskell—and Safe Haskell doesn't deliver it. Specifically, it's very hard to know whether it's safe to use a particular class method at a particular type. On Fri, Apr 16, 2021, 3:03 PM Richard Eisenberg wrote:

...

Hi café,

Do you use Safe Haskell? Do you know someone who does? If you do, which of Safe Haskell's guarantees do you rely on?

Here, a user of Safe Haskell is someone who relies on any guarantees that Safe Haskell provides, not someone who makes sure to have the right pragmas, etc., in your library so that users can import it Safely.

Context: Safe Haskell is not lightweight to support within GHC and the ecosystem. Despite being a formidable research project with a (in my opinion) quite worthwhile goal, it's unclear which of Safe Haskell's purported guarantees are actually guaranteed by GHC. (The lack of unsafeCoerce is not actually guaranteed: https://gitlab.haskell.org/ghc/ghc/-/issues/9562.) Recent design questions about what should be Safe and what shouldn't be (somehow cannot find the discussion after a few minutes of searching; perhaps fill this in) have been answered only by stabs in the dark. The status quo is causing pain: https://gitlab.haskell.org/ghc/ghc/-/issues/19590. There are hundreds (maybe thousands) of lines of delicate logic within GHC to support Safe Haskell. These parts of GHC have to be read, understood, and maintained by people with limited time.

I thus wonder about deprecating and eventually removing Safe Haskell. I don't have a concrete plan for how to do this yet, but I'm confident we could come up with a migration strategy.

The set of people who would win by removing Safe Haskell is easy enough to discover. But this email is intended to discover who would be harmed by doing so. If you know, speak up. Otherwise, I expect I will write up a GHC proposal to remove the feature.

Thanks, Richard _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Andreas Källberg

18 Apr 18 Apr

9:43 a.m.

I've been considering using it for safety-critical software to prevent things similar to the event-stream fiasco from happening, where someone took over maintenance of an npm library that was a transitive dependency of a bitcoin wallet application and injected malware that stole the users' secret keys and money. https://blog.npmjs.org/post/180565383195/details-about-the-event-stream-inci... Would Safe Haskell be effective against those kinds of attacks? It should allow using a large amount of transitive dependencies, without having to manually verify the safety of anything but the core trusted packages, right?

...

On 17 Apr 2021, at 03:02, Richard Eisenberg wrote:

Hi café,

Do you use Safe Haskell? Do you know someone who does? If you do, which of Safe Haskell's guarantees do you rely on?

Here, a user of Safe Haskell is someone who relies on any guarantees that Safe Haskell provides, not someone who makes sure to have the right pragmas, etc., in your library so that users can import it Safely.

Context: Safe Haskell is not lightweight to support within GHC and the ecosystem. Despite being a formidable research project with a (in my opinion) quite worthwhile goal, it's unclear which of Safe Haskell's purported guarantees are actually guaranteed by GHC. (The lack of unsafeCoerce is not actually guaranteed: https://gitlab.haskell.org/ghc/ghc/-/issues/9562 https://gitlab.haskell.org/ghc/ghc/-/issues/9562.) Recent design questions about what should be Safe and what shouldn't be (somehow cannot find the discussion after a few minutes of searching; perhaps fill this in) have been answered only by stabs in the dark. The status quo is causing pain: https://gitlab.haskell.org/ghc/ghc/-/issues/19590 https://gitlab.haskell.org/ghc/ghc/-/issues/19590. There are hundreds (maybe thousands) of lines of delicate logic within GHC to support Safe Haskell. These parts of GHC have to be read, understood, and maintained by people with limited time.

I thus wonder about deprecating and eventually removing Safe Haskell. I don't have a concrete plan for how to do this yet, but I'm confident we could come up with a migration strategy.

The set of people who would win by removing Safe Haskell is easy enough to discover. But this email is intended to discover who would be harmed by doing so. If you know, speak up. Otherwise, I expect I will write up a GHC proposal to remove the feature.

Thanks, Richard _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Tom Ellis

10:38 a.m.

On Sun, Apr 18, 2021 at 05:43:47PM +0800, Andreas Källberg wrote:

...

I've been considering using it for safety-critical software to prevent things similar to the event-stream fiasco from happening, where someone took over maintenance of an npm library that was a transitive dependency of a bitcoin wallet application and injected malware that stole the users' secret keys and money. https://blog.npmjs.org/post/180565383195/details-about-the-event-stream-inci...

Would Safe Haskell be effective against those kinds of attacks? It should allow using a large amount of transitive dependencies, without having to manually verify the safety of anything but the core trusted packages, right?

Sounds unlikely unless you're willing to never run an IO action:

...

It does not ensure code inferred safe but in IO cannot perform arbitrary IO.

https://wiki.haskell.org/Safe_Haskell Tom

Henning Thielemann

11:07 a.m.

On Sun, 18 Apr 2021, Tom Ellis wrote:

...

On Sun, Apr 18, 2021 at 05:43:47PM +0800, Andreas Källberg wrote:

...
I've been considering using it for safety-critical software to prevent things similar to the event-stream fiasco from happening, where someone took over maintenance of an npm library that was a transitive dependency of a bitcoin wallet application and injected malware that stole the users' secret keys and money. https://blog.npmjs.org/post/180565383195/details-about-the-event-stream-inci...

Would Safe Haskell be effective against those kinds of attacks? It should allow using a large amount of transitive dependencies, without having to manually verify the safety of anything but the core trusted packages, right?

Sounds unlikely unless you're willing to never run an IO action:

In safety critical code you might not use bare IO but a wrapper or a type class with a trusted set of primitive methods.

Ignat Insarov

28 Apr 28 Apr

2:16 p.m.

Maybe the problem is really that Safe Haskell is not taken to be a default good practice? For example, it is easy to know that I should be setting `-Wall`, but hard to know that I should also enable some other warnings not included in `-Wall`. I know the latter because I read some blog posts somewhere and figured there is a consensus that these additional warnings are good, so I enable them by default. For Safe Haskell, I have not read any such blog posts and I have no idea whether it is a good practice to enable it. This feature is not discoverable. This may be a marketing problem. I have a template for my Haskell projects that enables a bunch of extra warnings, language extensions that I like to have enabled by default, dependencies that I wish were in `base`, and so on. Should Safe Haskell go into that template?

Oleg Grenrus

3:51 p.m.

I don't think adding {-# LANGUAGE Safe #-} (as we do with -Wall) is a good idea. The culprit is Safe-inference. While authors of the original Safe Haskell paper argue that it makes more modules Safe without package maintainer knowing about Safe Haskell, problems on Hackage have shown that it is not the case. We simply cannot rely on Safe-inferred modules (continue) being Safe when new versions are released, or worse, when their transitive dependencies change. There are various examples, I had made such mistakes ([1] is most recent one). ([5] is an example of "dependency" change, luckily caught by explicit pragmas). I have implemented -Winferred-safe-imports [2] to warn about such cases. And also -Wmissing-safe-haskell-mode [3], and used them successfully to cleanup some Safe Haskell mess in Edward Kmett's packages [4]. (If you rely on older versions of the listed packages to be Safe: don't, you are asking for troubles). Another alternative, which IIRC Edward used previously, is to just stamp {-# LANGUAGE Trustworthy #-} everywhere. But that undermines the whole idea in my opinion. It's almost as bad as saying nothing, every Trustworthy module have to be audited. Yet, you have to use Trustworthy if your package uses Data.Coerce or depends on vector-package or... So while I and others put many hours into making more of ecosystem Safe Haskell, I feel it is work with very bad ROI. (I wouldn't have done scan like [4] if there weren't also another reason to do it). So while people may freely discuss whether Safe-Haskell or sandboxes are better setup for running untrusted code, Safe Haskell have been around for almost a decade. If lambdabot is really the only application people can publicly talk about, I just don't feel it's good enough motivation to keep the language feature and add extra burden for the package maintainers. It is one more thing to know about, as if Haskell package authoring and maintenance weren't complicated enough already. As I mentioned, Safe-inferred just doesn't work in practice, so package maintainers have to be aware of it. (Compare to the good practice of having type signatures for top-level bindings, Safety is a part of module signature). TL;DR Safe Haskell requires buy-in from every maintainer, but there are barely any users. For how much longer we need to run this "academic experiment"? - Oleg [1]: https://github.com/ekmett/free/pull/204 [2]: https://downloads.haskell.org/ghc/latest/docs/html/users_guide/exts/safe_has... [3]: https://downloads.haskell.org/ghc/latest/docs/html/users_guide/exts/safe_has... [4]: https://github.com/ekmett/lens/issues/959 [5]: https://github.com/haskell/time/compare/1.11.1...1.11.1.1 On 28.4.2021 17.16, Ignat Insarov wrote:

...

Maybe the problem is really that Safe Haskell is not taken to be a default good practice?

For example, it is easy to know that I should be setting `-Wall`, but hard to know that I should also enable some other warnings not included in `-Wall`. I know the latter because I read some blog posts somewhere and figured there is a consensus that these additional warnings are good, so I enable them by default.

For Safe Haskell, I have not read any such blog posts and I have no idea whether it is a good practice to enable it. This feature is not discoverable. This may be a marketing problem.

I have a template for my Haskell projects that enables a bunch of extra warnings, language extensions that I like to have enabled by default, dependencies that I wish were in `base`, and so on. Should Safe Haskell go into that template? _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Richard Eisenberg

6:07 p.m.

...

On Apr 28, 2021, at 11:51 AM, Oleg Grenrus wrote:

TL;DR Safe Haskell requires buy-in from every maintainer, but there are barely any users. For how much longer we need to run this "academic experiment"?

This is a good way to put it. The reason I've swung around in favor of keeping what we have is that I think *some* structure like Safe Haskell is needed for proper security. The idea is that an author should have to trust only a few packages (like `bytestring`), and these packages can advertise their wide level of trust in a central location, like Hackage. This trust system exists now, but it's not widely advertised. If we were to remove Safe Haskell, this trust system would disappear, only (likely) to be replaced by a very similar trust system required by the successor to Safe Haskell. So, given the real costs associated with discussing how best to remove the feature, then removing it, then having libraries remove it, seem not quite worth it. On the other hand, perhaps you've implicitly suggested something: just disable Safe-inference. That is, every module starts off Unsafe. This could easily be overridden at the package level with default-extensions: Safe in a cabal file. If we had no safe-inference, would that solve the library-level problems? It would certainly remove a good deal of the complexity within GHC! Richard

Oleg Grenrus

3 May 3 May

10:53 a.m.

Removing inference: yes, IMO it would be better then current state of affairs. Please write a proposal. However, `default-extensions: Safe` wouldn't just work. Often some modules are just Trustworthy, or even plain out Unsafe (e.g. in bytestring) {-# LANGUAGE Safe #-} {-# LANGUAGE Trustworthy #-} file errors with ghc: SH.hs:2:14-24: Incompatible Safe Haskell flags! (Safe, Trustworthy) (same if you ghc -XSafe TrustworthyModule.hs) We can argue back and forth whether we should be able to override Safe-Haskell status. - Overriding would make sense, as -XHaskell2010 -XGHC2021 -XHaskell2010 works. - OTOH, safety is peculiar, so there probably was a discussion justifying current behavior, which is to prohibit overriding. (Probably one argument is that if the extensions are in module, then sorting them shouldn't change the result! with Safe Haskell it's harder to spot - you have to write tests - than with mixing up Haskell98/2010/GHC2021). Yet, I'll be happy with any option. - Oleg On 28.4.2021 21.07, Richard Eisenberg wrote:

...

...
On Apr 28, 2021, at 11:51 AM, Oleg Grenrus mailto:oleg.grenrus@iki.fi> wrote:

TL;DR Safe Haskell requires buy-in from every maintainer, but there are barely any users. For how much longer we need to run this "academic experiment"?

This is a good way to put it. The reason I've swung around in favor of keeping what we have is that I think *some* structure like Safe Haskell is needed for proper security. The idea is that an author should have to trust only a few packages (like `bytestring`), and these packages can advertise their wide level of trust in a central location, like Hackage. This trust system exists now, but it's not widely advertised. If we were to remove Safe Haskell, this trust system would disappear, only (likely) to be replaced by a very similar trust system required by the successor to Safe Haskell.

So, given the real costs associated with discussing how best to remove the feature, then removing it, then having libraries remove it, seem not quite worth it.

On the other hand, perhaps you've implicitly suggested something: just disable Safe-inference. That is, every module starts off Unsafe. This could easily be overridden at the package level with default-extensions: Safe in a cabal file. If we had no safe-inference, would that solve the library-level problems? It would certainly remove a good deal of the complexity within GHC!

Richard

Tom Smeding

11:40 a.m.

On 03/05/2021 12:53, Oleg Grenrus wrote:

...

... - OTOH, safety is peculiar, so there probably was a discussion justifying current behavior, which is to prohibit overriding. (Probably one argument is that if the extensions are in module, then sorting them shouldn't change the result! with Safe Haskell it's harder to spot - you have to write tests - than with mixing up Haskell98/2010/GHC2021).

I'm just a user, but I would be quite surprised if sorting the list of language extension pragmas in a module changes the semantics -- particularly for something safety-related. I would accept this behaviour for the Haskell98/... flags because if you have multiple of those explicitly listed in a file, you're doing something wrong anyway. But perhaps I'm being ignorant of other existing cases where this already matters, and I've been living in an idealised world until now. - Tom

Richard Eisenberg

2:25 p.m.

...

On May 3, 2021, at 7:40 AM, Tom Smeding wrote:

But perhaps I'm being ignorant of other existing cases where this already matters, and I've been living in an idealised world until now.

Sad to say it, but order does matter here. In the very simple case, if you have {-# LANGUAGE FlexibleContexts, NoFlexibleContexts #-}, that's different from {-# LANGUAGE NoFlexibleContexts, FlexibleContexts #-} -- later extensions override earlier ones. This problem becomes more confounding when we recognize that some extensions imply others. For example {-# LANGUAGE TypeFamilies, NoMonoLocalBinds #-} means something different from {-# LANGUAGE NoMonoLocalBinds, TypeFamilies #-} because TypeFamilies implies MonoLocalBinds. Perhaps even worse, {-# LANGUAGE CUSKs, StandaloneKindSignatures #-} differs from {-# LANGUAGE StandaloneKindSignatures, CUSKs #-} because StandaloneKindSignatures implies NoCUSKs. Returning to Safe Haskell: It's true that Safe cannot be overridden locally. This is implemented by the fact that NoSafe does not exist. To me, this design makes sense, because it means that compiling with `ghc -XSafe` is guaranteed to use Safe Haskell. So we would need something like a default-safety field in Cabal, that could be overridden locally. But, still, this may be easier than the status quo. Do we think this would work? Specifically: * Introduce a new flag -fdefault-safety={safe,trustworthy,unsafe} that changes the module-level default. This default names the safety level in effect for any module that declares none of Safe, Trustworthy, or Unsafe. * If -fdefault-safety is not specified at the command line, it is as if the user wrote -fdefault-safety=unsafe. And that's it. Consequence: Safe-inference would never take place, because every module would have a declared level of Safety. The Safe-inference code could thus be removed. Further work: Introduce default-safety in Cabal, but that's not really necessary to make the changes above. What do we think? Richard

Mario

3:47 p.m.

On 2021-05-03 10:25 a.m., Richard Eisenberg wrote:

...

It's true that Safe cannot be overridden locally. This is implemented by the fact that NoSafe does not exist. To me, this design makes sense, because it means that compiling with `ghc -XSafe` is guaranteed to use Safe Haskell. So we would need something like a default-safety field in Cabal, that could be overridden locally.

But, still, this may be easier than the status quo.

I don't see why this should be easier. I do understand how type checking is easier than type inference, but in this case the "type" of the module (and every declaration in it) is a single bit: Safe or Unsafe. How can checking whether a declaration is safe can be easier than inferring whether it is? I must be misunderstanding your proposal.

...

Do we think this would work? Specifically:

* Introduce a new flag -fdefault-safety={safe,trustworthy,unsafe} that changes the module-level default. This default names the safety level in effect for any module that declares none of Safe, Trustworthy, or Unsafe. * If -fdefault-safety is not specified at the command line, it is as if the user wrote -fdefault-safety=unsafe.

And that's it.

Consequence: Safe-inference would never take place, because every module would have a declared level of Safety. The Safe-inference code could thus be removed.

Further work: Introduce default-safety in Cabal, but that's not really necessary to make the changes above.

What do we think?

Richard

_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Carter Schonwald

4 May 4 May

12:10 p.m.

Default safety isn’t quite what you want in normal packages, because that disables all user land rewrite rules! It may have other implications too, but short of augmenting ghc with a proof system for correctness of rewrite rules, default safehaskell is at odds with optimized builds. On Mon, May 3, 2021 at 10:28 AM Richard Eisenberg wrote:

...

On May 3, 2021, at 7:40 AM, Tom Smeding wrote:

But perhaps I'm being ignorant of other existing cases where this already matters, and I've been living in an idealised world until now.

Sad to say it, but order does matter here.

In the very simple case, if you have {-# LANGUAGE FlexibleContexts, NoFlexibleContexts #-}, that's different from {-# LANGUAGE NoFlexibleContexts, FlexibleContexts #-} -- later extensions override earlier ones. This problem becomes more confounding when we recognize that some extensions imply others. For example {-# LANGUAGE TypeFamilies, NoMonoLocalBinds #-} means something different from {-# LANGUAGE NoMonoLocalBinds, TypeFamilies #-} because TypeFamilies implies MonoLocalBinds. Perhaps even worse, {-# LANGUAGE CUSKs, StandaloneKindSignatures #-} differs from {-# LANGUAGE StandaloneKindSignatures, CUSKs #-} because StandaloneKindSignatures implies NoCUSKs.

Returning to Safe Haskell:

It's true that Safe cannot be overridden locally. This is implemented by the fact that NoSafe does not exist. To me, this design makes sense, because it means that compiling with `ghc -XSafe` is guaranteed to use Safe Haskell. So we would need something like a default-safety field in Cabal, that could be overridden locally.

But, still, this may be easier than the status quo.

Do we think this would work? Specifically:

* Introduce a new flag -fdefault-safety={safe,trustworthy,unsafe} that changes the module-level default. This default names the safety level in effect for any module that declares none of Safe, Trustworthy, or Unsafe. * If -fdefault-safety is not specified at the command line, it is as if the user wrote -fdefault-safety=unsafe.

And that's it.

Consequence: Safe-inference would never take place, because every module would have a declared level of Safety. The Safe-inference code could thus be removed.

Further work: Introduce default-safety in Cabal, but that's not really necessary to make the changes above.

What do we think?

Richard _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Oleg Grenrus

12:31 p.m.

The GHC manual says [1] RULES — Rewrite rules defined in a module M compiled with Safe are dropped. Rules defined in Trustworthy modules that M imports are still valid and will fire as usual. So rules are still in use, and e.g. list fusion works. Library authors cannot define their own RULES in Safe modules, but they are (always, no need to -Wall) warned about the fact that these rules are ignored. I don't see a possibility for silent performance regressions. - Oleg [1]: https://downloads.haskell.org/ghc/9.0.1/docs/html/users_guide/exts/safe_hask... On 4.5.2021 15.10, Carter Schonwald wrote:

...

Default safety isn’t quite what you want in normal packages, because that disables all user land rewrite rules! It may have other implications too, but short of augmenting ghc with a proof system for correctness of rewrite rules, default safehaskell is at odds with optimized builds.

On Mon, May 3, 2021 at 10:28 AM Richard Eisenberg mailto:rae@richarde.dev> wrote:

...
On May 3, 2021, at 7:40 AM, Tom Smeding mailto:x@tomsmeding.com> wrote:

But perhaps I'm being ignorant of other existing cases where this already matters, and I've been living in an idealised world until now.

Sad to say it, but order does matter here.

In the very simple case, if you have {-# LANGUAGE FlexibleContexts, NoFlexibleContexts #-}, that's different from {-# LANGUAGE NoFlexibleContexts, FlexibleContexts #-} -- later extensions override earlier ones. This problem becomes more confounding when we recognize that some extensions imply others. For example {-# LANGUAGE TypeFamilies, NoMonoLocalBinds #-} means something different from {-# LANGUAGE NoMonoLocalBinds, TypeFamilies #-} because TypeFamilies implies MonoLocalBinds. Perhaps even worse, {-# LANGUAGE CUSKs, StandaloneKindSignatures #-} differs from {-# LANGUAGE StandaloneKindSignatures, CUSKs #-} because StandaloneKindSignatures implies NoCUSKs.

Returning to Safe Haskell:

It's true that Safe cannot be overridden locally. This is implemented by the fact that NoSafe does not exist. To me, this design makes sense, because it means that compiling with `ghc -XSafe` is guaranteed to use Safe Haskell. So we would need something like a default-safety field in Cabal, that could be overridden locally.

But, still, this may be easier than the status quo.

Do we think this would work? Specifically:

* Introduce a new flag -fdefault-safety={safe,trustworthy,unsafe} that changes the module-level default. This default names the safety level in effect for any module that declares none of Safe, Trustworthy, or Unsafe. * If -fdefault-safety is not specified at the command line, it is as if the user wrote -fdefault-safety=unsafe.

And that's it.

Consequence: Safe-inference would never take place, because every module would have a declared level of Safety. The Safe-inference code could thus be removed.

Further work: Introduce default-safety in Cabal, but that's not really necessary to make the changes above.

What do we think?

Richard _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Carter Schonwald

6:38 p.m.

Are you sure? It also says "The use of Safe to compile Danger restricts the features of Haskell that can be used to a safe subset. This includes disallowing unsafePerformIO, Template Haskell, pure FFI functions, RULES and restricting the operation of Overlapping Instances." On Tue, May 4, 2021 at 8:32 AM Oleg Grenrus wrote:

...

The GHC manual says [1]

RULES — Rewrite rules defined in a module M compiled with Safe are dropped. Rules defined in Trustworthy modules that M imports are still valid and will fire as usual.

So rules are still in use, and e.g. list fusion works. Library authors cannot define their own RULES in Safe modules, but they are (always, no need to -Wall) warned about the fact that these rules are ignored. I don't see a possibility for silent performance regressions.

- Oleg

[1]:

https://downloads.haskell.org/ghc/9.0.1/docs/html/users_guide/exts/safe_hask...

On 4.5.2021 15.10, Carter Schonwald wrote:

...
Default safety isn’t quite what you want in normal packages, because that disables all user land rewrite rules! It may have other implications too, but short of augmenting ghc with a proof system for correctness of rewrite rules, default safehaskell is at odds with optimized builds.

On Mon, May 3, 2021 at 10:28 AM Richard Eisenberg mailto:rae@richarde.dev> wrote:

...
On May 3, 2021, at 7:40 AM, Tom Smeding mailto:x@tomsmeding.com> wrote:

But perhaps I'm being ignorant of other existing cases where this already matters, and I've been living in an idealised world until now.

Sad to say it, but order does matter here.

In the very simple case, if you have {-# LANGUAGE FlexibleContexts, NoFlexibleContexts #-}, that's different from {-# LANGUAGE NoFlexibleContexts, FlexibleContexts #-} -- later extensions override earlier ones. This problem becomes more confounding when we recognize that some extensions imply others. For example {-# LANGUAGE TypeFamilies, NoMonoLocalBinds #-} means something different from {-# LANGUAGE NoMonoLocalBinds, TypeFamilies #-} because TypeFamilies implies MonoLocalBinds. Perhaps even worse, {-# LANGUAGE CUSKs, StandaloneKindSignatures #-} differs from {-# LANGUAGE StandaloneKindSignatures, CUSKs #-} because StandaloneKindSignatures implies NoCUSKs.

Returning to Safe Haskell:

It's true that Safe cannot be overridden locally. This is implemented by the fact that NoSafe does not exist. To me, this design makes sense, because it means that compiling with `ghc -XSafe` is guaranteed to use Safe Haskell. So we would need something like a default-safety field in Cabal, that could be overridden locally.

But, still, this may be easier than the status quo.

Do we think this would work? Specifically:

* Introduce a new flag -fdefault-safety={safe,trustworthy,unsafe} that changes the module-level default. This default names the safety level in effect for any module that declares none of Safe, Trustworthy, or Unsafe. * If -fdefault-safety is not specified at the command line, it is as if the user wrote -fdefault-safety=unsafe.

And that's it.

Consequence: Safe-inference would never take place, because every module would have a declared level of Safety. The Safe-inference code could thus be removed.

Further work: Introduce default-safety in Cabal, but that's not really necessary to make the changes above.

What do we think?

Richard _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Carter Schonwald

6:47 p.m.

Hrmm “ - RULES — Rewrite rules defined in a module M compiled with Safe https://downloads.haskell.org/ghc/latest/docs/html/users_guide/exts/safe_has... are dropped. Rules defined in Trustworthy modules that M imports are still valid and will fire as usual. ” Huh I suppose I’m wrong! I thought it was difference or more subtle than that On Tue, May 4, 2021 at 2:38 PM Carter Schonwald wrote:

...

Are you sure?

It also says "The use of Safe to compile Danger restricts the features of Haskell that can be used to a safe subset. This includes disallowing unsafePerformIO, Template Haskell, pure FFI functions, RULES and restricting the operation of Overlapping Instances."

On Tue, May 4, 2021 at 8:32 AM Oleg Grenrus wrote:

...
The GHC manual says [1]

RULES — Rewrite rules defined in a module M compiled with Safe are dropped. Rules defined in Trustworthy modules that M imports are still valid and will fire as usual.

So rules are still in use, and e.g. list fusion works. Library authors cannot define their own RULES in Safe modules, but they are (always, no need to -Wall) warned about the fact that these rules are ignored. I don't see a possibility for silent performance regressions.

- Oleg

[1]:

https://downloads.haskell.org/ghc/9.0.1/docs/html/users_guide/exts/safe_hask...

On 4.5.2021 15.10, Carter Schonwald wrote:

...
Default safety isn’t quite what you want in normal packages, because that disables all user land rewrite rules! It may have other implications too, but short of augmenting ghc with a proof system for correctness of rewrite rules, default safehaskell is at odds with optimized builds.

On Mon, May 3, 2021 at 10:28 AM Richard Eisenberg mailto:rae@richarde.dev> wrote:

...
On May 3, 2021, at 7:40 AM, Tom Smeding mailto:x@tomsmeding.com> wrote:

But perhaps I'm being ignorant of other existing cases where this already matters, and I've been living in an idealised world until now.

Sad to say it, but order does matter here.

In the very simple case, if you have {-# LANGUAGE FlexibleContexts, NoFlexibleContexts #-}, that's different from {-# LANGUAGE NoFlexibleContexts, FlexibleContexts #-} -- later extensions override earlier ones. This problem becomes more confounding when we recognize that some extensions imply others. For example {-# LANGUAGE TypeFamilies, NoMonoLocalBinds #-} means something different from {-# LANGUAGE NoMonoLocalBinds, TypeFamilies #-} because TypeFamilies implies MonoLocalBinds. Perhaps even worse, {-# LANGUAGE CUSKs, StandaloneKindSignatures #-} differs from {-# LANGUAGE StandaloneKindSignatures, CUSKs #-} because StandaloneKindSignatures implies NoCUSKs.

Returning to Safe Haskell:

It's true that Safe cannot be overridden locally. This is implemented by the fact that NoSafe does not exist. To me, this design makes sense, because it means that compiling with `ghc -XSafe` is guaranteed to use Safe Haskell. So we would need something like a default-safety field in Cabal, that could be overridden locally.

But, still, this may be easier than the status quo.

Do we think this would work? Specifically:

* Introduce a new flag -fdefault-safety={safe,trustworthy,unsafe} that changes the module-level default. This default names the safety level in effect for any module that declares none of Safe, Trustworthy, or Unsafe. * If -fdefault-safety is not specified at the command line, it is as if the user wrote -fdefault-safety=unsafe.

And that's it.

Consequence: Safe-inference would never take place, because every module would have a declared level of Safety. The Safe-inference code could thus be removed.

Further work: Introduce default-safety in Cabal, but that's not really necessary to make the changes above.

What do we think?

Richard _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

1524

Age (days ago)

1542

Last active (days ago)

List overview

Download

41 comments

18 participants

participants (18)

Andreas Källberg
Askar Safin
Bertram Felgenhauer
Carter Schonwald
Chris Smith
David Feuer
Henning Thielemann
Ignat Insarov
Joachim Durchholz
Jon Fairbairn
Mario
Oleg Grenrus
Richard Eisenberg
Ryan Yates
Sven Panne
Tom Ellis
Tom Smeding
Viktor Dukhovni

Safe Haskell?

Tom Ellis

Askar Safin

Tom Ellis

Tom Ellis

Tom Ellis

tags

participants (18)