
Hi! I just got an idea for hackage feature. All functions/modules listed there could have some mark if they or any function/module they use uses an unsafe* function. Of course this will make probably almost everything marked as unsafe, but this is the idea - to raise awareness about that so that you can prefer some function/implementation over another. Of course marking/tagging everything as unsafe is not really useful. Because of this I propose that then community votes/vouches on correctness/stability of implementations and this would then influence the how unsafe given function really is (or is according to community, if we are more precise). Of course it would be even better that every function using unsafe would have also a formal proof but as we cannot believe that we will prove everything in a feasible feature we could maybe opt for such "crowd intelligence" approach. We cannot have a Turing machine, but maybe we can have crowd. ;-) (Of course low number of found bugs and good unit test code coverage can then positively influence crowd, so authors would be motivated to assure that.) Comments? Opinions? Because I really hate that I try to keep my code pure and separate IO from everything else and then somewhere deep in there some unsafe* lurks. (Ah, yes, a side effect of this tagging/marks would be also that you would be able to see where all those unsafe* calls are for a given function, so you would be able to fast jump (with link) to a given line in code and evaluate circumstances in which that unsafe* call is made. And then vote/vouch once you discover that it is probably pretty safe.) Mitar

On 16 September 2010 16:04, Mitar
Hi!
I just got an idea for hackage feature. All functions/modules listed there could have some mark if they or any function/module they use uses an unsafe* function. Of course this will make probably almost everything marked as unsafe, but this is the idea - to raise awareness about that so that you can prefer some function/implementation over another.
Of course marking/tagging everything as unsafe is not really useful. Because of this I propose that then community votes/vouches on correctness/stability of implementations and this would then influence the how unsafe given function really is (or is according to community, if we are more precise). Of course it would be even better that every function using unsafe would have also a formal proof but as we cannot believe that we will prove everything in a feasible feature we could maybe opt for such "crowd intelligence" approach. We cannot have a Turing machine, but maybe we can have crowd. ;-)
(Of course low number of found bugs and good unit test code coverage can then positively influence crowd, so authors would be motivated to assure that.)
Comments? Opinions?
Because I really hate that I try to keep my code pure and separate IO from everything else and then somewhere deep in there some unsafe* lurks. (Ah, yes, a side effect of this tagging/marks would be also that you would be able to see where all those unsafe* calls are for a given function, so you would be able to fast jump (with link) to a given line in code and evaluate circumstances in which that unsafe* call is made. And then vote/vouch once you discover that it is probably pretty safe.)
The problem with this is: unsafe* functions would be better called "yesIGuaranteeThatUsingThisFunctionDoesResultInAReferentiallyTransparentEntityAndItsOKForMeToUseIt*". They are "unsafe" in that you shouldn't use them blindly. Seeing as how lazy IO relies on various unsafe* functions, as do bytestrings, this means that any program that uses them is subsequently "tainted". A much better idea would be to have some kind of compilation warning unless you can prove that you're using the unsafe* function in a safe fashion, but such a proof is unlikely to be easily proven in a rigorous fashion nor mechanically checkable (and would delay compilation times). -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

As we were discussing in #haskell, it would have to be more involved than just a taint bit. A listing showing the "taint sources" of a given package would give you confidence in its good behavior. For example, if my nice, pure package's taint list showed that my only taint sources were through my dependencies on base and bytestring, people could trust my program not to launch missiles behind their backs. This would essentially involve traversing transitive dependencies and looking for any module that imports one of the unsafe modules and uses any function from them (including GHC.Prim). This seems like it could be done fairly efficiently, if you just try to do it on the module or package level. At the function level, it would be a lot more computationally intensive, but would be more or less the same idea. I like the idea, but I don't think any formal proof of unsafeness is feasible. We already can't prove arbitrary properties about even our pure code, and proving stuff about impure code would require modeling the outside world and proving your code safe under that model. Anyone reading the proof would have to accept your model of the outside world as well as verifying your proof. But maybe one day we'll have way more than just "Stability: experimental; Version: 0.0.1" on hackage, but instead: Stability: experimental Version: 0.0.1 Test coverage: 98% User stability rating: 86% User API quality rating: 56% Local sources of impurity: none Transitive sources of impurity: bytestring, base Used by: 37 packages [click to see them] But that's just a dream, and the impurity measures seem like a decent goal in the mean time :) Dan On Thu, Sep 16, 2010 at 8:21 AM, Ivan Lazar Miljenovic < ivan.miljenovic@gmail.com> wrote:
On 16 September 2010 16:04, Mitar
wrote: Hi!
I just got an idea for hackage feature. All functions/modules listed there could have some mark if they or any function/module they use uses an unsafe* function. Of course this will make probably almost everything marked as unsafe, but this is the idea - to raise awareness about that so that you can prefer some function/implementation over another.
Of course marking/tagging everything as unsafe is not really useful. Because of this I propose that then community votes/vouches on correctness/stability of implementations and this would then influence the how unsafe given function really is (or is according to community, if we are more precise). Of course it would be even better that every function using unsafe would have also a formal proof but as we cannot believe that we will prove everything in a feasible feature we could maybe opt for such "crowd intelligence" approach. We cannot have a Turing machine, but maybe we can have crowd. ;-)
(Of course low number of found bugs and good unit test code coverage can then positively influence crowd, so authors would be motivated to assure that.)
Comments? Opinions?
Because I really hate that I try to keep my code pure and separate IO from everything else and then somewhere deep in there some unsafe* lurks. (Ah, yes, a side effect of this tagging/marks would be also that you would be able to see where all those unsafe* calls are for a given function, so you would be able to fast jump (with link) to a given line in code and evaluate circumstances in which that unsafe* call is made. And then vote/vouch once you discover that it is probably pretty safe.)
The problem with this is: unsafe* functions would be better called
"yesIGuaranteeThatUsingThisFunctionDoesResultInAReferentiallyTransparentEntityAndItsOKForMeToUseIt*". They are "unsafe" in that you shouldn't use them blindly.
Seeing as how lazy IO relies on various unsafe* functions, as do bytestrings, this means that any program that uses them is subsequently "tainted".
A much better idea would be to have some kind of compilation warning unless you can prove that you're using the unsafe* function in a safe fashion, but such a proof is unlikely to be easily proven in a rigorous fashion nor mechanically checkable (and would delay compilation times).
-- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On 16 September 2010 17:00, Daniel Peebles
But maybe one day we'll have way more than just "Stability: experimental; Version: 0.0.1" on hackage, but instead: Stability: experimental Version: 0.0.1 Test coverage: 98% User stability rating: 86% User API quality rating: 56% Local sources of impurity: none Transitive sources of impurity: bytestring, base Used by: 37 packages [click to see them] But that's just a dream, and the impurity measures seem like a decent goal in the mean time :)
Problem is: whilst we might be able to derive the impurity stuff, and the usage is done by examing reverse dependencies, I'm not sure how you would categorise the stability and quality. Furthermore, the test coverage bit presumably requires developers enable hpc when building tests, and if those tests are optional then wouldn't HPC's figures be slightly off since the test suite is now included in the amount of source you have compared to what most people see?
Dan On Thu, Sep 16, 2010 at 8:21 AM, Ivan Lazar Miljenovic
wrote: On 16 September 2010 16:04, Mitar
wrote: Hi!
I just got an idea for hackage feature. All functions/modules listed there could have some mark if they or any function/module they use uses an unsafe* function. Of course this will make probably almost everything marked as unsafe, but this is the idea - to raise awareness about that so that you can prefer some function/implementation over another.
Of course marking/tagging everything as unsafe is not really useful. Because of this I propose that then community votes/vouches on correctness/stability of implementations and this would then influence the how unsafe given function really is (or is according to community, if we are more precise). Of course it would be even better that every function using unsafe would have also a formal proof but as we cannot believe that we will prove everything in a feasible feature we could maybe opt for such "crowd intelligence" approach. We cannot have a Turing machine, but maybe we can have crowd. ;-)
(Of course low number of found bugs and good unit test code coverage can then positively influence crowd, so authors would be motivated to assure that.)
Comments? Opinions?
Because I really hate that I try to keep my code pure and separate IO from everything else and then somewhere deep in there some unsafe* lurks. (Ah, yes, a side effect of this tagging/marks would be also that you would be able to see where all those unsafe* calls are for a given function, so you would be able to fast jump (with link) to a given line in code and evaluate circumstances in which that unsafe* call is made. And then vote/vouch once you discover that it is probably pretty safe.)
The problem with this is: unsafe* functions would be better called
"yesIGuaranteeThatUsingThisFunctionDoesResultInAReferentiallyTransparentEntityAndItsOKForMeToUseIt*". They are "unsafe" in that you shouldn't use them blindly.
Seeing as how lazy IO relies on various unsafe* functions, as do bytestrings, this means that any program that uses them is subsequently "tainted".
A much better idea would be to have some kind of compilation warning unless you can prove that you're using the unsafe* function in a safe fashion, but such a proof is unlikely to be easily proven in a rigorous fashion nor mechanically checkable (and would delay compilation times).
-- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Yeah, those other things were part of the "bigger picture" that I hope hackage will get some day: two axes of user rating, plus optional support for visualizing things like test coverage and regression tests that you include in your cabal file (cabal test was being worked on during one of this year's GSOC for example). I was just trying to show what this would get us one step closer to :P On Thu, Sep 16, 2010 at 11:47 AM, Ivan Lazar Miljenovic < ivan.miljenovic@gmail.com> wrote:
On 16 September 2010 17:00, Daniel Peebles
wrote: But maybe one day we'll have way more than just "Stability: experimental; Version: 0.0.1" on hackage, but instead: Stability: experimental Version: 0.0.1 Test coverage: 98% User stability rating: 86% User API quality rating: 56% Local sources of impurity: none Transitive sources of impurity: bytestring, base Used by: 37 packages [click to see them] But that's just a dream, and the impurity measures seem like a decent goal in the mean time :)
Problem is: whilst we might be able to derive the impurity stuff, and the usage is done by examing reverse dependencies, I'm not sure how you would categorise the stability and quality. Furthermore, the test coverage bit presumably requires developers enable hpc when building tests, and if those tests are optional then wouldn't HPC's figures be slightly off since the test suite is now included in the amount of source you have compared to what most people see?
Dan On Thu, Sep 16, 2010 at 8:21 AM, Ivan Lazar Miljenovic
wrote: On 16 September 2010 16:04, Mitar
wrote: Hi!
I just got an idea for hackage feature. All functions/modules listed there could have some mark if they or any function/module they use uses an unsafe* function. Of course this will make probably almost everything marked as unsafe, but this is the idea - to raise awareness about that so that you can prefer some function/implementation over another.
Of course marking/tagging everything as unsafe is not really useful. Because of this I propose that then community votes/vouches on correctness/stability of implementations and this would then influence the how unsafe given function really is (or is according to community, if we are more precise). Of course it would be even better that every function using unsafe would have also a formal proof but as we cannot believe that we will prove everything in a feasible feature we could maybe opt for such "crowd intelligence" approach. We cannot have a Turing machine, but maybe we can have crowd. ;-)
(Of course low number of found bugs and good unit test code coverage can then positively influence crowd, so authors would be motivated to assure that.)
Comments? Opinions?
Because I really hate that I try to keep my code pure and separate IO from everything else and then somewhere deep in there some unsafe* lurks. (Ah, yes, a side effect of this tagging/marks would be also that you would be able to see where all those unsafe* calls are for a given function, so you would be able to fast jump (with link) to a given line in code and evaluate circumstances in which that unsafe* call is made. And then vote/vouch once you discover that it is probably pretty safe.)
The problem with this is: unsafe* functions would be better called
"yesIGuaranteeThatUsingThisFunctionDoesResultInAReferentiallyTransparentEntityAndItsOKForMeToUseIt*".
They are "unsafe" in that you shouldn't use them blindly.
Seeing as how lazy IO relies on various unsafe* functions, as do bytestrings, this means that any program that uses them is subsequently "tainted".
A much better idea would be to have some kind of compilation warning unless you can prove that you're using the unsafe* function in a safe fashion, but such a proof is unlikely to be easily proven in a rigorous fashion nor mechanically checkable (and would delay compilation times).
-- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Yeah, those other things were part of the "bigger picture" that I hope hackage will get some day: two axes of user rating, plus optional support for visualizing things like test coverage and regression tests that you include in your cabal file (cabal test was being worked on during one of this year's GSOC for example). I was just trying to show what this would get us one step closer to :P
I'd be happy even if there was a mandatory "whats new" field on packages. I see version numbers flashing by but that doesn't really tell me whats happening. V

On Thu, Sep 16, 2010 at 1:43 PM, Ville Tirronen
I'd be happy even if there was a mandatory "whats new" field on packages. I see version numbers flashing by but that doesn't really tell me whats happening.
See: http://hackage.haskell.org/trac/hackage/ticket/299 http://hackage.haskell.org/trac/hackage/ticket/244 I try to add NEWS files to my packages source repositories[1] but it sure would be nice if this file was directly shown on hackage. Regards, Bas [1] http://code.haskell.org/~basvandijk/

On 9/16/10 9:44 AM, Bas van Dijk wrote:
I try to add NEWS files to my packages source repositories[1] but it sure would be nice if this file was directly shown on hackage.
Agreed. For now I sometimes put the most recent release notes in the description, like http://hackage.haskell.org/package/rss2irc-0.4 .

Ivan Lazar Miljenovic schrieb:
On 16 September 2010 16:04, Mitar
wrote: Hi!
I just got an idea for hackage feature. All functions/modules listed there could have some mark if they or any function/module they use uses an unsafe* function. Of course this will make probably almost everything marked as unsafe, but this is the idea - to raise awareness about that so that you can prefer some function/implementation over another. ...
The problem with this is: unsafe* functions would be better called "yesIGuaranteeThatUsingThisFunctionDoesResultInAReferentiallyTransparentEntityAndItsOKForMeToUseIt*". They are "unsafe" in that you shouldn't use them blindly.
I think such a long and descriptive name would be helpful, since there seem to be many programmers, that do not know, that functions using unsafePerformIO must be referentially transparent. My suggestion is to move the Unsafe modules to a new package 'unsafe'. Then you can easily spot all "dirty" packages by looking at reverse dependencies of 'unsafe'. However I see the problem that there are two kinds of uses of 'unsafePerformIO': 1. "Final" usage that turns some IO code into something that looks like Non-IO and thus must behave this way. 2. "Interim" usage that provides new kinds of unsafe* functions, that delegates the proof obligation to their users. Packages that provide functions of type 2 must be treated like the "unsafe" package, thus complicate the reverse dependency lookup.

On 17 September 2010 03:18, Henning Thielemann
Ivan Lazar Miljenovic schrieb:
The problem with this is: unsafe* functions would be better called "yesIGuaranteeThatUsingThisFunctionDoesResultInAReferentiallyTransparentEntityAndItsOKForMeToUseIt*". They are "unsafe" in that you shouldn't use them blindly.
I think such a long and descriptive name would be helpful, since there seem to be many programmers, that do not know, that functions using unsafePerformIO must be referentially transparent.
Maybe we need more documentation then? A better question is "why are you using unsafePerformIO?" (then again, I know a first year student that used it to get random numbers out of System.Random for an assignment - when randomness wasn't needed - because he didn't understand the explicit passing stuff we do and didn't bother asking).
My suggestion is to move the Unsafe modules to a new package 'unsafe'. Then you can easily spot all "dirty" packages by looking at reverse dependencies of 'unsafe'.
Hooray, yet another supposedly stand-alone library that GHC will depend on and thus can't be upgraded anyway, so there's no real advantage of making it stand-alone (after all, doesn't base use unsafeInterleaveIO or something for lazy IO?). -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

On Fri, Sep 17, 2010 at 1:44 AM, Ivan Lazar Miljenovic
On 17 September 2010 03:18, Henning Thielemann
My suggestion is to move the Unsafe modules to a new package 'unsafe'. Then you can easily spot all "dirty" packages by looking at reverse dependencies of 'unsafe'.
Hooray, yet another supposedly stand-alone library that GHC will depend on and thus can't be upgraded anyway, so there's no real advantage of making it stand-alone (after all, doesn't base use unsafeInterleaveIO or something for lazy IO?).
Well, it's not like we plan on regularly fiddling that API :) The clever thing about this suggestion is that most packages don't *export* equivalent power to unsafePerformIO even if they import it (inlinePerformIO from bytestring is a notable exception) so you can easily see from a library's *immediate* dependencies whether it could potentially do anything naughty or not. Also, it's implementable entirely with existing technology, although we'll probably want a major base version bump to remove the modules. When discussing this sort of "taint" I think it's important not to forget that the FFI can be just as bad. For a start, one common use of unsafe functions is to provide a pure API to a foreign library (as is done in RWH), and clearly in such cases the proof of correctness cannot exist in the language because it depends on properties of libraries, which may not even be linked until runtime. Secondly, FFI imports are almost as bad safetywise as System.IO.Unsafe, and twice as impossible to prove correct. So your taint measure should take into account use of that extension, too.

On 17 September 2010 10:12, Ben Millwood
On Fri, Sep 17, 2010 at 1:44 AM, Ivan Lazar Miljenovic
wrote: On 17 September 2010 03:18, Henning Thielemann
My suggestion is to move the Unsafe modules to a new package 'unsafe'. Then you can easily spot all "dirty" packages by looking at reverse dependencies of 'unsafe'.
Hooray, yet another supposedly stand-alone library that GHC will depend on and thus can't be upgraded anyway, so there's no real advantage of making it stand-alone (after all, doesn't base use unsafeInterleaveIO or something for lazy IO?).
Well, it's not like we plan on regularly fiddling that API :)
The clever thing about this suggestion is that most packages don't *export* equivalent power to unsafePerformIO even if they import it (inlinePerformIO from bytestring is a notable exception) so you can easily see from a library's *immediate* dependencies whether it could potentially do anything naughty or not. Also, it's implementable entirely with existing technology, although we'll probably want a major base version bump to remove the modules.
Couldn't that information be discovered by Hackage simply grepping the sources? Surely if all you want to know is if a package calls unsafePerformIO directly, that is the simplest way. Grepping would also find callers of inlinePerformIO, which would be far more useful than tainting every package that depends on bytestring just because it might call that function. Conrad. Conrad.
participants (9)
-
Bas van Dijk
-
Ben Millwood
-
Conrad Parker
-
Daniel Peebles
-
Henning Thielemann
-
Ivan Lazar Miljenovic
-
Mitar
-
Simon Michael
-
Ville Tirronen