Addition to unix: raw ByteString APIs

I propose to commit the attached patch to the unix package and release
it with GHC 7.4.1. The commit log is reproduced below. Comments please!
The unix version number will of course be bumped appropriately.
Cheers,
Simon
commit d5e43be90d3c6f8869dd2b0c65800c9a6dd0ac70
Author: Simon Marlow

I'm in favor. Two comments: * System/Posix/ByteString/FilePath.hsc sticks out a bit as it's the only module that doesn't follow the Foo.Bar.ByteString pattern (i.e. ByteString as the leaf module). * Should we newtype RawSystemPath? I cannot come up with a really good argument for it, but every time we don't hide our representations we end up getting screwed (see String, FilePath). We could provide an IsString instance and toPath/fromPath (or similarly named) helpers. -- Johan

- There is a new function
System.Posix.ByteString.getArgs :: [ByteString]
returning the raw untranslated arguments as passed to exec() when the program was started.
Is this one similar to the [String] getArgs in that it drops unix's argv[0]? I was recently surprised by that in the standard getArgs because I wanted a program to restart itself. I can't figure out how to do that without access to argv[0]. I suppose for consistency the ByteString version should have the same behaviour, so maybe this is just an opportunity to wonder why it does that in the first place.

On 11 November 2011 18:16, Evan Laforge
- There is a new function
System.Posix.ByteString.getArgs :: [ByteString]
returning the raw untranslated arguments as passed to exec() when the program was started.
Is this one similar to the [String] getArgs in that it drops unix's argv[0]? I was recently surprised by that in the standard getArgs because I wanted a program to restart itself. I can't figure out how to do that without access to argv[0].
I suppose for consistency the ByteString version should have the same behaviour, so maybe this is just an opportunity to wonder why it does that in the first place.
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
System.Environment exports: getProgName :: IO String maybe System.Posix.ByteString should export a similar function: getProgName :: IO ByteString Bas

System.Environment exports:
getProgName :: IO String
maybe System.Posix.ByteString should export a similar function:
getProgName :: IO ByteString
Yeah, that's actually not the same as argv[0], it has the path to the binary stripped. So you can't really use it to restart yourself because you have no way to know what directory the binary was in. It's frustrating because you can see in the source that it's going to some effort to intentionally strip off information that you can't get elsewhere. Anyway, it probably would make sense to have the ByteString version since it's hand-in-hand with getArgs and is a FilePath.

On Fri, Nov 11, 2011 at 7:59 PM, Evan Laforge
System.Environment exports:
getProgName :: IO String
maybe System.Posix.ByteString should export a similar function:
getProgName :: IO ByteString
Yeah, that's actually not the same as argv[0], it has the path to the binary stripped. So you can't really use it to restart yourself because you have no way to know what directory the binary was in. It's frustrating because you can see in the source that it's going to some effort to intentionally strip off information that you can't get elsewhere.
FYI, there are at least two libraries out there trying to solve this problem: http://hackage.haskell.org/package/executable-path http://hackage.haskell.org/package/FindBin Unfortunately, there is no standardized way on different unix systems to access the path of the executable running (it's not even fully clear what it means in the presence of symlinks, etc). Actually it seems to be impossible to do this (without argv[0]) on certain BSD systems. Balazs

On Mon, Nov 14, 2011 at 12:05, Balazs Komuves
Unfortunately, there is no standardized way on different unix systems to access the path of the executable running (it's not even fully clear what it means in the presence of symlinks, etc). Actually it seems to be impossible to do this (without argv[0]) on certain BSD systems.
Also note: - argv[0] won't be a full pathname if the program was found via $PATH search - it is possible for users to pass arbitrary argv[0] to the exec() family of system calls - some programs use special argv[0] values (this probably doesn't practically matter), notably shells look for a leading "-" (which is normally provided by "login" or "sshd" etc.) to indicate a login shell that should source ~/.profile etc. - there are various other special cases, such as a number of Unixlikes implementing setuid shell scripts securely by passing a /dev/fd/* reference as argv[0] to avoid symlink attacks. Again, you *probably* don't need to care about this one, but there may be others on various systems. In short, argv[0] should not be relied on as the executable name. (The usual way this is managed is that the real executable is something like foo.real and foo is a shell script which passes in the path to foo.real as a parameter. During installation/configuration the shell script is modified as necessary to provide the correct path.) -- brandon s allbery allbery.b@gmail.com wandering unix systems administrator (available) (412) 475-9364 vm/sms

On Mon, Nov 14, 2011 at 9:45 AM, Brandon Allbery
On Mon, Nov 14, 2011 at 12:05, Balazs Komuves
wrote: Unfortunately, there is no standardized way on different unix systems to access the path of the executable running (it's not even fully clear what it means in the presence of symlinks, etc). Actually it seems to be impossible to do this (without argv[0]) on certain BSD systems.
Also note: - argv[0] won't be a full pathname if the program was found via $PATH search
Well yes, granted it's not reliable under all possible circumstances and all possible unixes. But it works fine for a personal tool run under controlled circumstances. I wound up just always calling it as '$program $(dirname $program)' which is a bit noisy but works fine.

On Fri, 2011-11-11 at 16:23 +0000, Simon Marlow wrote:
I propose to commit the attached patch to the unix package and release it with GHC 7.4.1. The commit log is reproduced below. Comments please!
+1 :-) Just one minor thing:
- There is a new type: RawFilePath = ByteString
Can't that be made a proper type (e.g. via a newtype) instead of being a mere type-alias? -- hvr

On 11/11/2011 20:55, Herbert Valerio Riedel wrote:
On Fri, 2011-11-11 at 16:23 +0000, Simon Marlow wrote:
I propose to commit the attached patch to the unix package and release it with GHC 7.4.1. The commit log is reproduced below. Comments please!
+1 :-)
Just one minor thing:
- There is a new type: RawFilePath = ByteString
Can't that be made a proper type (e.g. via a newtype) instead of being a mere type-alias?
I'd rather *not* do that: - The unix library doesn't generally make newtypes - take a look at all the other types it exports. - It's a low-level API, abstraction is not the goal here. - We know from the POSIX spec that a path is a sequence of bytes and nothing more. This interface makes that explicit. Cheers, Simon

On 11/11/2011 04:23 PM, Simon Marlow wrote:
I propose to commit the attached patch to the unix package and release it with GHC 7.4.1. The commit log is reproduced below. Comments please!
The unix version number will of course be bumped appropriately.
That's great ! +1 Out of curiosity, is it a step in abstracting FilePath away from String ? or that's too complicated compatibility wise ? -- Vincent
participants (9)
-
Balazs Komuves
-
Bas van Dijk
-
Brandon Allbery
-
Evan Laforge
-
Gregory Collins
-
Herbert Valerio Riedel
-
Johan Tibell
-
Simon Marlow
-
Vincent Hanquez