
Hello, To implement high-performance web servers, I would like to remove overhead of String and want to use ByteString instead. Unfortunately, WAI is using FilePath (String) in ResponseFile. ResponseFile Status ResponseHeaders FilePath (Maybe FilePart) To my experience, the third argument is created from rawPathInfo (ByteString) in many cases. So, using FilePath is overhead. Since new Haskell Platform is released, I think I can implement new simple-sendfile which provides Network.Sendfile.ByteString: sendfile :: Socket -> RawFilePath -> FileRange -> IO () -> IO () Note that RawFilePath is a type synonym of ByteString. What do you think of this? A1) It's too late. A2) We can change FilePath to RawFilePath in ReponseFile. A3) We can provide a new constructor: ReponseRawFile A4) Others --Kazu

Hello Kazu, Wednesday, June 13, 2012, 11:41:33 AM, you wrote:
To implement high-performance web servers, I would like to remove overhead of String and want to use ByteString instead
1. when dealing with files involve system call, overhead of converting String forth and back to ByteString would be probably small compared to syscall itself 2. in the Haskell maillists, there were huge discussions about filenames nature. on Windows, filename is actually a list of Unicode characters (usually passed to Win32 APIs in UTF16 encoding), on Linux it's byte string in some partition-specific encoding. all the details are dealt in the hard way by the standard libraries. running ahead of them may be very difficult job -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

actually, the standard libraries still don't deal with filepaths well
enough, which is why we are moving towards using system-filepath.
Would that work better for you Kazu, or still not fast enough?
On Thu, Jun 14, 2012 at 10:32 AM, Bulat Ziganshin
Hello Kazu,
Wednesday, June 13, 2012, 11:41:33 AM, you wrote:
To implement high-performance web servers, I would like to remove overhead of String and want to use ByteString instead
1. when dealing with files involve system call, overhead of converting String forth and back to ByteString would be probably small compared to syscall itself
2. in the Haskell maillists, there were huge discussions about filenames nature. on Windows, filename is actually a list of Unicode characters (usually passed to Win32 APIs in UTF16 encoding), on Linux it's byte string in some partition-specific encoding. all the details are dealt in the hard way by the standard libraries. running ahead of them may be very difficult job
-- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com
_______________________________________________ web-devel mailing list web-devel@haskell.org http://www.haskell.org/mailman/listinfo/web-devel

On 14/06/2012 18:53, Greg Weber wrote:
actually, the standard libraries still don't deal with filepaths well enough, which is why we are moving towards using system-filepath.
Are there tickets open about the problems? I'd rather sort any problems out in the standard libraries than force people to use a separate package. Cheers, Simon
Would that work better for you Kazu, or still not fast enough?
On Thu, Jun 14, 2012 at 10:32 AM, Bulat Ziganshin
wrote: Hello Kazu,
Wednesday, June 13, 2012, 11:41:33 AM, you wrote:
To implement high-performance web servers, I would like to remove overhead of String and want to use ByteString instead
1. when dealing with files involve system call, overhead of converting String forth and back to ByteString would be probably small compared to syscall itself
2. in the Haskell maillists, there were huge discussions about filenames nature. on Windows, filename is actually a list of Unicode characters (usually passed to Win32 APIs in UTF16 encoding), on Linux it's byte string in some partition-specific encoding. all the details are dealt in the hard way by the standard libraries. running ahead of them may be very difficult job
-- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com
_______________________________________________ web-devel mailing list web-devel@haskell.org http://www.haskell.org/mailman/listinfo/web-devel
_______________________________________________ web-devel mailing list web-devel@haskell.org http://www.haskell.org/mailman/listinfo/web-devel

Simon,
actually, the standard libraries still don't deal with filepaths well enough, which is why we are moving towards using system-filepath.
Are there tickets open about the problems? I'd rather sort any problems out in the standard libraries than force people to use a separate package.
System.FilePath.ByteString is missing. I will open a ticket. Which trac should I use for this ticket? --Kazu

On 15/06/2012 09:00, Kazu Yamamoto (山本和彦) wrote:
Simon,
actually, the standard libraries still don't deal with filepaths well enough, which is why we are moving towards using system-filepath.
Are there tickets open about the problems? I'd rather sort any problems out in the standard libraries than force people to use a separate package.
System.FilePath.ByteString is missing.
I will open a ticket. Which trac should I use for this ticket?
System.FilePath.ByteString would be POSIX-specific (FilePaths are not ByteStrings on Windows). So I think it would need to be System.FilePath.Posix.ByteString. You can open a ticket on the GHC Trac, and also email the author of filepath (Neil Mitchell). Cheers, Simon

Hello,
System.FilePath.ByteString would be POSIX-specific (FilePaths are not ByteStrings on Windows). So I think it would need to be System.FilePath.Posix.ByteString.
Ah. Now I really understand why people start using system-filepath... Now I'm not sure that I should open a ticket. --Kazu

On Fri, Jun 15, 2012 at 12:25 AM, Simon Marlow
On 14/06/2012 18:53, Greg Weber wrote:
actually, the standard libraries still don't deal with filepaths well enough, which is why we are moving towards using system-filepath.
Are there tickets open about the problems? I'd rather sort any problems out in the standard libraries than force people to use a separate package.
Cheers, Simon
The ticket would be to switch the standard libraries to use system-filepath, which probably requires a mailing list discussion somewhere, a lot of boring changes of code, and a more inconvenient API for end users. I will wait for Text to become standardized first :)

On Fri, Jun 15, 2012 at 2:41 PM, Greg Weber
The ticket would be to switch the standard libraries to use system-filepath, which probably requires a mailing list discussion somewhere, a lot of boring changes of code, and a more inconvenient API for end users. I will wait for Text to become standardized first :)
The FilePath issue has been discussed at length before. The problem is that different OS:es define file paths to be different things (i.e. Unicode or bytes.) The FilePath type somehow has to bridge that, that will be difficult. My main grip with FilePath right now is that it's not an abstract type. -- Johan

Considering that we are talking about the web server, it would be safe to assume that most people would not run it on windows for production use. So i say WAI should switch to ByteString, and let the windows users be penalized just a little bit. You can't please everyone. On Friday, June 15, 2012 02:47:15 PM Johan Tibell wrote:
On Fri, Jun 15, 2012 at 2:41 PM, Greg Weber
wrote: The ticket would be to switch the standard libraries to use system-filepath, which probably requires a mailing list discussion somewhere, a lot of boring changes of code, and a more inconvenient API for end users. I will wait for Text to become standardized first
:)
The FilePath issue has been discussed at length before. The problem is that different OS:es define file paths to be different things (i.e. Unicode or bytes.) The FilePath type somehow has to bridge that, that will be difficult. My main grip with FilePath right now is that it's not an abstract type.
-- Johan
_______________________________________________ web-devel mailing list web-devel@haskell.org http://www.haskell.org/mailman/listinfo/web-devel

Hello Vagif, Saturday, June 16, 2012, 2:00:14 AM, you wrote: 1. Currently, filenames are stored as String requiring serializaton to UTF-16 on Windows and (usually) to UTF-8 on Unixes. With Filename=ByteString it will be `id` on Unixes and utf-to-utf conversion on Windows. Actually, it will became much faster on BOTH platforms 2. System calls usually have such large overhead than even conversion from String doesn't have any srious impact on performance. For example, i can open about 10k files/sec and find about 100k files/sec 3. The real difference will be, though, in the operations involving filenames but not requiring system calls. For example, file caching 4. We don't need only to store filenames, but to perform various operations on them. So it's probably better to store filenames in efficient and easy to operation way, leaving all conversions to syscall-shells. since from the user perspective filenames are just strings, the String or Text type looks like the best fit 5. Making its own file manipulation library for the needs of web servers looks like too expensive solution. If we really need this faster library, it may be proposed as general purpose Haskell library 6. Ideally, we should have Unix/Windows/...-specific libraries dealing with files using native filename representation and higher-level universal library, hiding OS differences and in particular converting native filenames from/to String and Text filename representation
Considering that we are talking about the web server, it would be safe to assume that most people would not run it on windows for production use.
So i say WAI should switch to ByteString, and let the windows users be penalized just a little bit. You can't please everyone.
On Friday, June 15, 2012 02:47:15 PM Johan Tibell wrote:
On Fri, Jun 15, 2012 at 2:41 PM, Greg Weber
wrote: The ticket would be to switch the standard libraries to use system-filepath, which probably requires a mailing list discussion somewhere, a lot of boring changes of code, and a more inconvenient API for end users. I will wait for Text to become standardized first
:)
The FilePath issue has been discussed at length before. The problem is that different OS:es define file paths to be different things (i.e. Unicode or bytes.) The FilePath type somehow has to bridge that, that will be difficult. My main grip with FilePath right now is that it's not an abstract type.
-- Johan
_______________________________________________ web-devel mailing list web-devel@haskell.org http://www.haskell.org/mailman/listinfo/web-devel
_______________________________________________ web-devel mailing list web-devel@haskell.org http://www.haskell.org/mailman/listinfo/web-devel
-- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

On 15/06/2012 22:47, Johan Tibell wrote:
On Fri, Jun 15, 2012 at 2:41 PM, Greg Weber
wrote: The ticket would be to switch the standard libraries to use system-filepath, which probably requires a mailing list discussion somewhere, a lot of boring changes of code, and a more inconvenient API for end users. I will wait for Text to become standardized first :)
The FilePath issue has been discussed at length before. The problem is that different OS:es define file paths to be different things (i.e. Unicode or bytes.) The FilePath type somehow has to bridge that, that will be difficult. My main grip with FilePath right now is that it's not an abstract type.
FYI, in case it wasn't clear, the system-filepath package provides such an abstract FilePath type, which I imagine is why some people prefer to use it even though it means adding some extra conversions when using the System.IO API (conversions which are fragile because we keep changing the meaning of the String version of FilePath, sigh). system-filepath looks slightly odd to be because I would have expected the representation internally to be platform-specific (e.g. either ByteString or Text for Unix or Windows respectively), but it just uses String. Cheers, Simon

FYI, in case it wasn't clear, the system-filepath package provides such an abstract FilePath type, which I imagine is why some people prefer to use it even though it means adding some extra conversions when using the System.IO API (conversions which are fragile because we keep changing the meaning of the String version of FilePath, sigh).
Users of system-filepath should use system-fileio and have fewer needs for System.IO, although system-fileio uses System.IO APIs. Internally, system-filepath has conversion rules that can vary from one ghc version to the next: http://hackage.haskell.org/packages/archive/system-filepath/0.4.6/doc/html/s...
system-filepath looks slightly odd to be because I would have expected the representation internally to be platform-specific (e.g. either ByteString or Text for Unix or Windows respectively), but it just uses String.
I am kind of suprised to see an internal representation using strings also. I imagine that is because paths are usually used with system-fileio which needs strings for the existing Haskell APIs. system-fileio calls encodeString, which is OS specific, and that the internal code base is easier to maintain with 1 type. The encode/decode functions do return a platform specific type. Greg Weber

On Mon, Jun 25, 2012 at 6:53 AM, Greg Weber
system-filepath looks slightly odd to be because I would have expected the representation internally to be platform-specific (e.g. either ByteString or Text for Unix or Windows respectively), but it just uses String.
I am kind of suprised to see an internal representation using strings also. I imagine that is because paths are usually used with system-fileio which needs strings for the existing Haskell APIs. system-fileio calls encodeString, which is OS specific, and that the internal code base is easier to maintain with 1 type. The encode/decode functions do return a platform specific type.
Having a single internal representation lets me avoid having to write all the path manipulation primitives twice. FilePaths use String so they can use the GHC 7.4 path encoding, which supports round-tripping for Unix and Windows paths. It has used other encodings (tried both Text and ByteString) in the past, but String+7.4 works well and is easier to use.
participants (7)
-
Bulat Ziganshin
-
Greg Weber
-
Johan Tibell
-
John Millikin
-
Kazu Yamamoto
-
Simon Marlow
-
Vagif Verdi