[ANN] stdio-0.1.0.0 - A simple and high performance IO toolkit for Haskell - Haskell-Cafe

newer
Call For Presentations: Compose...

[ANN] stdio-0.1.0.0 - A simple and high performance IO toolkit for Haskell

寒东

16 Feb 2019 16 Feb '19

12:23 p.m.

Dear Haskellers! We're pleased to announce the first public version of stdio, a simple and high performance IO toolkit for Haskell. This package provides a simple and high performance IO toolkit for Haskell, including packed vectors, unicode texts, socket, file system, timers and more! The Hackage link is http://hackage.haskell.org/package/stdio-0.1.0.0. This project started as an experiment of combining libuv with GHC runtime, the result is recorded in our Haskell 2018 paper A High-Performance Multicore IO Manager Based on libuv (Experience Report) https://github.com/haskell-stdio/stdio/blob/master/docs/A%20High-Performance... , which is pretty good comparing to the old IO manager in base. The IO manager part is stable and fast now. The package also provides new vector and new text type based on ByteArray# primitive, which is not limited to pinned memory. As GHC 8.6 release have added unaligned access primitives to ByteArray#, We think it's a good to release our work. The new vector and text (internal using UTF-8 encoding)'s performance is comparative to bytestring and text ones, and we strive to provide the similiar API. Some unicode processing such as normalization and casefolding is also provide, based on a C unicode library utf8rewind https://bitbucket.org/knight666/utf8rewind. To make our package more useful, we rebuild Builder and Parser type from groud, add TCP socket and filesystem support, so that user can start using it to do some simple task, such as parsing a CSV file or starting a TCP server and communicate in protocols. We also provide high performance timer and logger module, which is useful in practical engineering tasks. For installation guide and examples, please see the project's README. As we (I and Tao He) both are not native english speakers, the document quality is not as satisfying as it can be, please help! If you meet issues with installing, bugs, questions, please fill an issue on github: https://github.com/haskell-stdio/stdio. Happy hacking as always! Dong Han, Tao He Feb 16. 2019

Attachments:

attachment.html (text/html — 3.3 KB)

Show replies by date

Joachim Durchholz

16 Feb 16 Feb

2:30 p.m.

New subject: [ANN] stdio-0.1.0.0 - A simple and high performance IO toolkit for Haskell

Am 16.02.19 um 13:23 schrieb 寒东:

...

Some unicode processing such as normalization and casefolding is also provide, based on a C unicode libraryutf8rewind https://bitbucket.org/knight666/utf8rewind.

FWIW the rest looks fine, but committing to a specific UTF-8 implementation is risky. Unicode is a large and complicated standard, and constantly evolving; I am sceptical that a one-man library like utf8rewind can keep up with that, and I'd wrap a mature Unicode library (such as ICU) rather than place my bet on a one-man show like utf8rewind. In particular, I'd avoid utf8rewind because the author believes that deviating from a standard improves security. (See his comment in https://bitbucket.org/knight666/utf8rewind/issues/8/length-function-should-n... .) I do not think that's a well-considered policy, and certainly does not make me think that his code is well-audited. Including such a thing in such a basic library as stdio seems unwise to me.

...

To make our package more useful, we rebuild Builder and Parser type from groud, add TCP socket and filesystem support, so that user can start using it to do some simple task, such as parsing a CSV file or starting a TCP server and communicate in protocols. We also provide high performance timer and logger module, which is useful in practical engineering tasks.

You should split the library, into stuff that does fast byte shoving, and into stuff that does fast byte processing. That way, things can start to improve and evolve independently.

...

For installation guide and examples, please see the project's README. As we (I and Tao He) both are not native english speakers, the document quality is not as satisfying as it can be, please help!

Judging from this message, your English seems pretty good actually :-) Regards, Jo

Brandon Allbery

4:36 p.m.

New subject: [ANN] stdio-0.1.0.0 - A simple and high performance IO toolkit for Haskell

I'd like to point out that using unpinned memory for your ByteString alternative means you aren't providing the same API: as most string-using foreign functions expect something more like a bytestring than a Haskell string or text type, ByteString is pinned specifically to support that use case directly and many users of ByteString expect that support. This will include the ByteString variants of various functions in the "posix" package, which are thereby providing "raw" versions of system calls, ensuring the Haskell program can get exactly what the OS provides (POSIX interfaces don't use Unicode, thus various Unicode encodings can be unable to provide the exact OS level representation of e.g. file names). On Sat, Feb 16, 2019 at 9:30 AM Joachim Durchholz wrote:

...

...
Some unicode processing such as normalization and casefolding is also

Am 16.02.19 um 13:23 schrieb 寒东: provide, based on a C unicode libraryutf8rewind < https://bitbucket.org/knight666/utf8rewind>.

FWIW the rest looks fine, but committing to a specific UTF-8 implementation is risky. Unicode is a large and complicated standard, and constantly evolving; I am sceptical that a one-man library like utf8rewind can keep up with that, and I'd wrap a mature Unicode library (such as ICU) rather than place my bet on a one-man show like utf8rewind.

In particular, I'd avoid utf8rewind because the author believes that deviating from a standard improves security. (See his comment in

https://bitbucket.org/knight666/utf8rewind/issues/8/length-function-should-n... .) I do not think that's a well-considered policy, and certainly does not make me think that his code is well-audited. Including such a thing in such a basic library as stdio seems unwise to me.

...
To make our package more useful, we rebuild Builder and Parser type from groud, add TCP socket and filesystem support, so that user can start using it to do some simple task, such as parsing a CSV file or starting a TCP server and communicate in protocols. We also provide high performance timer and logger module, which is useful in practical engineering tasks.

You should split the library, into stuff that does fast byte shoving, and into stuff that does fast byte processing. That way, things can start to improve and evolve independently.

...
For installation guide and examples, please see the project's README. As we (I and Tao He) both are not native english speakers, the document quality is not as satisfying as it can be, please help!

Judging from this message, your English seems pretty good actually :-)

Regards, Jo _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

-- brandon s allbery kf8nh allbery.b@gmail.com

Dong Han

17 Feb 17 Feb

2:58 a.m.

New subject: [ANN] stdio-0.1.0.0 - A simple and high performance IO toolkit for Haskell

...

I'd like to point out that using unpinned memory for your ByteString alternative means you aren't providing the same API: as most string-using foreign functions expect something more like a bytestring than a Haskell string or text type, ByteString is pinned specifically to support that use case directly and many users of ByteString expect that support. This will include the ByteString variants of various functions in the "posix" package, which are thereby providing "raw" versions of system calls, ensuring the Haskell program can get exactly what the OS provides (POSIX interfaces don't use Unicode, thus various Unicode encodings can be unable to provide the exact OS level representation of e.g. file names).

We build a specific CByte https://hackage.haskell.org/package/stdio-0.1.0.0/docs/Std-Data-CBytes.html module to do that job: A a wrapper for immutable null-terminated string, either a literal Addr#, or a pinned heap ByteArray# . It's not intended to be using any encoding, we just use UTF-8 assumptions in literals and displaying. All file paths in stdio use CBytes instead of Bytes (alias to PrimVector Word8). Cheers~ Dong On Sun, Feb 17, 2019 at 12:37 AM Brandon Allbery wrote:

...

I'd like to point out that using unpinned memory for your ByteString alternative means you aren't providing the same API: as most string-using foreign functions expect something more like a bytestring than a Haskell string or text type, ByteString is pinned specifically to support that use case directly and many users of ByteString expect that support. This will include the ByteString variants of various functions in the "posix" package, which are thereby providing "raw" versions of system calls, ensuring the Haskell program can get exactly what the OS provides (POSIX interfaces don't use Unicode, thus various Unicode encodings can be unable to provide the exact OS level representation of e.g. file names).

On Sat, Feb 16, 2019 at 9:30 AM Joachim Durchholz wrote:

...
...
Some unicode processing such as normalization and casefolding is also

Am 16.02.19 um 13:23 schrieb 寒东: provide, based on a C unicode libraryutf8rewind < https://bitbucket.org/knight666/utf8rewind>.

FWIW the rest looks fine, but committing to a specific UTF-8 implementation is risky. Unicode is a large and complicated standard, and constantly evolving; I am sceptical that a one-man library like utf8rewind can keep up with that, and I'd wrap a mature Unicode library (such as ICU) rather than place my bet on a one-man show like utf8rewind.

In particular, I'd avoid utf8rewind because the author believes that deviating from a standard improves security. (See his comment in

https://bitbucket.org/knight666/utf8rewind/issues/8/length-function-should-n... .) I do not think that's a well-considered policy, and certainly does not make me think that his code is well-audited. Including such a thing in such a basic library as stdio seems unwise to me.

...
To make our package more useful, we rebuild Builder and Parser type from groud, add TCP socket and filesystem support, so that user can start using it to do some simple task, such as parsing a CSV file or starting a TCP server and communicate in protocols. We also provide high performance timer and logger module, which is useful in practical engineering tasks.

You should split the library, into stuff that does fast byte shoving, and into stuff that does fast byte processing. That way, things can start to improve and evolve independently.

...
For installation guide and examples, please see the project's README. As we (I and Tao He) both are not native english speakers, the document quality is not as satisfying as it can be, please help!

Judging from this message, your English seems pretty good actually :-)

Regards, Jo _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

-- brandon s allbery kf8nh allbery.b@gmail.com _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

2333

Age (days ago)

2334

Last active (days ago)

List overview

Download

3 comments

4 participants

participants (4)

Brandon Allbery
Dong Han
Joachim Durchholz
寒东