Re: [Haskell-cafe] Haskell not ready for Foo [was: Re: Hypothetical Haskell job in New York]

My statements refer not to the FFI, but as I said, to "FFI code". FFI- based libraries seldom compile without excessive amounts of work, they're often poorly documented, and in general they seem to be maintained much less than pure Haskell libraries. The FFI is necessary, of course, but in general I view it as a bootstrapping process leading to pure Haskell libraries -- a crutch you have to live with until you can afford to pay the price of walking. Regards, John On Jan 8, 2009, at 3:15 PM, John Goerzen wrote:
On Thu, Jan 08, 2009 at 11:14:18AM -0700, John A. De Goes wrote:
But really, what's the point? FFI code is fragile, often uncompilable and unsupported, and doesn't observe the idioms of Haskell nor take advantage of its powerful language features. Rather than coding through
That is an extraordinarily cruel, and inaccurate, sweep of FFI.
I've worked with C bindings to several high-level languages, and I must say that I like FFI the best of any I've used. It's easy to use correctly, stable, and solid. If anything, it suffers from under-documentation.
The whole point of FFI is to bring other languages into the Haskell fold. So you can, say, talk to a database using its C library and wind up putting the strongly-typed HaskellDB atop it. Or you can write an MD5 algorithm in C and make it look like a regular Haskell function.
You can indeed fit a square peg in a round hole, if you pound hard enough. That doesn't mean it's a good thing to do.
And with that, I fully agree.
-- Joh

John A. De Goes wrote:
My statements refer not to the FFI, but as I said, to "FFI code". FFI-based libraries seldom compile without excessive amounts of work, they're often poorly documented, and in general they seem to be
Examples? I maintain a couple of FFI libraries, and strive to have them just work with one line of calling cabal. There are plenty of other FFI libraries that also work that way. In fact, I don't really notice a difference these days, at least with the FFI libraries I've used. In any case, apt-get usually grabs stuff I want anyhow. I don't really see why the underlying details of an implementation (C or Haskell) would necessarily correlate with levels of documentation, and it doesn't seem to for me anyhow.
maintained much less than pure Haskell libraries. The FFI is necessary, of course, but in general I view it as a bootstrapping process leading to pure Haskell libraries -- a crutch you have to live with until you can afford to pay the price of walking.
Well, you pretty much always have to get down to the C level on a *nix platform at some point, anyhow. You've got to make syscalls somewhere. I don't think FFI is so evil. There is value in avoiding wheel reinvention, too. If zlib already works great, why re-invent it when you can easily just use what's there? -- John

On Jan 9, 2009, at 8:23 AM, John Goerzen wrote:
Well, you pretty much always have to get down to the C level on a *nix platform at some point, anyhow. You've got to make syscalls somewhere.
Take a language like Ruby or Python (or Java, or C#, etc.). The vast majority of code written in these languages does not "get down to the C level". When I say, "vast majority", I'm referring to > 99.999%. That's because the standard libraries provide sufficiently comprehensive platform-agnostic abstractions to do what most people need to do. As a result, libraries for these languages are built on the standard libraries, and do not require native code.
I don't think FFI is so evil. There is value in avoiding wheel reinvention, too. If zlib already works great, why re-invent it when you can easily just use what's there?
There are lots of reasons: 1. If there's a bug in a library, Haskellers are more likely to fix the bug if the library is written in Haskell. 2. Haskellers are more likely to improve code that is written in Haskell. 3. A chain is only as strong as its weakest link -- libraries with more dependencies are more fragile, more likely to break, and less likely to work across platforms. 4. Haskell-only libraries are easier to build, easy to use, and easier to include in your program (this is subjective and we don't agree on this one, so ignore it if you like). 5. Haskell libraries are generally more commercial friendly than the GNU-licensed libraries that inevitably back FFI-based libraries. 6. Haskell libraries can more easily offer tight integration with Haskell code, and take advantage of features unique to Haskell, such as purity and laziness, and a declarative coding style. A shining role model is the Java ecosystem. No platform has as many open source, commercial-friendly, robust, feature-rich, and community- supported libraries than Java does. These libraries are, in the vast majority of cases, written in 100% Java, work identically on all platforms, are as easy to use as adding a single file to your project (Java also has Maven, which functions similarly to Cabal). That's where I'd like Haskell to be in 5 years. Regards, John

Quoth "John A. De Goes"

On Jan 10, 2009, at 4:11 PM, Donn Cave wrote:
Quoth "John A. De Goes"
: | Take a language like Ruby or Python (or Java, or C#, etc.). The vast | majority of code written in these languages does not "get down to the | C level". When I say, "vast majority", I'm referring to > 99.999%. | That's because the standard libraries provide sufficiently | comprehensive platform-agnostic abstractions to do what most people | need to do. As a result, libraries for these languages are built on | the standard libraries, and do not require native code.
Maybe I haven't been paying enough attention, but I see Python and Haskell in about the same position on this, especially in light of how different they are (Haskell's FFI is a lot easier!) Plenty of Python software depends on C library modules and foreign code. The particular examples you mention - DB and UI - are great examples where it's sort of crazy to do otherwise for just the reasons you go on to list.
Python has pure interfaces to all the major databases. While it's true there are no "native" GUI libraries, there are pure Python libraries for just about everything else. Haskell is not yet to this point.
The arguments you list in favor of native code are reasonable (though some of them cut both ways - in particular it's a bold assertion that bug fixes and general development are more likely in a Haskell implementation, compared to a widely used C implementation that it parallels.)
I don't think it's a bold assertion. If I'm using a Haskell library that wraps a C library, and find a bug in it, my chances of tracking down the bug in C code and submitting a patch to whatever group maintains it are exactly zero. On the other hand, if it's a pure Haskell library, I'll at least take a look. What would you do?
But each case has its own merits, and it's perilous to generalize. It would have been absurd for Python to take the approach that Java takes (lacking the major corporate backing), and probably so also for Haskell. (Though Haskell may in the end need it for APIs that involve I/O, the way things seem to be going in GHC.)
Safe, composable IO needs to be pushed into the core (ideally, into the standard). And it needs to be powerful enough to handle the different use cases: text parsing, binary data, random IO, and interactive IO. The currently exposed semantics are neither safe nor composable. Regards, John

On Thu, Jan 15, 2009 at 09:17:55AM -0700, John A. De Goes wrote:
On Jan 10, 2009, at 4:11 PM, Donn Cave wrote:
Maybe I haven't been paying enough attention, but I see Python and Haskell in about the same position on this, especially in light of how different they are (Haskell's FFI is a lot easier!) Plenty of Python software depends on C library modules and foreign code. The particular examples you mention - DB and UI - are great examples where it's sort of crazy to do otherwise for just the reasons you go on to list.
Python has pure interfaces to all the major databases. While it's true there are no "native" GUI libraries, there are pure Python libraries for just about everything else. Haskell is not yet to this point.
By "pure" do you mean "containing python code only"? I'm looking through a few, and: PostgreSQL - psycopg - C PostgreSQL - pgsql - C PostgreSQL - pygresql - C MySQL - mysqldb - C MS SQL Server - pymssql - C And any interface that uses ODBC will, by necessity, be calling to C, because ODBC is defined as a C API and not a network protocol. Where are all these pure-Python drivers?
I don't think it's a bold assertion. If I'm using a Haskell library that wraps a C library, and find a bug in it, my chances of tracking down the bug in C code and submitting a patch to whatever group maintains it are exactly zero. On the other hand, if it's a pure Haskell library, I'll at
Why?
least take a look. What would you do?
You have to balance that against the chances of there being bugs in code that is used by far fewer people. I don't care what language you're talking about -- I'm going to expect the C PostgreSQL library to be less buggy than a pure reimplementation in any other language, and will have less concern about its maintenance and stability in the future. It's a lot of wheel reinvention to try to re-implement a database protocol in n languages instead of do it in 1 and bind to it everywhere else. AFAIK, the only language where that sort of wheel reinvention is popular is Java. But then Java seems to encourage wheel reinvention anyhow ;-) -- John

On Jan 15, 2009, at 9:31 AM, John Goerzen wrote:
By "pure" do you mean "containing python code only"? I'm looking through a few, and:
Search for "pure python mysql" or "pure python postgresql" and you'll see at least two implementations. In addition, there are plenty of pure Python databases for those who want and are able to stay strictly within Python.
I don't think it's a bold assertion. If I'm using a Haskell library that wraps a C library, and find a bug in it, my chances of tracking down the bug in C code and submitting a patch to whatever group maintains it are exactly zero. On the other hand, if it's a pure Haskell library, I'll at
Why?
I'm tired of C. I'm not going to use any unpaid time writing or maintaining anything written in C. I assume if C were my favorite language, I'd be hanging around c-cafe instead of haskell-cafe. :-)
You have to balance that against the chances of there being bugs in code that is used by far fewer people.
That's true.
It's a lot of wheel reinvention to try to re-implement a database protocol in n languages instead of do it in 1 and bind to it everywhere else.
Why is wheel reinvention a bad thing? A combination of cooperation and competition is best for every endeavor. We have lots of companies in the business of making tires, each trying to outdo the other, but for any given company, they are all united behind the goal of producing the best possible tire. The consumers benefit.
AFAIK, the only language where that sort of wheel reinvention is popular is Java. But then Java seems to encourage wheel reinvention anyhow ;-)
The Java reinventions look and feel like Java, because they're native implementations. This is even more important in Haskell where the differences between Haskell and C is about as large as you can get. Regards, John

john:
On Jan 15, 2009, at 9:31 AM, John Goerzen wrote:
By "pure" do you mean "containing python code only"? I'm looking through a few, and:
Search for "pure python mysql" or "pure python postgresql" and you'll see at least two implementations. In addition, there are plenty of pure Python databases for those who want and are able to stay strictly within Python.
FWIW, there are pure Haskell storage APIs. See e.g. HAppS-State and TCache. -- Don

John Goerzen ha scritto:
Python has pure interfaces to all the major databases. While it's true there are no "native" GUI libraries, there are pure Python libraries for just about everything else. Haskell is not yet to this point.
By "pure" do you mean "containing python code only"? I'm looking through a few, and:
PostgreSQL - psycopg - C PostgreSQL - pgsql - C PostgreSQL - pygresql - C MySQL - mysqldb - C MS SQL Server - pymssql - C
And any interface that uses ODBC will, by necessity, be calling to C, because ODBC is defined as a C API and not a network protocol.
Where are all these pure-Python drivers?
Time ago, I implemented a client for the network protocol used by PostgreSQL: http://hg.mperillo.ath.cx/twisted/pglib/ it covers almost all the protocol features (only extended queries are not supported). It is implemented using Twisted. I would like to reimplement it in Haskell, sometime in the future. I tried to implement the MySQL network protocol, too, but it is a *mess*, so I gave up (and, at that time, there were strange claims about copyright). It is also possible to support MSSQL and Sybase, implementing a client for the TDS (Tabular Data Stream) protocol. TDS, too, is a mess (well, if you compare it with the PostgreSQL protocol), and last time I studied it, the freeTDS project only had a reversed engineered protocol documentation; now Microsoft has made the TDS variant used my MSSQL public: http://msdn.microsoft.com/en-us/library/cc448435.aspx So, in theory, it should not really be a problem to implement native and robust support for PostgreSQL, MySQL, MSSQL and Sybase. One benefit of these implementation would be builtin support to concurrency [1]. For PostgreSQL, a native implementation can be useful to listen notifies.
[...]
[1] the libpq API *has* support for async API, but it is not complete (and well tested like sync API, IMHO). As an example there is no support for async function call, although "The Function Call sub-protocol is a legacy feature that is probably best avoided in new code. Similar results can be accomplished by setting up a prepared statement that does SELECT function($1, ...). The Function Call cycle can then be replaced with Bind/Execute." P.S.: the PostgreSQL protocol is really well designed Regards Manlio Perillo
participants (5)
-
Don Stewart
-
Donn Cave
-
John A. De Goes
-
John Goerzen
-
Manlio Perillo