Re: [Haskell-cafe] A distributed and replicating native Haskell database

2 Feb 2007


      On Feb 2, 2007, at 3:06 PM, Paul Johnson wrote:
...
As a rule, storing functions along with data is a can of worms.  
Either you actually store the code as a BLOB or you store a pointer  
to the function in memory. Either way you run into problems when  
you upgrade your software and expect the stored functions to work  
in the new context.
ACache does not store code in the database. You cannot read the  
database unless you have your original class code. ACache may store  
the "schema", i.e. the parent class names, slot names, etc.
...
Erlang also has a very disciplined approach to code updates, which  
presumably helps a lot when functions are stored.
No storing of code here either. What you store in Erlang is just  
tuples so there's no schema or class definition. No functions are  
stored since any Erlang code can fetch the tuples from Mnesia. You do  
need to have the original record definition around but this is just  
to be able to refer to tuple elements with field names rather name  
field position.
...
I very much admire Mnesia, even though I'm not an Erlang  
programmer. It would indeed be really cool to have something like  
that. But Mnesia is built on the Erlang OTP middleware. I would  
suggest that Haskell needs a middleware with the same sort of  
capabilities first. Then we can build a database on top of it.
Right. That would be a prerequisite.
...
The real headache is type safety. Erlang is entirely dynamically  
typed, so untyped schemas with column values looked up by name at  
run-time fit right in, and its up to the programmer to manage  
schema and code evolution to prevent errors. Doing all this in a  
statically type safe way is another layer of complexity and checking.
I believe Lambdabot does schema evolution.
...
Alternatively the protocol can be defined in a special purpose  
protocol module P, and A and B then import P. This is the approach  
taken by CORBA with IDL. However what happens if P is updated to  
P'? Does this mean that both A and B need to be recompiled and  
restarted simultaneously? Requiring this is a Bad Thing; imagine if  
every bank in the world had to upgrade and restart its computers  
simultaneously in order to upgrade a common protocol.
I would go for the middle ground and dump the issue entirely. Lets be  
practical here. When a binary protocol is updated, all code using the  
protocol needs to be updated. This would be good enough. It would  
suite me just fine too as I'm not yearning for CORBA, I just want to  
build a trading infrastructure entirely in Haskell.
...
There is still the possibility of a run-time failure at the  
protocol negotiation stage of course, if it transpires that the to  
processes have no common protocol.
So no protocol negotiation!
...
However there is a wrinkle here: what about "pass through"  
processes which don't interpret the data but just store and forward  
it. Various forms of protocol adapter fit this scenario, as does  
the database you originally asked about.
Any packet traveling over the wire would need to have a size,  
followed by a body. Any pass-through protocol can just take the  
binary blob and re-send it.

	Thanks, Joel

--
http://wagerlabs.com/

Re: [Haskell-cafe] A distributed and replicating native Haskell database

Joel Reymont