
At Thu, 23 Jun 2011 00:40:38 -0700 (PDT), oleg@okmij.org wrote:
How would you make this safe for dynamic loading? Would you have to keep track of all the package/module names that have been linked into the program and/or loaded, and not allow duplicates? Is dynamic unloading of code ever possible? Or at least is re-loading possible, since that appears to be key for debugging web servers and such?
I must read up on Andreas Rossberg's work and check AliceML (which is, alas, no longer developed) to answer the questions on safe dynamic loading. AliceML is probably the most ambitious system to support type-safe loading. See, for example
http://www.mpi-sws.org/~rossberg/papers/Rossberg%20-%20The%20Missing%20Link....
and other papers on his web page. He now works at Google, btw.
I wish I knew ML better, but from looking at that paper, I can't figure out the key piece of information, which is how to detect overlapping type instance declarations. (Does ML even have something equivalent?) In the absence of type families, I'm okay using something like haskell-plugins that returns Dynamic. In the absence of type families, Safe Haskell ought to be able to guarantee that typereps are all unique, making cast safe in such a situation. (Even if somehow two different pieces of code end up with conflicting functional dependencies, this may muck with the expected behavior of the program, as I get a different method from the one I was expecting, but it won't undermine type safety.) If there were a way to use Baars & Swiestra instead of Dynamic, that would be even better. With type families, I'm at a loss to figure out how this could even work. You would need to know have a runtime representation of every type family that's ever been instantiated, and you would need to check this against all the type families in the dynamic module you are loading. That seems very complicated.
MetaOCaml by not doing anything about it, letting the garbage code accumulate. It is hard to know when all the code pointers to a DLL are gone. GC do not usually track code pointers. One has to modify GC, which is quite an undertaking. I suppose one may hack around by registering each (potential) entry point as a ForeignPointer, and do reference counting in a finalizer. Since a DLL may produce lots of closures, each code pointer in those closures should be counted as an entry point.
It would be very cool to GC the code, but that's more than I'm hoping for.
Finally, how do you express constraints in contexts using type families? For example, say I wanted to implement folds over tuples. With fundeps (and undecidable instances, etc.), I would use a proxy type of class Function2 to represent the function:
Your code can be re-written to use type functions almost mechanically (the method definitions stay the same -- modulo uncurrying, which I used for generality). I admit there is a bit of repetition (repeating the instance head). Perhaps a better syntax could be found.
This is great. And in fact I was almost sold on your idea, until I realized that GHC does not accept the following program: {-# LANGUAGE TypeFamilies #-} x :: () x = const () (eq a b) where a :: S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(Z))))))))))))))))))))) a = undefined b :: S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(S(Z))))))))))))))))))))) b = undefined eq :: a -> b -> NatEq a b eq _ _ = undefined Notice that there are no undecidable instances required to compile this function (or the portions of your TTypeable that this function depends on). Yet GHC is still limiting my context stack (to 21 by default). Obviously any kind of unicode encoding of type names is going to run into the same sort of problem. So we seem to have come full circle: Undecidable instances are problematic because program compilation depends on this arbitrary context stack depth parameter. One way to avoid UndecidableInstances is to assign a unique type-level Nat to each type. But now the comparison of types is depending on this context stack and likely worsening the problem. A few thoughts: First, obviously if we went base 2 instead of unary for the encoding, types would get a lot shorter. If we took a SHA-256 hash of the type name and encoded it, we could then pick some default context depth (e.g., 21 + 256) that would make this work most of the time. Second, I believe it is kind of unfair of GHC to reject the above program. Without UndecidableInstances, GHC ought to be able to figure out that the type checker will terminate. Moreover, in the real world, GHC only has a finite amount of time and memory, so there is no difference between divergence and really big types. I can already crash GHC by feeding it a doubly-exponential-sized type such as: a = let f0 x = (x, x) f1 x = f0 (f0 x) f2 x = f1 (f1 x) f3 x = f2 (f2 x) f4 x = f3 (f3 x) f5 x = f4 (f4 x) in f5 () So given that decidable inputs already exist on which GHC wedges my machine for a while and then crashes, what's the point of artificially limiting compilation of other types of programs whose inputs aren't known to be decidable, since often they are in fact decidable? Finally, I still think most of the "magic" in everything we've been talking about boils down to being able to have a type variable that can take on any type *except* a certain handful. This would allow us to avoid overlapping instances of both classes and type families, would allow us to define TypeEq, etc. Moreover, it would be a more direct expression of what programmers intend, and so make code easier to read. Such a constraint cannot be part of the context, because instance selection would need to depend on it, so some new syntax would be necessary. David