Re: [Haskell-cafe] My Unification Sorrows

I seem to have did an accidental reply-to-sender at first: On Wed, Apr 04, 2007 at 01:37:44PM -0700, Paul Berg wrote:
Ok, so I decided to implement an algorithm to build Strongly Typed Genetic Programming trees in Haskell, in an effort to learn Haskell, and I'm way over my head on this unification problem.
The unification seems to work well, as long as I include the occurs check. growFullTree returns a lazy list of all possible syntax trees in random order that are well typed. So, with the occurs check:
head $ growFullTree 4 TInt 0 [] (mkStdGen 5)
Gives me a nice, random, well typed tree. So far so good, although I am wary of the efficiency... perhaps I need some applySubst calls in there somewhere to keep the constraint list down?
Anyway now, we comment out the occurs check in the unify function and do the same thing again. Now we get what appears to be an infinite constraint! The unify is supposed to be terminating without the occurs check, so something is very wrong here!
Worse, now we run:
head $ growFullTree 5 TInt 0 [] (mkStdGen 7)
This one never even begins printing! Although it appears the bug is tail recursive in this case. Other random number seeds will cause it to blow stack.
I have gone over and over this code and cannot find the issue due to my lack of experience with both Haskell and Unification algorithms in general.
I was hoping someone here could give pointers on where the bug might lie, eg. what is my algorithm doing wrong that makes it a mostly, but not completely correct unification algorithm, as well as give me pointers on how this code could be made cleaner and/or more concise, as I'm trying very hard to get my brain around this language, and this problem.
Here's the basic code, which should be fully runnable:
The problem is with this type: data Type = TInt | TReal | TVar TVName | Type :-> Type Because it is impossible to observe sharing in Haskell values, we are forced to explore the entire type; but without the occurs check the type can be infinite! In my infinite-types unifier, I used this: data Bin = Sum | Fun | Prod deriving (Eq, Show, Ord) data Zen = ZTup deriving (Eq,Show,Ord) data DictElt = EZen Zen | EBin Bin Type Type deriving (Eq, Show, Ord) type Dict = M.Map Int DictElt Changing to use Paul's set of types: data DictElt = TInt | TReal | TVar | Int :-> Int type Dict = M.Map Int DictElt IE, we use Ints explicitly as pointers. However this representation costs a lot in terms of clarity, since everything must be explicitly indirected through the 'heap'. Also fixtypes was a very old project of mine; were I to do it again I would probably use explicit mutable references: data Tinfo s = TInt | TReal | TVar | Type s :-> Type s type Type s = STRef s (Tinfo s) This may conflict with your method of backtracking tree generation; I haven't read that far in the code yet. Stefan ----- End forwarded message -----

I believe I may have found a solution (I *think* it's correct): The occurs check needs to stay, but be modified for infinite types. When the occurs check is true, instead of failing, we should keep the constraint, but skip performing substitution on the rest of the list of constraints. This allows us to not fall into the trap of an infinite substitution loop. So, unify becomes, in the case of infinite types: -- | Given a list of constraints, unify those constraints, -- finding values for the type variables unify :: (Monad m, Functor m) => [Constraint] -> m [Constraint] unify [] = return [] unify ((t1 , t2) :rest) | t1 == t2 = unify rest unify ((tyS, tyX@(TVar _)) :rest) -- | tyX `occursIn` tyS = fail "Infinite Type" | tyX `occursIn` tyS = fmap ((tyX,tyS):) (unify rest) | otherwise = fmap ((tyX,tyS):) (unify $ substinconstr tyX tyS rest) unify ((tyX@(TVar _), tyT) :rest) = unify $ (tyT,tyX) : rest unify ((tyS1 :-> tyS2,tyT1 :-> tyT2):rest) = unify $ (tyS1,tyT1) : (tyS2,tyT2) : rest unify _ = fail "Unsolvable"

On Wed, Apr 04, 2007 at 07:16:35PM -0700, Paul Berg wrote:
I believe I may have found a solution (I *think* it's correct):
The occurs check needs to stay, but be modified for infinite types. When the occurs check is true, instead of failing, we should keep the constraint, but skip performing substitution on the rest of the list of constraints. This allows us to not fall into the trap of an infinite substitution loop. So, unify becomes, in the case of infinite types:
-- | Given a list of constraints, unify those constraints, -- finding values for the type variables unify :: (Monad m, Functor m) => [Constraint] -> m [Constraint] unify [] = return [] unify ((t1 , t2) :rest) | t1 == t2 = unify rest unify ((tyS, tyX@(TVar _)) :rest) -- | tyX `occursIn` tyS = fail "Infinite Type" | tyX `occursIn` tyS = fmap ((tyX,tyS):) (unify rest) | otherwise = fmap ((tyX,tyS):) (unify $ substinconstr tyX tyS rest) unify ((tyX@(TVar _), tyT) :rest) = unify $ (tyT,tyX) : rest unify ((tyS1 :-> tyS2,tyT1 :-> tyT2):rest) = unify $ (tyS1,tyT1) : (tyS2,tyT2) : rest unify _ = fail "Unsolvable"
I honestly don't understand the algorithm. (And I don't have tapl!!) So, ... quickcheck! You've seen my occurs-free unifier; now write an Arbitrary instance for trees and check that they always give the same result. "The chance of a common bug is neglible.". Stefan

This solution is invalid for some edge cases.
Better than quickcheck, I can map eval across all possible trees of
depth n, since growFullTree returns a lazy list of all trees.
On 4/4/07, Stefan O'Rear
On Wed, Apr 04, 2007 at 07:16:35PM -0700, Paul Berg wrote:
I believe I may have found a solution (I *think* it's correct):
The occurs check needs to stay, but be modified for infinite types. When the occurs check is true, instead of failing, we should keep the constraint, but skip performing substitution on the rest of the list of constraints. This allows us to not fall into the trap of an infinite substitution loop. So, unify becomes, in the case of infinite types:
-- | Given a list of constraints, unify those constraints, -- finding values for the type variables unify :: (Monad m, Functor m) => [Constraint] -> m [Constraint] unify [] = return [] unify ((t1 , t2) :rest) | t1 == t2 = unify rest unify ((tyS, tyX@(TVar _)) :rest) -- | tyX `occursIn` tyS = fail "Infinite Type" | tyX `occursIn` tyS = fmap ((tyX,tyS):) (unify rest) | otherwise = fmap ((tyX,tyS):) (unify $ substinconstr tyX tyS rest) unify ((tyX@(TVar _), tyT) :rest) = unify $ (tyT,tyX) : rest unify ((tyS1 :-> tyS2,tyT1 :-> tyT2):rest) = unify $ (tyS1,tyT1) : (tyS2,tyT2) : rest unify _ = fail "Unsolvable"
I honestly don't understand the algorithm. (And I don't have tapl!!) So, ... quickcheck! You've seen my occurs-free unifier; now write an Arbitrary instance for trees and check that they always give the same result. "The chance of a common bug is neglible.".
Stefan

Also, the random backtracker isn't truly random.. it clusters
solutions. the head should be completely random each time though.
On 4/4/07, Stefan O'Rear
On Wed, Apr 04, 2007 at 07:16:35PM -0700, Paul Berg wrote:
I believe I may have found a solution (I *think* it's correct):
The occurs check needs to stay, but be modified for infinite types. When the occurs check is true, instead of failing, we should keep the constraint, but skip performing substitution on the rest of the list of constraints. This allows us to not fall into the trap of an infinite substitution loop. So, unify becomes, in the case of infinite types:
-- | Given a list of constraints, unify those constraints, -- finding values for the type variables unify :: (Monad m, Functor m) => [Constraint] -> m [Constraint] unify [] = return [] unify ((t1 , t2) :rest) | t1 == t2 = unify rest unify ((tyS, tyX@(TVar _)) :rest) -- | tyX `occursIn` tyS = fail "Infinite Type" | tyX `occursIn` tyS = fmap ((tyX,tyS):) (unify rest) | otherwise = fmap ((tyX,tyS):) (unify $ substinconstr tyX tyS rest) unify ((tyX@(TVar _), tyT) :rest) = unify $ (tyT,tyX) : rest unify ((tyS1 :-> tyS2,tyT1 :-> tyT2):rest) = unify $ (tyS1,tyT1) : (tyS2,tyT2) : rest unify _ = fail "Unsolvable"
I honestly don't understand the algorithm. (And I don't have tapl!!) So, ... quickcheck! You've seen my occurs-free unifier; now write an Arbitrary instance for trees and check that they always give the same result. "The chance of a common bug is neglible.".
Stefan
participants (2)
-
Paul Berg
-
Stefan O'Rear