
Hi, I'm new to Haskell, and I'm writing a class for evolutionary trees. Since I'm new to Haskell, trying to encode information at the type level is new to me. In C++ I'd use runtime checks to ask whether (for example) a tree is rooted. Or I'd have the getRoot function return Maybe NodeID and then make the methods for checking if a branch points rootward throw an exception if the tree is unrooted. I'm trying to avoid doing this the C++ way, but this raises a lot of question. Let me start with 2 of them: 1. Is there any advice on how to represent data structures that may or may not have a list of attributes? But what I have in Haskell right now looks like this: data Graph (IntMap Node) (IntMap Edge) data WithBranchLengths t = WithBranchLengths t (IntMap Double) data WithLabels t l = WithLabels t (IntMap (Maybe l)) data Forest = Forest Graph {- Assert that the Graph is a Forest -} data WithRoots t = WithRoots t [NodeId] data WithNodeTimes t = WithNodeTimes t (IntMap Double) data WithBranchRates t = WithBranchRates t (IntMap Double) data Tree = Tree Forest {- Assert that the Forest is a Tree -} A tree with labels, roots, and branch lengths could have following types: WithLabels (WithRoots (WithBranchLengths Tree)) Text WithLabels (WithBranchLengths (WithRoots Tree)) Text WithRoots (WithLabels (WithBranchLengths Tree) Text) WithRoots (WithBranchLengths (WithLabels Tree Text)) .. etc etc.. The fact that there are multiple ways to represent the same thing is already not great. Also, the fact that you can do WithBranchLengths (WithBranchLengths Tree) indicates that the construction of types should be obeying certain rules that are not expressed. Only the outer WithBranchLengths is going to count, and there really should not be two of the same attribute. 2. I'm wondering if it is possible to do run-time selection on whether a tree is rooted or not. (I've considered trying to move all checks to the type-level, but this inhibits putting models with different properties into the same list -- see below) class IsGraph g where ... class IsGraph g => IsDirectedGraph g where isForward :: g -> EdgeId -> Bool class IsDirectedGraph g => IsDirectedAcyclicGraph g class IsGraph f => IsForest f where type family Unrooted f type family Rooted f unroot :: f -> Unrooted f makeRooted :: f -> Rooted f class (IsDirectedAcyclicGraph t, IsForest t) => HasRoots t where isRoot :: t -> NodeId -> Bool roots :: t -> [NodeId] isRoot f n = isSource f n roots f = filter (isRoot f) (getNodes f) At first I was thinking that since rooting is at the type level, then I can't do any runtime branching on whether a tree is rooted. But then it occured to me that a GADT can package dictionaries with a constructor, so that in theory I could do something like: likelihood s t = if isEquilibriumReversible s then equilibriumReversibleLikelihood s t else case isRooted t of Unrooted -> error "tree must be rooted, since model is not reversible or not at equilibrium!" Rooted -> rootedTreeLikelihood s t It seems like this might work if I could make a GADT with a signature something like: data MaybeRooted t = Unrooted | HasRoots t => Rooted -- Q: should this use a forall type? In that case the Rooted constructor would package the HasRoots t dictionary needed to run the rootedTreeLikelihood function. But I'm not sure how I would actually write the `isRooted` function that would construct the MaybeRooted t data type. I've considered trying to encode the EquilibriumReversible predicate at the type level to avoid runtime checks, but that has the problem that I couldn't put models with different properties into the the same list of type [m]. So I guess the questions are: Q2a: is this a reasonable way of trying to do runtime selection on whether a tree is rooted? Q2b: how would I write the `isRooted t` function? Feedback much appreciated. -BenRI P.S. Haskell code is here: https://github.com/bredelings/BAli-Phy/tree/master/haskell