Runtime selection on types while capturing a dictionary?

Hi, I'm new to Haskell, and I'm writing a class for evolutionary trees. Since I'm new to Haskell, trying to encode information at the type level is new to me. In C++ I'd use runtime checks to ask whether (for example) a tree is rooted. Or I'd have the getRoot function return Maybe NodeID and then make the methods for checking if a branch points rootward throw an exception if the tree is unrooted. I'm trying to avoid doing this the C++ way, but this raises a lot of question. Let me start with 2 of them: 1. Is there any advice on how to represent data structures that may or may not have a list of attributes? But what I have in Haskell right now looks like this: data Graph (IntMap Node) (IntMap Edge) data WithBranchLengths t = WithBranchLengths t (IntMap Double) data WithLabels t l = WithLabels t (IntMap (Maybe l)) data Forest = Forest Graph {- Assert that the Graph is a Forest -} data WithRoots t = WithRoots t [NodeId] data WithNodeTimes t = WithNodeTimes t (IntMap Double) data WithBranchRates t = WithBranchRates t (IntMap Double) data Tree = Tree Forest {- Assert that the Forest is a Tree -} A tree with labels, roots, and branch lengths could have following types: WithLabels (WithRoots (WithBranchLengths Tree)) Text WithLabels (WithBranchLengths (WithRoots Tree)) Text WithRoots (WithLabels (WithBranchLengths Tree) Text) WithRoots (WithBranchLengths (WithLabels Tree Text)) .. etc etc.. The fact that there are multiple ways to represent the same thing is already not great. Also, the fact that you can do WithBranchLengths (WithBranchLengths Tree) indicates that the construction of types should be obeying certain rules that are not expressed. Only the outer WithBranchLengths is going to count, and there really should not be two of the same attribute. 2. I'm wondering if it is possible to do run-time selection on whether a tree is rooted or not. (I've considered trying to move all checks to the type-level, but this inhibits putting models with different properties into the same list -- see below) class IsGraph g where ... class IsGraph g => IsDirectedGraph g where isForward :: g -> EdgeId -> Bool class IsDirectedGraph g => IsDirectedAcyclicGraph g class IsGraph f => IsForest f where type family Unrooted f type family Rooted f unroot :: f -> Unrooted f makeRooted :: f -> Rooted f class (IsDirectedAcyclicGraph t, IsForest t) => HasRoots t where isRoot :: t -> NodeId -> Bool roots :: t -> [NodeId] isRoot f n = isSource f n roots f = filter (isRoot f) (getNodes f) At first I was thinking that since rooting is at the type level, then I can't do any runtime branching on whether a tree is rooted. But then it occured to me that a GADT can package dictionaries with a constructor, so that in theory I could do something like: likelihood s t = if isEquilibriumReversible s then equilibriumReversibleLikelihood s t else case isRooted t of Unrooted -> error "tree must be rooted, since model is not reversible or not at equilibrium!" Rooted -> rootedTreeLikelihood s t It seems like this might work if I could make a GADT with a signature something like: data MaybeRooted t = Unrooted | HasRoots t => Rooted -- Q: should this use a forall type? In that case the Rooted constructor would package the HasRoots t dictionary needed to run the rootedTreeLikelihood function. But I'm not sure how I would actually write the `isRooted` function that would construct the MaybeRooted t data type. I've considered trying to encode the EquilibriumReversible predicate at the type level to avoid runtime checks, but that has the problem that I couldn't put models with different properties into the the same list of type [m]. So I guess the questions are: Q2a: is this a reasonable way of trying to do runtime selection on whether a tree is rooted? Q2b: how would I write the `isRooted t` function? Feedback much appreciated. -BenRI P.S. Haskell code is here: https://github.com/bredelings/BAli-Phy/tree/master/haskell

On Mon, Oct 07, 2024 at 05:38:30PM -0400, Benjamin Redelings wrote:
I'm new to Haskell, and I'm writing a class for evolutionary trees. Since I'm new to Haskell, trying to encode information at the type level is new to me. In C++ I'd use runtime checks to ask whether (for example) a tree is rooted. Or I'd have the getRoot function return Maybe NodeID and then make the methods for checking if a branch points rootward throw an exception if the tree is unrooted.
I think I'd just do data AugmentedTree roots lengths labels = MkAugmentedTree { tree :: Tree, labels :: labels, roots :: roots, lengths :: lenghts } and then just do stuff like type FullyAugmentedTree = AugmentedTree [NodeId] (IntMap Double) (IntMap (Maybe l)) type PartiallyAugmentedTree = AugmentedTree [NodeId] () (IntMap (Maybe l)) For rootedness-independence I strongly recommend against any type class shenanigans. Instead you can do data PossiblyRootedTree lengths labels where RootedTree :: AugmentedTree [NodeId] lengths labels UnrootedTree :: AugmentedTree () lengths labels and then isRooted :: PossiblyRootedTree lengths labels -> Bool isRooted = \case RootedTree {} -> True UnrootedTree {} -> False (but `isRooted` is not really very useful compared to just pattern matching every where you need to know.) Tom

On 8 Oct 2024, at 00:14, Tom Ellis
wrote: data PossiblyRootedTree lengths labels where RootedTree :: AugmentedTree [NodeId] lengths labels UnrootedTree :: AugmentedTree () lengths labels
Isn’t that invalid syntax? Wouldn’t you need either data PossiblyRootedTree lengths labels where RootedTree :: AugmentedTree [NodeId] lengths labels -> PossiblyRootedTree lengths labels UnrootedTree :: AugmentedTree () lengths labels -> PossiblyRootedTree lengths labels or simply (avoiding GADT syntax altogether): data PossiblyRootedTree lengths labels = RootedTree (AugmentedTree [NodeId] lengths labels) | UnrootedTree (AugmentedTree () lengths labels)

On Wed, Oct 09, 2024 at 09:33:30AM +0200, Andreas Källberg wrote:
On 8 Oct 2024, at 00:14, Tom Ellis
wrote: data PossiblyRootedTree lengths labels where RootedTree :: AugmentedTree [NodeId] lengths labels UnrootedTree :: AugmentedTree () lengths labels Isn’t that invalid syntax? Wouldn’t you need either
data PossiblyRootedTree lengths labels where RootedTree :: AugmentedTree [NodeId] lengths labels -> PossiblyRootedTree lengths labels UnrootedTree :: AugmentedTree () lengths labels -> PossiblyRootedTree lengths labels
or simply (avoiding GADT syntax altogether):
data PossiblyRootedTree lengths labels = RootedTree (AugmentedTree [NodeId] lengths labels) | UnrootedTree (AugmentedTree () lengths labels)
Yes, you're right, thanks (and I'm not sure why I thought that required a GADT). Tom
participants (3)
-
Andreas Källberg
-
Benjamin Redelings
-
Tom Ellis