
On Tue, Aug 3, 2010 at 8:01 AM, Ivan Lazar Miljenovic
Felipe Lessa
writes: 'hierarchical-clustering' provides a function to create a dendrogram from a list of items and a distance function between them. The most common linkage types are available: single linkage, complete linkage and UPGMA. An item can be anything, for example a DNA sequence, so this may used to create a phylogenetic tree.
What actual clustering algorithm are you using here?
A naïve O(n^2) algorithm using a distance matrix. This can be improved without changing the API, however.
Also, would it be possible to have some more documentation there in general? At the very least, in your next release explain what a dendogram is and why someone would want to use your package (I had to do some quick wikipedia looking to refresh my memory on what dendogram, etc. were to get an understanding of what it does).
Documentation is always good, but I didn't want to take the time to explain everything from the beginning. I guess most people coming to this package will already know that they want a dendrogram. But if they don't, a quick googling is very effective. Hmm, I guess some diagrams would be nice. I've took the time only to explain why there is an "UPGMA" and a "FakeAverageLinkage", because that distinction isn't easy to find on the web. Actually, I still haven't found someone talking about it, just people using either with the same name "average linkage". =) Cheers, -- Felipe.