
Most of the well-known algorithms are first-order, in the sense that their input and output are "plain" data. Some are second-order in a trivial way, for example sorting, hashtables or the map and fold functions: they are parameterized by a function, but they don't really do anything interesting with it except invoke it on pieces of other input data. Some are also second-order but somewhat more interesting: * Fingertrees parameterized by monoids * Splitting a fingertree on a monotonous predicate * Prefix sum algorithms, again usually parameterized by a monoid or a predicate etc. Finally, some are "truly" higher-order in the sense that is most interesting to me: * The Y combinator * Difference lists Do there exist other nontrivial higher-order algorithms and datastructures? Is the field of higher-order algorithms indeed as unexplored as it seems? I mean that not only higher-order facilities are used, but the essence of the algorithm is some non-trivial higher-order manipulation. For example, parser combinators are not so interesting: they are a bunch of relatively orthogonal (by their purpose) combinators, each of which is by itself quite trivial, plus not-quite-higher-order backtracking at the core. For example, for the Y combinator and difference lists are interesting: the Y combinator builds a function from a function in a highly non-trivial way; difference lists are a data structure built entirely from functions and manipulated using higher-order mechanisms. -- Eugene Kirpichov Senior Software Engineer, Grid Dynamics http://www.griddynamics.com/

2010/8/23 Eugene Kirpichov
[snip] Do there exist other nontrivial higher-order algorithms and datastructures? Is the field of higher-order algorithms indeed as unexplored as it seems? [snip]
Hi, I'm thinking to some HOAS (higher order abstract syntax) representation. Cheers, Thu

2010/8/23 Eugene Kirpichov
For example, parser combinators are not so interesting: they are a bunch of relatively orthogonal (by their purpose) combinators, each of which is by itself quite trivial, plus not-quite-higher-order backtracking at the core.
This is only if you're not quite considering generalizing parser combinators to non-backtracking algorithms. The CYK algorithm [1] does not backtrack, it merges partial parsing results. When I thought about it I figured that parser combinators became even more restricted that they in arrow parsers. [1] http://en.wikipedia.org/wiki/CYK_algorithm PS CYK is interesting because it provides parallel parsing opportunities, it can parse many parts of text in parallel and then merge bags of successful parsings into another successful parsings. As CYK does not care about start of sequence it was used to parse grammars on hypergraphs: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.47.6425 PPS I didn't thought fully about CYK parser combinators yet. But I think that CYK could be an example of something unusual in the accustomed field of parsing.

Eugene Kirpichov wrote:
Do there exist other nontrivial higher-order algorithms and datastructures? Is the field of higher-order algorithms indeed as unexplored as it seems?
Many algorithms in natural language processing can be captured by higher-order algorithms parameterized by the choice of semiring (or module space). For example, consider the inference problem for hidden Markov models (which are often used for things like determining the part of speech tags for some sentence in natural language). To figure out the total probability that the HMM is in some state at some time, you use the Forward algorithm.[1] To figure out the probability of the most likely state sequence that has a specific state at some time, you use the Viterbi algorithm. To figure out not only the probability of the most likely state sequence but also what that tag sequence actually is, you can modify Viterbi to store back pointers. All of these are the same algorithm, just with different (augmented) semirings. In order to prevent underflow for very small probabilities, we usually run these algorithms with probabilities in the log-domain. Those variants are also the same algorithm, just taking the image of the semiring under the logarithm functor: Forward : FW ([0,1], +, 0, *, 1) Log Forward : FW ([-Inf,0], <+>, -Inf, +, 0) where -- Ignoring infinities... x <+> y | x >= y = x + log (1 + exp (y-x)) | otherwise = y + log (1 + exp (x-y)) Viterbi : FW ([0,1], max, 0, *, 1) Log Viterbi : FW ([-Inf,0], max, -Inf, +, 0) ViterbiBP Q : FW (Maybe([0,1],Maybe Q), argmax, Nothing, <*>, Just(1,Nothing)) where -- Q = the type of the states in your HMM mx <*> my = do (px,x) <- mx (py,y) <- my return (px*py, y `mappend` x) Log (ViterbiBP Q) : FW ( Maybe([-Inf,0],Maybe Q) , argmax, Nothing , <+>, Just(0,Nothing)) where mx <+> my = do (px,x) <- mx (py,y) <- my return (px+py, y `mappend` x) Using augmented semirings we can simplify the backpointer version significantly in order to incorporate the optimizations usually encountered in practice. That is, the Maybes are required to make it a semiring, but we can optimize both of them away in practice, yielding an augmented semiring over (Prob,Q) or (Log Prob, Q). We get the same sort of thing for variants of the Backward algorithm used in the Forward--Backward algorithm. Of course, there's nothing special about HMMs here. We can extend the Forward--Backward algorithm to operate over tree structures instead of just list structures. That version is called the Inside--Outside algorithm. And semirings show up all over the place in other algorithms too. Of course, in hindsight this makes perfect sense: the powerset of the free semiring over S is the set of all (automata theoretic) languages over S. So semirings capture languages exactly; in the same way that commutative monoids capture multisets, and monoids capture sequences. This insight also extends to cover things like weighted-logic programming languages, since we can use any semiring we like, not just the Boolean probability semiring. Automata theoretic languages are everywhere. [1] Or you combine the Forward and Backward algorithms, depending on what exactly you want. Same goes for the others. -- Live well, ~wren

On 8/24/10 12:29 AM, wren ng thornton wrote:
All of these are the same algorithm, just with different (augmented) semirings. In order to prevent underflow for very small probabilities, we usually run these algorithms with probabilities in the log-domain. Those variants are also the same algorithm, just taking the image of the semiring under the logarithm functor:
Forward : FW ([0,1], +, 0, *, 1)
Technically, the semiring is (E, <+>, <0>, <*>, <1>) where E is an event space, <+> is union of events[1], <0> is the impossible event, <*> is intersection of events[2], and <1> is the event of certainty. But we can simplify things from the event space to a probability space, given the assumptions made by the forward algorithm. Just in case anyone cared :) [1] Pr(x) <+> Pr(y) = Pr(x) + Pr(y) - Pr(x,y) [2] Pr(x) <*> Pr(y) = Pr(x,y) -- Live well, ~wren

Interesting. I've come across this general idea/algorithm the factor graph /
sum-product algorithm papers[1] but I was wondering if you knew of any
implementations of it in haskell? I wrote one a while back but it was fairly
ugly and not as general as I'd have liked, so I never released it.
Thanks,
Dan
[1] http://cba.mit.edu/events/03.11.ASE/docs/Loeliger.pdf
On Tue, Aug 24, 2010 at 9:25 AM, wren ng thornton
On 8/24/10 12:29 AM, wren ng thornton wrote:
All of these are the same algorithm, just with different (augmented) semirings. In order to prevent underflow for very small probabilities, we usually run these algorithms with probabilities in the log-domain. Those variants are also the same algorithm, just taking the image of the semiring under the logarithm functor:
Forward : FW ([0,1], +, 0, *, 1)
Technically, the semiring is (E, <+>, <0>, <*>, <1>) where E is an event space, <+> is union of events[1], <0> is the impossible event, <*> is intersection of events[2], and <1> is the event of certainty. But we can simplify things from the event space to a probability space, given the assumptions made by the forward algorithm.
Just in case anyone cared :)
[1] Pr(x) <+> Pr(y) = Pr(x) + Pr(y) - Pr(x,y) [2] Pr(x) <*> Pr(y) = Pr(x,y)
-- Live well, ~wren _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On 8/24/10 11:10 AM, Daniel Peebles wrote:
Interesting. I've come across this general idea/algorithm the factor graph / sum-product algorithm papers[1] but I was wondering if you knew of any implementations of it in haskell? I wrote one a while back but it was fairly ugly and not as general as I'd have liked, so I never released it.
Yeah, factor graphs and graphical models also see this kind of thing. Basically anything that can be thought of as collecting or combining paths through a graph is likely to work for arbitrary semirings (again, because of the connection between languages and free semirings). Versions of Dijkstra's algorithm for weighted graphs vs unweighted graphs, for example, same thing. As for Haskell implementations: this summer I've been working on a generalized forward--backward algorithm as well as an anytime n-best algorithm, though I haven't released the code just yet. One of the main aims of the project is to explore incremental, on-line, and interactive algorithms for HMMs, and to make sure the implementation is efficient enough for real-time use. I think the code is pretty attractive, for all that. Though there are always a few rough edges. Curiously enough, I ran into some difficulties when trying to make the algorithm general over different semirings. Basically GHC was having problems figuring out that two required class instances should be the same one. That's the big thing holding back a public release right now. After doing the final report for this summer, I think I've figured out a new way of tackling it, which I hope will allow GHC to resolve the types. Once I get that figured out I'll throw it up on Hackage and make an announcement. HMMs, including higher-order HMMs, hit a nice sweet spot when it comes to implementing things efficiently. Trying to do it for arbitrary factor graphs or graphical models is going to make the implementation bog down. For instance, you can perform both passes of the forward--backward algorithm in parallel because the chain structure of an HMM ensures that the "forward" and "backward" halves of the graph are completely severed. When generalizing this to tree structures you get the inside--outside algorithm, but the outside pass requires the results of the inside pass, so you can't do them in parallel. -- Live well, ~wren

On 23 August 2010 14:03, Eugene Kirpichov
Do there exist other nontrivial higher-order algorithms and datastructures? Is the field of higher-order algorithms indeed as unexplored as it seems?
Aren't higher order algorithms "functional pearls"? :-) You might find Olivier Danvy and Michael Spivey's "On Barron and Strachey’s Cartesian Product Function" (subtitle "Possibly the world’s first functional pearl") a interesting read - BRICS Tech Report RS-07-14. Olivier Danvy has a lot of work on defunctionalization and refunctionalization which may be relevant at the "meta level". http://www.brics.dk/~danvy/ http://www.brics.dk/RS/07/14/BRICS-RS-07-14.pdf

Automatic differentiation can also bee seen this way. In a sense it
transforms a function to compute f(x) into a function to compute
f'(x), where f' is the derivative of f.
--
Dan
On Mon, Aug 23, 2010 at 6:03 AM, Eugene Kirpichov
Most of the well-known algorithms are first-order, in the sense that their input and output are "plain" data. Some are second-order in a trivial way, for example sorting, hashtables or the map and fold functions: they are parameterized by a function, but they don't really do anything interesting with it except invoke it on pieces of other input data.
Some are also second-order but somewhat more interesting: * Fingertrees parameterized by monoids * Splitting a fingertree on a monotonous predicate * Prefix sum algorithms, again usually parameterized by a monoid or a predicate etc.
Finally, some are "truly" higher-order in the sense that is most interesting to me: * The Y combinator * Difference lists
Do there exist other nontrivial higher-order algorithms and datastructures? Is the field of higher-order algorithms indeed as unexplored as it seems?
I mean that not only higher-order facilities are used, but the essence of the algorithm is some non-trivial higher-order manipulation.
For example, parser combinators are not so interesting: they are a bunch of relatively orthogonal (by their purpose) combinators, each of which is by itself quite trivial, plus not-quite-higher-order backtracking at the core.
For example, for the Y combinator and difference lists are interesting: the Y combinator builds a function from a function in a highly non-trivial way; difference lists are a data structure built entirely from functions and manipulated using higher-order mechanisms.
-- Eugene Kirpichov Senior Software Engineer, Grid Dynamics http://www.griddynamics.com/ _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Eugene Kirpichov wrote:
Most of the well-known algorithms are first-order, in the sense that their input and output are "plain" data. Some are second-order in a trivial way, for example sorting, hashtables or the map and fold functions: they are parameterized by a function, but they don't really do anything interesting with it except invoke it on pieces of other input data.
[...]
For example, parser combinators are not so interesting: they are a bunch of relatively orthogonal (by their purpose) combinators, each of which is by itself quite trivial, plus not-quite-higher-order backtracking at the core.
Aww, and there I thought that a famous function of 6th order for combining parsers would be to your liking: Chris Okasaki. Even Higher-Order Functions for Parsing or Why Would Anyone Ever Want To Use a Sixth-Order Function? http://www.eecs.usma.edu/webs/people/okasaki/jfp98.ps Regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com

The function image style always described as "from Point to ..." [insert Picture, Bitmap, Texture...] is also inherently higher order. Examples are Conal Elliott's Pan, Jerzy Karczmarczuk's Clastic and Peter Henderson's images. In Clastic and Pan, the higher order image seems like a characteristic function extended to generate a colour rather than a boolean test, in Peter Henderson's work (I think) the images are always described with vectors displacing an initial point and so have similarities to the "coordinate free style" of graphics pioneered by Tony DeRose and others.

One very nice example of a "higher-order" algorithm is the notion of region
(i.e. Point -> Bool) defined in Hudak's paper, that is using functions as
data structures...
http://delivery.acm.org/10.1145/250000/242477/a196-hudak.html?key1=242477&key2=4611513821&coll=GUIDE&dl=GUIDE&CFID=99830619&CFTOKEN=16057768
On Mon, Aug 23, 2010 at 6:03 AM, Eugene Kirpichov
Most of the well-known algorithms are first-order, in the sense that their input and output are "plain" data. Some are second-order in a trivial way, for example sorting, hashtables or the map and fold functions: they are parameterized by a function, but they don't really do anything interesting with it except invoke it on pieces of other input data.
Some are also second-order but somewhat more interesting: * Fingertrees parameterized by monoids * Splitting a fingertree on a monotonous predicate * Prefix sum algorithms, again usually parameterized by a monoid or a predicate etc.
Finally, some are "truly" higher-order in the sense that is most interesting to me: * The Y combinator * Difference lists
Do there exist other nontrivial higher-order algorithms and datastructures? Is the field of higher-order algorithms indeed as unexplored as it seems?
I mean that not only higher-order facilities are used, but the essence of the algorithm is some non-trivial higher-order manipulation.
For example, parser combinators are not so interesting: they are a bunch of relatively orthogonal (by their purpose) combinators, each of which is by itself quite trivial, plus not-quite-higher-order backtracking at the core.
For example, for the Y combinator and difference lists are interesting: the Y combinator builds a function from a function in a highly non-trivial way; difference lists are a data structure built entirely from functions and manipulated using higher-order mechanisms.
-- Eugene Kirpichov Senior Software Engineer, Grid Dynamics http://www.griddynamics.com/ _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
participants (9)
-
Dan Piponi
-
Daniel Peebles
-
Eugene Kirpichov
-
Heinrich Apfelmus
-
Serguey Zefirov
-
Stephen Tetley
-
Vinod Grover
-
Vo Minh Thu
-
wren ng thornton