Haskell idioms for hierarchical memory modeling ?

26 May 2008

      *** How FP in general and Haskell in particular can help in modeling memory
with neuron-like structures? ***

The particular model discussed here - latter in this text I call it Memory
Tree - is a type of neural network (NN) where the network consists of a
collection of data processing nodes arranged in a tree-shaped hierarchy.
Memory Tree (MT) is a hierarchical classifier, that infers recurring data
patterns observed in its input during some period of time T. Time period is
a set of discreet times (or moments) T = [t1, t2, ...tN].

Each node reads inputs from all its children and then builds from these
inputs a single input vector as a concatenation of children outputs. Node
then classifies this newly built input vector assigning it to one of the
category groups that node creates. Next node outputs group number that
current input belongs to. This output goes to the node parent.
All nodes at all all levels do these steps in turn, inputs flowing from
bottom level nodes up to a top node through all nodes at all intermidiate
levels.
Next I illustrate main ideas of MT with a simple example.

*** Simple MT Example

Primitive MT classifier, used in this example, will (I hope :) discern, for
example, these images:

1)
########
########
########
########

2)
########
#......#
#......#
########

3)
###..###
#.####.#
#.####.#
###..###

('#' - black dot, '.' - white dot)

In my case at each moment MT uses bottom-level nodes to read input from a
data file.
Input data is a collection of strings. Each string is a binary vector
representing data read by some sensor at time tj.

For example:

v1(t1) : 11111111111111111111111111111111
v2(t2) : 11111111100000011000000111111111
v3(t3) : 11100111101111011011110111100111
...
...
vN(tN) : ...

In this example input is 32 bit vector read at consecutive times t1 < t2 <
t3 ... tN
This vector represents a rectangular view of a camera - 8 x 4 matrix of
black (1) and white (0) dots. Shown above images are encoded in 32 bit
vectors, one image - one vector, where black dot '#' becomes 1 and white dot
'.' becomes 0.

In this case MT may have the following topology:

-- One Top Level Node: T
-- Two Middle Level Nodes: M1, M2
-- Four Bottom Level Nodes: B1,B2,B3,B4
-- Input Vectors: v1(t1), v2(t2), v3(t3), ... vN(tN)

To show how nodes in this tree are connected I will use brackets. In my
'bracket notation' parent node name is followed by its child node names in
brackets:

(T (M1 (B1, B2), M2(B3, B4)))

This describes topology where:
-- T node has two children: M1 and M2
-- M1 node has two children: B1 and B2
-- M2 node has two children: B3 and B4

(Sorry for poor illustrations, unfortunately I can't use graphics are
supported in this post, including ascii art :)

For this topology each input vector, for example v1(t2), may be split in
four equal parts of 8 bits. Chunks will then represent four regions of the
camera image.

Each 8 bit chunk is sent as input to the corresponding Bottom Level nodes:
B1,B2,B3,B4

v2(t2)_b31-b24 : 11111111 --> B1
v2(t2)_b23-b16 : 10000001 --> B2
v2(t2)_b15-b8  : 10000001 --> B3
v1(t1)_b7-b0   : 11111111 --> B4

To work with these vectors each Bottom node will have dimension of input
vector BK = 8 so it can classify 2 ** 8 = 256 input vectors.
Let's say Bottom node can classify these 256 input vectors into 2 ** 3 = 8
output groups, so it will have dimension of output vector BL = 3. In turn,
each M1 and M2 node will have input dimension equal to the sum of all its
children (B nodes) output dimensions:

M1K = B1L + B2L = 3 + 3 = 6
M2K = B3L + B4L = 3 + 3 = 6

Our M nodes can classify 2 ** 6 = 64 vectors, and we assume that they can
categorize these 64 vectors in 2 ** 2 = 4 groups, so M node output vector is
2 bit long:

ML = 2

Connecting two M nodes to a single top node T will requre T input dimension
equal to the sum of two M node output dimensions:

TK = M1L + M2L = 2 + 2= 4

T node can classify 2 ** 4 = 16 vectors, and we assume T node can group
these vectors in 2 groups, so T node output vector is 1 bit long:

TL = 1

To summarize:

*** Node Attributes

Each node in this tree has the following attributes:
1) K-bit input vector.
2) L-bit output vector, where  K >= L
3) Node Category Memory (CM) - stores unique category vectors. Each vector
is stored as a tuple (V, C) where V is a vector value and C is a category
number. Both V and C values, as well as (V,C) pairs are unique for the node.
Node can store a limited amount of (V,C) pairs. This amount (CMSize) is
equal to the number of categories that node can differentiate and also equal
to the node output vector dimension.

*** Node Data Processing algorithm:

Every Node runs the following endless loop:
1) Read inputs from children
2) Build input vector X
3) Calculate distance between X and every vector in CM and determine if X
belongs to any existing node category.
If X belongs to existing category
   Then do step 4)
   Else
       If CM is not full create new node category pair (X, C_new)
       Else return
4) Output  vector group number to a higher level node in the tree.

*** Main Questions

1) How to model node vector processing and node state? Node receives input
and produces output and can change its state by adding new category to Node
CM. Will State monad help to 'thread' node state between inputs? I don't see
how.
State monad encapsulates state transition function:

State s a = State (\s -> a, s)

that inputs some state value and outputs a pair (value, newState).
In my case node category memory (CM) could be such a changing state of node
, but what then will be node input vector in State monad?

2) How to traverse classifier tree (MT) so inputs from bottom level nodes
could flow up to a top node through all nodes at all intermediate levels? In
contrast to a 'normal' tree where  all variations of tree traversals start
at root node, classifier tree must be traversed starting from bottom level
nodes.

3) How to build tools that will allow to construct classifier trees with
different topologies? In this case one topology differers from another by a)
number of levels and b) numer of children at each level.
Ideally I would like to have a simple DSL that will allow specify tree
topology declaratively, easy to understand for non-programmer.

4) Most important: In what direction should I look as a Haskell newbie to
find answers to these and similar questions? Should I build my own monad to
traverse this tree, what other standard monads may help? What combinator
libraries should I learn? Will I need Arrows (that at the moment I know only
name and nothing more about)?

Thanks for reading all this and any help!

-- 
Dmitri O. Kondratiev
dokondr@gmail.com
http://www.geocities.com/dkondr

Dmitri O.Kondratiev

Marc Weber

tags

participants (2)