I am trying to understand how can I use Numeric.AD
(automatic differentiation) in Haskell.
I defined a simple matrix type and a scalar function taking an array and two matrices as arguments. I want to use AD to get the gradient of the scoring function with respect to both matrices, but I'm running into compilation problems. Here is the code:
-------------------------------
{-# LANGUAGE DeriveTraversable, DeriveFunctor, DeriveFoldable #-}
import Numeric.AD.Mode.Reverse as R
import Data.Traversable as T
import Data.Foldable as F
--- Non-linear function on "vectors"
logistic x = 1.0 / (1.0 + exp(-x) )
phi v = map logistic v
phi' (x:xs) = x : (phi xs)
--- dot product
dot u v = foldr (+) 0 $ zipWith (*) u v
--- simple matrix type
data Matrix a = M [[a]] deriving (Eq,Show,Functor,F.Foldable,T.Traversable)
--- action of a matrix on a vector
mv _ [] = []
mv (M []) _ = []
mv ( M m ) v = ( dot (head m) v ) : (mv (M (tail m)) v )
--- two matrices
mbW1 = M $ [[1,0,0],[-1,5,1],[1,2,-3]]
mbW2 = M $ [[0,0,0],[1,3,-1],[-2,4,6]]
--- two different scoring functions
sc1 v m = foldr (+) 0 $ (phi' . (mv m) ) v
sc2 :: Floating a => [a] -> [Matrix a] -> a
sc2 v [m1, m2] = foldr (+) 0 $ (phi' . (mv m2) . phi' . (mv m1) ) v
strToInt = read :: String -> Double
strLToIntL = map strToInt
--- testing
main = do
putStrLn $ "mbW1:" ++ (show mbW1)
putStrLn $ "mbW2:" ++ (show mbW2)
rawInput <- readFile "/dev/stdin"
let xin= strLToIntL $ lines rawInput
putStrLn "sc xin mbW1"
print $ sc1 xin mbW1 --- ok. =
putStrLn "grad (sc1 xin) mbW1"
print $ grad ( sc1 xin) mbW1 -- yields an error: expects xin [Reverse s Double] instead of [Double]
putStrLn "grad (sc1 [3,5,7]) mbW1"
print $ grad ( sc1 [3,5,7]) mbW1 --- ok. =
putStrLn "sc2 xin [mbW1,mbW2]"
print $ sc2 xin [mbW1, mbW2]
putStrLn "grad (sc2 [3,5,7) [mbW1,mbW2]"
print $ grad ( sc2 [3,5,7]) [mbW1, mbW2] --- Error: see text
--------------------------------
The last line (grad on sc2) gives the following error:
---------------------------------
Couldn't match type ‘Reverse s (Matrix Double)’
with ‘Matrix (Reverse s (Matrix Double))’
Expected type: [Reverse s (Matrix Double)]
-> Reverse s (Matrix Double)
Actual type: [Matrix (Reverse s (Matrix Double))]
-> Reverse s (Matrix Double)
In the first argument of ‘grad’, namely ‘(sc2 [3, 5, 7])’
In the second argument of ‘($)’, namely
‘grad (sc2 [3, 5, 7]) [mbW1, mbW2]’
---------------------------------
I don't understand where the "Matrix of Matrix" in the actual type seen comes from. I'm feeding the grad
with a curried version of sc2, making it a function on a list of Matrix.
Commenting out the two offending lines runs without problem, i.e., the first gradient works and is correctly calculated (I'm feeding [1,2,3] as input to the program):
-------------------Both errors are an issue. I want to take the gradient of any such
mbW1:M [[1.0,0.0,0.0],[-1.0,5.0,1.0],[1.0,2.0,-3.0]] mbW2:M [[0.0,0.0,0.0],[1.0,3.0,-1.0],[-2.0,4.0,6.0]] sc1 xin mbW1 1 2 3 2.0179800657874893 grad (sc1 [3,5,7]) mbW1 M [[3.0,5.0,7.0],[7.630996942126885e-13,1.2718328236878141e-12,1.7805659531629398e-12],[1.0057130122694228e-3,1.6761883537823711e-3,2.3466636952953197e-3]] sc2 xin [mbW1,mbW2] 1.8733609463863194
-------------------sc2
scoring function, depending on an array of matrices, evaluated at any given "point" xin. Clearly, I'm not yet understanding the AD library well enough. Any help would be appreciated.