Re: [Haskell-cafe] Automatic differentiation (AD) with respect to list of matrices in Haskell

Awesome! Lots of new info! And more importantly, I got it to work.
I still need to work on it a bit more, but things seems to be getting clearer.
On Sun, May 8, 2016 at 2:25 PM, Moritz Kiefer < moritz.kiefer@purelyfunctional.org> wrote:
Miguel A. Santos
writes: Thanks a lot for your quick answer.
First, I can't get your modifications through: What are Scalar & Mode? ghc says type/class not in scope.
Oh sorry you need to import Numeric.AD for that. Mode is a typeclass with an associated type called Scalar. You can just think of it as some magic that allows you to use constants in your function.
Yes. I figured out as well. I was only importing its Mode.Reverse as R.
Some more questions to help me understand your suggestions below.
On Sun, May 8, 2016 at 1:14 PM, Moritz Kiefer < moritz.kiefer@purelyfunctional.org> wrote:
There are two orthogonal errors going on here:
1. You need to use auto to embed constants. See the comments at the
code
below on how to use it.
Ah, that solves indeed the first problem. Thanks!
2. You need to wrap [Matrix a] in a newtype. grad requires you to provide a function f a -> a, however sc2 _ has the type [Matrix a] -> a so if you replace [] by f you end up with f (Matrix a) -> a which results in the error you’re seeing.
I've problems following you here. For example this is legit and works grad (\[x,y] -> x^2y^3) [5,7]
That is, grad here is taking a function f [a,a] -> a. That seems to work as well with sc1 in my example: I do get the rigth gradient.
That function has type [a] -> a which can be rewritten as [] a -> a and if you unify [] and f you have f a -> a which is exactly what you need.
However in sc2 you have a function of type [Matrix a] -> a which can be rewritten as [] (Matrix a) -> a and if you unify f and [] you end up with f (Matrix a) -> a. However Matrix a and a are not the same type so you get an error.
I got confused by the expression "[the] function f a -> a": I though the word 'function' was referring to the symbol 'f'.
Yet, I still can't follow things completely...uhm, issuing :type grad in ghci I get grad :: (Num a, Traversable f) => (forall s. Data.Reflection.Reifies s Numeric.AD.Internal.Reverse.Tape => f (Numeric.AD.Internal.Reverse.Reverse s a) -> Numeric.AD.Internal.Reverse.Reverse s a) -> f a -> f a
which I'd say could be simplified to grad :: Num a => ( f Reverse s a -> Reverse s a) -> f a -> f a
Now, It's not only that I don't see your "grad expects a 'f a -> a' ", but I don't even see how something like grad (\[x,y]-> x*y) [2,3] fits...or do I? The notation 'f' seems was misleading me. I'd say what's important is its type, name Traversable f. [] seems to fit there, isn't? So the first argument of grad more or less would be what you were referring to ( f a -> a) , the second it the point to calculate the gradient at (of course a [a], or, [] a, hence, f a) and it does yield a "vector" (on paper). Interestingly, its implementation means that grad's output is the same type the input of the function we want to take its gradient of.
I think it starts making sense to me now.
Miguel A. Santos
writes: mv _ [] = [] mv (M []) _ = [] mv ( M m ) v = ( dot (head m) v ) : (mv (M (tail m)) v ) --- two matrices
You need explicit type annotations to make this polymorphic here because of the monomorphism restriction.
Just for the sake of learning how to use that jargon in this context: which/what is that monomorphism restriction are you referring to? Also, when you say my mbW1 is polymorphic, do you mean, given my definition of it, ghc cannot adscribe it one _unique_ type signature, but that will vary depending on the context?
I’m going to refer you to the haskell report [1] for information on the monomorphism restriction. By polymorphic I’m referring to the choice of a which is now left open.
Thanks!!
mbW1 :: Num a => Matrix a
mbW1 = M $ [[1,0,0],[-1,5,1],[1,2,-3]] mbW2 = M $ [[0,0,0],[1,3,-1],[-2,4,6]] --- two different scoring functions
-- Provide a type signature and map auto over the constants sc1 :: (Floating b, Mode b) => [Scalar b] -> Matrix b -> b sc1 v m = foldr (+) 0 $ (phi' . (mv m) ) (map auto v)
-- Provide a type signature, use auto and use the newtype newtype MatrixList a = MatrixList [Matrix a] deriving (Functor,Foldable,Traversable)
sc2 :: (Floating a,Mode a) => [Scalar a] -> MatrixList a -> a sc2 v [m1, m2] = foldr (+) 0 $ (phi' . (mv m2) . phi' . (mv m1) ) (map auto v)
strToInt = read :: String -> Double strLToIntL = map strToInt--- testing main = do putStrLn $ "mbW1:" ++ (show mbW1) putStrLn $ "mbW2:" ++ (show mbW2) rawInput <- readFile "/dev/stdin" let xin= strLToIntL $ lines rawInput putStrLn "sc xin mbW1"
-- That needs an explicit type annotation because mbW1 is polymorphic print $ sc1 xin (mbW1 :: Matrix Double) --- ok. =
uhm...but it doesn't seem to give any problem, especially not after using auto...I've still problems getting an intuition on when ghc does need "my help" and when it can walk on it own.
Hm maybe you don’t need it after all. I thought I got an error locally. Sorry I don’t have any good tip on how to get an intuition. I add them when I’m unsure about the type of something myself to help me to figure it out and when I get an error.
putStrLn "grad (sc1 xin) mbW1" print $ grad ( sc1 xin) mbW1 -- yields an error: expects
xin
[Reverse s Double] instead of [Double] putStrLn "grad (sc1 [3,5,7]) mbW1" print $ grad ( sc1 [3,5,7]) mbW1 --- ok. = putStrLn "sc2 xin [mbW1,mbW2]" print $ sc2 xin [mbW1, mbW2] putStrLn "grad (sc2 [3,5,7) [mbW1,mbW2]"
-- Use the newtype defined above print $ grad ( sc2 [3,5,7]) (MatrixList [mbW1, mbW2]) --- Error: see text
Also as a general recommendation, write your type signatures explicitely at least for top level definitions.
What do you mean by "top level" definitions? --again, just a side comment to fine-tune my jargon registers.
Top level definitions are the definitions in your module that are not in a let or where. So basically everything that’s accessible from other definitions outside of the current one.
I still have to learn my way through all types. Sometimes this rule makes my life a bit too hard, specially when I just want to quickly test some ideas, like here: logistic expects a Floating a but if I don't make things explicit, ghc was finding its way around my use of Num a and this Floating requirement of the exp function.
I'll keep this rule in mind.
Thanks a lot!!
Regards, MA
PD: Forgot to reply to the list before...
Cheers Moritz
[1] https://www.haskell.org/onlinereport/decls.html#sect4.5.5

Sorry, while that all did work before, it looks like I screwed up again. Tried playing around varying things trying to understand your fix better, but I seem not to be able to figured out how to fix it now. This type issues feel bizarre! The code below doesn't compile. The error is now on a line that before gave absolutely no problem before, name the evaluation of sc1 xin mbW1. The error is now, basically, "expected [Scalar a] ; Actual [Double a]" and explicitly stating types doesn't help either. That is, explicitly declaring sc1 type as "sc1 :: (Floating a, Mode a) => [Scalar a] -> Matrix a -> a" still gives that error. Yet, there is no problem in calculating its gradient, namely "grad (sc1 xin) mbW1" works ok !? I'm at lost here. What's wrong/missing now? Evaluation of sc0 or sc2 on the input list xin works, but no for sc1!? Somehow, from the detailed error message (see below) it looks like ghc can't really figure out the type of 'a' in '[Scalar a]'. Some lack of injectivity issues or so? And I can only imaging that the type Double it sees may be induced by the use of the exponential implicit in the definition of `phi'` which is used in that of sc1, but I have no clue how to tweak `phi'`. As I said before, I still don't grasp Haskell's type system. Usually it seems easier to not give explicitly a type and let ghc guess it. Unfortunately this doesn't work in the present case. I apologize in advance if I happen to be missing something elementary. I just don't see it. Regards, MA PD: Without the offending line evaluating sc1, the output of the program is (again the 1,2,3 is my arbitrary input to the program): -------- mbW1:M [[1,0,0],[-1,5,1],[1,2,-3]] mbW2:M [[0,0,0],[1,3,-1],[-2,4,6]] --- sc0: --- sc0 xin mbW1 1 2 3 9.0 grad (sc0 [3,5,7]) mbW1 M [[3,5,7],[3,5,7],[3,5,7]] --- sc1: --- grad (sc1 xin) mbW1 M [[1.0,2.0,3.0],[6.1441368513331755e-6,1.2288273702666351e-5,1.8432410553999525e-5],[1.7662706213291118e-2,3.5325412426582235e-2,5.298811863987335e-2]] grad (sc1 [3,5,7]) mbW1 M [[3.0,5.0,7.0],[7.630996942126885e-13,1.2718328236878141e-12,1.7805659531629398e-12],[1.0057130122694228e-3,1.6761883537823711e-3,2.3466636952953197e-3]] --- sc2: --- sc2 xin [mbW1,mbW2] 1.8733609463863194 --- sc3: --- grad (sc3 xin) [mbW1,mbW2] MatrixList [M [[-0.1752205960584877,-0.3504411921169754,-0.525661788175463],[2.7052661672554392e-6,5.4105323345108785e-6,8.115798501766318e-6],[9.919472739879849e-3,1.9838945479759697e-2,2.9758418219639544e-2]],M [[1.0,0.9999938558253978,1.798620996209156e-2],[1.79718498433056e-2,1.7971739421122238e-2,3.232454646888768e-4],[9.659622295089665e-2,9.659562944683693e-2,1.7373999475398345e-3]]] -------- and the code producing that output is: -------- {-# LANGUAGE DeriveTraversable, DeriveFunctor, DeriveFoldable #-} import Numeric.AD import Numeric.AD.Mode.Reverse as R import Data.Traversable as T import Data.Foldable as F --- Non-linear function on "vectors" logistic x = 1.0 / (1.0 + exp(-x) ) phi v = map logistic v phi' (x:xs) = x : (phi xs) --- dot product dot u v = foldr (+) 0 $ zipWith (*) u v --- simple matrix type data Matrix a = M [[a]] deriving (Eq,Show,Functor,F.Foldable,T.Traversable) --- action of a matrix on a vector mv _ [] = [] mv (M []) _ = [] mv ( M m ) v = ( dot (head m) v ) : (mv (M (tail m)) v ) --- two matrices mbW1,mbW2 :: Num a => Matrix a mbW1 = M $ [[1,0,0],[-1,5,1],[1,2,-3]] mbW2 = M $ [[0,0,0],[1,3,-1],[-2,4,6]] --- different scoring functions sc0 v m = foldr (+) 0 $ mv m v --sc1 :: (Floating a, Mode a) => [Scalar a] -> Matrix a -> a sc1 v m = foldr (+) 0 $ (phi' . (mv m) ) (map auto v) --sc2 :: Floating a => [a] -> [Matrix a] -> a sc2 v [m1, m2] = foldr (+) 0 $ (phi' . (mv m2) . phi' . (mv m1) ) v -- Provide a type signature, use auto and use the newtype newtype MatrixList a = MatrixList [Matrix a] deriving (Show, Functor,Foldable,Traversable) sc3 :: (Floating a, Mode a) => [Scalar a] -> MatrixList a -> a sc3 v (MatrixList [m1, m2]) = foldr (+) 0 $ (phi' . (mv m2) . phi' . (mv m1) ) (map auto v) strToInt = read :: String -> Double strLToIntL = map strToInt --- testing main = do putStrLn $ "mbW1:" ++ (show mbW1) putStrLn $ "mbW2:" ++ (show mbW2) rawInput <- readFile "/dev/stdin" let xin= strLToIntL $ lines rawInput --- putStrLn "---\nsc0:\n---" --- --putStrLn "sc0 [3,5,7] mbW1" --print $ sc0 [3,5,7] mbW1 putStrLn "sc0 xin mbW1" print $ sc0 xin mbW1 putStrLn "grad (sc0 [3,5,7]) mbW1" print $ grad ( sc0 [3,5,7]) mbW1 --print $ grad ( sc0 xin) mbW1 --- putStrLn "---\nsc1:\n---" --- --putStrLn "sc1 xin mbW1" --print $ sc1 xin mbW1 --- ok. = NOT OK anymore using map auto!? : Expected [Scalar a0] ; Actual [Double] putStrLn "grad (sc1 xin) mbW1" print $ grad ( sc1 xin) mbW1 -- ok now just with auto !? Was: yields an error: expects xin [Reverse s Double] instead of [Double] putStrLn "grad (sc1 [3,5,7]) mbW1" print $ grad ( sc1 [3,5,7]) mbW1 --- ok. = --- putStrLn "---\nsc2:\n---" --- putStrLn "sc2 xin [mbW1,mbW2]" print $ sc2 xin [mbW1, mbW2] --- putStrLn "---\nsc3:\n---" --- putStrLn "grad (sc3 xin) [mbW1,mbW2]" print $ grad ( sc3 xin) (MatrixList [mbW1, mbW2]) -------- When trying to evaluate 'sc1 xin mbW1', the he precise error message is: -------- Couldn't match type ‘Scalar r0’ with ‘Double’ The type variable ‘r0’ is ambiguous Expected type: [Scalar r0] Actual type: [Double] In the first argument of ‘sc1’, namely ‘xin’ In the second argument of ‘($)’, namely ‘sc1 xin mbW1’ -- Public key ID: E8FE60D7 Public key server: see, e.g., hkp://keys.gnupg.net
participants (1)
-
Miguel A. Santos