New subject: Automatic differentiation (AD) with respect to list of matrices in Haskell

8 May 2016


      Awesome! Lots of new info! And more importantly, I got it to work.
...
I still need to work on it a bit more, but things seems to be getting
clearer.
On Sun, May 8, 2016 at 2:25 PM, Moritz Kiefer <
moritz.kiefer@purelyfunctional.org> wrote:
...
Miguel A. Santos  writes:
...
Thanks a lot for your quick answer.
First, I can't get your modifications through: What are Scalar & Mode?
ghc
says type/class not in scope.
Oh sorry you need to import Numeric.AD for that. Mode is a typeclass
with an associated type called Scalar. You can just think of it as some
magic that allows you to use constants in your function.
Yes. I figured out as well. I was only importing its Mode.Reverse as R.
...
...
Some more questions to help me understand your suggestions below.
On Sun, May 8, 2016 at 1:14 PM, Moritz Kiefer <
moritz.kiefer@purelyfunctional.org> wrote:
...
There are two orthogonal errors going on here:
1. You need to use auto to embed constants. See the comments at the
code
...
below on how to use it.
Ah, that solves indeed the first problem. Thanks!
...
2. You need to wrap [Matrix a] in a newtype. grad requires you to
provide a function f a -> a, however sc2 _ has the type [Matrix a] -> a
so if you replace [] by f you end up with f (Matrix a) -> a which
results in the error you’re seeing.
I've problems following you here. For example this is legit and works
grad (\[x,y] -> x^2y^3) [5,7]
That is, grad here is taking a function f [a,a] -> a. That seems to
work as
well with sc1 in my example: I do get the rigth gradient.
That function has type [a] -> a which can be rewritten as [] a -> a and
if you unify [] and f you have f a
-> a which is exactly what you need.
However in sc2 you have a function of type [Matrix a] -> a which can be
rewritten as [] (Matrix a) -> a and if you unify f and [] you end up
with f (Matrix a) -> a. However Matrix a and a are not the same type so
you get an error.
I got confused by the expression "[the] function f a -> a": I though the
word 'function' was referring to the symbol 'f'.
Yet, I still can't follow things completely...uhm, issuing :type grad in
ghci I get
grad
  :: (Num a, Traversable f) =>
     (forall s.
      Data.Reflection.Reifies s Numeric.AD.Internal.Reverse.Tape =>
      f (Numeric.AD.Internal.Reverse.Reverse s a)
      -> Numeric.AD.Internal.Reverse.Reverse s a)
     -> f a -> f a
which I'd say could be simplified to
grad :: Num a => ( f Reverse s a -> Reverse s a) -> f a -> f a
Now, It's not only that I don't see your "grad expects a 'f a -> a' ", but
I don't even see how something like grad (\[x,y]-> x*y) [2,3] fits...or do
I? The notation 'f' seems was misleading me. I'd say what's important is
its type, name Traversable f. [] seems to fit there, isn't? So the first
argument of grad more or less would be what you were referring to ( f a ->
a) , the second it the point to calculate the gradient at (of course a [a],
or, [] a, hence, f a) and it does yield a "vector" (on paper).
Interestingly, its implementation means that grad's output is the same type
the input of the function we want to take its gradient of.
I think it starts making sense to me now.
...
...
...
Miguel A. Santos  writes:
...
mv _ [] = []
mv (M []) _ = []
mv ( M m ) v = ( dot (head m)  v ) :  (mv (M (tail m)) v )
--- two matrices
You need explicit type annotations to make this polymorphic here
because
of the monomorphism
restriction.
Just for the sake of learning how to use that jargon in this context:
which/what is that monomorphism restriction are you referring to? Also,
when you say my mbW1 is polymorphic, do you mean, given my definition of
it, ghc cannot adscribe it one _unique_ type signature, but that will
vary
depending on the context?
I’m going to refer you to the haskell report [1] for information on the
monomorphism restriction. By polymorphic I’m referring to the choice of
a which is now left open.
Thanks!!
...
...
...
mbW1 :: Num a => Matrix a
...
mbW1 = M $ [[1,0,0],[-1,5,1],[1,2,-3]]
mbW2 = M $ [[0,0,0],[1,3,-1],[-2,4,6]]
--- two different scoring functions
-- Provide a type signature and map auto over the constants
sc1 :: (Floating b, Mode b) => [Scalar b] -> Matrix b -> b
sc1 v m = foldr (+) 0 $ (phi' . (mv m) ) (map auto v)
...
-- Provide a type signature, use auto and use the newtype
newtype MatrixList a = MatrixList [Matrix a] deriving
(Functor,Foldable,Traversable)
sc2 :: (Floating a,Mode a) => [Scalar a] -> MatrixList a -> a
sc2 v [m1, m2] = foldr (+) 0 $ (phi' . (mv m2) . phi' . (mv m1) ) (map
auto v)
...
strToInt = read :: String -> Double
strLToIntL = map strToInt--- testing
main = do
        putStrLn $ "mbW1:" ++ (show mbW1)
        putStrLn $ "mbW2:" ++ (show mbW2)
        rawInput <-  readFile "/dev/stdin"
        let xin= strLToIntL $ lines rawInput
        putStrLn "sc xin mbW1"
-- That needs an explicit type annotation because mbW1 is polymorphic
         print $ sc1 xin (mbW1 :: Matrix Double) --- ok. =
uhm...but it doesn't seem to give any problem, especially not after
using
auto...I've still problems getting an intuition on when ghc does need
"my
help" and when it can walk on it own.
Hm maybe you don’t need it after all. I thought I got an error
locally. Sorry I don’t have any good tip on how to get an
intuition. I add them when I’m unsure about the type of something myself
to help me to figure it out and when I get an error.
...
...
...
putStrLn "grad (sc1 xin) mbW1"
        print $ grad ( sc1 xin) mbW1   -- yields an error: expects
xin
...
[Reverse s Double] instead of [Double]
        putStrLn "grad (sc1 [3,5,7]) mbW1"
        print $ grad ( sc1 [3,5,7]) mbW1   --- ok. =
        putStrLn "sc2 xin [mbW1,mbW2]"
        print $ sc2 xin [mbW1, mbW2]
        putStrLn "grad (sc2 [3,5,7) [mbW1,mbW2]"
-- Use the newtype defined above
         print $ grad ( sc2 [3,5,7]) (MatrixList [mbW1, mbW2])  ---
Error:
see text
Also as a general recommendation, write your type signatures
explicitely
at least for top level definitions.
What do you mean by "top level" definitions? --again, just a side
comment
to fine-tune my jargon registers.
Top level definitions are the definitions in your module that are not in
a let or where. So basically everything that’s accessible from other
definitions outside of the current one.
I still have to learn my way through all types. Sometimes this rule makes
my life a bit too hard, specially
when I just want to quickly test some ideas, like here: logistic expects a
Floating a but if I don't  make things
explicit, ghc was finding its way around my use of Num a and this Floating
requirement of the exp function.
I'll keep this rule in mind.
Thanks a lot!!
Regards,
MA
PD: Forgot to reply to the list before...
...
Cheers Moritz
...
[1] https://www.haskell.org/onlinereport/decls.html#sect4.5.5

Re: [Haskell-cafe] Automatic differentiation (AD) with respect to list of matrices in Haskell

Miguel A. Santos

Miguel A. Santos

tags

participants (1)