Don't be led astray by leaky analogies. A Functor is not a container. Some Functor instances are like containers. When this analogy stops working, discard it and think about the problem directly. Like any other typeclass, Functor is just a collection of methods and laws[1]; its instances are just types which have law abiding implementations of the methods. Knowing the type of fmap and its laws, we know what it means for ((->) r) to be an instance: it means that we can define
fmap :: (a -> b) -> f a -> f b
for
f = ((->) r) and prove that it satisfies the laws.
Substituting for f, we have:
fmap :: (a -> b) -> (r -> a) -> (r -> b)
By alpha equivalence, we can rename this to
fmap :: (b -> c) -> (a -> b) -> a -> c
and immediately we find a candidate implementation in function composition, (.) :: (b -> c) -> (a -> b) -> a -> c:
fmap f g = f . g
Now we must prove that this implementation is law abiding. I'll show a proof for the first law, fmap id = id, with assistance from a few definitions:
1) f . g = \x -> f (g x)
2) id x = x
3) \x -> f x = f
fmap id f
= id . f {- definition of fmap -}
= \x -> id (f x) {- by (1) -}
= \x -> f x {- by (2) -}
= f {- by (3) -}
= id f {- by (2) -}
Thus we have fmap id f = id f and (by eta reduction) fmap id = id. Try to prove the second law for yourself! Once you've proven it, you know that ((->) r) is an instance of Functor where fmap = (.)[2]. If you do the same for Applicative and Monad then you will know exactly how ((->) r) is a Functor, an Applicative, and a Monad.
Then you can experiment by applying the typeclass methods to functions to see what the practical value of these definitions is. For example. the Applicative instance lets you plumb a shared argument to a number of functions. Here's a contrived example:
> (++) <$> map toUpper <*> reverse $ "Hello"
"HELLOolleH"
-R
[1] The laws are not really a part of the typeclass proper (i.e., the compiler doesn't know anything about them), but developers need to ensure that their instances are law abiding so that they behave as expected.
[2]: Actually, it turns out that one only needs to prove the first law for fmap because the second law is implied by the first, but that's a topic for another day!