
AC> I'm still puzzled as to what makes the other categories so magical AC> that they cannot be considered sets.
They are just too big. The set of all sets can't exist, you know.
That's news. How come the set of all sets doesn't exist?
Well, you mentioned that you have some knowledge of group theory, so let me give you three examples of adjoint functors (don't worry, it won't hurt) - two from the group theory and one related to Haskell.
1) Consider the category Set of all sets - it's objects are sets and it's morphisms (=arrows) are functions between sets. Also, there is a category Grp of groups - with groups as objects and homomorphisms as morphisms. Then, the trivial mapping G: Grp -> Set, which maps each group to it's base set (and each homomorphisms to itself - as a function from one base set to another) is a functor (it is called "forgetting functor", I guess, since it "forgets" the group structure).
There is a very natural functor F: Set -> Grp. Namely, F maps each set X to the free group, with generators corresponding to elements of X. By definition, each function f:X->H, where H is a group, can be extended to the homomorphism f*:F(X)->H. That means that there is a natural bijection between functions X->H (more precisely, from X to G(H), since these functions aren't related to the group structure on H) and homomorphisms F(X)->H:
Set(X,G(H)) ~ Grp(F(X),H)
Here by Grp(H1,H2) I denote the set of morphisms (=homomorphisms) from H1 to H2; the same notation is used for Set.
That means exactly that F is LEFT ADJOINT to G (or, equivalently, G is right adjoint to F).
Their composition GF is a monad on the category Set. GF(X) is the set of all elements of the free group, generated by X. For a in X, return(x) is an element of the free group, corresponding to x. And if we have a map X->GF(Y) and an element of GF(X), we can remember that GF(Y) carries some group structure, so our map is in fact a map from X to some group, which extends to a homomorphism from F(X) to F(Y), which is a map from GF(X) to GF(Y), which maps our chosen element of GF(X) to an element of GF(Y) - that gives us (>>=).
That almost made sense most of the way through... but... ooouch... x_x
2) There is a category Ab of abelian groups (and homomorphisms). Of course, there is a trivial functor G: Ab -> Grp, which maps each abelian group to itself. There is also a functor F: Grp -> Ab; it maps each group H to it's "abelianization": F(H) = H/[H,H]. F is also left adjoint to G: there is a bijection between homomorphisms H -> A, where A is abelian, and homomorphisms H/[H,H] -> A. Again, there composition GF: Grp -> Grp is a monad (on Grp this time).
There are many other constructions that happen to be adjoint functors; and that is a kind of generalization that makes mathematics so useful and exciting. These constructions include all kinds of "free" structures - free modules, algebras etc.; discrete and codiscrete topological spaces, and many others.
3) Let X be a set. I'll denote here the set of all functions from X to Y by X->Y, and the product XxY (small x stands here for the "times" sign) by (X,Y), sticking to the Haskell notation. Then the functor (X ->): Set -> Set which maps each set Y to the set X->Y, is right adjoint to the functor (,X), mapping each set Y to (Y,X): there is a bijection between functions from Z to (X->Y) and functions from (Z,X) to Y. This bijection is called "currying". The composition of this functors is - as always - monad: it maps Y to X->(Y,X). And this is a kind of monad we are familiar with: it's the State monad. Summarizing, we have the following: State monad exists because of currying.
Again... that almost makes some sort of sense... but this is REALLY making my head hurt!