Indeed!  Repa and accelerate are really designed as a showcase for pushing the cutting edge of fusion based optimization.  Sorting algorithms, fast matrix multiply, and other locality sensitive algorithms are precisely their weakest spot.  To the point where you really shouldn't try to use them for such!

I second the vector-algorithms recommendation. Be prepared for some large compile times, those sorting algs do a lot of Inlining to give you good perf. 

On Thursday, January 23, 2014, Chaddaï Fouché <chaddai.fouche@gmail.com> wrote:
Convert to a vector (that should be efficient since I understand that it is the underlying representation anyway), then use the sort algorithms available in the vector-algorithms package then convert back. You'll need to use modify to apply the sort (since it works on MVector).

I don't think there's a simpler or better solution right now.
-- 
Jedaï