If you want to use bit fiddly mutable vector stuff to make the classic Sieve of Eratosthenes fast and compact, I think it makes a lot of sense to use the bitvec package instead of doing the bit fiddling by hand.
On the other hand, I think the O'Neill prime sieve makes an excellent example, much prettier than a mutable-vector-based sieve. Her paper is at
https://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf and her actual implementation (much better than the one described directly in the paper) is at
https://hackage.haskell.org/package/NumberSieves/docs/Math-Sieve-ONeill.html . It can be optimized in various ways, most obviously by specializing from Integral to Word, but probably also by switching from a tree-based heap to one based on a mutable vector. I'm not sure how a really carefully optimized version would compare to Eratosthenes.