
Perhaps I am just missing something, but a major piece of efficient array functionality seems to be missing. Namely the ability to efficiently copy spans of arrays into one another and/or compare spans of memory. (basically memcpy and memcmp from C). any particular reason these basic building blocks were left out? I am trying to write an efficient packed string library and having to go through intermediate lists kills the garbage collector as well as the overall speed of my program. Ideally concatination should be a simple alloc + pair of memcpys. (yeah, I know they are both of linear order, but the constant factor on a system tuned assembler memcpy vs. chugging of the STG-machine is absurdly different when cache is taken into account) So, my question is also, as a workaround for the moment, how might I implement these rountines (without resorting to rewriting all the array functionality with Foreign)? Looking at the implementations of arrays, they all seem to be built on ByteArray#'s, if I could get a (Ptr a) out of them, that would be ideal, but I imagine that might be problematic without some mechanism to temporarily pin the address in place so the GC doesn't move it. I am thinking a family of routines. (with psuedosignatures) copySpan: range -> MArray -> whereto -> MArray -> m () extractSpan : range -> IArray -> IArray extractSpanM : range -> MArray -> m IArray saveSpan : range -> IArray -> whereto -> MArray -> m () hopefully the meaning of all is clear... perhaps better names can be come up with though. A similar set of routines which perform 'memcmp' would also be useful, but less vital at the moment. Sorry if I am just blind and missing these rountines... John -- John Meacham - ⑆repetae.net⑆john⑈

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 15 juil. 04, at 13:26, John Meacham wrote:
Perhaps I am just missing something, but a major piece of efficient array functionality seems to be missing. Namely the ability to efficiently copy spans of arrays into one another and/or compare spans of memory. (basically memcpy and memcmp from C).
memcpy is available in Foreign.Marshal.Utils: copyBytes :: Ptr a -> Ptr a -> Int -> IO () Copies the given number of bytes from the second area (source) into the first (destination);the copied areas may not overlap Here is the result of a quick try to implement fast copy using it and Data.Array.Storable: module FastArrrayCopy where import Data.Array.Storable import Foreign.Ptr import Foreign.Storable import Foreign.Marshal.Utils ( copyBytes ) fastArrayCopy :: (Storable e1, Ix i, Ix i1) => StorableArray i1 e1 -> i1 -> StorableArray i e -> i -> Int -> IO () fastArrayCopy src srcStart dest destStart count | destOffset + count > rangeSize (bounds dest) = error "Out of bounds" | otherwise = withStorableArray src $ \ pSrc -> withStorableArray dest $ \ pDest -> do dummy <- peek pSrc copyBytes (pDest `plusPtr` (destOffset * sizeOf dummy)) (pSrc `plusPtr` (srcOffset * sizeOf dummy)) (count * sizeOf dummy) where srcOffset = index (bounds src) srcStart destOffset = index (bounds dest) destStart main :: IO () main = do a <- newListArray (0, 100) ([0..] :: [Int]) a' <- newArray (0, 50) (42 :: Int) getElems a' >>= print copyRange a 10 a' 15 20 getElems a' >>= print It seems to work on GHC 6.2.1, though I did not made further tests than this main. Hope this helps, Jérémy. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (Darwin) iD8DBQFA9pK8JhPEcwATZDwRAhSnAJ9BELp+L/L2rFaYwFFzg/axQjEJ8wCcC+YV iN+XPdynHWROb3x27eVa5wE= =f5Vo -----END PGP SIGNATURE-----

--- J�r�my_Bobbio
memcpy is available in Foreign.Marshal.Utils:
copyBytes :: Ptr a -> Ptr a -> Int -> IO ()
Copies the given number of bytes from the second area (source) into the first (destination);the copied areas may not overlap
Here is the result of a quick try to implement fast copy using it and Data.Array.Storable:
module FastArrrayCopy where
import Data.Array.Storable import Foreign.Ptr import Foreign.Storable import Foreign.Marshal.Utils ( copyBytes )
fastArrayCopy :: (Storable e1, Ix i, Ix i1) => StorableArray i1 e1 -> i1 -> StorableArray i e -> i -> Int -> IO () fastArrayCopy src srcStart dest destStart count | destOffset + count > rangeSize (bounds dest) = error "Out of bounds" | otherwise = withStorableArray src $ \ pSrc -> withStorableArray dest $ \ pDest -> do dummy <- peek pSrc copyBytes (pDest `plusPtr` (destOffset * sizeOf dummy)) (pSrc `plusPtr` (srcOffset * sizeOf dummy)) (count * sizeOf dummy) where srcOffset = index (bounds src) srcStart destOffset = index (bounds dest) destStart
main :: IO () main = do a <- newListArray (0, 100) ([0..] :: [Int]) a' <- newArray (0, 50) (42 :: Int) getElems a' >>= print copyRange a 10 a' 15 20 getElems a' >>= print
This should be in a FAQ somewhere for those of us damaged by years of coding in C. __________________________________ Do you Yahoo!? New and Improved Yahoo! Mail - 100MB free storage! http://promotions.yahoo.com/new_mail

On Thu, Jul 15, 2004 at 04:20:38PM +0200, Jérémy Bobbio wrote:
memcpy is available in Foreign.Marshal.Utils:
copyBytes :: Ptr a -> Ptr a -> Int -> IO ()
Copies the given number of bytes from the second area (source) into the first (destination);the copied areas may not overlap
Here is the result of a quick try to implement fast copy using it and Data.Array.Storable:
Yeah, I know I can copy areas of memory allocated via the foreign library or C code around. What I am looking for is an efficient way to work with the standard Arrays as provided by Data.Array. Also, in my tests, arrays implemented via ByteArray# or Ptr a seem to be signifigantly faster than those implemented via ForeignPtr. Is this expected? John -- John Meacham

On Thu, Jul 15, 2004 at 04:26:27AM -0700, John Meacham wrote:
I am thinking a family of routines. (with psuedosignatures)
copySpan: range -> MArray -> whereto -> MArray -> m () extractSpan : range -> IArray -> IArray extractSpanM : range -> MArray -> m IArray saveSpan : range -> IArray -> whereto -> MArray -> m ()
I would like to have these, too. I would especially like to have saveSpan :: (Int, Int) -> UArray Int Word8 -> (Int, Int) -> IOUArray Int Word8 -> IO () copySpan :: (Int, Int) -> IOUArray Int Word8 -> (Int, Int) -> IOUArray Int Word8 -> IO () to use these in conjunction with hGetArray and hPutArray. In the hope of advancing the discussion, I have taken up the trivial task of implementing the above on top of the usual array functions. (I have not done any testing, though.) Maybe some kind soul with knowledge of the ghc innards can provide an implementation that reduces to a memcpy call, at least for some specializations. Greetings, Carsten -- Carsten Schultz (2:38, 33:47), FB Mathematik, FU Berlin http://carsten.codimi.de/ PGP/GPG key on the pgp.net key servers, fingerprint on my home page.

On Tue, Jul 20, 2004 at 02:56:15AM +0200, Carsten Schultz wrote:
On Thu, Jul 15, 2004 at 04:26:27AM -0700, John Meacham wrote:
I am thinking a family of routines. (with psuedosignatures)
copySpan: range -> MArray -> whereto -> MArray -> m () extractSpan : range -> IArray -> IArray extractSpanM : range -> MArray -> m IArray saveSpan : range -> IArray -> whereto -> MArray -> m ()
I would like to have these, too. I would especially like to have
saveSpan :: (Int, Int) -> UArray Int Word8 -> (Int, Int) -> IOUArray Int Word8 -> IO () copySpan :: (Int, Int) -> IOUArray Int Word8 -> (Int, Int) -> IOUArray Int Word8 -> IO ()
to use these in conjunction with hGetArray and hPutArray.
In the hope of advancing the discussion, I have taken up the trivial task of implementing the above on top of the usual array functions. (I have not done any testing, though.) Maybe some kind soul with knowledge of the ghc innards can provide an implementation that reduces to a memcpy call, at least for some specializations.
Cool, I can build more efficient versions on these now that I know the ByteArray# as Addr# trick. I am curious what the best way to go about writing specialized versions is, placing the copying functions in a class, with (slow) default methods for everything and special instances for (IO)UArrays? or relying on RULES pragmas to do the appropriate specialization? anyone have any intuition on which would be a better approach to take? The class is more straightforward, but since we are not changing behavior, just doing a pure type based optimization, perhaps RULES is the better way to go. (can RULES even be used in this way? it looks like they are already to implement unsafeFreeze and unsafeThaw) John -- John Meacham - ⑆repetae.net⑆john⑈

(I am not sure, if keeping the cc to libraries is ok, apologies in case it is not.) On Mon, Jul 19, 2004 at 06:23:21PM -0700, John Meacham wrote:
I am curious what the best way to go about writing specialized versions is, placing the copying functions in a class, with (slow) default methods for everything and special instances for (IO)UArrays? or relying on RULES pragmas to do the appropriate specialization?
I would vote for RULEs.
anyone have any intuition on which would be a better approach to take? The class is more straightforward, but since we are not changing behavior, just doing a pure type based optimization, perhaps RULES is the better way to go.
Exactly.
(can RULES even be used in this way?
I think so: http://www.haskell.org/ghc/docs/latest/html/users_guide/pragmas.html#SPECIAL... Greetings, Carsten -- Carsten Schultz (2:38, 33:47), FB Mathematik, FU Berlin http://carsten.codimi.de/ PGP/GPG key on the pgp.net key servers, fingerprint on my home page.
participants (4)
-
Carsten Schultz
-
John Meacham
-
Jérémy Bobbio
-
Shawn Garbett