how to simplify ressources access and pipes steam

Hello, I am writing a software which do some data treatement from hdf5 files. In order to access the data which are stored in arrays, I need to open and closes ressources. This is the purpose of the withHdf5PathP function. given a file f and a UhvPath I end up with this withUhvPathP :: (MonadSafe m, Location l) => l -> UhvPath -> ((Int -> IO Geometry) -> m r) -> m r withUhvPathP f (UhvPath m o d g w) gg = withHdf5PathP f m $ \m' -> withHdf5PathP f o $ \o' -> withHdf5PathP f d $ \d'-> withHdf5PathP f g $ \g' -> withHdf5PathP f w $ \w' -> gg (\j -> do mu <- get_position m' j omega <- get_position o' j delta <- get_position d' j gamma' <- get_position g' j wavelength <- getValueWithUnit w' 0 angstrom let positions = Data.Vector.Storable.fromList [mu, omega, delta, gamma'] let source = Source wavelength pure $ Geometry Uhv source positions Nothing) then I use this like this forever $ do (Chunk fp from to) <- await withFileP (openH5 fp) $ \f -> withHdf5PathP f imgs $ \dimgs -> withUhvPathP f dif $ \getDiffractometer -> withSamplePathP f samp $ \getSample -> forM_ [from..to-1] (\j -> yield =<< liftIO (DataFrameHkl <$> pure j <*> get_image' det dimgs j <*> getDiffractometer j <*> getSample j)) so once I open the resources, I use the getDiffratometer ((Int -> IO Geometry) -> m r) function give the position on the stream and it return the value at the j position. the purpose of this is to open the ressources only once at the begining and then send the values in a Pipe via the yield function of pipes package. I hope, I was clear. 1) I find the withUhvPathP sort of ugly, and I would like to know if it could be written more elegantly, because I will need to write a lot's of these function for different types. 2) It is the right way to design a solution to my problem (open ressources and send each values into a stream). thanks for you attention. Frederic

Maybe fork another thread, keep reading data and putting into a shared `MVar` / `TMVar`, then current thread keep taking from that var?
On 2020-05-20, at 14:23, PICCA Frederic-Emmanuel
wrote: Hello, I am writing a software which do some data treatement from hdf5 files.
In order to access the data which are stored in arrays, I need to open and closes ressources. This is the purpose of the withHdf5PathP function.
given a file f and a UhvPath I end up with this
withUhvPathP :: (MonadSafe m, Location l) => l -> UhvPath -> ((Int -> IO Geometry) -> m r) -> m r withUhvPathP f (UhvPath m o d g w) gg = withHdf5PathP f m $ \m' -> withHdf5PathP f o $ \o' -> withHdf5PathP f d $ \d'-> withHdf5PathP f g $ \g' -> withHdf5PathP f w $ \w' -> gg (\j -> do mu <- get_position m' j omega <- get_position o' j delta <- get_position d' j gamma' <- get_position g' j wavelength <- getValueWithUnit w' 0 angstrom let positions = Data.Vector.Storable.fromList [mu, omega, delta, gamma'] let source = Source wavelength pure $ Geometry Uhv source positions Nothing)
then I use this like this
forever $ do (Chunk fp from to) <- await withFileP (openH5 fp) $ \f -> withHdf5PathP f imgs $ \dimgs -> withUhvPathP f dif $ \getDiffractometer -> withSamplePathP f samp $ \getSample -> forM_ [from..to-1] (\j -> yield =<< liftIO (DataFrameHkl <$> pure j <*> get_image' det dimgs j <*> getDiffractometer j <*> getSample j))
so once I open the resources, I use the getDiffratometer ((Int -> IO Geometry) -> m r) function give the position on the stream and it return the value at the j position. the purpose of this is to open the ressources only once at the begining and then send the values in a Pipe via the yield function of pipes package.
I hope, I was clear.
1) I find the withUhvPathP sort of ugly, and I would like to know if it could be written more elegantly, because I will need to write a lot's of these function for different types.
2) It is the right way to design a solution to my problem (open ressources and send each values into a stream).
thanks for you attention.
Frederic _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Do you mean it's ugly because of the nested withHdf4PathP? If so, have you
considered ResourceT?
But I saw something else that seems more noteworthy. Why is withUhvPathP
being used to repeatedly open and close the same files using the filenames
contained in dif? (I suppose withHdf5PathP opens and closes HDF5 files.)
Why don't you open the files once in the beginning of your loop and close
them at the end? You can do this by wrapping everything in a ResourceT and
use allocate.
On Wed, May 20, 2020 at 6:25 PM Compl Yue
Maybe fork another thread, keep reading data and putting into a shared `MVar` / `TMVar`, then current thread keep taking from that var?
On 2020-05-20, at 14:23, PICCA Frederic-Emmanuel < frederic-emmanuel.picca@synchrotron-soleil.fr> wrote:
Hello, I am writing a software which do some data treatement from hdf5 files.
In order to access the data which are stored in arrays, I need to open and closes ressources. This is the purpose of the withHdf5PathP function.
given a file f and a UhvPath I end up with this
withUhvPathP :: (MonadSafe m, Location l) => l -> UhvPath -> ((Int -> IO Geometry) -> m r) -> m r withUhvPathP f (UhvPath m o d g w) gg = withHdf5PathP f m $ \m' -> withHdf5PathP f o $ \o' -> withHdf5PathP f d $ \d'-> withHdf5PathP f g $ \g' -> withHdf5PathP f w $ \w' -> gg (\j -> do mu <- get_position m' j omega <- get_position o' j delta <- get_position d' j gamma' <- get_position g' j wavelength <- getValueWithUnit w' 0 angstrom let positions = Data.Vector.Storable.fromList [mu, omega, delta, gamma'] let source = Source wavelength pure $ Geometry Uhv source positions Nothing)
then I use this like this
forever $ do (Chunk fp from to) <- await withFileP (openH5 fp) $ \f -> withHdf5PathP f imgs $ \dimgs -> withUhvPathP f dif $ \getDiffractometer -> withSamplePathP f samp $ \getSample -> forM_ [from..to-1] (\j -> yield =<< liftIO (DataFrameHkl <$> pure j <*> get_image' det dimgs j <*> getDiffractometer j <*> getSample j))
so once I open the resources, I use the getDiffratometer ((Int -> IO Geometry) -> m r) function give the position on the stream and it return the value at the j position. the purpose of this is to open the ressources only once at the begining and then send the values in a Pipe via the yield function of pipes package.
I hope, I was clear.
1) I find the withUhvPathP sort of ugly, and I would like to know if it could be written more elegantly, because I will need to write a lot's of these function for different types.
2) It is the right way to design a solution to my problem (open ressources and send each values into a stream).
thanks for you attention.
Frederic _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Do you mean it's ugly because of the nested withHdf4PathP? If so, have you considered ResourceT?
This is more about the complex signature of the function.
But I saw something else that seems more noteworthy. Why is withUhvPathP being used to repeatedly open and close the same files using the filenames contained in dif? (I suppose withHdf5PathP opens and closes HDF5 files.) Why don't you open the > files once in the beginning of your loop and close them at the end? You can do this by wrapping everything in a ResourceT and use allocate.
withHdf5Path already us the MonadSafe from pipes, so the resources are opened only once during the process. (once per thread, because I process chunk of the data, one per thread).

Maybe fork another thread, keep reading data and putting into a shared `MVar` / `TMVar`, then current thread keep taking from that var?
The code already use async in order to read only a chunk of the data. But is true, that I could have a thread whcih read the data and post the messag into a queue. then workers take these frames and produce a monoid. then all monoid are reduced into the final one.

On Wed, May 20, 2020 at 06:23:31AM +0000, PICCA Frederic-Emmanuel wrote:
1) I find the withUhvPathP sort of ugly, and I would like to know if it could be written more elegantly, because I will need to write a lot's of these function for different types.
At first glance seems like you should be able to convert to ContT and use do notation.
participants (4)
-
Compl Yue
-
PICCA Frederic-Emmanuel
-
Tom Ellis
-
☂Josh Chia (謝任中)