Data.List / Map: simple serialization?

Hello, Please advise on existing serialization libraries. I need a simple way to serialize Data.List and Data.Map to plain text files. Thanks, Dmitri

On 9 June 2011 17:23, Dmitri O.Kondratiev
Hello, Please advise on existing serialization libraries. I need a simple way to serialize Data.List and Data.Map to plain text files.
Well, the obvious solution is to just use show and read... though if you want something more efficient, have a look at binary and cereal on Hackage. -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Binary should be pretty easy to use (and more advisable if you need
performance), since it defines the Binary instance for every basic type,
including of course Map.
I don't know about cereal, but I suppose it will be pretty much the same.
The major difference is that binary offers lazy serialization while cereal
is strict.
2011/6/9 Ivan Lazar Miljenovic
On 9 June 2011 17:23, Dmitri O.Kondratiev
wrote: Hello, Please advise on existing serialization libraries. I need a simple way to serialize Data.List and Data.Map to plain text files.
Well, the obvious solution is to just use show and read... though if you want something more efficient, have a look at binary and cereal on Hackage.
-- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

If you want plain text serialization, "writeFile "output.txt" . show"
and "fmap read (readFile "output.txt")" should suffice...
Max
On 9 June 2011 08:23, Dmitri O.Kondratiev
Hello, Please advise on existing serialization libraries. I need a simple way to serialize Data.List and Data.Map to plain text files.
Thanks, Dmitri
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Thu, Jun 9, 2011 at 11:31 AM, Max Bolingbroke wrote: If you want plain text serialization, "writeFile "output.txt" . show"
and "fmap read (readFile "output.txt")" should suffice... Max This is really a simple way that I like, thanks. Do I understand this right,
that in this case Haskell run-time will do all the necessary data buffering
as needed?
I am trying to solve the following task:
1) Parse a log of 30 000 lines where each line is about 200 chars. Parsing
includes splitting out of every line two text fields and converting them to
GUID. Then build from this log a list of GUIDs and save it for future use in
text file. As a result list will have 30 000 integer elements.
2) Read GUID list and traverse it several times from E(s) to E(N), where
E(s) - start and E(N) - end elements of traverse, and 's' = [1.. N-1], so
I have the following traversals:
1 ...N
2 .. N
3 .. N
...
N-1 .. N
3) As a result of this traversal I build a Data.Map holding various pairs of
list elements (key) and number of times
pair occurs in the list (value). I need to save this map in text file for
future use.
4) Read from a text file this map back into Data.Map and build from it
another list, save it to text file, etc.
I wonder how Haskell will distribute memory between the buffer for
sequential element access (list elements, map tree nodes) and memory for
computation while reading in list, Data.Map from file? On 9 June 2011 08:23, Dmitri O.Kondratiev Hello,
Please advise on existing serialization libraries.
I need a simple way to serialize Data.List and Data.Map to plain text
files. Thanks,
Dmitri _______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Hi Dmitri,
On 9 June 2011 09:13, Dmitri O.Kondratiev
I wonder how Haskell will distribute memory between the buffer for sequential element access (list elements, map tree nodes) and memory for computation while reading in list, Data.Map from file?
Your list only has 30,000 elements. From the description of the problem, you traverse the list several times, so GHC will create an in-memory link list that persists for the duration of all the traversals. This is OK, because the number of elements in the list is small. For the construction of the Map, it sounds like in the worst case you will have 30,000*30,000 = 900,000,000 elements in the Map, which you may not want to keep in memory. Assuming "show", "read" and list creation are lazy enough, and as long as you use the Map linearly GHC should be able to GC parts of it to keep the working set small. You should experiment and see what happens. My advice is just write the program the simple way (with show and read, and not worrying about memory) and see what happens. If it turns out that it uses too much memory you can come back to the list with your problematic program and ask for advice. Max

On Thu, Jun 9, 2011 at 7:23 PM, Max Bolingbroke
Hi Dmitri,
On 9 June 2011 09:13, Dmitri O.Kondratiev
wrote: I wonder how Haskell will distribute memory between the buffer for sequential element access (list elements, map tree nodes) and memory for computation while reading in list, Data.Map from file?
Your list only has 30,000 elements. From the description of the problem, you traverse the list several times, so GHC will create an in-memory link list that persists for the duration of all the traversals. This is OK, because the number of elements in the list is small.
For the construction of the Map, it sounds like in the worst case you will have 30,000*30,000 = 900,000,000 elements in the Map, which you may not want to keep in memory. Assuming "show", "read" and list creation are lazy enough, and as long as you use the Map linearly GHC should be able to GC parts of it to keep the working set small. You should experiment and see what happens.
My advice is just write the program the simple way (with show and read, and not worrying about memory) and see what happens. If it turns out that it uses too much memory you can come back to the list with your problematic program and ask for advice.
Max
Yes, that's what I will try first - simple serialization with show and read, not worrying about memory. Thanks!

On Thu, Jun 9, 2011 at 11:31 AM, Max Bolingbroke wrote: If you want plain text serialization, "writeFile "output.txt" . show"
and "fmap read (readFile "output.txt")" should suffice... Max This code works:
main = do
let xss = [[1,2,3],[4,5,6],[7,8],[9]]
writeFile "output.txt" (show xss)
line <- readFile "output.txt"
let xss2 = read line :: [[Int]]
print xss2
As soon as complete file is returned as a single line, using 'fmap' does
not make sense here:
line <- readFile "output.txt"
let xss2 = fmap read line
When to use 'fmap'?

On Friday 10 June 2011, 13:49:23, Dmitri O.Kondratiev wrote:
On Thu, Jun 9, 2011 at 11:31 AM, Max Bolingbroke
wrote:
If you want plain text serialization, "writeFile "output.txt" . show" and "fmap read (readFile "output.txt")" should suffice...
Max
This code works:
main = do let xss = [[1,2,3],[4,5,6],[7,8],[9]] writeFile "output.txt" (show xss) line <- readFile "output.txt" let xss2 = read line :: [[Int]] print xss2
As soon as complete file is returned as a single line, using 'fmap' does not make sense here: line <- readFile "output.txt" let xss2 = fmap read line
When to use 'fmap'?
xss2 <- fmap read (readFile "output.txt") or xss2 <- read `fmap` readFile "output.txt" But it might be necessary to tell the compiler which type xss2 ought to have, so it knows which `read' to invoke, if it can't infer that from later use.

On Fri, Jun 10, 2011 at 4:13 PM, Daniel Fischer < daniel.is.fischer@googlemail.com> wrote:
On Friday 10 June 2011, 13:49:23, Dmitri O.Kondratiev wrote:
On Thu, Jun 9, 2011 at 11:31 AM, Max Bolingbroke
wrote:
If you want plain text serialization, "writeFile "output.txt" . show" and "fmap read (readFile "output.txt")" should suffice...
Max
This code works:
main = do let xss = [[1,2,3],[4,5,6],[7,8],[9]] writeFile "output.txt" (show xss) line <- readFile "output.txt" let xss2 = read line :: [[Int]] print xss2
As soon as complete file is returned as a single line, using 'fmap' does not make sense here: line <- readFile "output.txt" let xss2 = fmap read line
When to use 'fmap'?
xss2 <- fmap read (readFile "output.txt")
or
xss2 <- read `fmap` readFile "output.txt"
But it might be necessary to tell the compiler which type xss2 ought to have, so it knows which `read' to invoke, if it can't infer that from later use.
Two questions: 1) Why to use 'fmap' at all if a complete file is read in a single line of text? 2) Trying to use 'fmap' illustrates 1) producing an error (see below): main = do let xss = [[1,2,3],[4,5,6],[7,8],[9]] writeFile "output.txt" (show xss) xss2 <- fmap read (readFile "output.txt") :: [[Int]] print xss2 == Error: Couldn't match expected type `[String]' with actual type `IO String' In the return type of a call of `readFile' In the second argument of `fmap', namely `(readFile "output.txt")' In a stmt of a 'do' expression: xss2 <- fmap read (readFile "output.txt") :: [[Int]]

"Dmitri O.Kondratiev"
xss2 <- read `fmap` readFile "output.txt"
Two questions: 1) Why to use 'fmap' at all if a complete file is read in a single line of text?
Because it's not 'map', it's more generalized. So the argument ('read' here) is applied to whatever is "inside" the second argument ('readFile ...'). Here xss2 <- read `fmap` readFile "output.txt" is equivalent to xss2 <- return . read =<< readFile "output.txt" or tmp <- readFile "output.txt" let xss2 = read tmp
2) Trying to use 'fmap' illustrates 1) producing an error (see below): main = do let xss = [[1,2,3],[4,5,6],[7,8],[9]] writeFile "output.txt" (show xss) xss2 <- fmap read (readFile "output.txt") :: [[Int]] print xss2
fmap read (readFile "output.txt") is of type IO [[Int]], not [[Int]]. -k -- If I haven't seen further, it is by standing in the footprints of giants

On Friday 10 June 2011, 14:25:59, Dmitri O.Kondratiev wrote:
Two questions: 1) Why to use 'fmap' at all if a complete file is read in a single line of text?
Well, it's a matter of taste whether to write foo <- fmap read (readFile "bar") stuffWithFoo or text <- readFile "bar" let foo = read text stuffWithFoo The former saves one line of code (big deal).
2) Trying to use 'fmap' illustrates 1) producing an error (see below): main = do let xss = [[1,2,3],[4,5,6],[7,8],[9]] writeFile "output.txt" (show xss) xss2 <- fmap read (readFile "output.txt") :: [[Int]]
That type signature doesn't refer to xss2, but to the action to the right of the "<-", `fmap read (readFile "output.txt")' readFile "output.txt" :: IO String so fmap foo (readFile "output.txt") :: IO bar supposing foo :: String -> bar You want read at the type `String -> [[Int]]', so the signature has to be xss2 <- fmap read (readFile "output.txt") :: IO [[Int]]
print xss2
== Error: Couldn't match expected type `[String]' with actual type `IO String' In the return type of a call of `readFile' In the second argument of `fmap', namely `(readFile "output.txt")' In a stmt of a 'do' expression: xss2 <- fmap read (readFile "output.txt") :: [[Int]]
Looking at the line xss2 <- fmap read someStuff :: [[Int]] the compiler sees that fmap read someStuff should have type [[Int]] Now, fmap :: Functor f => (a -> b) -> f a -> f b and [] is a Functor, so the fmap here is map, hence map read someStuff :: [[Int]] means someStuff :: [String] That's the expected type of (readFile "output.txt"), but the actual type is of course IO String, which is the reported error.

Thanks for the excellent explanation! : On Fri, Jun 10, 2011 at 4:49 PM, Daniel Fischer < daniel.is.fischer@googlemail.com> wrote:
On Friday 10 June 2011, 14:25:59, Dmitri O.Kondratiev wrote:
Two questions: 1) Why to use 'fmap' at all if a complete file is read in a single line of text?
Well, it's a matter of taste whether to write
foo <- fmap read (readFile "bar") stuffWithFoo
or
text <- readFile "bar" let foo = read text stuffWithFoo
The former saves one line of code (big deal).
2) Trying to use 'fmap' illustrates 1) producing an error (see below): main = do let xss = [[1,2,3],[4,5,6],[7,8],[9]] writeFile "output.txt" (show xss) xss2 <- fmap read (readFile "output.txt") :: [[Int]]
That type signature doesn't refer to xss2, but to the action to the right of the "<-", `fmap read (readFile "output.txt")'
readFile "output.txt" :: IO String
so
fmap foo (readFile "output.txt") :: IO bar
supposing
foo :: String -> bar
You want read at the type `String -> [[Int]]', so the signature has to be
xss2 <- fmap read (readFile "output.txt") :: IO [[Int]]
print xss2
== Error: Couldn't match expected type `[String]' with actual type `IO String' In the return type of a call of `readFile' In the second argument of `fmap', namely `(readFile "output.txt")' In a stmt of a 'do' expression: xss2 <- fmap read (readFile "output.txt") :: [[Int]]
Looking at the line
xss2 <- fmap read someStuff :: [[Int]]
the compiler sees that
fmap read someStuff should have type [[Int]]
Now, fmap :: Functor f => (a -> b) -> f a -> f b
and [] is a Functor, so the fmap here is map, hence
map read someStuff :: [[Int]]
means
someStuff :: [String]
That's the expected type of (readFile "output.txt"), but the actual type is of course IO String, which is the reported error.
participants (6)
-
Daniel Fischer
-
Dmitri O.Kondratiev
-
Ivan Lazar Miljenovic
-
Ketil Malde
-
Max Bolingbroke
-
Yves Parès