Re: [Haskell-cafe] Don't make 'show*' functions

Quoth brian

Jeff Heard proclaimed:
There are multiple distinct reasons people use Show, and this gets confusing.
A parser function applied to a Haskell string representation of the second
This is exactly what I was getting at. I see four uses being discussed: 1) Programmer readable / compiler parsable. For this we have 'Show' and 'Read', but the community has a lack of conformance on proper use of these classes. Suggestion 1: Have "a == (read . show) a" as a mandatory QC test for all exported data structures that have Read and Show instances within the Haskell Platform packages. 2) Human readable format ('pretty'). The 'Pretty' type class is hidden in Test.PrettyPrint.HughesPJClass - much as I admire the giants of FP I feel we shouldn't encode peoples names in modules. The whole 'Pretty' module heirarchy seems backward to me, the guts are on the bottom (Text.PrettyPrint) and the high level API is hidden. Suggestion 2: Alter the Pretty packages just a little, combine them, and encourage people to use it. 3) "Network ready" string packing as mentioned by Donn Cave. This one surprised me as I hadn't thought of a function termed 'str' or 'show' being used for binary encoding. Obviously, we have Binary for just this. 4) Casts comment: form --- still (somewhat) readable, but also acceptable to ghci as input. This odd middle ground is the only place I see package specific 'show*' functions as appropriate. OTOH, if its still valid Haskell I don't see why this still can't be the Show type class, but the derivation will have to be done manually or with a patched compiler. forall m. (Mind m, View m ~ Maybe (Email Substance)) => Email a -> m -> View m (peoples views may vary and better be deterministic) Cheers, Tom

Thomas DuBuisson wrote:
Jeff Heard proclaimed:
There are multiple distinct reasons people use Show, and this gets confusing.
This is exactly what I was getting at. I see four uses being discussed:
Indeed, though I think the situation is even worse. It seems to me that there are a number of cross-purposes for the Read+Show classes. The two main questions at stake are: A) Who/what is the audience? Common answers: 1) The human user on the other side of the terminal 2) The human developer trying to debug their work 3) The compiler, a la cut&paste 4) A program on the other end of the wire/disk/flux capacitor B) What is the resolution/detail? Common answers: 1) All the gory details 2) Enough for a human to get the big picture 3) Enough for a computer to get the right value The @a == (read . show) a@ interpretation is in A3/B1 territory, or A3/B3 if smart constructors are used. The GHCi and Hugs REPL are generally in A2/B1, or A1/B2 if we're trying to abstract away from the concrete implementation. The showTrie function mentioned at the outset of this thread is firmly in the A2/B2 category and is not intended for end users at all. Clearly not all of these combinations can be serviced by a single class or pair of classes. Many of the combinations can themselves be broken down further (how much detail is "enough" to get the big picture?). One of the things which has always disappointed me about the Read+Show classes is that they elide the profound difference between A1 and A2. Printing out a value for human consumption is entirely different than printing it out for debugging. The A3 compromise only serves to muddy things further and brings on the spectre of A4. For complex datastructures like Map, IntMap, and Trie there are many details stored in the structure which end users need not or should not know about; but these details are essential to the developers to ensure their code is doing what they think it should be. Similarly, for these large datastructures there is a profound difference in resolution between a derived Show class and an ASCII-art rendering of the tree. For small values ---and when you don't trust your tree drawer--- the derived instance is just what you want, but it quickly becomes unreadable for all but the most trivial examples. Given the enormous design space involved here, there is no tractable answer that will cover everything. People have been working on user interfaces and data visualization for years, and no perfect answer has been found. But that doesn't mean we can't make progress. I agree with Thomas DuBuisson's suggestions and have rephrased them below, along with three new ones of my own. Proposal 1: Combine Read and Show into a single class, "enforcing" @a == (read . show) a@ with the intention of capturing A3/B1 or A3/B3. While this may be serviceable for A1 or A2 uses, the intention of the class should be made clear that it is for A3. Presumably some solution should be found for types which can be read or shown but not both. Proposal 2: Clean up Text.PrettyPrint.HughesPJ and market it heavily for covering code-oriented aspects of A1. Other visualizations like charts, graphs, or trees should be relegated elsewhere. Proposal 3: Add a generic Lisp pretty printer function to convert from the output of Proposal 1 towards output like Proposal 2. Dealing with operators makes it trickier than Lisp, but most datastructures lack operators. I'm sure this has already been written in Haskell by Yi enthusiasts, and it should reduce the cost of getting people into using Pretty. Proposal 4: Write a generic function for taking recursive types and printing them as a tree. Users should only need to serialize the "here" content of each node, leaving it up to the generic function to add spines and adequate spacing between nodes. This is for targeting A2/B2. While it's a time-honored tradition to implement specialized versions of this function to introduce people to recursion, making a standard version for visualizing large datastructures would alleviate some of the burden of what Show should be doing. Proposal 5: Currently a main consumer of Show is the GHCi or Hugs REPL. However, those are used both by end users and by developers, which leads to the elision between A1 and A2. Provided the previous four proposals are taken to heart, it would be nice if GHCi and Hugs had commands to select which viewing mode (show, prettyShow, lispShow, treeShow) should be used for each type. By default, when available prettyShow should be favored over lispShow which is favored over show; but this behavior could be changed on a type-by-type basis. There are complications here regarding how to recurse for each element of a type, but they seem soluble. -- Live well, ~wren

wren ng thornton schrieb:
Thomas DuBuisson wrote:
Jeff Heard proclaimed:
There are multiple distinct reasons people use Show, and this gets confusing.
This is exactly what I was getting at. I see four uses being discussed:
Indeed, though I think the situation is even worse. It seems to me that there are a number of cross-purposes for the Read+Show classes. The two main questions at stake are:
A) Who/what is the audience?
Common answers: 1) The human user on the other side of the terminal
no
2) The human developer trying to debug their work
yes
3) The compiler, a la cut&paste
yes
4) A program on the other end of the wire/disk/flux capacitor
no
B) What is the resolution/detail?
Common answers: 1) All the gory details
maybe
2) Enough for a human to get the big picture
no
3) Enough for a computer to get the right value
yes
For complex datastructures like Map, IntMap, and Trie there are many details stored in the structure which end users need not or should not know about; but these details are essential to the developers to ensure their code is doing what they think it should be.
Hm, yes, there seems to be demand for debug-levels, even different ones for different libraries
Proposal 1: Combine Read and Show into a single class, "enforcing" @a == (read . show) a@ with the intention of capturing A3/B1 or A3/B3. While this may be serviceable for A1 or A2 uses, the intention of the class should be made clear that it is for A3. Presumably some solution should be found for types which can be read or shown but not both.
functions - If you have only one field in a record, which is of function type, you may use a dummy 'show' for it, but you cannot define a 'read'.
Proposal 2: Clean up Text.PrettyPrint.HughesPJ and market it heavily for covering code-oriented aspects of A1. Other visualizations like charts, graphs, or trees should be relegated elsewhere.
... TeX, HTML, ...
Proposal 3: Add a generic Lisp pretty printer function to convert from the output of Proposal 1 towards output like Proposal 2. Dealing with operators makes it trickier than Lisp, but most datastructures lack operators. I'm sure this has already been written in Haskell by Yi enthusiasts, and it should reduce the cost of getting people into using Pretty.
Proposal 4: Write a generic function for taking recursive types and printing them as a tree. Users should only need to serialize the "here" content of each node, leaving it up to the generic function to add spines and adequate spacing between nodes. This is for targeting A2/B2. While it's a time-honored tradition to implement specialized versions of this function to introduce people to recursion, making a standard version for visualizing large datastructures would alleviate some of the burden of what Show should be doing.
Proposal 5: Currently a main consumer of Show is the GHCi or Hugs REPL. However, those are used both by end users and by developers, which leads to the elision between A1 and A2.
I think, that a user who uses GHCi becomes a developer. For me a user is someone who calls compiled Haskell programs.
Provided the previous four proposals are taken to heart, it would be nice if GHCi and Hugs had commands to select which viewing mode (show, prettyShow, lispShow, treeShow) should be used for each type. By default, when available prettyShow should be favored over lispShow which is favored over show; but this behavior could be changed on a type-by-type basis. There are complications here regarding how to recurse for each element of a type, but they seem soluble.
Yes it would be nice, if the showing in GHCi could be changed. Such that e.g. matrices are shown as grids. However, this can be currently done by passing the expression to a function which does the required formatting. I remember GSLMatrix had a function like (//) :: Matrix -> Precision -> IO () which let you write GSLMatrix> matrix // 2 /1.00 0.00\ \0.00 1.00/ I would not be surprised, if GHCi can already be configured to attach the (//2) automatically. I always think we need a type class which allows to give type specific options. However, the option type would functionally dependency on the type of the shown value, and thus the type class must be a multiparameter typeclass with functional dependencies. This way you can specify formatting options of the matrix (kind of parentheses) and its elements (precision, exponential style etc.).

Henning Thielemann wrote:
I think, that a user who uses GHCi becomes a developer. For me a user is someone who calls compiled Haskell programs.
GHCi makes for a great calculator. I agree that, to a first approximation, "users" call compiled binaries and "developers" use GHCi. But there's a big gap between those two which is not devoid of life. I think there are many more people who use GHCi as an interactive shell session than most folks give credit for. More particularly, I posit that the prevalence of these people is part of what muddies the questions of who the audience of Show should be. Even if we chose to lump them in with "developers" there's still the very real problem of resolution. Many people develop by bricolage, which means that over time they transition from being "interactive users" or "casual developers" or "scripters" into being "'real' developers". Just as often 'real' developers transition into interaction when they want to do active debugging. We should be able to visualize data differently at each of these different stages, but it is just as important ---if not moreso--- to be able to transition between different resolutions easily. So far, the Show class is very much a one-size-fits-all solution which doesn't fit anyone very well. IMO, trying to refine or redefine the intended semantics of Show is wrongheaded, because the space between "users" and "developers" is far closer to being continuous than discrete. A single type-class that looks like Show cannot possibly resolve this mismatch, no matter what the intended semantics are. The continuous space between "users" and "developers" requires some solution which takes the variability of audiences/resolutions into consideration. Trying to shove everyone into the same bucket has been not working for some time already. We should acknowledge the real problem behind that. -- Live well, ~wren
participants (4)
-
Donn Cave
-
Henning Thielemann
-
Thomas DuBuisson
-
wren ng thornton