Re: [web-devel] xml-types IsString instance for Name causes crashes

10 Jun 2011


      2011/6/9 John Millikin :
...
On Thu, Jun 9, 2011 at 01:27, Yitzchak Gale  wrote:
...
...
...
Since the whole idea of xml-types is for it to be a
unifying standard, I'd like to see it usable in that kind
of environment, too.
...
It is usable in such an environment -- simply do not
use the IsString instance.
Once you have the Name type rather than just Text -
which is useful for people needing namespaces -
the IsString instance is important to keep code from
becoming really awkward.
I don't think the code is awkward without IsString. Compare:
-------------------------------------------------
bar :: Name
bar = "{urn:foo}bar
bar :: Name
bar = Name "bar" (Just "urn:foo") Nothing
-- optional: utility function
name :: Text -> Text -> Name
name ns local = Name local (Just ns) Nothing
bar :: Name
bar = name "urn:foo" "bar"
-------------------------------------------------
Using the Name constructor directly is more verbose, yes, but not very
much. If you know that you'll always have names of a certain form
(such as "has namespace, no prefix") then you can define a simple,
type-safe utility function. The IsString instance is a purely optional
syntactic sugar.
...
Perhaps the real problem here is including Clark
notation in the IsString instance. Clark notation is
very nice, but it doesn't really belong in the IsString
instance. Clark notation could be a function, or a
quasi-quoter. Perhaps the client library should be
allowed to decide.
Is there an alternative syntax you'd prefer?
IMO, IsString is only useful if it allows including both a namespace
and local name. To do so, it has to parse the input, which can fail.
...
But if it is included in the IsString instance, it
absolutely cannot raise an asynchronous exception.
That is a serious bug.
To me, the choice is between raising an exception or removing IsString.
IsString without namespaces is pointless. IsString without input
checking is dangerous. If fromString cannot fail on invalid input,
then it shouldn't be defined.
...
The Name type already produces invalid XML. A client library
that wants to avoid invalid names must already check for them.
If a typo in Clark notation were another way to create invalid
XML that wouldn't change anything.
You're right -- it is already possible for Names to be invalid. There
should probably be stricter input checking on names, to ensure they
match the XML spec. Something like this:
----------------------------------------------------------------------
-- Constructor isn't exported
data Name = Name Text (Maybe Text) (Maybe Text)
-- exported
name :: Text -> Maybe Text -> Maybe Text -> Maybe Name
name = -- validates input
-- exported, raises exception on failure
name_ :: Text -> Maybe Text -> Maybe Text -> Name
name_ local ns prefix = case name local ns prefix of
   Just n -> n
   Nothing -> error ("invalid name: error msg here")
instance IsString Name where
 fromString = name_ . Data.Text.pack
----------------------------------------------------------------------
...
Right, and it shouldn't. IsString is just a way of giving a different
string type, like ByteString or Text, to Haskell's string literal syntax.
There is no parsing to do beyond what the compiler already does.
Any IsString instance should just take the contents of the string
literal and incorporate it directly into the string type.
IMO, the ByteString and Text instances for IsString are broken.
ByteString should raise an exception if any (\c -> ord c > 255)
Text should raise an exception if any of the characters are invalid Unicode
I have had real failures in some of my programs as a result of these
overly-liberal instances, which could have been caught *much* sooner
if they'd simply raised an exception instead of silently returning a
corrupt value.
...
There is no language I've ever heard of where you can cause a
program to crash at *runtime* because of the particular characters
you include in a string literal in your source code. Please don't make
Haskell the first. That would be very ironic.
Again, this cannot cause crashes. Name's fromString throws exceptions,
which can be caught. If your software needs to be reliable, it should
*already* be catching and handling/logging unexpected exceptions,
since they can be raised from literally any point in the code.
Incorrect string literals can cause runtime crashes or exceptions in
many languages; consider [[ char c = "abc"[3]; ]]
...
...
if you're very concerned about it, I could add a flag like
"NoIsString" which disables that particular instance. You could enable
it in your build scripts.
I wouldn't use that flag. I need and use the IsString instance.
The instance must be defined in the module that defines the type.
Nobody *needs* the IsString instance. It's a trivial utility for use
in small scripts and other simple programs. It saves you from typing a
dozen characters in a utility module somewhere, nothing more.
If you're concerned that someone might accidentally use it, and thus
introduce errors into your program, I can offer the choice of
disabling it. The normal way to construct names will still exist.
_______________________________________________
web-devel mailing list
web-devel@haskell.org
http://www.haskell.org/mailman/listinfo/web-devel
Note that the OverloadedStringLiterals together with the 'IsString'
instances gives you more than just a nice syntactic abbreviation: the
defined string literals are also lifted out of their context and
defined as CAFs. Therefore, you can share the conversion from String
to <your-type> during the whole run of a program. We exploit that in
blaze-html to do escaping and UTF-8 encoding only once. Hence, in some
contexts the IsString instances are necessary.

Re: [web-devel] xml-types IsString instance for Name causes crashes

Simon Meier