
Hi. I am working on some practice programming problems, and one is the Roman numeral problem: write a program that converts Roman numerals into their (arabic) numeral equivalent. I imagine I could hack something together, but I was trying to think about the problem a bit more deeply. I don't know much about parsing, but I was wondering if this problem matched up with some kind of parsing or grammar or other generic approach to thinking about the problem. (I once read something about Context Free Grammars, which was rather interesting.) I can do more of my own research if I can get some initial guidance. -- frigidcode.com

Hi Christopher,
I made a small library to convert between strings and roman numerals [1].
It didn't use much abstraction. I mainly used some type-classes so multiple
string-like types can be parsed.
The numerals themselves are basically a concatenation of value symbols. The
order from high to low is not even strictly necessary in order to parse a
roman numeral. One insight was to handle exceptions like "IV", "IX", "XL"
etc. as separate symbols.
It would be interesting if you could parse roman numerals using a dedicated
parsing library and come up with something shorter and/or more
elegant/readable than my little library.
Regards,
Roel
1 - http://hackage.haskell.org/package/roman-numerals
On 24 June 2013 08:43, Christopher Howard wrote: Hi. I am working on some practice programming problems, and one is the
Roman numeral problem: write a program that converts Roman numerals into
their (arabic) numeral equivalent. I imagine I could hack something
together, but I was trying to think about the problem a bit more deeply.
I don't know much about parsing, but I was wondering if this problem
matched up with some kind of parsing or grammar or other generic
approach to thinking about the problem. (I once read something about
Context Free Grammars, which was rather interesting.) I can do more of
my own research if I can get some initial guidance. --
frigidcode.com _______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

An important question here is whether you want to notice when a Roman numeral is invalid, e.g., iix, or not. From a parsing point of view context-free grammars are not ideal. We have the patterns i{1,3} | iv | vi{1,3} | ix units x{1,3} | xl | lx{1,3} | xc tens c{1,3} | cd | dc{1,3} | cm hundreds m{1,3} | mↁ | ↁm{1,3} | mↂ thousands so we want to write a grammar rule like roman(N) --> group('m', 'ↁ', 'ↂ', M), group('c', 'd', 'm', C), group('x', 'l', 'c', X), group('i', 'v', 'x', I), {N is M*1000 + C*100 + X*10 + I}. group(U, F, T, N) --> ( [U,T] -> {N = 9} ; [U,F] -> {N = 4} ; [U,U,U] -> {N = 3} ; [U,U] -> {N = 2} ; [U] -> {N = 1} ; [F,U,U,U] -> {N = 8} ; [F,U,U] -> {N = 7} ; [F,U] -> {N = 6} ; [F] -> {N = 5} ). using DCG notation. This is not context-free. It is an attribute grammar with attributes passed down the parse tree (U, F, T) and passed up the parse tree (N). Parser combinators are the easy way to do this in Haskell; see 'parsec' or any number of parser combinator tutorials. For that matter, it's pretty trivial to do this particular one with plain code that pretty much mirrors the structure shown above. Just write something that takes (as many arguments as you want) then a list of characters coming in and returns a pair with a computed answer and the remaining characters. But wait! That _is_ a parser combinator! I guess parser combinators can't be that hard after all (:-). Good luck finding the five thousand character (ↁ) or the ten thousand character (ↂ) or any of the other Unicode "I'm not a letter, no, honest, I'm really not, I'm a Roman numeral" characters on your keyboard.

Coincidence. I blogged about this today
http://kusimari.blogspot.in/2013/06/roman-numerals-using-parser-combinators.....
I can share the haskelish python code offline, assuming this is not needed
to pass the thoughtworks code submission.
It more or less resembles what Richard has written.
-Santhosh
On Mon, Jun 24, 2013 at 12:54 PM, Richard A. O'Keefe
An important question here is whether you want to notice when a Roman numeral is invalid, e.g., iix, or not.
From a parsing point of view context-free grammars are not ideal. We have the patterns i{1,3} | iv | vi{1,3} | ix units x{1,3} | xl | lx{1,3} | xc tens c{1,3} | cd | dc{1,3} | cm hundreds m{1,3} | mↁ | ↁm{1,3} | mↂ thousands so we want to write a grammar rule like
roman(N) --> group('m', 'ↁ', 'ↂ', M), group('c', 'd', 'm', C), group('x', 'l', 'c', X), group('i', 'v', 'x', I), {N is M*1000 + C*100 + X*10 + I}.
group(U, F, T, N) --> ( [U,T] -> {N = 9} ; [U,F] -> {N = 4} ; [U,U,U] -> {N = 3} ; [U,U] -> {N = 2} ; [U] -> {N = 1} ; [F,U,U,U] -> {N = 8} ; [F,U,U] -> {N = 7} ; [F,U] -> {N = 6} ; [F] -> {N = 5} ).
using DCG notation. This is not context-free. It is an attribute grammar with attributes passed down the parse tree (U, F, T) and passed up the parse tree (N).
Parser combinators are the easy way to do this in Haskell; see 'parsec' or any number of parser combinator tutorials.
For that matter, it's pretty trivial to do this particular one with plain code that pretty much mirrors the structure shown above. Just write something that takes (as many arguments as you want) then a list of characters coming in and returns a pair with a computed answer and the remaining characters. But wait! That _is_ a parser combinator! I guess parser combinators can't be that hard after all (:-).
Good luck finding the five thousand character (ↁ) or the ten thousand character (ↂ) or any of the other Unicode "I'm not a letter, no, honest, I'm really not, I'm a Roman numeral" characters on your keyboard.
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
participants (4)
-
Christopher Howard
-
kusimari share
-
Richard A. O'Keefe
-
Roel van Dijk