How to make this more functional?

I wrote a 10-line program[1] for converting from org-mode format to smsn-mode format. Both formats use indentation to indicate hierarchy. In org, a line at level k (levels are positive integers) starts with k asterisks, followed by a space, followed by the text of the line. In smsn-mode, a line at level k starts with 4*(k-1) spaces, followed by an asterisks, followed by a space. I feel like there ought to be an intermediate step where it converts the data to something other than string -- for instance, data IndentedLine = IndentedLine Int String | BadLine and then generates the output from that. It feels like it needs a parser. But when I resurrect my parser code that I understood many moons ago, it looks like a lot of machinery. Thanks. [1] https://github.com/synchrony/smsn-mode/blob/develop/org-to-smsn-mode.hs -- Jeff Brown | Jeffrey Benjamin Brown Website https://msu.edu/~brown202/ | Facebook https://www.facebook.com/mejeff.younotjeff | LinkedIn https://www.linkedin.com/in/jeffreybenjaminbrown(spammy, so I often miss messages here) | Github https://github.com/jeffreybenjaminbrown

On 2017-07-18 06:10, Jeffrey Brown wrote:
I wrote a 10-line program[1] for converting from org-mode format to smsn-mode format. Both formats use indentation to indicate hierarchy. In org, a line at level k (levels are positive integers) starts with k asterisks, followed by a space, followed by the text of the line. In smsn-mode, a line at level k starts with 4*(k-1) spaces, followed by an asterisks, followed by a space.
I feel like there ought to be an intermediate step where it converts the data to something other than string -- for instance,
data IndentedLine = IndentedLine Int String | BadLine
and then generates the output from that.
I think generating an intermediate data structure makes a lot of sense. To make your program more 'functional', I'd start by factoring out the IO part as early as often. I.e. consider your program to be a function of type 'String -> String': it consumes a string, and yields a string: reformat :: String -> String Now, reformatting the input means splitting it into lines, converting each line, and then merging the lines into a single string again, i.e. we can define 'reformat' as: reformat input = unlines (convertLine (lines input)) To make this type-check, clearly you need some functions with the types lines :: String -> [String] convertLine :: String -> String unlines :: [String] -> String As it happens, the first and the last function are part of the standard library, so we only need to worry about 'convertLine'. Converting a line means parsing the input line and the serialising the parsed data to the output format, i.e. convertLine line = serialiseToOutput (parseLine line) At this point, some sort of data structure to pass from parseLine to serialiseToOutput would be useful. You could certainly go for the 'IndentedLine' type you sketched, i.e. the parseLine function can be declared to be of type parseLine :: String -> IndentedLine I'll skip defining this function, but it might be that the 'span' function defined in the Data.List module might be useful here. With that at hand, you only need to define the serialiseToOutput function which (in order to make this program type-check) needs to be of type serialiseToOutput :: IndentedLine -> String Again, I'll omit the definition here (but the 'replicate' function would probably be useful). At this point, you should have your 'reformat' function fully defined and usable from within 'ghci', i.e. you can nicely test it with some manual input. What's missing is to use it in a real program - you could of course plug it into your existing program calling 'readFile', but as a last idea I'd like to mention the standard 'interact' function which, given a function of type 'String -> String', yields an IO action which reads some input from stdin, applies the given function to it, and then prints the output to stdout. A useful helper for defining UNIX-style filter programs. I believe one lesson to take from this is to not think about how the program does something ('count the number of * characters, etc.) but rather think about _what_ the program does - in this case, in a top-down fashion. Also, in Haskell, this type-driven development works quite nicely to yield programs which you can tinker with very early on. -- Frerich Raabe - raabe@froglogic.com www.froglogic.com - Multi-Platform GUI Testing

interact is slick! And I love that there's a word for unlines. Thanks,
Frerich!
On Mon, Jul 17, 2017 at 11:32 PM, Frerich Raabe
On 2017-07-18 06:10, Jeffrey Brown wrote:
I wrote a 10-line program[1] for converting from org-mode format to smsn-mode format. Both formats use indentation to indicate hierarchy. In org, a line at level k (levels are positive integers) starts with k asterisks, followed by a space, followed by the text of the line. In smsn-mode, a line at level k starts with 4*(k-1) spaces, followed by an asterisks, followed by a space.
I feel like there ought to be an intermediate step where it converts the data to something other than string -- for instance,
data IndentedLine = IndentedLine Int String | BadLine
and then generates the output from that.
I think generating an intermediate data structure makes a lot of sense.
To make your program more 'functional', I'd start by factoring out the IO part as early as often. I.e. consider your program to be a function of type 'String -> String': it consumes a string, and yields a string:
reformat :: String -> String
Now, reformatting the input means splitting it into lines, converting each line, and then merging the lines into a single string again, i.e. we can define 'reformat' as:
reformat input = unlines (convertLine (lines input))
To make this type-check, clearly you need some functions with the types
lines :: String -> [String] convertLine :: String -> String unlines :: [String] -> String
As it happens, the first and the last function are part of the standard library, so we only need to worry about 'convertLine'. Converting a line means parsing the input line and the serialising the parsed data to the output format, i.e.
convertLine line = serialiseToOutput (parseLine line)
At this point, some sort of data structure to pass from parseLine to serialiseToOutput would be useful. You could certainly go for the 'IndentedLine' type you sketched, i.e. the parseLine function can be declared to be of type
parseLine :: String -> IndentedLine
I'll skip defining this function, but it might be that the 'span' function defined in the Data.List module might be useful here. With that at hand, you only need to define the serialiseToOutput function which (in order to make this program type-check) needs to be of type
serialiseToOutput :: IndentedLine -> String
Again, I'll omit the definition here (but the 'replicate' function would probably be useful).
At this point, you should have your 'reformat' function fully defined and usable from within 'ghci', i.e. you can nicely test it with some manual input. What's missing is to use it in a real program - you could of course plug it into your existing program calling 'readFile', but as a last idea I'd like to mention the standard 'interact' function which, given a function of type 'String -> String', yields an IO action which reads some input from stdin, applies the given function to it, and then prints the output to stdout. A useful helper for defining UNIX-style filter programs.
I believe one lesson to take from this is to not think about how the program does something ('count the number of * characters, etc.) but rather think about _what_ the program does - in this case, in a top-down fashion. Also, in Haskell, this type-driven development works quite nicely to yield programs which you can tinker with very early on.
-- Frerich Raabe - raabe@froglogic.com www.froglogic.com - Multi-Platform GUI Testing _______________________________________________ Beginners mailing list Beginners@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners
-- Jeff Brown | Jeffrey Benjamin Brown Website https://msu.edu/~brown202/ | Facebook https://www.facebook.com/mejeff.younotjeff | LinkedIn https://www.linkedin.com/in/jeffreybenjaminbrown(spammy, so I often miss messages here) | Github https://github.com/jeffreybenjaminbrown

On 2017-07-18 06:10, Jeffrey Brown wrote:
I wrote a 10-line program[1] for converting from org-mode format to smsn-mode format. Both formats use indentation to indicate hierarchy. In org, a line at level k (levels are positive integers) starts with k asterisks, followed by a space, followed by the text of the line. In smsn-mode, a line at level k starts with 4*(k-1) spaces, followed by an asterisks, followed by a space.
I feel like there ought to be an intermediate step where it converts the data to something other than string -- for instance,
data IndentedLine = IndentedLine Int String | BadLine
and then generates the output from that.
I think generating an intermediate data structure makes a lot of sense. To make your program more 'functional', I'd start by factoring out the IO part as early as often. I.e. consider your program to be a function of type 'String -> String': it consumes a string, and yields a string: reformat :: String -> String Now, reformatting the input means splitting it into lines, converting each line, and then merging the lines into a single string again, i.e. we can define 'reformat' as: reformat input = unlines (convertLine (lines input)) To make this type-check, clearly you need some functions with the types lines :: String -> [String] convertLine :: String -> String unlines :: [String] -> String As it happens, the first and the last function are part of the standard library, so we only need to worry about 'convertLine'. Converting a line means parsing the input line and the serialising the parsed data to the output format, i.e. convertLine line = serialiseToOutput (parseLine line) At this point, some sort of data structure to pass from parseLine to serialiseToOutput would be useful. You could certainly go for the 'IndentedLine' type you sketched, i.e. the parseLine function can be declared to be of type parseLine :: String -> IndentedLine I'll skip defining this function, but it might be that the 'span' function defined in the Data.List module might be useful here. With that at hand, you only need to define the serialiseToOutput function which (in order to make this program type-check) needs to be of type serialiseToOutput :: IndentedLine -> String Again, I'll omit the definition here (but the 'replicate' function would probably be useful). At this point, you should have your 'reformat' function fully defined and usable from within 'ghci', i.e. you can nicely test it with some manual input. What's missing is to use it in a real program - you could of course plug it into your existing program calling 'readFile', but as a last idea I'd like to mention the standard 'interact' function which, given a function of type 'String -> String', yields an IO action which reads some input from stdin, applies the given function to it, and then prints the output to stdout. A useful helper for defining UNIX-style filter programs. I believe one lesson to take from this is to not think about how the program does something ('count the number of * characters, etc.) but rather think about _what_ the program does - in this case, in a top-down fashion. Also, in Haskell, this type-driven development works quite nicely to yield programs which you can tinker with very early on. -- Frerich Raabe - raabe@froglogic.com www.froglogic.com - Multi-Platform GUI Testing
participants (2)
-
Frerich Raabe
-
Jeffrey Brown