
Hi everyone, I'm hoping someone can point me in the right direction for a project I'm working on. Essentially I would like to represent a grid of data (much like a spreadsheet) in pure code. In this sense, one would need functions to operate on the concepts of "rows" and "columns". A simple "cell" might be represented like this: data Cell = CellStr Text | CellInt Integer | CellDbl Double | CellEmpty The spreadsheet analogy isn't too literal as I'll be using this for data with a more regular structure. For instance, one grid might have 3 columns where every item in column one is a CellStr, every item in column two a CellStr, and every item in column 3 a CellDbl, but within a given grid there won't be surprise rows with extra columns or columns that contain some cell strings, some cell ints, etc. Representing cells in a matrix makes the most sense to me, in order to facilitate access by columns or rows or both, and I'd like to know if there's a particular matrix library that would work well with this idea. However, I'm certainly open to any other data structures that may be better suited to the task. Thanks! Eric

Hi Eric A spreadsheet is an indexed / tabular structure which doesn't map well to Haskell's built-in way of defining data - algebraic types - which are trees via sums and products. Wolfram Kahl has a paper on modelling tables in Haskell "Compositional Syntax and Semantics of Tables" which might be interesting / useful: tables look like they have strong similarities to spreadsheets and the implementation is included in the appendix. Unfortunately the code is very complicated - I say this intending no criticism or judgement of Wolfram's work, just that it takes a lot of type system power to get over the representation mismatch between trees and tables. Wolfram Kahl - Compositional Syntax and Semantics of Tables http://www.cas.mcmaster.ca/sqrl/papers/sqrl15.pdf Best wishes Stephen

Hi, Eric Rasmussen wrote:
The spreadsheet analogy isn't too literal as I'll be using this for data with a more regular structure. For instance, one grid might have 3 columns where every item in column one is a CellStr, every item in column two a CellStr, and every item in column 3 a CellDbl, but within a given grid there won't be surprise rows with extra columns or columns that contain some cell strings, some cell ints, etc.
Sounds more like a database than like a spreadsheet. Tillmann

Stephen, thanks for the link! The paper was an interesting read and definitely gave me some ideas. Tillmann -- you are correct in that it's very similar to a database. I frequently go through this process: 1) Receive a flat file (various formats) of tabular data 2) Create a model of the data and a parser for the file 3) Code utilities that allow business users to filter/query/accumulate/compare the files The models are always changing, so one option would be to inspect a user-supplied definition, build a SQLite database to match, and use Haskell to feed in the data and run queries. However, I'm usually dealing with files that can easily be loaded into memory, and generally they aren't accessed with enough frequency to justify persisting them in a separate format. It's actually worked fine in the past to code a custom data type with record syntax (or sometimes just tuples) and simply build a list of them, but the challenge in taking this to a higher level is reading in a user-supplied definition, perhaps translated as 'the first column should be indexed by the string "Purchase amount" and contains a Double', and then performing calculations on those doubles based on further user input. I'm trying to get over bad object-oriented habits of assigning attributes at runtime and inspecting types to determine which functions can be applied to which data, and I'm not sure what concepts of functional programming better address these requirements. On Fri, May 27, 2011 at 12:33 PM, Tillmann Rendel < rendel@informatik.uni-marburg.de> wrote:
Hi,
Eric Rasmussen wrote:
The spreadsheet analogy isn't too literal as I'll be using this for data with a more regular structure. For instance, one grid might have 3 columns where every item in column one is a CellStr, every item in column two a CellStr, and every item in column 3 a CellDbl, but within a given grid there won't be surprise rows with extra columns or columns that contain some cell strings, some cell ints, etc.
Sounds more like a database than like a spreadsheet.
Tillmann
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Fri, May 27, 2011 at 3:11 PM, Eric Rasmussen
Stephen, thanks for the link! The paper was an interesting read and definitely gave me some ideas.
Tillmann -- you are correct in that it's very similar to a database.
I frequently go through this process:
1) Receive a flat file (various formats) of tabular data 2) Create a model of the data and a parser for the file 3) Code utilities that allow business users to filter/query/accumulate/compare the files
The models are always changing, so one option would be to inspect a user-supplied definition, build a SQLite database to match, and use Haskell to feed in the data and run queries. However, I'm usually dealing with files that can easily be loaded into memory, and generally they aren't accessed with enough frequency to justify persisting them in a separate format.
"Worth it" in what terms? You're either going to have to encode the relationships yourself, or else automate the process.
It's actually worked fine in the past to code a custom data type with record syntax (or sometimes just tuples) and simply build a list of them, but the challenge in taking this to a higher level is reading in a user-supplied definition, perhaps translated as 'the first column should be indexed by the string "Purchase amount" and contains a Double', and then performing calculations on those doubles based on further user input. I'm trying to get over bad object-oriented habits of assigning attributes at runtime and inspecting types to determine which functions can be applied to which data, and I'm not sure what concepts of functional programming better address these requirements.
My intuition is to use some kind of initial algebra to create a list-like structure /for each record/ For example, with GADTs:. data Field a = Field { name :: String } data Value a = Value { value :: a }
Presumably, your data definition will parse into: data RecordScheme where NoFields :: RecordScheme AddField :: Field a -> RecordScheme -> RecordScheme
And then, given a record scheme, you can construct a Table running the appropriate queries for the scheme and Populating its Records.
data Record where EndOfRecord :: Record Populate :: Value a -> Record -> Record type Table = [Record]

On Fri, May 27, 2011 at 3:11 PM, Eric Rasmussen
Presumably, your data definition will parse into: data RecordScheme where NoFields :: RecordScheme AddField :: Field a -> RecordScheme -> RecordScheme
And then, given a record scheme, you can construct a Table running the appropriate queries for the scheme and Populating its Records.
data Record where EndOfRecord :: Record Populate :: Value a -> Record -> Record type Table = [Record]

Thanks! I think GADTs may work nicely for this project, so I'm going to
start building it out.
On Fri, May 27, 2011 at 4:16 PM, Alexander Solla
On Fri, May 27, 2011 at 3:11 PM, Eric Rasmussen
wrote: Stephen, thanks for the link! The paper was an interesting read and definitely gave me some ideas.
Tillmann -- you are correct in that it's very similar to a database.
I frequently go through this process:
1) Receive a flat file (various formats) of tabular data 2) Create a model of the data and a parser for the file 3) Code utilities that allow business users to filter/query/accumulate/compare the files
The models are always changing, so one option would be to inspect a user-supplied definition, build a SQLite database to match, and use Haskell to feed in the data and run queries. However, I'm usually dealing with files that can easily be loaded into memory, and generally they aren't accessed with enough frequency to justify persisting them in a separate format.
"Worth it" in what terms? You're either going to have to encode the relationships yourself, or else automate the process.
It's actually worked fine in the past to code a custom data type with record syntax (or sometimes just tuples) and simply build a list of them, but the challenge in taking this to a higher level is reading in a user-supplied definition, perhaps translated as 'the first column should be indexed by the string "Purchase amount" and contains a Double', and then performing calculations on those doubles based on further user input. I'm trying to get over bad object-oriented habits of assigning attributes at runtime and inspecting types to determine which functions can be applied to which data, and I'm not sure what concepts of functional programming better address these requirements.
My intuition is to use some kind of initial algebra to create a list-like structure /for each record/ For example, with GADTs:.
data Field a = Field { name :: String } data Value a = Value { value :: a }
Presumably, your data definition will parse into: data RecordScheme where NoFields :: RecordScheme AddField :: Field a -> RecordScheme -> RecordScheme
And then, given a record scheme, you can construct a Table running the appropriate queries for the scheme and Populating its Records.
data Record where EndOfRecord :: Record Populate :: Value a -> Record -> Record
type Table = [Record]
participants (4)
-
Alexander Solla
-
Eric Rasmussen
-
Stephen Tetley
-
Tillmann Rendel