Sparse records/ADTs - Haskell-Cafe - Haskell.org

newer
Overload function application?...

Sparse records/ADTs

older
Type-directed functions with data...

Jon Fairbairn

24 Oct 2012 24 Oct '12

6:08 a.m.

Is there a convenient way of handling a data structure with lots of fields of different types that may or may not be filled in? Something equivalent to data D = D {a::Maybe A, b::Maybe B, c::Maybe C, …} but with better space efficiency and a more convenient empty object. An easy alternative is data E = Ea A | Eb B | Ec C | … type R = [E] which has a straightforward empty object, but one then must define getA e = listToMaybe [a | Ea a <- e] for each field, which is tedious (and O(n)). Obviously Templates would help, but is there an alternative I’ve missed? -- Jón Fairbairn Jon.Fairbairn@cl.cam.ac.uk

Reply

Sign in to reply online Use email software

Show replies by date

Roman Cheplyaka

24 Oct 24 Oct

6:55 a.m.

* Jon Fairbairn [2012-10-24 11:08:29+0100]

Is there a convenient way of handling a data structure with lots of fields of different types that may or may not be filled in?

Something equivalent to

data D = D {a::Maybe A, b::Maybe B, c::Maybe C, …}

but with better space efficiency and a more convenient empty object.

An easy alternative is

data E = Ea A | Eb B | Ec C | … type R = [E]

which has a straightforward empty object, but one then must define

getA e = listToMaybe [a | Ea a <- e]

for each field, which is tedious (and O(n)). Obviously Templates would help, but is there an alternative I’ve missed?

For runtime efficiency it's better to use Data.Map. For keys of this map you have two alternatives: either define data Key = Ka | Kb | Kc | ... or, to prevent this duplication at the cost of less convenient notation, you can do something like -- Identity combinator; or use one from Control.Monad.Identity newtype I a = I a -- Constant combinator newtype K a b = K a deriving (Eq, Ord) data E c = Ea (c A) | Eb (c B) deriving instance Eq (E (K ())) deriving instance Ord (E (K ())) type R = Map.Map (E (K ())) (E I) When you set a value, you infer the key from the value. (This inference needs to be written manually or generated.) When you get a value by the key, you check that the returned constructor is what you expect and throw an error otherwise (but the latter should never happen if you maintain the invariant). Roman

Reply

Sign in to reply online Use email software

Roman Cheplyaka

7:06 a.m.

* Roman Cheplyaka [2012-10-24 13:55:17+0300]

* Jon Fairbairn [2012-10-24 11:08:29+0100]

...
Is there a convenient way of handling a data structure with lots of fields of different types that may or may not be filled in?

Something equivalent to

data D = D {a::Maybe A, b::Maybe B, c::Maybe C, …}

but with better space efficiency and a more convenient empty object.

An easy alternative is

data E = Ea A | Eb B | Ec C | … type R = [E]

which has a straightforward empty object, but one then must define

getA e = listToMaybe [a | Ea a <- e]

for each field, which is tedious (and O(n)). Obviously Templates would help, but is there an alternative I’ve missed?

For runtime efficiency it's better to use Data.Map.

Actually, you can use Data.IntMap for even better performance, if you define an Enum instance for your keys. Roman

Reply

Sign in to reply online Use email software

Daniel Trstenjak

7:02 a.m.

Hi Jon On Wed, Oct 24, 2012 at 11:08:29AM +0100, Jon Fairbairn wrote:

for each field, which is tedious (and O(n)). Obviously Templates would help, but is there an alternative I’ve missed?

perhaps something like: data Type = Ta | Tb | Tc ... data E = Ea A | Eb B | Ec C | ... type D = HashMap Type E get :: Type -> D -> Maybe E get = lookup Greetings, Daniel

Reply

Sign in to reply online Use email software

Twan van Laarhoven

11:10 a.m.

On 24/10/12 12:08, Jon Fairbairn wrote:

Is there a convenient way of handling a data structure with lots of fields of different types that may or may not be filled in?

Not sure about convenience, but here is a type safe solution with O(log n) lookups and updates. The idea is to define a GADT tree type with a fixed layout: -- define the structure type MyT = TBranch (TLeaf A) (TBranch (TLeaf B) (TLeaf C)) -- a value level tree that uses that structure type My = GTree MyT You still have to define the paths to the members pa = GL GH pb = GR (GL GH) pc = GR (GR GH) But once you have that you can perform lookups and updates: *Main> glookup pc (gupdate pa (Just A) (gupdate pc (Just C) gempty)) Just C It shouldn't be too hard to make a template haskell function that generates these paths. Or perhaps the corresponding lenses. Twan

Reply

Sign in to reply online Use email software

Jon Fairbairn

26 Oct 26 Oct

11:17 a.m.

Twan van Laarhoven writes:

On 24/10/12 12:08, Jon Fairbairn wrote:

...
Is there a convenient way of handling a data structure with lots of fields of different types that may or may not be filled in?

Not sure about convenience, but here is a type safe solution with O(log n) lookups and updates. The idea is to define a GADT tree type with a fixed layout:

Thanks for your reply (and for all the others). Since type safe is something that (for me) goes without saying, this is the best solution, but it doesn’t really satisfy the convenience aspect. (I had already looked at solutions using Map and contemplated a tree structure, but didn’t like anything I had come up with). In short, it looks like the answer to my question is “No.” :-/ -- Jón Fairbairn Jon.Fairbairn@cl.cam.ac.uk

Reply

Sign in to reply online Use email software

Yuri de Wit

11:25 a.m.

Would this be relevant? https://github.com/jonsterling/Data.Records On Fri, Oct 26, 2012 at 11:17 AM, Jon Fairbairn wrote:

Twan van Laarhoven writes:

...
On 24/10/12 12:08, Jon Fairbairn wrote:

...
Is there a convenient way of handling a data structure with lots of fields of different types that may or may not be filled in?

Not sure about convenience, but here is a type safe solution with O(log n) lookups and updates. The idea is to define a GADT tree type with a fixed layout:

Thanks for your reply (and for all the others). Since type safe is something that (for me) goes without saying, this is the best solution, but it doesn’t really satisfy the convenience aspect. (I had already looked at solutions using Map and contemplated a tree structure, but didn’t like anything I had come up with). In short, it looks like the answer to my question is “No.” :-/

-- Jón Fairbairn Jon.Fairbairn@cl.cam.ac.uk

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply

Sign in to reply online Use email software

Jon Fairbairn

27 Oct 27 Oct

6:04 a.m.

Yuri de Wit writes:

Would this be relevant?

https://github.com/jonsterling/Data.Records

That looks promising, thanks. It does use rather a lot of extensions, so it’ll take me a while to understand it. -- Jón Fairbairn Jon.Fairbairn@cl.cam.ac.uk

Reply

Sign in to reply online Use email software

Sjoerd Visscher

6:19 a.m.

Maybe the vault package works for you? http://hackage.haskell.org/package/vault Sjoerd Visscher On Oct 26, 2012, at 5:17 PM, Jon Fairbairn wrote:

Twan van Laarhoven writes:

...
On 24/10/12 12:08, Jon Fairbairn wrote:

...
Is there a convenient way of handling a data structure with lots of fields of different types that may or may not be filled in?

Not sure about convenience, but here is a type safe solution with O(log n) lookups and updates. The idea is to define a GADT tree type with a fixed layout:

Thanks for your reply (and for all the others). Since type safe is something that (for me) goes without saying, this is the best solution, but it doesn’t really satisfy the convenience aspect. (I had already looked at solutions using Map and contemplated a tree structure, but didn’t like anything I had come up with). In short, it looks like the answer to my question is “No.” :-/

-- Jón Fairbairn Jon.Fairbairn@cl.cam.ac.uk

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply

Sign in to reply online Use email software

AntC

24 Oct 24 Oct

7:10 p.m.

Jon Fairbairn writes:

Is there a convenient way of handling a data structure with lots of fields of different types that may or may not be filled in?

Hi Jon, if your question had appeared in a database forum, the answer would be ... Sounds like you need to do normal-form analysis: - what are the functional dependencies between the fields? - is your data structure something like a 'universal relation'? - is the structure better expressed as several records? (vertically partitioned) Relevant treatments could be "How to Handle missing information without using NULL" http://www.dcs.warwick.ac.uk/~hugh/TTM/Missing-info-without-nulls.pdf I appreciate that SQL's NULL is not the same as Haskell's Maybe/Nothing (and SQL's semantics for NULL are utterly horrible). Nevertheless, if you're concerned to avoid the wasted space of sparse records, it could be a helpful discipline to design data structures without needing Maybe's. AntC

Reply

Sign in to reply online Use email software

David Thomas

8:30 p.m.

You've got a bunch of great answers, if there's no rhyme or reason to which fields are missing. If, on the other hand, they will tend to be present or absent in groups, you could decompose your data-structure a bit, for fast lookups, good space efficiency, and maybe even slightly more interesting checks from the type system. On Wed, Oct 24, 2012 at 3:08 AM, Jon Fairbairn wrote:

Is there a convenient way of handling a data structure with lots of fields of different types that may or may not be filled in?

Something equivalent to

data D = D {a::Maybe A, b::Maybe B, c::Maybe C, …}

but with better space efficiency and a more convenient empty object.

An easy alternative is

data E = Ea A | Eb B | Ec C | … type R = [E]

which has a straightforward empty object, but one then must define

getA e = listToMaybe [a | Ea a <- e]

for each field, which is tedious (and O(n)). Obviously Templates would help, but is there an alternative I’ve missed?

-- Jón Fairbairn Jon.Fairbairn@cl.cam.ac.uk

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply

Sign in to reply online Use email software

4589

Age (days ago)

4592

Last active (days ago)

Download

10 comments

8 participants

tags

participants (8)

AntC
Daniel Trstenjak
David Thomas
Jon Fairbairn
Roman Cheplyaka
Sjoerd Visscher
Twan van Laarhoven
Yuri de Wit