How to correctly define data types?

Hello (first post), I have spent so many time learning typeclasses and thinking it was part of Haskell essence... and suddenly I read the discussion "[Haskell-beginners] Can fields in a record be optional?". So typeclasses are not always recomended? I'm implementing a Machine Learning framework and I am in a sort of related dilemma. I found three ways of implementing the same distance function between "examples" (aka "attribute vectors" or simply "Float vectors" for mere mortals :) ): [obs: "Example" datatype will be added more fields later] --------------first------------------------------------ module ML where data Example = Example [Float] deriving (Show) class ExampleClass a where (distance) :: a → a → Float instance ExampleClass Example where (Example atts1) distance (Example atts2) = sqrt $ sum $ map (λ(x, y) → (x-y)↑2) $ zip atts1 atts2 ================================= --------------second------------------------------------ module ML where data Example = Example {attributes :: [Float]} deriving (Show) distance :: Example → Example → Float distance ex1 ex2 = sqrt $ sum $ map (λ(x, y) → (x-y)↑2) $ zip (attributes ex1) (attributes ex2) ================================= --------------third------------------------------------ module ML where data Example = Example [Float] deriving (Show) distance :: Example → Example → Float distance (Example att1) (Example att2) = sqrt $ sum $ map (λ(x, y) → (x-y)↑2) $ zip (att1) (att2) ================================= All three reserves the word "distance" for itself and the second reserves also the word "attributes". How could I implement the module ML and which would be the best way to set "attributes" outside the module? Thanks Davi

On Mon, Jul 18, 2011 at 02:26:33PM -0300, Davi Santos wrote:
Hello (first post), I have spent so many time learning typeclasses and thinking it was part of Haskell essence... and suddenly I read the discussion "[Haskell-beginners] Can fields in a record be optional?". So typeclasses are not always recomended?
I'm implementing a Machine Learning framework and I am in a sort of related dilemma.
I found three ways of implementing the same distance function between "examples" (aka "attribute vectors" or simply "Float vectors" for mere mortals :) ):
[obs: "Example" datatype will be added more fields later]
--------------first------------------------------------ module ML where
data Example = Example [Float] deriving (Show)
class ExampleClass a where (distance) :: a → a → Float
instance ExampleClass Example where (Example atts1) distance (Example atts2) = sqrt $ sum $ map (λ(x, y) → (x-y)↑2) $ zip atts1 atts2 =================================
This is an unnecessary use of type classes, unless you plan to make additional instances of ExampleClass later.
--------------second------------------------------------ module ML where
data Example = Example {attributes :: [Float]} deriving (Show)
distance :: Example → Example → Float distance ex1 ex2 = sqrt $ sum $ map (λ(x, y) → (x-y)↑2) $ zip (attributes ex1) (attributes ex2) =================================
--------------third------------------------------------ module ML where
data Example = Example [Float] deriving (Show)
distance :: Example → Example → Float distance (Example att1) (Example att2) = sqrt $ sum $ map (λ(x, y) → (x-y)↑2) $ zip (att1) (att2) =================================
I'd say these are about the same style-wise, it's a matter of preference. But if more fields will be added to Example later, using record labels may be a good idea.
All three reserves the word "distance" for itself and the second reserves also the word "attributes". How could I implement the module ML and which would be the best way to set "attributes" outside the module?
I'm not sure what you mean by "set 'attributes' outside the module", can you clarify? -Brent

Brent, thanks for answering. Sorry, by "set" I meant "access" [the field "attributes"]. But you already answered, it depends on the ML module implementation choice. One more question about elegance: How to make Examples "sortable" by distance to a reference point "referenceEx"? For instance, if referenceEx was not needed: ----------------------------- listExs = [Example {att=[2,5,3]}, Example {att=[0,2,3]}, Example {att=[2,0,9]}] sortedExamples = sort listExs ================== With referenceEx, I tried this, but it is getting ugly: ------------------------------ sortedExamples = sort $ map (ExamplesPair referenceEx) listExs ================ ------ML.hs------------------------------------------ data ExamplesPairClass = ExamplesPair {ex1 :: Example, ex2 :: Example} deriving (Show, Eq) instance Ord ExamplesPairClass where (ExamplesPair ex1 referenceEx) ≤ (ExamplesPair ex2 referenceEx) = (ex1 distance referenceEx) ≤ (ex2 distance referenceEx) ================================== Sorry that amount of code, it is terrible to read with no colorizing. Davi

On Tue, Jul 19, 2011 at 01:46:19PM -0300, Davi Santos wrote:
Brent, thanks for answering. Sorry, by "set" I meant "access" [the field "attributes"]. But you already answered, it depends on the ML module implementation choice.
One more question about elegance: How to make Examples "sortable" by distance to a reference point "referenceEx"?
To sort by distance to a given reference point you could use something like import Data.List (sortBy) import Data.Ord (comparing) ... sortedExamples = sortBy (comparing (distance referenceEx)) listExs -Brent

Ok, it seems a better. thanks Davi To sort by distance to a given reference point you could use something like import Data.List (sortBy) import Data.Ord (comparing) ... sortedExamples = sortBy (comparing (distance referenceEx)) listExs
participants (2)
-
Brent Yorgey
-
Davi Santos