New subject: RFC: Handling marginally bad data via logging "soft-errors"

12 Apr 2017

      Cross-posted from
https://www.reddit.com/r/haskell/comments/64ymup/rfc_handling_marginally_bad...

Hi Everyone,

I would like to hear some  perspectives from people who have more
experience in handling production code in Haskell.

SCENARIO: You're loading data from a DB (or any other external source) and
are expecting certain fields to be present. You end-up loading a record/row
which has all the strictly necessary fields, but a not-so-necessary field
is missing. Example: loading a full name, stored as the following HSTORE in
PG:

    {'title' => '...', 'first_name' => '...', 'last_name' => '...'}

For whatever legacy reasons, there is a possibility that some records may
not have the title field, or may not have the last_name field. Now, the
skies are not going to fall if you display the name with some components
missing, so, you don't want to throw an error and terminate the entire
operation. It's a "soft error" which you want to log, and proceed with a
default value.

OPTION 1: Use unsafePerformIO to avoid making an otherwise pure function,
"impure", i.e.

    parseName :: HStore -> FullName -- uses unsafePerformIO to log
soft-errors

OPTION 2: Put the data transformation function in IO or the app-wide monad
AppM because it will call the logging function, i.e.

    parseName :: HStore -> AppM FullName

OPTION 3: Use a wrapper over Writer monad, called ErrorCollector, that
forces the calling function to log while accessing the underlying value,
i.e.

    parseName :: HStore -> ErrorCollector FullName
    logErrorAndExtract extraLoggingContext (parseName hstoreVal)

THOUGHTS:

-- The downside of Option 1 is that (a) it's unsafe, and (b) the log won't
have surrounding context, for example it won't have the primary key of the
row that was being loaded within which this error occurred.
-- The downside of Option 2 is that it introduces AppM into the function
signature, and even that log won't have enough surrounding context to help
in debugging.
-- The downside of Option 3 is that it pushes some logging responsibility
to the calling functions, but gives the ability to put extra context when a
soft-error occurs.

How do others deal with this problem?

-- Saurabh.

RFC: Handling marginally bad data via logging "soft-errors"

Saurabh Nanda

Patrick Mylund Nielsen

amindfv＠gmail.com

tags

participants (3)