Re: [Haskell-cafe] [ANN] Laborantin: experimentation framework

31 Dec 2013

      Hi Adam, thanks for your inputs.

2013/12/31 adam vogt 
...
Hello Lucas,
Am I correct to say that laborantin only does full factorial
experiments? Perhaps there is a straightforward way for users to
specify which model parameters should be confounded in a fractional
factorial design. Another extension would be to move towards
sequential designs, where the trials to run depend on the results so
far. Then more time is spent on the "interesting" regions of the
parameter space.
Actually, the parameters specified in the DSL are "indicative" values for a
full-factorial default.
At this point, a command-line handler is responsible for exploring the
parameter space and executing scenarios. This command-line handler has a
way to specify fractional factorial designs by evaluating a query like:
"(@sc.param 'foo' > @sc.param 'bar') and @sc.param 'baz' in [1,2,3,'toto']"
.
This small query language was my first attempt at "expression parsing and
evaluation" and the code might be ugly, but it works and fits most of my
current needs. Bonus: with this design, the algorithm to "explore" the
satisfiable parameter space is easy to express.

One direction to enrich this small query language would be to express that
a parameter takes a continuous value in a range or should fullfill a
boolean test function. Then we could use techniques such as rapidly
exploring random trees to explore "exotic feasability regions".

Another direction to improve the query language is to require
ScenarioDescriptions to have a sort of "cost/fitness function" so that we
can later build a parameter-space explorer that performs an optimization.
We could even extend the query language to bind a parameter to a value
which optimize another experiment.
...
I think getVar/param could be re-worked to give errors at compile
time. Now you get a runtime error if you typo a parameter or get the
type wrong. Another mistake is to include parameters in the experiment
that do not have any effect on the `run` action, unless those
parameters are there for doing replicates.
Those might be addressed by doing something like:
a <- parameter "destination" $ do ...
    run $ print =<< param a
Where the types are something like:
param :: Data.Tagged.Tagged a Text -> M a
  values :: [T a] -> M (Tagged a Text)
  str :: Text -> T Text
  num :: Double -> T Double
with M being whatever state monad you currently use, and param does
the same thing it always has, except now it knows which type you put
in the values list, and it cannot be called with any string. The third
requirement might be met by requiring -fwarn-unused-matches.
That's one thing I am parted about. From my experience, it is sometimes
handy to branch on whether a value is a number or a string (e.g., to say
things like 1, 2, 3, or "all"). Somehow, tagged values do not prevent this
either. Similarly, I don't know whether I should let users specify any type
for their ParameterDescription at the cost of writing
serializers/deserializers boilerplate (although we could provide some
default useful types as it is the case now).

An alternative strategy is to change your type Step, into an algebraic
...
data type with a function to convert it into what it is currently.
Before the experiment happens, you can have a function go through that
data to make sure it will succeed with it's getVar/param. This is
called a deep embedding:
http://www.haskell.org/haskellwiki/Embedded_domain_specific_language.
That can be an idea, I didn't go that far yet, but I'll keep an eye on it.

Best wishes for this happy new year,
--Lucas

Regards,
...
Adam
On Mon, Dec 23, 2013 at 4:27 AM, lucas di cioccio
 wrote:
...
Dear all,
I am happy to announce Laborantin. Laborantin is a Haskell library and
DSL
for
running and analyzing controlled experiments.
Repository: https://github.com/lucasdicioccio/laborantin-hs
Hackage page: http://hackage.haskell.org/package/laborantin-hs
Laborantin's opinion is that running proper experiments is a non-trivial
and
often overlooked problem. Therefore, we should provide good tools to
assist
experimenters. The hope is that, with Laborantin, experimenters will
spend
more
time on their core problem while racing through the menial tasks of
editing
scripts because one data point is missing in a plot. At the same time,
Laborantin is also an effort within the broad open-science movement.
Indeed,
Laborantin's DSL separates boilerplate from the actual experiment
implementation. Thus, Laborantin could reduce the friction for code and
data-reuse.
One family of experiments that fit well Laborantin are benchmarks with
tedious
setup and teardown procedures (for instance starting, configuring, and
stopping
remote machines). Analyses that require measurements from a variety of
data
points in a multi-dimensional parameter space also fall in the scope of
Laborantin.
When using Laborantin, the experimenter:
* Can express experimental scenarios using a readable and familiar DSL.
  This feature, albeit subjective, was confirmed by non-Haskeller
colleagues.
* Saves time on boilerplate such as writing command-line parsers or
  encoding dependencies between experiments and analysis results in a
Makefile.
* Benefits from auto-documentation and result introspection features when
one
  comes back to a project, possibly months or weeks later.
* Harnesses the power of Haskell type-system to catch common errors at
compile time
If you had to read one story to understand the pain points that
Laborantin
tries to address, it should be Section 5 of "Strategies for Sound
Internet
Measurement" (V. Paxson, IMC 2004).
I'd be glad to take question and comments (or, even better, code reviews
and
pull requests).
Kind regards,
--Lucas DiCioccio (@lucasdicioccio on GitHub/Twitter)
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe