Hi Adam, thanks for your inputs.

2013/12/31 adam vogt <vogt.adam@gmail.com>

Hello Lucas,

Am I correct to say that laborantin only does full factorial
experiments? Perhaps there is a straightforward way for users to
specify which model parameters should be confounded in a fractional
factorial design. Another extension would be to move towards
sequential designs, where the trials to run depend on the results so
far. Then more time is spent on the "interesting" regions of the
parameter space.

Actually, the parameters specified in the DSL are "indicative" values for a full-factorial default.

At this point, a command-line handler is responsible for exploring the parameter space and executing scenarios. This command-line handler has a way to specify fractional factorial designs by evaluating a query like: "(@sc.param 'foo' > @sc.param 'bar') and @sc.param 'baz' in [1,2,3,'toto']" .

This small query language was my first attempt at "expression parsing and evaluation" and the code might be ugly, but it works and fits most of my current needs. Bonus: with this design, the algorithm to "explore" the satisfiable parameter space is easy to express.

One direction to enrich this small query language would be to express that a parameter takes a continuous value in a range or should fullfill a boolean test function. Then we could use techniques such as rapidly exploring random trees to explore "exotic feasability regions".

Another direction to improve the query language is to require ScenarioDescriptions to have a sort of "cost/fitness function" so that we can later build a parameter-space explorer that performs an optimization. We could even extend the query language to bind a parameter to a value which optimize another experiment.

I think getVar/param could be re-worked to give errors at compile
time. Now you get a runtime error if you typo a parameter or get the
type wrong. Another mistake is to include parameters in the experiment
that do not have any effect on the `run` action, unless those
parameters are there for doing replicates.

Those might be addressed by doing something like:

a <- parameter "destination" $ do ...
run $ print =<< param a

Where the types are something like:

param :: Data.Tagged.Tagged a Text -> M a
values :: [T a] -> M (Tagged a Text)
str :: Text -> T Text
num :: Double -> T Double

with M being whatever state monad you currently use, and param does
the same thing it always has, except now it knows which type you put
in the values list, and it cannot be called with any string. The third
requirement might be met by requiring -fwarn-unused-matches.

That's one thing I am parted about. From my experience, it is sometimes handy to branch on whether a value is a number or a string (e.g., to say things like 1, 2, 3, or "all"). Somehow, tagged values do not prevent this either. Similarly, I don't know whether I should let users specify any type for their ParameterDescription at the cost of writing serializers/deserializers boilerplate (although we could provide some default useful types as it is the case now).

An alternative strategy is to change your type Step, into an algebraic
data type with a function to convert it into what it is currently.
Before the experiment happens, you can have a function go through that
data to make sure it will succeed with it's getVar/param. This is
called a deep embedding:
<http://www.haskell.org/haskellwiki/Embedded_domain_specific_language>.

That can be an idea, I didn't go that far yet, but I'll keep an eye on it.

Best wishes for this happy new year,
--Lucas

Regards,
Adam

On Mon, Dec 23, 2013 at 4:27 AM, lucas di cioccio
<lucas.dicioccio@gmail.com> wrote:

> Dear all,
>
> I am happy to announce Laborantin. Laborantin is a Haskell library and DSL
> for
> running and analyzing controlled experiments.
>
> Repository: https://github.com/lucasdicioccio/laborantin-hs
> Hackage page: http://hackage.haskell.org/package/laborantin-hs
>
> Laborantin's opinion is that running proper experiments is a non-trivial and
> often overlooked problem. Therefore, we should provide good tools to assist
> experimenters. The hope is that, with Laborantin, experimenters will spend
> more
> time on their core problem while racing through the menial tasks of editing
> scripts because one data point is missing in a plot. At the same time,
> Laborantin is also an effort within the broad open-science movement. Indeed,
> Laborantin's DSL separates boilerplate from the actual experiment
> implementation. Thus, Laborantin could reduce the friction for code and
> data-reuse.
>
> One family of experiments that fit well Laborantin are benchmarks with
> tedious
> setup and teardown procedures (for instance starting, configuring, and
> stopping
> remote machines). Analyses that require measurements from a variety of data
> points in a multi-dimensional parameter space also fall in the scope of
> Laborantin.
>
> When using Laborantin, the experimenter:
>
> * Can express experimental scenarios using a readable and familiar DSL.
> This feature, albeit subjective, was confirmed by non-Haskeller
> colleagues.
> * Saves time on boilerplate such as writing command-line parsers or
> encoding dependencies between experiments and analysis results in a
> Makefile.
> * Benefits from auto-documentation and result introspection features when
> one
> comes back to a project, possibly months or weeks later.
> * Harnesses the power of Haskell type-system to catch common errors at
> compile time
>
> If you had to read one story to understand the pain points that Laborantin
> tries to address, it should be Section 5 of "Strategies for Sound Internet
> Measurement" (V. Paxson, IMC 2004).
>
> I'd be glad to take question and comments (or, even better, code reviews and
> pull requests).
>
> Kind regards,
> --Lucas DiCioccio (@lucasdicioccio on GitHub/Twitter)
>

> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>