
I assume that your proper goal is not to structure pipeline processes in full generality, but to simplify the current one at hand.
No, I'm looking for full generality. ;) I have dozens of these kinds of "quasi-pipelines," all similar in overall appearance, but different in detail.
Ah, the names help a lot and they confirm my uneasy feeling about the quasi-pipeline: I think it's ad-hoc. What I want to say is that I suggest refactoring the pipeline to become expressible as a point-free function concatenation by decoupling data dependencies instead of trying to find a way to express arbitrary interlinked quasi-pipelines. So I'd eliminate the problem by switching to a different one :) Of course, this decoupling needs the concrete names and types, that's why I wanted to know them. So, let's have look on the data dependencies, taking the information from your web-site into account. I'll formulate some guesses and questions, you don't need to comment on them; they're just meant as hints. The point is that its *type* and not so much its name should guide what a functions does. This way, dependencies on unnecessary parameters automatically go away.
process :: Item -> MediaKind -> MediaSize -> Language -> SFO "Item" doesn't tell me anything. Seems to be an XML-File containing the questions and such.
process item mediaKind mediaSize language = let pagemaster = loadPagemaster item mediaKind mediaSize;
Mh, I cannot guess what a "pagemaster" might do, but from its arguments, it looks like the "ink guy" responsible for actual printing (or on-screen display). So he might know about graphics, colors and inches but not about content.
validateQuestionContent :: [Question] -> [Question] questions = stripUndisplayedQuestions mediaKind $
Ok, mediaKind is indispensable because on-screen and print forms are utterly different. Maybe one should write filter willBeDisplayedQuestion $ instead, but I think the name 'stripUndisplayedQuestions' says it all.
appendEndQuestions item pagemaster $
Uh, why do questions depend on pagemaster and thus on mediaSize? Are these some floating questions appearing on every page, like the name of the guy to be questioned? Those should be treated somewhere else. Here, one could also write (++ endquestions ...) $ so that everybody sees what's going on, but again 'appendEnd' is adequate. The dependency on the full item is too much, I think.
coalesceParentedQuestions :: [Question] -> [Question] coalesceParentedQuestions $
This makes me suspicious whether [Question] is the right type. Apparently, data Question = GroupedQuestions [String] or something like that, so that a Question may well be a tree of questions. Guessing that this function resolves some tree structure that got specified by explicitly naming nodes, I'd suggest a type '[QuestionTaggedWithLevel] -> Tree Question' instead. Note that one also has fold and filter on Trees for further processing.
validateQuestionContent :: [Question] -> [Question] validateQuestionContent $
Uh, I think the type is plain wrong. Doesn't the name suggest 'Question -> Bool' and a fatal error when a question content is invalid?
loadQuestions item;
'loadQuestions' is a strange name (too imperative) but that's personal taste and maybe a hint to the fact that 'item' is stored inside a file.
(numberedQuestions,questionCategories) = numberQuestions pagemaster questions;
Yet again the pagemaster. I don't think that mere numbering should depend on mediaSize, not even implicitly. Why must questionCategories be collected? Aren't they inherent in 'Tree Question', so that every Branch has a unique category? Automatic numbering is fine, though.
numberedQuestions' = coalesceNAQuestions numberedQuestions;
Does 'NA' mean not answered? Isn't that either a fatal error or a Maybe-Answer? 'coalesce' makes me suspicious, I could live with a 'filter'.
(bands,sequenceLayouts) = buildLayout mediaKind language numberedQuestions';
Ah, there's no pagemaster, only mediaKind and language, although the pagemaster would be tempting here. I guess that layout builds for 'endless paper' (band).
bands' = resolveCrossReferences bands;
Mh, cross reference for thing x on page y? But there aren't any pages yet. Likely that I just don't know what bands are.
groupedBands = groupBands bands';
(can't guess on that)
pages = paginate item mediaKind mediaSize pagemaster groupedBands;
Now, the dependence on mediaSize is fine. But it's duplicated by pagemaster.
pages' = combineRows pages; sfo = createSFO pages' sequenceLayouts; in sfo
(can't guess on that)
In summary, I think that the dependencies on the pagemaster are not adequate, he mixes too many concerns that should be separated. If he goes, there's much more function concatenation possible. If really necessary, the MediaKind stuff can go into a MonadReader. Btw, IMHO the explicitely named version is much clearer than any diagram with boxes and arrows. At least, you should keep names like 'questions', 'bands' and 'pages'. Only cumbersome derivations like 'groupedBands' or names with additional ticks are really redundant. Regards, apfelmus