Re: [Haskell-cafe] Are explicit exports and local imports desirable in a production application?

Dear Ignat, have you seen https://wiki.haskell.org/Import_modules_properly https://wiki.haskell.org/Qualified_names I find the arguments convincing. Even in my own packages I sometimes get lost where a certain function was imported from. When neither exports nor imports are done explicitly, you usually have only two choices: 1. search all sources (e.g. with grep -l) 2. rely on the haddock index Maybe your IDE can do that for you, but you can't expect all downstream users or all your colleagues to do the same. -- Olaf

Hard +1 to what Olaf said - this was (still is if I can't get ghcide running) one of the most annoying things when exploring a new codebase for me. ====== Georgi

I don't think that it's unreasonable in general to expect people to explore
a codebase via IDE tooling. But given Haskell's current situation on that
front, I currently agree with your approach to Haskell imports/exports.
Ignat, I agree with you that explicit imports/exports involve unnecessary
typing. I call this "busywork". Explicit exports still seem valuable for
encapsulation, avoiding name clashes, and in the case of GHC they unlock a
bit more optimisation.
In this case I think that we should automate that busywork, and hopefully
the recent Haskell IDE work gives us a path in that direction.
On Fri, 18 Sep 2020, 3:54 am Olaf Klinke,
Dear Ignat,
have you seen https://wiki.haskell.org/Import_modules_properly https://wiki.haskell.org/Qualified_names
I find the arguments convincing. Even in my own packages I sometimes get lost where a certain function was imported from. When neither exports nor imports are done explicitly, you usually have only two choices: 1. search all sources (e.g. with grep -l) 2. rely on the haddock index Maybe your IDE can do that for you, but you can't expect all downstream users or all your colleagues to do the same.
-- Olaf
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

On Thu, Sep 17, 2020 at 3:08 PM Isaac Elliott
I don't think that it's unreasonable in general to expect people to explore a codebase via IDE tooling. But given Haskell's current situation on that front, I currently agree with your approach to Haskell imports/exports.
Ignat, I agree with you that explicit imports/exports involve unnecessary typing. I call this "busywork". Explicit exports still seem valuable for encapsulation, avoiding name clashes, and in the case of GHC they unlock a bit more optimisation.
In this case I think that we should automate that busywork, and hopefully the recent Haskell IDE work gives us a path in that direction.
I mention this every time it comes up, but you can automate it right now, and I've been doing it for the last 10 years or so, no IDE needed. The tool I wrote is called fix-imports, but there are a number of others floating around. So I don't agree that writing imports is busywork, you never had to write that stuff in the first place, if you really didn't want to. Another benefit of qualifications for navigation is that they can disambiguate tags. Fancier IDE-like tools could do that without the qualification, but tags are here now and I think they actually work better. Actually on further thought, the same thing that disambiguates based on qualification could also easily disambiguate without it, so maybe this is not a real benefit after all. I just happened to set up the former and not the latter :) My first step looking at any third-party code is to tags the whole lot but for whatever reason I still much prefer qualifications. Few people use them though. For explicit exports, I often leave them off for convenience during development, but put them in when it settles down. I don't think it unlocks any optimization in code generation, but it does make rebuilds faster because it won't change the hi file if you changed a non-exported non-inlined function. You also get unused warnings. When I add the export list, I often append '#ifdef TEST , module This.Module #endif' so that tests still have total visibility. I prefer this to the Internal module approach because I don't like zillions of modules with the same name, and I don't want to have to structure code to the whims of tests, and I like to get unused symbol warnings from ghc without having to go to weeder. One benefit to explicit exports that surprises me is the trivial detection of unused functions. On several occasions I have done extra work or even just extra thinking to try to preserve a caller, only to find out that due to the changes I just made, it has no other users and I could have just deleted it without thinking hard. Yes, a simple grep would have revealed that, but I will instantly notice ghc saying unused symbol and I might not think to insert manual greps into my planning process. Often it's a whole chain of callers that can be deleted, and ghc will find those all immediately.

| For explicit exports, I often leave them off for convenience during
| development, but put them in when it settles down. I don't think it
| unlocks any optimization in code generation
Actually, it does make a difference to optimisation. If a function is known not to be exported, then GHC knows every one of its call sites. Eg so
* It may be called only once, and can be inlined (regardless of size)
at that call site.
* If we get a worker/wrapper split, we'll inline the wrapper at all the
call sites. If it's not exported, GHC can discard the wrapper.
* CalledArity analysis can be much more aggressive when it can see all
call sites.
I don't know anyone who has measured the perf or binary-size benefits of limiting export lists. It's probably not huge. But it's not zero.
Simon
| -----Original Message-----
| From: Haskell-Cafe

I would like to summarize the conversation so far. Surely I am not free from bias, so please object if my representation of some of the opinions I am re-stating is unfair or incomplete to their disadvantage. I shall present arguments and counter-arguments in the shape of a tree, with markup and indentation. ### Many people said that they like qualified imports. Note that my opening message did not speak against that. I suppose I should have stated openly that I am all for appropriate use of qualified imports with descriptive names, such as `Text.pack`, `Set.fromList` and even `List.sort` — although `Maybe.fromMaybe` may be be too much. I think there is wide agreement about this. ## Imports. ### Reasons why explicit imports may be preferable: * Resolution of names to the originating modules. - Note that this still leaves it to the reader to resolve modules to packages. That is, unless one also insists on using package imports, which I see very rarely. - Modern tools can resolve names as if by magic, qualified or not. Surely we can expect more progress in this direction. * Can we expect online, multilingual code repositories to do this for us? ### Cases when explicit imports are required: None so far. _(Disambiguation can be done with qualified imports.)_ ## Exports. ### Reasons why explicit exports may be preferable: * There may be a helper function that is used by several exported functions, so it cannot be put in a `where` clause — if it is absent from the explicit export list, GHC will be able to optimize better. * Haddock magic: changing the order of exports, adding headers and sections. - Most if not all of these effects can be accomplished without export lists. * Detection of dead code by GHC. - Tools exist to detect dead code without the use of explicit exports. ### Cases when explicit exports are required: * Exporting the type constructor of an abstract type, but not its data constructors. * Re-exporting modules.

On Fri, Sep 18, 2020 at 12:48 AM Simon Peyton Jones
| For explicit exports, I often leave them off for convenience during | development, but put them in when it settles down. I don't think it | unlocks any optimization in code generation
Actually, it does make a difference to optimisation. If a function is known not to be exported, then GHC knows every one of its call sites. Eg so
I stand corrected! I recall reading somewhere that, for the purposes of inlining, all symbols inside the module are fully visible, so the only difference the export makes is whether or not the non-inlined/specialized version is also kept around for possible external callers. So maybe that's just wrong, or I made it up somehow, or maybe it's right but just for inlining and not for those other things? I suppose it could theoretically have been right if GHC were willing to duplicate all exported functions and perhaps analyze them twice, but perhaps it's not willing to do that? I'm not saying it should, especially if it would hurt compile time, just curious.

| I suppose it could theoretically have been right if GHC were willing
| to duplicate all exported functions and perhaps analyze them twice,
| but perhaps it's not willing to do that? I'm not saying it should,
| especially if it would hurt compile time, just curious.
For non-exported functions, GHC inlines one regardless of size (and discards the definition) if it has exactly one occurrence, because that doesn't duplicate. For all other functions (local and exported) GHC will inline (and hence duplicate) small ones, and not inline (thereby avoiding duplicating) big ones.
There are flags to control what "big" means.
Does that answer your question?
Simon
| -----Original Message-----
| From: Evan Laforge

On Mon, Sep 21, 2020 at 12:41 AM Simon Peyton Jones
| I suppose it could theoretically have been right if GHC were willing | to duplicate all exported functions and perhaps analyze them twice, | but perhaps it's not willing to do that? I'm not saying it should, | especially if it would hurt compile time, just curious.
For non-exported functions, GHC inlines one regardless of size (and discards the definition) if it has exactly one occurrence, because that doesn't duplicate. For all other functions (local and exported) GHC will inline (and hence duplicate) small ones, and not inline (thereby avoiding duplicating) big ones.
There are flags to control what "big" means.
Does that answer your question?
Yes I think it does, to rephrase, we could say the distinction is not so much exported or not, but 1 caller vs. >1 caller. But of course it happens that exported forces you to assume >1 caller. I was thinking GHC could theoretically be willing to duplicate an exported oversized function if it only occurs once within the module, because this is a max of 1 additional copy, not an unbounded number. But perhaps it makes sense that it refuses to copy at all if it's over the size limit, because who's to say how far over the limit it is! Inlining something with exactly 1 caller is guaranteed safe. It also makes it worry-free to extract an expression to a where, or promote a where to the toplevel, so long as it remains called only once. This is as it should be, but it's nice to know for sure!

On Fri, Sep 18, 2020 at 08:06:14AM +1000, Isaac Elliott wrote:
I don't think that it's unreasonable in general to expect people to explore a codebase via IDE tooling.
Implicit imports prevent people easily understanding code that is presented on GitHub, for example. I think this is the main reason I dislike implicit imports, my own inconvenience coming a close second.

On Fri, 18 Sep 2020, Isaac Elliott wrote:
I don't think that it's unreasonable in general to expect people to explore a codebase via IDE tooling. But given Haskell's current situation on that front, I currently agree with your approach to Haskell imports/exports.
I already lost many hours exploring orphaned codebases. Imagine a five year old package or say ten years that you cannot compile anymore. You do not know with what version of GHC it worked, and even if you would know, you will not be able to get this GHC version running anymore. You do not know which library versions the package used, because Build-Depends is missing version ranges. Even if you would know, you would not be able to get old library versions compiled with new GHC versions anymore. This orphaned package uses identifiers like "prim" and you have to search whether this is a local identifier or whether it is external and then from which package? No IDE will help with code that misses so many information.
participants (8)
-
Evan Laforge
-
Georgi Lyubenov
-
Henning Thielemann
-
Ignat Insarov
-
Isaac Elliott
-
Olaf Klinke
-
Simon Peyton Jones
-
Tom Ellis