Re: [Haskell-cafe] What features should an (fictitious) IDE for Haskell have?

very cool feature would be if I could select a program phrase and let it find /similar/ phrases, where a similarity metric could be edit-distance with respect to language tokens ...
I often wanted a tool that finds (nearly) duplicate AST sub-trees in a large code base, and suggests refactorings. Of course, in an IDE, it could alert me on-the-fly that I'm typing some code that's already present elsewhere. How might one go about implementing this? Actual (approximate) sub-tree matching seems the easy part; but I have no clear idea about whether this should just use syntax, or needs types as well (my guess is: yes) what libraries are there to provide the (annotated) ASTs, etc. - J.W.

hlint does this to some extent (at least I have some copy-pasted code it keeps on pleading me to remove duplication). Doug On Mon, Dec 7, 2020 at 8:24 AM Johannes Waldmann < johannes.waldmann@htwk-leipzig.de> wrote:
very cool feature would be if I could select a program phrase and let it find /similar/ phrases, where a similarity metric could be edit-distance with respect to language tokens ...
I often wanted a tool that finds (nearly) duplicate AST sub-trees in a large code base, and suggests refactorings.
Of course, in an IDE, it could alert me on-the-fly that I'm typing some code that's already present elsewhere.
How might one go about implementing this? Actual (approximate) sub-tree matching seems the easy part; but I have no clear idea about whether this should just use syntax, or needs types as well (my guess is: yes) what libraries are there to provide the (annotated) ASTs, etc.
- J.W. _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Speak of this detection by hlint, I can not figure out how I'm supposed to improve it for identical or nearly identical where clauses, e.g. ```log Reduce duplication Found: !eac = el'context eas diags = el'ctx'diags eac returnAsExpr = el'Exit eas exit $ EL'Expr xsrc Why not: Combine with /xxx/Analyze.hs:741:7-27 hlint(Reduce duplication) ``` While yes, that's verbatim copy as of a where clause, but how to *combine* them? ```hs where !eac = el'context eas diags = el'ctx'diags eac returnAsExpr = el'Exit eas exit $ EL'Expr xsrc ``` Regards, Compl
On 2020-12-07, at 21:32, Doug Burke
wrote: hlint does this to some extent (at least I have some copy-pasted code it keeps on pleading me to remove duplication).
Doug
On Mon, Dec 7, 2020 at 8:24 AM Johannes Waldmann
mailto:johannes.waldmann@htwk-leipzig.de> wrote: very cool feature would be if I could select a program phrase and let it find /similar/ phrases, where a similarity metric could be edit-distance with respect to language tokens ...
I often wanted a tool that finds (nearly) duplicate AST sub-trees in a large code base, and suggests refactorings.
Of course, in an IDE, it could alert me on-the-fly that I'm typing some code that's already present elsewhere.
How might one go about implementing this? Actual (approximate) sub-tree matching seems the easy part; but I have no clear idea about whether this should just use syntax, or needs types as well (my guess is: yes) what libraries are there to provide the (annotated) ASTs, etc.
- J.W. _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post. _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

7 décembre 2020 14:25 "Johannes Waldmann"
I often wanted a tool that finds (nearly) duplicate AST sub-trees in a large code base, and suggests refactorings.
Of course, in an IDE, it could alert me on-the-fly that I'm typing some code that's already present elsewhere.
How might one go about implementing this? Actual (approximate) sub-tree matching seems the easy part; but I have no clear idea about whether this should just use syntax, or needs types as well (my guess is: yes) what libraries are there to provide the (annotated) ASTs, etc.
- J.W.
Hi, I developed last year a small tool for redundancy detection in OCaml called asak [1], which can be easily adapted for Haskell. The idea is pretty simple: use an intermediate language of the compilation pipeline to normalize the code and remove sugar (for Haskell, one can take Core), inline everything, and hash the tree bottom-up (abstracting constants) keeping intermediates trees. Then you just compare hashs of trees against pre-computed hashs of, let's say, the whole Hackage ecosystem. The technique is pretty efficient and scalable. I ran asak against the whole OPAM repository and got some results (like the detection of `map_opt` 140 times under 32 different names). There are some drawbacks: the code needs to be compiled down to Core and one has to maintain a database of available hashs, but the first one seems legitimate and the other inherent to any such tool. I started developing a plugin editor for emacs, but never go further due to a lack of time. Anyway, the approach is language-agnostic and could be easily adapted to Haskell. Moreover, I have an "inline everything" part which is not possible in OCaml (due to effects), and will be highly valuable in Haskell. Best, -- Alexandre Moine [1]: https://github.com/nobrakal/asak
participants (4)
-
Alexandre Moine
-
Doug Burke
-
Johannes Waldmann
-
YueCompl