better generation of vi ctags in ghci

Hi GHC and VI users, I got frustrated with vi tags not working after some unrelated code is edited in a source file. Moreover non-exported top level declarations were not available in vi tags file. Here is an attempt to fix it: http://www.hck.sk/users/peter/pub/ghc/betterCTags.patch Why would you want the new implementation of :ctags ghci command? * Tags are searched based on the line content. This is what Exuberant Ctags do for other languages and it is the other posix way to do it. This makes the positioning to work well even when the source code was edited (on places unrelated to the tag location). More complicated Ex statements can be used to improve it even more but then it does not work well with :tselect (it screws up tag kinds, at least with my version of vim 7.2.65). * All top level symbols defined in a module are added to the tags file. Even the non-exported ones. These are marked as static (file:) so the default tag selection (Ctrl-]) works fine based on the file you started the search from. * Tags get kinds added so you can select whether you want to get to a type constructor or a data constructor (that is if you share names between the two). * In general it is a nice addition to vim haskellmode. If you search for help on symbols in libraries then opening haddock is cool. If you search for help on a symbol in your project then opening the tag in a preview window (Ctrl-W} or ptselect) is cool. Problems: * It needs somebody to check that emacs tags were not broken. I'm not an emacs user but some tag generation code is shared for vim and emacs. I tried to keep emacs tags exactly the way they were (only the exported symbols, original file format). * If your code happens to have definitions on lines which happen to exist more times in one source file then it may put you at an incorrect location. I doubt it will ever happen but if anybody thinks it is really bad we can keep the original format of vim tags too. Then e.g. :ctags would generate tags with line numbers and :ctags! would generate tags with search expressions. If there is any support for this and ghc team decides to merge it I can provide darcs patch. I can do changes needed to get it merged. Peter.

I got frustrated with vi tags not working after some unrelated code is edited in a source file. Moreover non-exported top level declarations were not available in vi tags file. Here is an attempt to fix it: http://www.hck.sk/users/peter/pub/ghc/betterCTags.patch
I'm all in favour of ctags improvements in general! Thanks for investing the effort. Haven't looked at your patch in detail yet, just a few comments based on your message:
Why would you want the new implementation of :ctags ghci command?
* Tags are searched based on the line content. This is what Exuberant Ctags do for other languages and it is the other posix way to do it. This makes the positioning to work well even when the source code was edited (on places unrelated to the tag location). More complicated Ex statements can be used to improve it even more but then it does not work well with :tselect (it screws up tag kinds, at least with my version of vim 7.2.65).
Haskell isn't like other languages. If you search on source lines of definitions, that'll break every time you change a pattern, parameter name, parameter order, clause order, ..; if you search on less than the full line, you get additional misleading matches, especially because you can only search forward or backward, not both (also, the fallback of /^<tag>/ won't work for infix operators, classes, data, ..). Ideally, one (well, I at least;-) would like a mixed approach: start with source line, figure out if the line is right, if yes, done, if not, then search (starting from the old position). But that seemed to be beyond posix tags, so I stuck with line numbers. If the definition you're looking for isn't close to where it was, you'd better regenerate the tags anyway; if the definition is still close, line numbers degrade gracefully; search patterns tend to break completely (on the other hand, line number changes affect more tags than definition changes do, so it would be useful to have both options). Note that emacs always does a search, but it does search outwards in both directions, from the old location as the start position, so it can make do with an underspecified search pattern, such as the tag itself (if I recall correctly, the etags position might be off by one anyway; emacs doesn't have lines/columns, it counts bytes from the start or something like that, which meant that the etags and ctags parts couldn't share all their code).
* All top level symbols defined in a module are added to the tags file. Even the non-exported ones. These are marked as static (file:) so the default tag selection (Ctrl-]) works fine based on the file you started the search from.
Thanks, I had meant to do this, don't know why I didn't (you use the new static tag format, not the old, I assume?).
* Tags get kinds added so you can select whether you want to get to a type constructor or a data constructor (that is if you share names between the two).
You mean 'kind' in the tags file sense, using it to record namespace, similar to haddock's t/v distinction? The extended tag line format is somewhat underspecified (only file: and kind: are mentioned in Vim's docs), so I could not make up my mind what to use kind: for (namespace? type? kind? type/data/class? package? ..) and I couldn't figure out whether any editors would actually use the extra info if I were to add extra fields to add all the useful info. In Vim, that info could be accessed via scripts, so it would be useful to add all of it, but some standard would help; also some of the info that :ctags could record in the tags file can be had by other means, including :type and :info output, and those other means will often be more up to date than the tags file, so I added no extra info to the tags file.
* In general it is a nice addition to vim haskellmode. If you search for help on symbols in libraries then opening haddock is cool. If you search for help on a symbol in your project then opening the tag in a preview window (Ctrl-W} or ptselect) is cool.
That's why I suggested the addition in the first place!-) Thanks for taking it further. The one real show-stopper are files that GHCi can't handle: because the ctags/etags patch borrowed all the interesting functionality from GHCi, it inherited its limitations as well (there is a ghctags program somewhere which circumvents that issue by not trying to generate code when generating tags, so it can handle tags for GHC's sources, which GHCi :ctags couldn't, last time I tried). A secondary issue was what to do with non-interpreted modules (probably: just skip, and rely on them having their own tags files). (btw, I also use '_si' more since I switched it to open a preview window for the :info output, and of course, one can get the types without having to record them in the tags file).
Problems:
* It needs somebody to check that emacs tags were not broken. I'm not an emacs user but some tag generation code is shared for vim and emacs. I tried to keep emacs tags exactly the way they were (only the exported symbols, original file format).
I'm not an emacs user, either, I just added both versions because I saw no reason not to (and I had been looking into the curious world of emacs terminology for other reasons). Simon M did quite a bit of cleanup on my original patch, if I recall, perhaps other emacs users care to take a look?
* If your code happens to have definitions on lines which happen to exist more times in one source file then it may put you at an incorrect location. I doubt it will ever happen but if anybody thinks it is really bad we can keep the original format of vim tags too. Then e.g. :ctags would generate tags with line numbers and :ctags! would generate tags with search expressions.
See above for other things that can go wrong with search-based tags, so I'd prefer to have both options. Claus

Claus Reinke wrote: > Haskell isn't like other languages. If you search on source lines of > definitions, that'll break every time you change a pattern, parameter > name, parameter order, clause order, .. This is what I do. The whole line is searched to avoid as much of false positives as possible. So far I did not get any false positive in my code with this approach. > ; if you search on less than the full line, you get additional > misleading matches, especially because you > can only search forward or backward, not both (also, the fallback of > /^/ won't work for infix operators, classes, data, ..). > > Ideally, one (well, I at least;-) would like a mixed approach: start with > source line, figure out if the line is right, if yes, done, if not, > then search > (starting from the old position). But that seemed to be beyond posix > tags, so I stuck with line numbers. I tried some form of that (set position before the expected loction before searching). It could be done better but then the tag file would get big (inline all the code for each tag, or we would need some functions defined outside of the tags file). Anyway the reason I dropped this was that my vim did not parse this file correctly and tag kinds (as defined in the vim tags file) were not shown in the :tselect list. So I decided it is not worth it for now. It would be cool to have something better but I do not know whether it is possible aside from just writing an interactive compiler (something like yi people are trying - I do not know how far did they get). >> * All top level symbols defined in a module are added to the tags >> file. Even the non-exported ones. These are marked as static (file:) >> so the default tag selection (Ctrl-]) works fine based on the file >> you started the search from. > > Thanks, I had meant to do this, don't know why I didn't (you use the new > static tag format, not the old, I assume?). I use the latest format: {tagname} {TAB} {tagfile} {TAB} {tagaddress} {term} {field} .. Where {tagaddress} is a search expression looking for whole lines. >> * Tags get kinds added so you can select whether you want to get to a >> type constructor or a data constructor (that is if you share names >> between the two). > > You mean 'kind' in the tags file sense, using it to record namespace, > similar to haddock's t/v distinction? I mean kind in the tags file sense. I do not know about haddock's t/v distinction so I cannot compare to it. So far I use: * v (varId), * t (typeCon), * d (dataCon), * c (class). >> * In general it is a nice addition to vim haskellmode. If you search >> for help on symbols in libraries then opening haddock is cool. If you >> search for help on a symbol in your project then opening the tag in a >> preview window (Ctrl-W} or ptselect) is cool. > > That's why I suggested the addition in the first place!-) Thanks for > taking it further. NP, it is great! I did not know there is at least one more person wanting better tags. I did not even know about ghctags. Otherwise I would probably just use them if they are any good (generate non-exported symbols too and use search patterns instead of the line numbers ... so they are at least a bit useful after some edits). Ok, the reason I care about tags to be useful after some edits is that for some of my files it takes some time to compile them in ghci. And I don't want to think about regenerating them regularly. Now I need to regenerate them only when the tag cannot be located or it found an incorrect place (which did not happen to me yet). > for GHC's sources, which GHCi :ctags couldn't, last time I tried). A > secondary issue was what to do with non-interpreted modules > (probably: just skip, and rely on them having their own tags files). Skipping is fine with me. >> * If your code happens to have definitions on lines which happen to >> exist more times in one source file then it may put you at an >> incorrect location. I doubt it will ever happen but if anybody thinks >> it is really bad we can keep the original format of vim tags too. >> Then e.g. :ctags would generate tags with line numbers and :ctags! >> would generate tags with search expressions. > > See above for other things that can go wrong with search-based tags, > so I'd prefer to have both options. Ok, I can add it. Generating line numbers instead of search patterns will be quicker too. For big projects, the time difference may be noticeable. So what about UI? :ctags would generate tags with line numbers and :ctags! would generate tags with search patterns? Or will we add an argument to :ctags to specify what kind of tags we want? This option would break ghci UI backward compatibility. Peter.

On Wed, 17 Jun 2009 13:59:24 +0200, Peter Hercek wrote:
* If your code happens to have definitions on lines which happen to exist more times in one source file then it may put you at an incorrect location. I doubt it will ever happen but if anybody thinks it is really bad we can keep the original format of vim tags too. Then e.g. :ctags would generate tags with line numbers and :ctags! would generate tags with search expressions.
See above for other things that can go wrong with search-based tags, so I'd prefer to have both options. Ok, I can add it. Generating line numbers instead of search patterns will be quicker too. For big projects, the time difference may be noticeable. So what about UI? :ctags would generate tags with line numbers and :ctags! would generate tags with search patterns? Or will we add an argument to :ctags to specify what kind of tags we want? This option would break ghci UI backward compatibility.
Bump! Looks like nobody cares enough to respond. Do we have at least a "general agreement" of two for :ctags[!] user interface? Is "general agreement" of two and nobody caring enough to respond good enough for a merge? If nobody responds to this I'll assume there is no general agreement and I'll maintain the patch only for myself. Peter.

On 17/06/2009 10:14, Peter Hercek wrote:
Hi GHC and VI users,
I got frustrated with vi tags not working after some unrelated code is edited in a source file. Moreover non-exported top level declarations were not available in vi tags file. Here is an attempt to fix it: http://www.hck.sk/users/peter/pub/ghc/betterCTags.patch
I'm an infrequent etags user, and I never use ctags. So while I can't give you any useful comments on your patch, if there seems to be general agreement we'll be happy to incorporate it. Cheers, Simon

Simon Marlow wrote:
I'm an infrequent etags user, and I never use ctags.
The problem is I do not know whether I should try to improve etags too. So far I tried to keep them the same they were. The only difference I know about is that if more tags happen to exist on the same source line then they may have different order (but only within the group of the symbols on the line). This is probably because different GHC interface is used and the symbols are coming in different order. Vim can accept emacs tags too and they work fine as they are generated by the new code. So the questions for emacs users are: * Should the non-exported top level symbols be added to emacs TAGS file? As far as I could find on the internet, emacs does not have notion of "static" symbols as vim has so it may not be a good idea. But if emacs prefers jump to a symbol in the local file to symbols in the other files it may work well enough to be worth it. * The last data field of the tag definition is "byte_offset". But in past (and I kept it as it was so even when the patch is applied) this was actually byteOffset-numberOfLines*sizeOfLineDelimiter. The size of newLine delimiters was ignored (and moreover it is system dependent). This does not matter that much since based on Claus information emacs allows some fuzz in the position information. The question is whether to keep this as it was or add 1 for each line or somehow try to detect platform and add the correct number of bytes for each line? If no answers come I just keep it as it is and hope for the support of the vim related ctags changes only :) Peter.
participants (3)
-
Claus Reinke
-
Peter Hercek
-
Simon Marlow