How to check if two Haskell files are the same?

Hi, I would like to write a Haskell pretty-printer, using standard libraries for that. How can I check if the original and the pretty-printed versions are the same? For instance, is there a file generated by GHC at the compilation pipe that is always guaranteed to have the same MD5 hash when it comes from equivalent source? Thanks, Maurício

On Tue, Sep 16, 2008 at 9:30 AM, Mauricio
Hi,
I would like to write a Haskell pretty-printer, using standard libraries for that. How can I check if the original and the pretty-printed versions are the same? For instance, is there a file generated by GHC at the compilation pipe that is always guaranteed to have the same MD5 hash when it comes from equivalent source?
I don't know the answers to your question, but if you're looking for inspiration on your project you should check out the following two packages: http://hackage.haskell.org/cgi-bin/hackage-scripts/package/haskell-src http://hackage.haskell.org/cgi-bin/hackage-scripts/package/haskell-src-exts -Antoine

On 2008 Sep 16, at 10:30, Mauricio wrote:
I would like to write a Haskell pretty-printer, using standard libraries for that. How can I check if the original and the pretty-printed versions are the same? For instance, is there a file generated by GHC at the compilation pipe that is always guaranteed to have the same MD5 hash when it comes from equivalent source?
Compare .hi files? -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

I would like to write a Haskell pretty-printer, using standard libraries for that. How can I check if the original and the pretty-printed versions are the same? For instance, is there a file generated by GHC at the compilation pipe that is always guaranteed to have the same MD5 hash when it comes from equivalent source?
Compare .hi files?
That was my first thought, but can I be sure .hi files are going to be exactly the same, i.e., isn't there some kind of information (timestamps?) that can change without changes in the code? Maurício

On Wed, Sep 17, 2008 at 7:04 PM, Mauricio
I would like to write a Haskell pretty-printer, using standard libraries for that. How can I check if the original and the pretty-printed versions are the same? For instance, is there a file generated by GHC at the compilation pipe that is always guaranteed to have the same MD5 hash when it comes from equivalent source?
Compare .hi files?
That was my first thought, but can I be sure .hi files are going to be exactly the same, i.e., isn't there some kind of information (timestamps?) that can change without changes in the code?
For that matter, the code can change without the .hi file doing so, eg. if a pragma noinline'd function is altered without changing its type/strictness - or a function the optimizer decides is just pointless to try inlining, for all I know.

On Wed, Sep 17, 2008 at 1:03 AM, Brandon S. Allbery KF8NH
On 2008 Sep 16, at 10:30, Mauricio wrote:
I would like to write a Haskell pretty-printer, using standard libraries for that. How can I check if the original and the pretty-printed versions are the same? For instance, is there a file generated by GHC at the compilation pipe that is always guaranteed to have the same MD5 hash when it comes from equivalent source?
Compare .hi files?
You an also compare the resulting object files

On 2008 Sep 17, at 14:17, Alfonso Acosta wrote:
On Wed, Sep 17, 2008 at 1:03 AM, Brandon S. Allbery KF8NH
wrote: On 2008 Sep 16, at 10:30, Mauricio wrote:
I would like to write a Haskell pretty-printer, using standard libraries for that. How can I check if the original and the pretty-printed versions are the same? For instance, is there a file generated by GHC at the compilation pipe that is always guaranteed to have the same MD5 hash when it comes from equivalent source?
Compare .hi files?
You an also compare the resulting object files
On ELF systems (the majority) you have to watch out for the timestamp in the ELF header. I know there is code in the gcc source that does object comparisons to verify that stage3 builds match stage2, omitting the header. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

On Tue, Sep 16, 2008 at 7:30 AM, Mauricio
Hi,
I would like to write a Haskell pretty-printer, using standard libraries for that. How can I check if the original and the pretty-printed versions are the same? For instance, is there a file generated by GHC at the compilation pipe that is always guaranteed to have the same MD5 hash when it comes from equivalent source?
I don't know, but you can parse the resulting concrete syntax and compare the original abstract syntax to the new abstract syntax.
Thanks, Maurício
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Before you reinvent the wheel, have you looked at Language.Haskell.Pretty?
http://haskell.org/ghc/docs/latest/html/libraries/haskell-src/Language-Haske...
On Tue, Sep 16, 2008 at 10:30 AM, Mauricio
Hi,
I would like to write a Haskell pretty-printer, using standard libraries for that. How can I check if the original and the pretty-printed versions are the same? For instance, is there a file generated by GHC at the compilation pipe that is always guaranteed to have the same MD5 hash when it comes from equivalent source?
Thanks, Maurício
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- /jve

2008/9/16 Mauricio
Hi,
I would like to write a Haskell pretty-printer, using standard libraries for that. How can I check if the original and the pretty-printed versions are the same? For instance, is there a file generated by GHC at the compilation pipe that is always guaranteed to have the same MD5 hash when it comes from equivalent source?
There is not, though I have a suggestion : Am I correct in assuming that you mean "equivalent source" in the sense that only the formatting (and eventually {;} as a layout format consequence) differs ? Then the sequence of tokens from the source ought to do the trick as long as you delete location information (map unLoc) and transform ITvocurly ("virtual" braces for layout induced blocks) into ITocurly (real braces for no-layout blocks) (and same for ITvccurly) (it's just another map). If only the formatting differs, those two should be identical. Now the current GHC don't give you direct access to the Token stream but the next release should contain the functions I wrote to support this (for HaRe). In fact you could do this with the AST but it would be more complicated to do the necessary extractions and comparisons... -- Jedaï

Chaddaï Fouché a écrit :
2008/9/16 Mauricio
: Hi,
I would like to write a Haskell pretty-printer, using standard libraries for that. How can I check if the original and the pretty-printed versions are the same? For instance, is there a file generated by GHC at the compilation pipe that is always guaranteed to have the same MD5 hash when it comes from equivalent source?
There is not, though I have a suggestion : Am I correct in assuming that you mean "equivalent source" in the sense that only the formatting (and eventually {;} as a layout format consequence) differs ?
Exactly! And with comments removed, since the last time I checked Language.Haskell.* used not to preserve that.
Then the sequence of tokens from the source ought to do the trick as long as you delete location information (map unLoc) and transform ITvocurly ("virtual" braces for layout induced blocks) into ITocurly (real braces for no-layout blocks) (and same for ITvccurly) (it's just another map). If only the formatting differs, those two should be identical.
Good idea. I think that's all that I need. I can write a hash function that filters and transforms like that. Thanks, Maurício
participants (8)
-
Alfonso Acosta
-
Antoine Latter
-
Brandon S. Allbery KF8NH
-
Chaddaï Fouché
-
John Van Enk
-
Mauricio
-
Philip Weaver
-
Svein Ove Aas