darcs patch: Figure out timezone offset from timezone name

Sat Feb 2 12:33:17 CET 2008 David Leuschner

On Sat, Feb 02, 2008 at 11:36:33AM -0800, Donald Bruce Stewart wrote:
david:
Sat Feb 2 12:33:17 CET 2008 David Leuschner
* Figure out timezone offset from timezone name This should be forwarded to the time library maintainer, Bjorn Bringert.
It's Ashley Yakeley

On Sat, 2008-02-02 at 11:36 -0800, Don Stewart wrote:
david:
Sat Feb 2 12:33:17 CET 2008 David Leuschner
* Figure out timezone offset from timezone name This should be forwarded to the time library maintainer, Bjorn Bringert.
If that is the case then he should probably update the email address that darcs uses: http://darcs.haskell.org/packages/time/_darcs/prefs/email It is currently set to this list. Duncan

On Feb 2, 2008 8:49 PM, Duncan Coutts
On Sat, 2008-02-02 at 11:36 -0800, Don Stewart wrote:
david:
Sat Feb 2 12:33:17 CET 2008 David Leuschner
* Figure out timezone offset from timezone name This should be forwarded to the time library maintainer, Bjorn Bringert.
If that is the case then he should probably update the email address that darcs uses:
http://darcs.haskell.org/packages/time/_darcs/prefs/email
It is currently set to this list.
Duncan
I also thought that Ashley Yakeley was the maintainer. I have written the code that this patch concerns though. The patch seems nice, but there are some problems: - Time zone information is not static. - Time zone names are sometimes ambiguous. Shouldn't we really use the OS time zone database for this? /Björn

On Feb 2, 2008 9:17 PM, Bjorn Bringert
On Feb 2, 2008 8:49 PM, Duncan Coutts
wrote: On Sat, 2008-02-02 at 11:36 -0800, Don Stewart wrote:
david:
Sat Feb 2 12:33:17 CET 2008 David Leuschner
* Figure out timezone offset from timezone name This should be forwarded to the time library maintainer, Bjorn Bringert.
If that is the case then he should probably update the email address that darcs uses:
http://darcs.haskell.org/packages/time/_darcs/prefs/email
It is currently set to this list.
Duncan
I also thought that Ashley Yakeley was the maintainer. I have written the code that this patch concerns though. The patch seems nice, but there are some problems:
- Time zone information is not static. - Time zone names are sometimes ambiguous.
Shouldn't we really use the OS time zone database for this?
David and I have discussed this, and it seems like this hard-coded timezone database is in any case better than the current incorrect behavior. In the majority of cases it will do the right thing, instead of being almost always incorrect as it is now. I have pushed the patch now, but there are some improvements that could be made: - There are no test cases that test this. The Arbitrary TimeZone instance in test/TestParseTime.hs doesn't generate time zone names, so time zone name handling doesn't get tested. - The timezone table is rather long, and uses plain 'lookup'. OTOH, introducing a dependency on the 'containers' package seems excessive just to do faster lookup. /Bjorn

Bjorn Bringert wrote:
- The timezone table is rather long, and uses plain 'lookup'. OTOH, introducing a dependency on the 'containers' package seems excessive just to do faster lookup.
it doesn't seem excessive to me -- one of the main things 'containers' implements is fast(ish) lookup. But you mean the way it's looking up amongst a fixed/hard-coded set of keys? -Isaac

Bjorn Bringert wrote:
David and I have discussed this, and it seems like this hard-coded timezone database is in any case better than the current incorrect behavior. In the majority of cases it will do the right thing, instead of being almost always incorrect as it is now. I have pushed the patch now, but there are some improvements that could be made:
I think this is wrong. Time-zones change quite frequently, and now any program compiled with 'time' will be incorrect the next time some government changes its zone. It's for a similar reason that no leap-second table is included in the package. I always prefer not to provide information than provide misinformation. That said, this file is Bjorn's part of the package. Is there an alternative approach? Is there some way we can query the tz database? I looked into this earlier but didn't find an obvious way. -- Ashley Yakeley

David and I have discussed this, and it seems like this hard-coded timezone database is in any case better than the current incorrect behavior. In the majority of cases it will do the right thing, instead of being almost always incorrect as it is now. I have pushed the patch now, but there are some improvements that could be made:
I think this is wrong. Time-zones change quite frequently, and now any program compiled with 'time' will be incorrect the next time some government changes its zone.
Which time zone is used at what time indeed changes frequently but I am not aware of any case where the meaning of a time zone _name_ changed. CEST always refers to +0200 although the periods during which this time zone is used changes. The patch doesn't try to figure out which time zone should be used for a certain place at a certain time. It only maps the names that are typically used to refer to time zones to their time offset.
Is there an alternative approach? Is there some way we can query the tz database? I looked into this earlier but didn't find an obvious way.
Querying the time zone database doesn't help in this case. It only provides information which time zone was (or should be) used at a certain place at a certain time. This patch changes only the way time zone names including a time zone name are parsed. Ambiguity really is a problem (and for that reason it would be best to not use time zone names at all). Quoting from a mail to Björn:
Ambiguity of names really is a problem because PST can refer to "Pacific Standard Time" which is -07:00 as well as "Pakistan Standard Time" which is +05:00. But even here detailed time zone information doesn't provide any help as the meaning doesn't depend on where the computer doing the decoding is located but where the formatted timestamp originated from. (And in which context it appeared. For example RFC822 clearly indicates that PST should mean -07:00). To accomodate for this all parsing functions should probably return lists of valid interpretations but that would break the library interface.
At the moment the library doesn't even indicate, that it could not interpret the time zone and silently returns nonsense like
TimeZone 0 False "CEST"
which is just wrong because "CEST" has an offset of 120 minutes.
As a final note PostgreSQL has excellent time and time zone handling and uses exactly this approach. They deal with ambiguity by allowing the database administrator to use localised mapping tables, see http://www.postgresql.org/docs/8.1/static/datetime-keywords.html Cheers, David -- David Leuschner Meisenweg 7 79211 Denzlingen Tel. 07666/912466

Which time zone is used at what time indeed changes frequently but I am not aware of any case where the meaning of a time zone _name_ changed.
What's your source for time-zone names?
I just realised I forgot to add a comment in the patch. It's the table from the PostgreSQL manual http://www.postgresql.org/docs/8.1/static/datetime-keywords.html Cheers, David -- David Leuschner Meisenweg 7 79211 Denzlingen Tel. 07666/912466

On Tue, 2008-02-05 at 09:23 +0100, David Leuschner wrote:
Which time zone is used at what time indeed changes frequently but I am not aware of any case where the meaning of a time zone _name_ changed.
What's your source for time-zone names?
I just realised I forgot to add a comment in the patch. It's the table from the PostgreSQL manual
http://www.postgresql.org/docs/8.1/static/datetime-keywords.html
OK, we should certainly include a link in the source. But why this particular set? Where did PostgreSQL get them from? I Googled and found this: http://www.worldtimezone.com/wtz-names/timezonenames.html http://www.worldtimezone.com/wtz-names/wtz-vet.html Venezuela Time certainly has changed offset. The question is, is "VET" somehow less standard as an abbreviation than the ones in the PostgreSQL list? Alternatively, are the ones listed less liable to change? -- Ashley Yakeley

I just realised I forgot to add a comment in the patch. It's the table from the PostgreSQL manual
http://www.postgresql.org/docs/8.1/static/datetime-keywords.html
OK, we should certainly include a link in the source. But why this particular set? Where did PostgreSQL get them from?
I Googled and found this: http://www.worldtimezone.com/wtz-names/timezonenames.html http://www.worldtimezone.com/wtz-names/wtz-vet.html
Venezuela Time certainly has changed offset. The question is, is "VET" somehow less standard as an abbreviation than the ones in the PostgreSQL list?
The meaning of some abbreviations in specified in RFCs (for example RFC822 defines EST,EDT,CST,CDT,MST,MDT,PST,PDT,GMT and UT). I think the important point is, that the patch doesn't break anything. Previously the time module returned wrong time offsets for EVERY timezone name except UT and GMT. The patch improves on this by returning a time zone offset that most probably is correct. Due to ambiguity and the obviously very seldom occuring changes in the meaning of time zone names (probably the ones in the list won't be affected anyway) the new code might also return a wrong value. But for all those cases the old code would have returned a wrong value anyway. I think it's better to get it right most of the times than to always return the wrong values. It would be even better if we would return lists of possible interpretations and let the user decide which interpretation to choose. But that would require a major API change. Cheers, David -- David Leuschner Meisenweg 7 79211 Denzlingen Tel. 07666/912466
participants (7)
-
Ashley Yakeley
-
Bjorn Bringert
-
David Leuschner
-
Don Stewart
-
Duncan Coutts
-
Ian Lynagh
-
Isaac Dupree