Comments (5)
It's not limited to parse()
:
>>> import pendulum
>>> pendulum.create(2016, 11, 12, 2, 9, 39, 594000, 'America/Panama').microsecond
593999
After checking, it seems that the wrong value is due to the C extension to calculate offsets. If you install pendulum
without C extensions it works properly:
PENDULUM_EXTENSIONS=0 pip install pendulum
from pendulum.
It seems the difference between the Python and C versions is that Python rounds the value before casting to int which the C version does not.
C
microsecond = (int64_t) (unix_time * 1000000) % 1000000;
if (microsecond < 0) {
microsecond += 1000000;
}
Python
microsecond = int(round(unix_time % 1, 6) * 1e6)
Using the 594000
microseconds from above we can see that due to the way floats are handled the modulo value varies:
>>> 1.594000 % 1
0.5940000000000001
>>> 123.594000 % 1
0.5939999999999941
and without the round()
call the Python version would be similar to C.
>>> int((123.594000 % 1 ) * 1e6)
593999
>>> int(round(123.594000 % 1, 6) * 1e6)
594000
from pendulum.
As @dokai pointed, it's a floating point truncating error, but I think this has something similar to what I described in #71. Always rounding instead of truncating would be a workaround to the issue, but I think Pendulum should never use floating point numbers internally, unless the result itself has to be a float number.
In this case, it goes back to the timezone _normalize
method (pendulum.tz.Timezone._normalize
) when it calculates the unix timestamp as:
unix_time = tr.unix_time - (tr.pre_time - dt).total_seconds()
unix_time = tr.unix_time + (dt - tr.time).total_seconds()
The method total_seconds
returns a floating point. Every single call to delta.total_seconds()
internal to Pendulum should be replaced by delta.days * 86400 + delta.seconds
dealing with the microseconds part elsewhere (an extra local_time
parameter in both C and Python implementations). This way, nothing is float in between.
The given example:
>>> import pendulum
>>> from datetime import datetime
>>> dt = datetime(2016, 11, 12, 2, 9, 39, 594000)
>>> tz = pendulum.timezone("America/Panama")
>>> tr = tz.transitions[-1]
This method would be internally called on creation:
>>> tz._normalize(dt, "post")
(2016, 11, 12, 2, 9, 39, 593999, <TimezoneInfo [America/Panama, -18000, False]>)
And it does this:
>>> unix_time = tr.unix_time + (dt - tr.time).total_seconds()
>>> offset = tz._tzinfos[tr._transition_type_index].offset
>>> pendulum._extensions._helpers.local_time(unix_time, offset)
(2016, 11, 12, 2, 9, 39, 593999)
But for very high year values, the microsecond is simply lost (i.e., the unix_time
value itself isn't valid), no matter the Python/C implementation:
>>> dt = datetime(2316, 11, 12, 2, 9, 39, 857)
>>> unix_time = tr.unix_time + (dt - tr.time).total_seconds()
>>> "%.18f" % unix_time # Rounding/truncating wouldn't be enough
'10945955379.000856399536132812'
>>> dt = datetime(2222, 11, 12, 2, 9, 39, 1454)
>>> unix_time = tr.unix_time + (dt - tr.time).total_seconds()
>>> "%.18f" % unix_time # Neither rounding nor truncating would do it, again
'7979584179.001453399658203125'
>>> dt = datetime(2180, 7, 4, 13, 16, 8, 12)
>>> unix_time = tr.unix_time + (dt - tr.time).total_seconds()
>>> "%.18f" % unix_time
'6643016168.000011444091796875'
from pendulum.
@danilobellini I agree that working with floating point numbers are prone to error and approximations.
I will check if I can cook up a better implementation to be sure we don't lose information.
from pendulum.
Commit 184b94a on the develop
branch fixes the issue:
>>> import pendulum
>>> dt = pendulum.parse('2016-11-12T02:09:39.594000', 'America/Panama')
>>> dt.isoformat()
'2016-11-12T02:09:39.594000-05:00'
>>> dt.microsecond
594000
>>> dt = pendulum.create(2316, 11, 12, 2, 9, 39, 857, 'America/Panama')
>>> dt.isoformat()
'2316-11-12T02:09:39.000857-05:00'
>>> dt.microsecond
857
Basically, microseconds are now treated separately to avoid having to round the value.
from pendulum.
Related Issues (20)
- Unable to build pendulum for 32bit HOT 10
- Segmentation fault at import when running 3.0.0 under Docker Alpine linux
- Arithmetic overflow during the compilation of the Rust code for the "_pendulum" library. HOT 1
- Allow formatting of `in_words`
- dst_rule not implemented in 3.x? Has it in documentation but examples don't work. HOT 3
- in_months() returns wrong values in some cases in January to February HOT 1
- pendulum PanicException on FreeBSD 14 HOT 4
- [Feature Request] Add strict mode for from_format()
- backports-zoneinfo does not compile on 3.10, and hard to package (pyinstaller, zipapp,...) HOT 3
- Bug / regression: Unable to parse string '031' with 'DDDD' format using Pendulum 3.0.0
- Regression - Inconsistent DBAPI serialization of pendulum datetimes
- Duration division, multiplication and mod broken for month lengths. HOT 1
- Default datetime string format changed in 3.0 release HOT 2
- Security issue or bug with pipenv ? HOT 1
- to_iso8601_string
- instance() fails for datetimes with pytz.FixedOffsets
- "import pendulum" causes unwanted import of pytest (presence of pytest in sys.modules often used to control Python behaviour) HOT 1
- MMM is not case-insensitive like Java
- If you have the /etc/timezone file set to a timezone string, pendulum.UTC has that timezone's utcoffset instead of None HOT 1
- in_words() exclude week
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pendulum.