Comments (16)
The actual format is part of the standard: https://tools.ietf.org/html/rfc7049#section-2.4.1 so there's probably no wiggle room in changing that to the nicer UTCTime
day/time format. I tried writing a faster encoder but time conversions are slow. Will probably have to newtype with custom serialise :/
from cborg.
You could newtype
it and use the same serialisation format as the binary
package?
from cborg.
It'd be interesting to see where you got your Binary
instances from - you took an Orphan I imagine (or newtyped it) something like this I assume?
instance Binary UTCTime where
put (UTCTime a b) = put a >> put b
get = UTCTime <$> get <*> get
instance Binary Day where
put (ModifiedJulianDay d) = put d
get = ModifiedJulianDay <$> get
instance Binary DiffTime where
put = put . fromEnum
get = toEnum <$> get
...
or whatever? I can definitely believe that the time formatting functions are simply slow no matter what, since they have to roundtrip through String
and probably a trillion other things.
Second, if you can file a synthetic benchmark or something with our criterion
setup, that'd be really nice! It'd be interesting to investigate this (it's a bit hard to say exactly what could be improved without looking at a profile).
If this is an insurmountable problem, it would probably be OK to provide another function like reallyFastUTCEncoding :: UTCTime -> Encoding
that does something similar, and just mention it very prominently in the tutorial and documentation that any time a UTCTime
is needed, you should carefully consider which one you need. This is a really horrible workaround, though, and gets very unusable if you ever want some function overloaded on Serialise
(you can't just call encode
, so you'd have to take an Encoding
instead or something). FWIW: A literal translation of any binary
instance should always be faster, basically without exception - but this is a case where one of our instances has to be different from upstream for the stated behavior.
That said, providing the canonical instance is, IMO correct (we don't want orphans for this). It sucks that it's so slow, though.
from cborg.
@thoughtpolice - yea, that's pretty close: https://hackage.haskell.org/package/binary-orphans-0.1.4.0/docs/src/Data.Binary.Orphans.html
I'm trying to add a UTCTime
benchmark but can't get the benchmark to compile. GHC (7.10.3) has been stuck for over an hour now compiling PkgAesonGeneric
- have you seen this?
[16 of 28] Compiling Macro.PkgAesonGeneric ( bench/Macro/PkgAesonGeneric.hs, dist/build/vs-other-libs/vs-other-libs-tmp/Macro/PkgAesonGeneric.o )
23247 me 20 0 4878848 3.542g 22980 S 310.3 46.1 15:06.19 ghc
from cborg.
Removed that benchmark for now while testing. See #52. I'm totally unclear on whether that benchmark actually does what I want (with force
) ..
from cborg.
The benchmark compilation performance is a known problem - see #33.
I'll take a closer look at #52. Ugh, I need to get our 32 bit support online...
from cborg.
Oh yes, and Generic might be especially bad due to some GHC bugs. I'll test for a bit.
from cborg.
Sigh. It's unfortunate that it's so slow. So we have two choices:
- use the standard CBOR representation for greater compatibility (e.g.cbor2json tools will know how to render them) and try and make the parser/printer faster, or
- go for a different representation in terms of the day + time of day. Of course we only have to handle the one standard date/time format, unlike the
time
lib with its general format string interpretation.
from cborg.
I suspect this is a problem for our application as well, which serializes large datasets full of UTCTime
s. Would it be worth working on a higher-performance RFC 3339 parser/printer (if such a thing doesn't exist already)? That might be a nice little project that could be useful in other contexts.
Alternatively, I notice that the CBOR standard permits POSIX-style timestamps as well (tag 1). Though those don't seem to be supported by the Serialise UTCTime
instance...
from cborg.
Looks like aeson
might have some useful prior art.
from cborg.
Right, so Aeson uses ISO-8601 for encoding UTCTime
, which is also exactly what CBOR uses. The writer bit should be pretty easy to steal, since it already uses a Builder
to write out the encoded format. But the parser will be tricker as we'll have to eliminate the attoparsec
dependency, I think. But the initial results of reusing that code seem promising at a glance.
from cborg.
We are going to move to a new, more reasonable encoding for the UTCTime
instance for the release. However, this can be done in a backwards compatible way as it is tagged.
One possible encoding is two integers: days since the 0 epoch and picoseconds.
from cborg.
It looks like this will go into 0.2 regardless at this rate, so I'm queuing this for the initial release.
from cborg.
I have submitted an IANA request for the new tags but it will take some time for the process to take it course. I don't think we should hold the release for this.
from cborg.
Fixed by d093bb6.
from cborg.
Errr, actually fixed by cad4c72! (Ben accidentally force pushed over the last commit)
from cborg.
Related Issues (20)
- Support GHC 9.4 HOT 1
- cborg-0.2.6+ fails to build for 32bit: Couldn't match expected type βInt64#β with actual type βInt#β etc HOT 3
- Alternative tags have now been standardised
- Build error on ghc 9.2.2 HOT 2
- `serialise` `versus` benchmark fails to parse internal cabal libraries
- Allow vector-0.13 (in serialise) HOT 2
- Add instances for wide-word
- cborg-0.2.8.0 fails to build with 32bit ghc9 HOT 2
- Support base-4.18, strict-0.5, these-1.2, criterion-1.6 (i.e. GHC-9.6) HOT 2
- ghc-9.4 build error on M1 HOT 6
- Test `decodeValue` against `Reference.termToJson`.
- hPutSerialise should use hPutBuilder? HOT 1
- Encoding allocates too many blocks in old generation HOT 3
- Remove dependency on `aeson-pretty`
- fromFlatTerm impl of PeekTokenType is insufficently precise
- Support ghc-9.8.1 HOT 4
- serialise: Usage without requiring `Serialise` instances HOT 3
- serialise: allow tar-0.6
- serialise: allow zlib-0.7
- Versions on master have fallen behind Hackage
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cborg.