tinylib / msgp Goto Github PK
View Code? Open in Web Editor NEWA Go code generator for MessagePack / msgpack.org[Go]
License: MIT License
A Go code generator for MessagePack / msgpack.org[Go]
License: MIT License
When one simply wants to create a set of custom types to be used as a field of type interface{}, you really just want to register a type number with a normal struct.
Currently, you must add 4 boiler plate functions to satisfy the Extension interface.
Is there a reason for making the extension interface not use the same function names as (Un)Marshaller and add the additional ExtensionType function?
Example code I have copy/pasted (and renamed struct) in my code:
func (i *Identity) ExtensionType() int8 {
return IdentityT
}
func (i *Identity) Len() int {
return i.Msgsize()
}
func (i *Identity) MarshalBinaryTo(b []byte) error {
_, err := i.MarshalMsg(b)
return err
}
func (i *Identity) UnmarshalBinary(b []byte) error {
_, err := i.UnmarshalMsg(b)
return err
}
There should probably be a Reset()
method, because by default ReadFrom
doesn't zero-out fields before reading.
(*Writer).WriteString
and (*Writer).WriteByte
collide with io.ByteWriter
and io.StringWriter
.
Right now there's no documentation on the "rules" for decoding/decoding. They should be explicit.
There is a lot of work done inside the templates that is more easily expressed in go.
This would (will) be a major overhaul of /gen
.
$ cat x.go
package x
//go:generate msgp -io=false
type X int
$ go generate
...
>>> Formatted "x_gen.go"
>>> Done.
$ go build
# _/tmp
./x_gen.go:13: z.Msgsize undefined (type X has no field or method Msgsize)
If you rename DecodeMsg and EncodeMsg to ReadFrom and WriteTo and return (int64, error) you will be satisfying io.ReaderFrom and io.WriterTo.
At the moment msgp seems to ignore unexported types and does not generate any code for them.
More often than not, I just want to parse some msgpack data and not export the type parsed into. Is there a reason why msgp ignores unexported types?
Dropping the ast.FileExports
calls in parse/getast.go
enables processing of unexported types and so far I haven't hit any ill effects.
As I am rather new with golang it would be nice the complete wiki example as compilable code in the repository. Including the msgpack data and with json output.
Kind regards,
Jerry
The shim funcs to encode/decode third party types work beautifully, and to encode/decode slices of structs including these third party types I just have to define explicit slice types.
These slice types usually use the struct type encode/decode funcs to encode/decode their elements, but if the struct type is small enough (I guess the reason to be) the slice type will just inline the field encoding/decoding.
In these cases, the shim methods are not used for the encoding/decoding, and the code won't compile.
go vet beeps on the generated code
xxx_gen.go:400: github.com/philhofer/msgp/enc.ArrayError composite literal uses unkeyed fields
exit status 1
if asz != 2 {
err = enc.ArrayError{asz, 2}
return
}
thanks for your hard work..
I found a 404 on a doco page..
at the bottom of https://github.com/tinylib/msgp/wiki/Using-the-Code-Generator there is a link to
Godoc
which goes here:
http://godoc.org/github.com/philhofer/msgp/msgp
404
Some third party packages (e.g. https://cloud.google.com/appengine/docs/go/datastore/) use proprietary types (e.g. https://cloud.google.com/appengine/docs/go/datastore/reference#Key) in ways that don't allow developers to transparently replace them with other types (see https://cloud.google.com/appengine/docs/go/datastore/reference where *datastore.Key
has a fairly prominent role, as opposed to a simple type Key interface {...}
which I think would have been preferable).
Sometimes these types (like datastore.Key
) don't have any exported fields, and are thus usually impossible to serialize with encoders that use the praxis of only encoding exported fields.
To still allow encoding they usually implement http://golang.org/pkg/encoding/#BinaryMarshaler or (in this case) http://golang.org/pkg/encoding/gob/#GobEncoder.
If msgp honoured the above interfaces and used them to encode these types, msgp would be possible to use as a plug-in replacement of encoding/gob
in scenarios where you want to serialize your App Engine entities (e.g. when implementing https://cloud.google.com/appengine/docs/go/memcache/reference#Codec).
Currently only map[string]string
and map[string]interface{}
are supported.
We can, in theory, support any map[string]IDENT
type, where IDENT is any gen.BaseElem
. The code will have to be generated explicitly, as opposed to the existing static implementations.
We can't use testing/quick
to populate structs for testing and benchmarking because of go issue #8818.
I'm open to short-term solutions to this problem until the issue is fixed (go1.5).
The generated code for fixed-size byte arrays fails to compile. For example:
$ cat types.go
//go:generate msgp -file types.go
package foo
type X struct {
Items [32]byte `msg:"items"`
}
$ go generate
[...]
$ go build
[...]
./types_gen.go:62: dc.ReadByte undefined (type *msgp.Reader has no field or method ReadByte)
./types_gen.go:133: undefined: msgp.ByteSize
While this error seems to be specific to byte
and the generated code for
other types compiles, the generated test-suite fails for at least string
and the int
variants. For example:
$ cat types.go
package foo
//go:generate msgp -file types.go
type X struct {
Items [32]int32 `msg:"items"`
}
$ go generate
[...]
$ go test
--- FAIL: TestXMarshalUnmarshal (0.00s)
types_gen_test.go:87: 1 bytes left over after Skip(): "\x00"
FAIL
exit status 1
[...]
Hello.
I'm experiencing a problem while trying to encode a complex object.
The structure is the following:
type MyStructA struct {
Field interface{}
}
type MyStructB struct {
Field MyStructC
}
type MyStructC struct {
Field string
}
I need to encode an object of type MyStructA
which contains an object of type MyStructB
containing an object of type MyStructC
:
package main
myObj := MyStructA {
Field: MyStructB{
Field: MyStructC{
Field: "string",
},
},
}
After I generate the file using go generate
(including this message inlining methods for MyStructC into MyStructB
) and build using go build
successfully with no errors or warning I get the following runtime error:
msgp: type "main.MyStructB" not supported
Now, if I change the MyStructB
to this
type MyStructB struct {
Field string // Changed from MyStructC to primitive type string
}
This works. Am I doing it wrong? As I can see in the code, it fails when checking the type of Field
inside MyStructB
We're using msgp a lot lately, and like it. Thank you!
We are marshalling and unmarshalling 40GB of data every 15-60 minutes, and have been doing a lot of this recently: https://xkcd.com/303/. Speeds are roughly 250MB/s from msgp. Which kicks ass. BUT, our disk storage runs at something like 900MB/S and we have a huge RAM cache which makes me suspect we could get at least a 4x speedup with some extra cores thrown at this. That would be great.
I've been thinking over whether one could automatically goroutinize a large load, and I'm curious for your thoughts about it. We could reinstrument our marshal/unmarshal functions to split data into files and recombine after load, but it would be really great to shard out reading from slices leaving one file which could be read single threaded or multicore. Any thoughts on doing this?
The generator tree (gen.Elem
) can be used as input for JSON method templates.
There's a bunch of boilerplate that has to be written, but we should be able to use at least ~60% of the existing code.
I haven't tried it, so forgive me if I'm incorrect.
gen (another code generation thingy for Go) also uses the _gen
naming convention for generated files. Are these two tools going to clobber each other if e.g. I want both a Objects
struct AND a msgpack serialization of Object
?
Because most encoding frameworks allow encoding either values or pointers to values (even if they only allow decoding to pointers for obvious reasons), it would ease integration with msgp if the encoder funcs were bound to values instead of pointers.
As in:
func (t Type) MarshalMsg ...
vs
func (t *Type) MarshalMsg ...
Maybe you have good reasons for having pointer receivers, but since I guess most of any type gets accessed in the encode func anyway, the copy shouldn't (without me having benchmarked this at all :) be slower than doing lots of dereferencing in a pointer receiver func.
Also, the value receiver func will work even when you have a *t, but not the reverse, so it's easier on the user :D
Does this make sense, or am I missing something?
Hi!
I'm trying to recover a slice of a byte array (encoded msgpack) .
The msgpacked string looks good and I'm able to recover a slice from the root depth, but not more.
Here is the structure:
type MyStructA {
Field1 interface{}
}
type MyStructB {
Field2 interface{}
}
type MyStructC {
Field3 string
}
data := MyStructA {
Field1: &MyStructB {
Field2: &MyStructC {
Field3: "String",
},
},
}
I want to isolate Field2
to decode it into a MyStructC
variable to avoid dealing with a map[string]interface{}
object since my fields are of type interface{}
.
Using the raw
content (the entire msgpack from data
) I can access Field1
and decode it to a proper structure but not Field2
because msgp.Locate("Field2", raw)
returns an empty byte array.
Like gogoprotobuf, we should generate tests and benchmarks automatically.
This is more of a question about desired functionality. Should msgp create *_gen.go if I don't have any structs annotated for generation or if those structs don't have any exported fields? I feel like it shouldn't because this results in a file with imports but no other code, obviously causing a compilation error for the package.
App Engine doesn't allow importing unsafe, so right now it's impossible to use this lib there :/
In #31 a nice functionality was implemented to generate shims for generic types.
Unfortunately, when I run the code now, it seems the generator generates functions with the shimmed type as receiver.
The entire point of the shim (for me, at least) was to be able to encode third party types that I can't create new functions for.
Example:
//go:generate msgp
//msgp:shim datastore.Key as:string using:utils.EncodeKey/utils.DecodeKey
package user
import "appengine/datastore"
type User struct {
Id *datastore.Key
Name string
}
When I run go generate
with this code, something like
// MarshalMsg implements msgp.Marshaler
func (z *datastore.Key) MarshalMsg(b []byte) (o []byte, err error) {
o = msgp.Require(b, z.Msgsize())
o = msgp.AppendString(o, common.EncodeKey((*z)))
return
}
will be generated, but won't compile of course.
Things get rather difficult if we don't want to generate code for arbitrary types, but still want to support something like:
type Chunk [128]byte
type Block struct {
Meta string
Data Chunk
}
Right now, Chunk
is interpreted as IDENT
, even though we could conceivably generate the appropriate code for it in-situ. However, generating the appropriate methods for Chunk
is simpler than transitively applying its type information across every location in which it is used.
TL;DR generate code for all valid *ast.TypeDecl
nodes.
There's not a single example of how to use the msgp package. Not even a reference to other source for documentation.
Take the following declaration:
type MyType struct {
Blah *Thing
}
The current implementation assumes that *Thing
implements the io.WriterTo
and io.ReaderFrom
interfaces. In order to determine the underlying type of the identifier, we may have to do full-package identifier resolution.
Issue #45 is fixed, but the solution involves re-reading/writing the whole generated file.
A better solution (from an I/O standpoint) is to keep track of required imports in the generated file and insert them as necessary. However, this may end up being a lot of extra code to write/maintain, so a good solution should be fairly pithy.
Right now a field like [3]float64
won't work. Since the array length can be a constant, we'll have to support constant expressions along with it, e.g. [LEN]float64
.
msgp fails when GOPATH has multiple entries
A better way to grab the base path of the tmpl files is to start from the path of the current go file.
runtime.Caller exposes this.
Sending a pull request in a few minutes.
We need a way for decoders to skip over fields with no corresponding element in the struct. dc.Skip()
would go in the default
switch case of the struct decoding template.
At least the decode implementation usually has a few error cases, and it would be nice if those errors could be properly propagated.
Using io.WriterTo
and io.ReaderFrom
was a mistake. Those implementations in the standard library read until EOF; the generated methods do not.
shims seem to be ignored for time.Time
.
package tt
import "time"
//go:generate msgp
//msgp:shim time.Time as:string using:timetostr/strtotime
type T struct {
T time.Time
}
func timetostr(t time.Time) string {
return t.Format(time.RFC3339)
}
func strtotime(s string) time.Time {
t, _ := time.Parse(time.RFC3339, s)
return t
}
The generated code still calls dc.ReadTime
which expects a TimeExtension.
Hi!
Would it be possible to add a go generate
argument to generate implementation for empty structures?
If yes, how easy is it to implement that feature?
time.Time
requires an allocation for encoding, because right now we're using the MarshalBinary()
method.
We should probably define a different encoding such that we can do this without allocating.
e.g. in pseudo code:
//msgp:shim string as:[]byte using:[]byte/string
//msgp:tuple MyTuple generate:DecodeMsg
type MyTuple struct {
A, B string
}
would cause only the DecodeMsg
method to be generated (but not any of the marshall-related methods or the EncodeMsg
)
The use case here is again MSGPACK RPC. I want to be able to generate a decoder for one struct
(the args to the method) and an encoder for another (the return values). Then combine those two structs into a wrapper struct
along the following lines:
//msgp:tuple MethodArgs generate:DecodeMsg
type MethodArgs struct {
I int
}
//msgp:tuple MethodRetVals generate:EncodeMsg
type MethodRetVals struct {
S string
}
type MethodWrapper struct {
*MethodArgs
*MethodRetVals
}
I can then unambiguously call DecodeMsg
on an instance of the MethodWrapper
Thoughts?
The following declaration breaks the current ast parser:
type Thing struct {
A, B, C float64
}
We need to break up multi-name lines as multiple fields.
If I have a struct such as:
type A struct {
C *time.Time
}
msgp does not add a time import into the generated code. This raises a compilation error as in Unmarshal/Decode such a field has the following code generated for it:
if identifier == nil {
idenifier = new(qualified.Type)
}
I don't currently have the time to provide a patch for this, but I will when I can find some.
Firstly, great package. Thanks very much.
I've seen in the README
:
Maps must have string keys
Would you be open to making this configurable in some way?
Reason being, the MSGPACK I'm consuming sends all strings as bin
because the strings themselves might be encoded in a different format to UTF8.
Even though the generated code is gofmt
ed, it can still look pretty gross, and that's mostly due to inconsistent line spacing.
+1 internets to anyone who can figure out how to make gofmt
remove empty lines within function calls.
msgp seems to work pretty well and performance is good, thanks! Question: Is it possible to serialize a value whose type is defined in a different package?
//go:generate msgp -file types.go
package foo
type X struct {
Items [][32]int`msg:"items"`
}
The generated code fails to compile due to an unused variable in Msgsize():
./types_gen.go:167: xvk declared and not used
165 func (z *X) Msgsize() (s int) {
166 s = msgp.MapHeaderSize + msgp.StringPrefixSize + 5 + msgp.ArrayHeaderSize
167 for xvk := range z.Items {
168 s += msgp.ArrayHeaderSize + (32 * (msgp.IntSize))
169 }
170 return
171 }
Line 167 should probably read for _ = range z.Items
or (similar as it is
done in sizeGen.gMap) _ = xvk
added to the loop body. Something like 04c6662 would
fix this, but introduces a lot of unnecessary _ = idx
statements.
It could be a nice feature to be able to specify a directory to generate code for instead of just a single file - this would mean the pkg flag could also be omitted.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.