Giter Site home page Giter Site logo

planetscale / vtprotobuf Goto Github PK

View Code? Open in Web Editor NEW
831.0 22.0 72.0 11.29 MB

A Protocol Buffers compiler that generates optimized marshaling & unmarshaling Go code for ProtoBuf APIv2

License: BSD 3-Clause "New" or "Revised" License

Go 97.14% Makefile 2.49% Shell 0.37%
go protobuf grpc codegen vitess

vtprotobuf's Introduction

vtprotobuf, the Vitess Protocol Buffers compiler

This repository provides the protoc-gen-go-vtproto plug-in for protoc, which is used by Vitess to generate optimized marshall & unmarshal code.

The code generated by this compiler is based on the optimized code generated by gogo/protobuf, although this package is not a fork of the original gogo compiler, as it has been implemented to support the new ProtoBuf APIv2 packages.

Available features

vtprotobuf is implemented as a helper plug-in that must be run alongside the upstream protoc-gen-go generator, as it generates fully-compatible auxiliary code to speed up (de)serialization of Protocol Buffer messages.

The following features can be generated:

  • size: generates a func (p *YourProto) SizeVT() int helper that behaves identically to calling proto.Size(p) on the message, except the size calculation is fully unrolled and does not use reflection. This helper function can be used directly, and it'll also be used by the marshal codegen to ensure the destination buffer is properly sized before ProtoBuf objects are marshalled to it.

  • equal: generates the following helper methods

    • func (this *YourProto) EqualVT(that *YourProto) bool: this function behaves almost identically to calling proto.Equal(this, that) on messages, except the equality calculation is fully unrolled and does not use reflection. This helper function can be used directly.

    • func (this *YourProto) EqualMessageVT(thatMsg proto.Message) bool: this function behaves like the above this.EqualVT(that), but allows comparing against arbitrary proto messages. If thatMsg is not of type *YourProto, false is returned. The uniform signature provided by this method allows accessing this method via type assertions even if the message type is not known at compile time. This allows implementing a generic func EqualVT(proto.Message, proto.Message) bool without reflection.

  • marshal: generates the following helper methods

    • func (p *YourProto) MarshalVT() ([]byte, error): this function behaves identically to calling proto.Marshal(p), except the actual marshalling has been fully unrolled and does not use reflection or allocate memory. This function simply allocates a properly sized buffer by calling SizeVT on the message and then uses MarshalToSizedBufferVT to marshal to it.

    • func (p *YourProto) MarshalToVT(data []byte) (int, error): this function can be used to marshal a message to an existing buffer. The buffer must be large enough to hold the marshalled message, otherwise this function will panic. It returns the number of bytes marshalled. This function is useful e.g. when using memory pooling to re-use serialization buffers.

    • func (p *YourProto) MarshalToSizedBufferVT(data []byte) (int, error): this function behaves like MarshalTo but expects that the input buffer has the exact size required to hold the message, otherwise it will panic.

  • marshal_strict: generates the following helper methods

    • func (p *YourProto) MarshalVTStrict() ([]byte, error): this function behaves like MarshalVT, except fields are marshalled in a strict order by field's numbers they were declared in .proto file.

    • func (p *YourProto) MarshalToVTStrict(data []byte) (int, error): this function behaves like MarshalToVT, except fields are marshalled in a strict order by field's numbers they were declared in .proto file.

    • func (p *YourProto) MarshalToSizedBufferVTStrict(data []byte) (int, error): this function behaves like MarshalToSizedBufferVT, except fields are marshalled in a strict order by field's numbers they were declared in .proto file.

  • unmarshal: generates a func (p *YourProto) UnmarshalVT(data []byte) that behaves similarly to calling proto.Unmarshal(data, p) on the message, except the unmarshalling is performed by unrolled codegen without using reflection and allocating as little memory as possible. If the receiver p is not fully zeroed-out, the unmarshal call will actually behave like proto.Merge(data, p). This is because the proto.Unmarshal in the ProtoBuf API is implemented by resetting the destination message and then calling proto.Merge on it. To ensure proper Unmarshal semantics, ensure you've called proto.Reset on your message before calling UnmarshalVT, or that your message has been newly allocated.

  • unmarshal_unsafe generates a func (p *YourProto) UnmarshalVTUnsafe(data []byte) that behaves like UnmarshalVT, except it unsafely casts slices of data to bytes and string fields instead of copying them to newly allocated arrays, so that it performs less allocations. Data received from the wire has to be left untouched for the lifetime of the message. Otherwise, the message's bytes and string fields can be corrupted.

  • pool: generates the following helper methods

    • func (p *YourProto) ResetVT(): this function behaves similarly to proto.Reset(p), except it keeps as much memory as possible available on the message, so that further calls to UnmarshalVT on the same message will need to allocate less memory. This an API meant to be used with memory pools and does not need to be used directly.

    • func (p *YourProto) ReturnToVTPool(): this function returns message p to a local memory pool so it can be reused later. It clears the object properly with ResetVT before storing it on the pool. This method should only be used on messages that were obtained from a memory pool by calling YourProtoFromVTPool. Using p after calling this method will lead to undefined behavior.

    • func YourProtoFromVTPool() *YourProto: this function returns a YourProto message from a local memory pool, or allocates a new one if the pool is currently empty. The returned message is always empty and ready to be used (e.g. by calling UnmarshalVT on it). Once the message has been processed, it must be returned to the memory pool by calling ReturnToVTPool() on it. Returning the message to the pool is not mandatory (it does not leak memory), but if you don't return it, that defeats the whole point of memory pooling.

  • clone: generates the following helper methods

    • func (p *YourProto) CloneVT() *YourProto: this function behaves similarly to calling proto.Clone(p) on the message, except the cloning is performed by unrolled codegen without using reflection. If the receiver p is nil a typed nil is returned.

    • func (p *YourProto) CloneMessageVT() proto.Message: this function behaves like the above p.CloneVT(), but provides a uniform signature in order to be accessible via type assertions even if the type is not known at compile time. This allows implementing a generic func CloneVT(proto.Message) without reflection. If the receiver p is nil, a typed nil pointer of the message type will be returned inside a proto.Message interface.

Usage

  1. Install protoc-gen-go-vtproto:

    go install github.com/planetscale/vtprotobuf/cmd/protoc-gen-go-vtproto@latest
    
  2. Ensure your project is already using the ProtoBuf v2 API (i.e. google.golang.org/protobuf). The vtprotobuf compiler is not compatible with APIv1 generated code.

  3. Update your protoc generator to use the new plug-in. Example from Vitess:

    for name in $(PROTO_SRC_NAMES); do \
        $(VTROOT)/bin/protoc \
        --go_out=. --plugin protoc-gen-go="${GOBIN}/protoc-gen-go" \
        --go-grpc_out=. --plugin protoc-gen-go-grpc="${GOBIN}/protoc-gen-go-grpc" \
        --go-vtproto_out=. --plugin protoc-gen-go-vtproto="${GOBIN}/protoc-gen-go-vtproto" \
        --go-vtproto_opt=features=marshal+unmarshal+size \
        proto/$${name}.proto; \
    done
    

    Note that the vtproto compiler runs like an auxiliary plug-in to the protoc-gen-go in APIv2, just like the new GRPC compiler plug-in, protoc-gen-go-grpc. You need to run it alongside the upstream generator, not as a replacement.

  4. (Optional) Pass the features that you want to generate as --go-vtproto_opt. If no features are given, all the codegen steps will be performed.

  5. (Optional) If you have enabled the pool option, you need to manually specify which ProtoBuf objects will be pooled.

    • You can tag messages explicitly in the .proto files with option (vtproto.mempool):
    syntax = "proto3";
    
    package app;
    option go_package = "app";
    
    import "github.com/planetscale/vtprotobuf/vtproto/ext.proto";
    
    message SampleMessage {
        option (vtproto.mempool) = true; // Enable memory pooling
        string name = 1;
        optional string project_id = 2;
        // ...
    }
    • Alternatively, you can enumerate the pooled objects with --go-vtproto_opt=pool=<import>.<message> flags passed via the CLI:
        $(VTROOT)/bin/protoc ... \
            --go-vtproto_opt=features=marshal+unmarshal+size+pool \
            --go-vtproto_opt=pool=vitess.io/vitess/go/vt/proto/query.Row \
            --go-vtproto_opt=pool=vitess.io/vitess/go/vt/proto/binlogdata.VStreamRowsResponse \
    
  6. (Optional) if you want to selectively compile the generate vtprotobuf files, the --vtproto_opt=buildTag=<tag> can be used.

    When using this option, the generated code will only be compiled in if a build tag is provided.

    It is recommended, but not required, to use vtprotobuf as the build tag if this is desired, especially if your project is imported by others. This will reduce the number of build tags a user will need to configure if they are importing multiple libraries following this pattern.

    When using this option, it is strongly recommended to make your code compile with and without the build tag. This can be done with type assertions before using vtprotobuf generated methods. The grpc.Codec{} object (discussed below) shows an example.

  7. Compile the .proto files in your project. You should see _vtproto.pb.go files next to the .pb.go and _grpc.pb.go files that were already being generated.

  8. (Optional) Switch your RPC framework to use the optimized helpers (see following sections)

vtprotobuf package and well-known types

Your generated _vtproto.pb.go files will have a dependency on this Go package to access some helper functions as well as the optimized code for ProtoBuf well-known types. vtprotobuf will detect these types embedded in your own Messages and generate optimized code to marshal and unmarshal them.

Using the optimized code with RPC frameworks

The protoc-gen-go-vtproto compiler does not overwrite any of the default marshalling or unmarshalling code for your ProtoBuf objects. Instead, it generates helper methods that can be called explicitly to opt-in to faster (de)serialization.

vtprotobuf with GRPC

To use vtprotobuf with the new versions of GRPC, you need to register the codec provided by the github.com/planetscale/vtprotobuf/codec/grpc package.

package servenv

import (
	"github.com/planetscale/vtprotobuf/codec/grpc"
	"google.golang.org/grpc/encoding"
	_ "google.golang.org/grpc/encoding/proto"
)

func init() {
	encoding.RegisterCodec(grpc.Codec{})
}

Note that we perform a blank import _ "google.golang.org/grpc/encoding/proto" of the default proto coded that ships with GRPC to ensure it's being replaced by us afterwards. The provided Codec will serialize & deserialize all ProtoBuf messages using the optimized codegen.

Mixing ProtoBuf implementations with GRPC

If you're running a complex GRPC service, you may need to support serializing ProtoBuf messages from different sources, including from external packages that will not have optimized vtprotobuf marshalling code. This is perfectly doable by implementing a custom codec in your own project that serializes messages based on their type. The Vitess project implements a custom codec to support ProtoBuf messages from Vitess itself and those generated by the etcd API -- you can use it as a reference.

Twirp

Twirp does not support customizing the Marshalling/Unmarshalling codec by default. In order to support vtprotobuf, you'll need to perform a search & replace on the generated Twirp files after running protoc. Here's an example:

for twirp in $${dir}/*.twirp.go; \
do \
  echo 'Updating' $${twirp}; \
  sed -i '' -e 's/respBytes, err := proto.Marshal(respContent)/respBytes, err := respContent.MarshalVT()/g' $${twirp}; \
  sed -i '' -e 's/if err = proto.Unmarshal(buf, reqContent); err != nil {/if err = reqContent.UnmarshalVT(buf); err != nil {/g' $${twirp}; \
done; \

DRPC

To use vtprotobuf as a DRPC encoding, simply pass github.com/planetscale/vtprotobuf/codec/drpc as the protolib flag in your protoc-gen-go-drpc invocation.

Example:

protoc --go_out=. --go-vtproto_out=. --go-drpc_out=. --go-drpc_opt=protolib=github.com/planetscale/vtprotobuf/codec/drpc

Connect

To use vtprotobuf with Connect, first implement a custom codec in your own project that serializes messages based on their type (see Mixing ProtoBuf implementations with GRPC). This is required because Connect internally serializes some types such as Status that don't have vtprotobuf helpers. Then pass in connect.WithCodec(mygrpc.Codec{}) as a connect option to the client and handler constructors.

package main

import (
	"net/http"

	"github.com/bufbuild/connect-go"
	"github.com/foo/bar/pingv1connect"
	"github.com/myorg/myproject/codec/mygrpc"
)

func main() {
	mux := http.NewServeMux()
	mux.Handle(pingv1connect.NewPingServiceHandler(
		&PingServer{},
		connect.WithCodec(mygrpc.Codec{}), // Add connect option to handler.
	))
	// handler serving ...

	client := pingv1connect.NewPingServiceClient(
		http.DefaultClient,
		"http://localhost:8080",
		connect.WithCodec(mygrpc.Codec{}), // Add connect option to client.
	)
	/// client code here ...
}

Integrating with buf

vtprotobuf generation can be easily automated if your project's Protocol Buffers are managed with buf.

Simply install protoc-gen-go-vtproto (see Usage section) and add it as a plug-in to your buf.gen.yaml configuration:

version: v1
managed:
  enabled: true
  # ...
plugins:
  - plugin: buf.build/protocolbuffers/go
    out: ./
    opt: paths=source_relative
  - plugin: go-vtproto
    out: ./
    opt: paths=source_relative

Running buf generate will now also include the vtprotobuf optimized helpers.

vtprotobuf's People

Contributors

adphi avatar artemreyt avatar bhollis avatar biosvs avatar bogdandrutu avatar convto avatar cristaloleg avatar cyriltovena avatar evgfedotov avatar fenollp avatar fumesover avatar gnagel avatar howardjohn avatar kruskall avatar maheeshap-canopus avatar markus-wa avatar misberner avatar nockty avatar skottler avatar systay avatar technicallyty avatar vmg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

vtprotobuf's Issues

cut new release?

Hey all, any chance of cutting a new release?

The latest version is 0.3.0 and a bunch of stuff has been added since then such as some of the documented options (pool and so on). Is the current main stable? Can a new release be made?

Deterministic mode

Upstream proto has a "deterministic" mode: https://github.com/protocolbuffers/protobuf-go/blob/f221882bfb484564f1714ae05f197dea2c76898d/proto/encode.go#L50

I think this is more-strict than marshal_strict (but I could be wrong) in that it also sorts maps.

Interestingly, there is a "Stable" field here:

Stable, once, strict bool
. but its never used.

It might be nice to add this option, or if its already possible documentation around it.

Unable to get pooling methods to gen

I am currently attempting to put together a POC for using vtprotobuf in an application I work on. However the most promising feature (pooling) does not seem to exist in the generated code.

My command line:

protoc \
  -Ipkg/.patched-proto \
  --go_out=paths=source_relative:./pkg/tempopb/ \
  --go-grpc_out=paths=source_relative:./pkg/tempopb/ \
  --go-vtproto_out=paths=source_relative:./pkg/tempopb/ \
  --go-vtproto_opt=features=marshal+unmarshal+size+pool \
  pkg/.patched-proto/trace/v1/trace.proto

The output files seem to be generated correctly and there are no errors:

image

But I'm not seeing ResetVT, ReturnVTToPool, or *FromVTPool generated. I have tried with 0.2.0 as well as tip of main. I have also tried not specifying --go-vtproto_opt without luck.

I am seeing MarshalVT, MarshalToVT, SizeVT, UnmarshalVT, ... generated.

Thanks for your time!

Provide ability to use golang (v1) marshal/unmarshal under the hood

Hello 👋
We have a large protobuf repo, using golang codegen, most of which is based on the V1 go message format. We are now starting to move to using the V2 message format, and are using vtprotobuf for fast (de)serialization. However, we cannot do the migration all at once. Due to this, we have V2 generated code, that can depend on V1 format generated code during the migration. So, we need the vtproto-generated code to be compatible with the older code.

To handle such scenarios, it would be very helpful if there is an ability to substitute google.golang.org/protobuf/proto with github.com/golang/protobuf/proto. This would make our migration a lot smoother.

This could be exposed via an option/flag at codegen time.

cc @euroelessar

Make proto.Marshal use MarshalVT under the hood

This fast-marshaling is cool, but I would like to avoid having the VT suffix in my codebase, and would like to simply continue using proto.Marshal(...).

I read that protoreflect's Message supports an optional ProtoMethods method, and quoting the docs:

ProtoMethods returns optional fast-path implementions of various operations.

Is there a way vtprotobuf's fast-marshaling could be added to those methods, instead of new (Un)MarshalVT methods?

Error compiling generated code with third party imports

Hello,

I was looking to experiment with this and ran into the following error:

type *"google.golang.org/genproto/googleapis/rpc/status".Status has no field or method MarshalToSizedBufferVT

The protobuf files that are complaining import files like:

import "google/rpc/status.proto";

with messages that look like:

message TestMessage {
   google.rpc.Status status = 2;
}

The google/rpc/status.proto file is copied locally for the code generation, but the generated code is importing the Go module from google.golang.org/genproto/googleapis/rpc/status so it's not part of the vtproto generation steps.

Is this an issue that you've had to resolve or any suggestions on how to approach this?

unknown feature: "clone"

Works fine...

--go-vtproto_opt=features=marshal+unmarshal+size+pool \

Making sure to get latest go install github.com/planetscale/vtprotobuf/cmd/protoc-gen-go-vtproto@latest

--go-vtproto_opt=features=marshal+unmarshal+size+pool+clone \

Gives --go-vtproto_out: unknown feature: "clone"

Replace sync.Pool with zeropool

Generated pool code currently uses sync.Pool which is nice because there are no external dependencies.
However, there is a small dependency we use instead that avoids a known allocation issue with sync.Pool (and introduces type safety, but it doesn't really matter since this is generated code).

Question: grpc

One of the features (which is on by default if features are not specified) is grpc. A quick look at the code and I can't seem to find any difference between the regular grpc plugin and the code generated vtprotobuf. Is there a plan to extends this to use say use pool for object creation in the future?

gRPC and pooling

I have a couple of questions regarding pooling and gRPC that I could not fully understand from existing issues or the readme. (happy to do a PR to clarify in the readme after understanding for the next person)

In #16 (comment) it was mentioned, that memory-pooled objects are automatically unmarshaled using objects from a pool, which I could get to work. How is this intended to work for the other way around though? Does it happen automatically somewhere like during marshaling (if so I can't find the code that does this)? If not, is there a recommended or suggested place where this could be done (maybe in the codec?)

My use case is I have a lot of objects that I read from a key-value store that eventually end up in a gRPC response, but after marshaling the response I would like to return the objects to the pool.

buf.build support?

Unfortunately when using buf.build, #19 pops up again, except since buf is doing the work there's no easy way just run a single protoc.

I'm not sure if this is something vtprotobuf can work around, or if this requires a feature in buf but it's a shame that these two tools don't play nicely together.

New extension: scrub

Proposal: new extension called scrub which adds a function Scrub() to messages.

Similar to Reset(), except it recursively overwrites all buffers & fields in the message with zeros.

ReturnToVTPool() recursive?

When func (p *YourProto) ReturnToVTPool() is called, children of YourProto that implement method ReturnToVTPool() should also be returned to the pool.

Nil pointer panic marshalling nil oneof field

Hi 👋 Ran into a bit of a weird one. Given the following proto:

syntax = "proto3";

package proto;

option go_package = "github.com/pfouilloux/vttest/proto";

message TestMsg {
  oneof Test {
    string a = 1;
    string b = 2;
    string c = 3;
  }
}

and the following code:

package main_test

import (
	"testing"

	"github.com/planetscale/vtprotobuf/codec/grpc"
	_ "google.golang.org/grpc/encoding/proto"
	"vttest/proto/github.com/pfouilloux/vttest/proto"
)

//go:generate protoc --proto_path=proto --go_out=proto --go_opt=paths=source_relative --go-vtproto_out=proto --go-vtproto_opt=features=marshal+unmarshal+size oneof.proto

func TestMarshal(t *testing.T) {
	test := &proto.TestMsg{Test: getA()}
	_, err := grpc.Codec{}.Marshal(test)
	if err != nil {
		panic(err)
	}
}

func getA() *proto.TestMsg_A {
	return nil
}

I'm seeing the following error:

panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x1239f1d]

goroutine 4 [running]:
testing.tRunner.func1.2({0x126e880, 0x14b3f20})
	/usr/local/opt/go/libexec/src/testing/testing.go:1396 +0x24e
testing.tRunner.func1()
	/usr/local/opt/go/libexec/src/testing/testing.go:1399 +0x39f
panic({0x126e880, 0x14b3f20})
	/usr/local/opt/go/libexec/src/runtime/panic.go:884 +0x212
vttest/proto/github.com/pfouilloux/vttest/proto.(*TestMsg_A).MarshalToSizedBufferVT(0x126e580?, {0x14efc18?, 0xc000057601?, 0x0?})
	/Users/pfouilloux/code/vttest/proto/github.com/pfouilloux/vttest/proto/oneof_vtproto.pb.go:72 +0x1d
vttest/proto/github.com/pfouilloux/vttest/proto.(*TestMsg_A).MarshalToVT(0x1240e01?, {0x14efc18?, 0x0?, 0x123a5a4?})
	/Users/pfouilloux/code/vttest/proto/github.com/pfouilloux/vttest/proto/oneof_vtproto.pb.go:68 +0x6a
vttest/proto/github.com/pfouilloux/vttest/proto.(*TestMsg).MarshalToSizedBufferVT(0xc0001049c0?, {0x14efc18, 0x0, 0x0})
	/Users/pfouilloux/code/vttest/proto/github.com/pfouilloux/vttest/proto/oneof_vtproto.pb.go:58 +0x133
vttest/proto/github.com/pfouilloux/vttest/proto.(*TestMsg).MarshalVT(0xc0001049c0)
	/Users/pfouilloux/code/vttest/proto/github.com/pfouilloux/vttest/proto/oneof_vtproto.pb.go:27 +0x58
github.com/planetscale/vtprotobuf/codec/grpc.Codec.Marshal({}, {0x12a20c0, 0xc0001049c0})
	/Users/pfouilloux/go/pkg/mod/github.com/planetscale/[email protected]/codec/grpc/grpc_codec.go:20 +0x42
vttest_test.TestMarshal(0x0?)
	/Users/pfouilloux/code/vttest/oneof_test.go:15 +0x47
testing.tRunner(0xc0000076c0, 0x12c8b20)
	/usr/local/opt/go/libexec/src/testing/testing.go:1446 +0x10b
created by testing.(*T).Run
	/usr/local/opt/go/libexec/src/testing/testing.go:1493 +0x35f


Process finished with the exit code 1

It looks like there is a nil check missing in the implementation of MarshalToVT for *TestMsg_A

func (m *TestMsg_A) MarshalToVT(dAtA []byte) (int, error) {
	size := m.SizeVT()
	return m.MarshalToSizedBufferVT(dAtA[:size])
}

I'm more than happy to raise a PR to address this if you could give me some guidance on where to add the appropriate tests.

Kind regards & thanks for sharing your work with the community!

Provide clarification on reusing serialization buffers

Documentation says:

MarshalToVT() ... This function is useful e.g. when using memory pooling to re-use serialization buffers.

Could you please provide clarification on how to use it? I'm asking because protobufs are typically used with gRPC, and grpc-go's SendMsg() returns before the buffer gets put on the wire. Hence, you cannot reuse it. Here is a relevant issue: grpc/grpc-go#2159. Someone has even attempted this http://www.golangdevops.com/2019/12/31/autopool/ but it's not a solution and the finalizers have poor performance. You could find even more details here thanos-io/thanos#4609.

Are there any examples of the usage of this function? I couldn't find anything.

If I am correct then the recommendation in the README seems dangerous.

Per Msg/Field Features

👋

Are there any plans to support adding features per-msg or per-field?

For example, for some of our string or []byte fields, we prefer to unmarshal them as "unsafe" so as to avoid an allocation if we don't plan to keep the data in memory past the lifetime of the original message.

I was thinking that type of behavior could be added as an annotation in the proto, similar to how the extensions in gogo currently work.

Is that something that is 1.) feasible with this project 2.) something that you would be interested in?

If so, I could try to come up with a POC

Thanks

equal: code doesn't distinguish between oneof fields when zero-valued

Because the comparison logic for oneof fields relies on the getters for the individual fields, it cannot differentiate between a field not being set, and a field being set to the zero value. While the nil checks allow distinguishing protos where one has a oneof field set to a zero value, while the other doesn't have any field in the oneof set, the code fails to distinguish protos where different fields in a oneof are set to the respective zero value.

Test case:

func TestEqualVT_Oneof_AbsenceVsZeroValue(t *testing.T) {
	a := &TestAllTypesProto3{
		OneofField: &TestAllTypesProto3_OneofUint32{
			OneofUint32: 0,
		},
	}
	b := &TestAllTypesProto3{
		OneofField: &TestAllTypesProto3_OneofString{
			OneofString: "",
		},
	}

	aJson, err := protojson.Marshal(a)
	require.NoError(t, err)
	bJson, err := protojson.Marshal(b)
	require.NoError(t, err)

	if a.EqualVT(b) {
		assert.JSONEq(t, string(aJson), string(bJson))
		err := fmt.Errorf("these %T should not be equal:\nmsg = %+v\noriginal = %+v", a, a, b)
		require.NoError(t, err)
	}
}

This is similar to #48 , but applies to oneofs and exercises different paths in the code generation.

wrong pool unmarshal slize

wrong pool unmarshal size.

my proto file:

// protoc  --go_out=. --plugin protoc-gen-go="/Users/jie.yang05/go/bin/protoc-gen-go"  --go-vtproto_out=.  --plugin protoc-gen-go-vtproto="/Users/jie.yang05/go/bin/protoc-gen-go-vtproto"  --go-vtproto_opt=features=marshal+unmarshal+size+pool ./lineentry.proto
syntax = "proto3";

package index;

option go_package="./proto";

import "github.com/planetscale/vtprotobuf/vtproto/ext.proto";

message lineEntries {
  option (vtproto.mempool) = true; // Enable memory pooling
  repeated lineEntry lineEntries = 1;
}

message lineEntry {
  uint64 address = 1;
  uint32 line = 2;
  uint32 file = 3;
}

comand

yangjie05-mac:index jie.yang05$ protoc  --go_out=. --plugin protoc-gen-go="/Users/jie.yang05/go/bin/protoc-gen-go"  --go-vtproto_out=.  --plugin protoc-gen-go-vtproto="/Users/jie.yang05/go/bin/protoc-gen-go-vtproto"   -I /Users/jie.yang05/go/pkg/mod/github.com/planetscale/vtprotobuf\@v0.4.0/include -I ./ ./lineentry.proto

generate code:

func (m *LineEntries) ResetVT() {
	for _, mm := range m.LineEntries {
		mm.ResetVT()
	}
	m.Reset()
}

func (m *LineEntries) ReturnToVTPool() {
	if m != nil {
		m.ResetVT()
		vtprotoPool_LineEntries.Put(m)
	}
}

generate code have wrong ResetVT.

Question about behavior of `ReturnToVTPool()` and `Reset()`

Using this proto:

message Parent {
  option (vtproto.mempool) = true;
  repeated Child children = 1;
  Child one = 2;
}

message Child {
  option (vtproto.mempool) = true;
  uint32 field = 1;
}

When calling ReturnToVTPool() on Parent it calls ResetVT on all children and then calls m.Reset()

func (m *Parent) ResetVT() {
	for _, mm := range m.Children {
		mm.ResetVT()
	}
	m.One.ReturnToVTPool()
	m.Reset()
}

However m.Reset() allocates a new object and overwrites the existing object entirely:

func (x *Parent) Reset() {
	*x = Parent{}

This nils out all fields on the parent throwing away the slice for the GC to handle. Am I missing something? Is there some way to put back into the pool, call ResetVT() but not call Reset()?

equal: presence of bytes fields not honored

All fields in proto2 and fields explicitly marked optional in proto3 have a presence property, i.e., the field not being set is always different from the field being set, even when set to the zero value. The generated code for equal correctly checks for equality of this presence property for optional scalar and message fields (i.e., fields with a pointer Go type), but not for bytes (which, in the proto world, is a scalar type, but in Go maps to a nilable reference type).

It is correct to not differentiate between []byte(nil) and []byte{} for fields without presence (i.e., fields in a oneof, repeated fields, and fields with neither optional nor repeated in proto3), however, for all the other cases, this difference should be taken into account as it indicates presence/absence.

Example test case:

func TestEqualVT_Proto2_BytesPresence(t *testing.T) {
	a := &TestAllTypesProto2{
		OptionalBytes: nil,
	}
	b := &TestAllTypesProto2{
		OptionalBytes: []byte{},
	}

	require.False(t, proto.Equal(a, b))

	aJson, err := protojson.Marshal(a)
	require.NoError(t, err)
	bJson, err := protojson.Marshal(b)
	require.NoError(t, err)

	if a.EqualVT(b) {
		assert.JSONEq(t, string(aJson), string(bJson))
		err := fmt.Errorf("these %T should not be equal:\nmsg = %+v\noriginal = %+v", a, a, b)
		require.NoError(t, err)
	}
}

PR coming

Adding support for proto extensions?

I'm using protobuf and trying this plugin, but seems like it's lacking the support for extensionFields 🤔? Am I right, or misunderstanding something here?

Avoid proto reflection for builtin types

The optimized generated code for marshal, unmarshal, clone etc. still resorts to generic, protoreflect-based logic for builtin/well-known types such as google.protobuf.Timestamp, google.protobuf.Duration etc. Since these types are well known, it would be nice if optimized unrolled code could be used for operations on these types as well - especially as there seems to be a significant performance penalty for the "context-switch" to reflection with each individual proto.Clone invocation (benchmark).

I'm happy to send a PR, but one thing I'd ask the library maintainers to chime in on is whether it's acceptable to have the generated code reference global helper functions from a package within this module (e.g., github.com/planetscale/vtprotobuf/support/...), or whether it would be preferable to just generate package-private helper functions for all referenced types on demand.

Cloning messages

"clone" would generates a VTClone() function to duplicate a message (like proto.Clone)

"copy" would generate "VTCopy(dst)" to copy the contents of a message to a target message.

Add optional unsafe operations

Context

For messages that contain many string fields (e.g. repeated string with many elements coming in), UnmarshalVT can spend a lot of CPU time in runtime.slicetobytestring. Indeed, when decoding the []byte data, it does a string(bytes) cast (e.g. m.Foo1 = string(dAtA[iNdEx:postIndex])). Since []byte is mutable and string is not, this cast requires an allocation for safety. This allocation, repeated many times, sometimes turns out to be expensive.

Feature request

We could avoid this by using the unsafe package that allows us to perform this cast without an allocation:

func unsafeBytesToString(b []byte) string {
	return *(*string)(unsafe.Pointer(&b))
}

Of course, the user has to be careful because if they overwrite the []byte data they received from the wire, then the string is mutated. So, this feature should be opt-in in my opinion.

This feature is mentioned in another issue: Per Msg/Field Features. I think having per-message / per-field features is not mandatory to implement unsafe casting, though.

Note that this feature applies to bytes fields as well where we could reference data instead of copying it.

Proposal

I think this feature is worth implementing and I'm open to ideas. In my opinion, a simple and pragmatic approach to implement it would be to add unsafe functions, such as UnmarshalVTUnsafe, which perform such operations for all applicable fields. I've actually started such an implementation on my personal fork and it seems to work well (diff for anyone curious, although it doesn't work yet for bytes fields). Now, before spending more time on it, I would love to gauge interest and hear considerations from others!

A few considerations I had on my side:

  • I think having a different function from UnmarshalVT is mandatory because several applications can use the same generated code for a message and we cannot ask them to all be careful not overwriting data received from the wire.
  • I considered adding a feature but it sounds weird because it would be transversal to other features.
    • Either we could just always generate both the safe and unsafe versions of the function;
    • or I think we could add an --unsafe flag (different from features) to trigger the generation of unsafe functions.

Let me know what you folks think about all this!

Using stack trace feature for returned errors

Hello.
There's stack trace feature in official package github.com/pkg/errors.
I'd offer to wrap returned errors in generated files with errors.WithStack(err) from this package.
Other words, let's turn this generated code:

if err != nil {
	return err
}
if (skippy < 0) || (iNdEx+skippy) < 0 {
	return ErrInvalidLength
}
if (iNdEx + skippy) > l {
	return io.ErrUnexpectedEOF
}

into that:

if err != nil {
	return errors.WithStack(err)
}
if (skippy < 0) || (iNdEx+skippy) < 0 {
	return errors.WithStack(ErrInvalidLength)
}
if (iNdEx + skippy) > l {
	return errors.WithStack(io.ErrUnexpectedEOF)
}

That would be extremely useful for debug purposes.
I could prepare pull request if you wish.

Repeated field with nulls break pool

It appears there is no null checks for repeated field with returning to the pool, causing panic. Is this not supported or am I missing how to handle properly?

Codec with GRPC Server Support

Thought/Question - Why does the GRPC codec not try and ReturnToVTPool on the way out? Is anyone else doing this. My thoughts are to have this for optimising the return of complex payloads - build them in the handler with ...FromVTPool() then let the codec release them once the wire-work has finished and the bytes are sent?

Are there reasons for this not being in the Codec, and if not, is there an opening for a PR with a PoolAwareCodec to be added to the project?

Support pooling repeated fields

As far as I can tell single fields are pooled correctly but repeated fields are not. Using the following proto:

message Parent {
  option (vtproto.mempool) = true;
  repeated Child children = 1;
  Child one = 2;
}

message Child {
  option (vtproto.mempool) = true;
  uint32 field = 1;
}

I can see the the ResetVT and UnmarshalVT methods correctly handle the "one" field but not the "children" field.

func (m *Parent) ResetVT() {
	for _, mm := range m.Children {
		mm.ResetVT()  // does not return the slice pointers to the pool
	}
	m.One.ReturnToVTPool()   // correctly returns this pointer to the pool
	m.Reset()
}
func (m *Parent) UnmarshalVT(dAtA []byte) error {
...
		switch fieldNum {
		case 1:
...
			if len(m.Children) == cap(m.Children) {
				m.Children = append(m.Children, &Child{}) // allocates new object for slice
			} else {
				m.Children = m.Children[:len(m.Children)+1]
				if m.Children[len(m.Children)-1] == nil {
					m.Children[len(m.Children)-1] = &Child{} // allocates new object for slice
				}
			}
...
		case 2:
...
			if m.One == nil {
				m.One = ChildFromVTPool() // correctly pulls from pool
			}
...

Is there a way to do this that I'm not seeing?

Twirp Integration

In response to "I actually have no idea of how to switch encoders in Twirp. Maybe it's not even possible.", this can't be done out of the box with the code generation, but can be achieved (server-side) with a simple find and replace.

The changes are below:

proto.Marshal(respContent) >> respContent.MarshalVT()
proto.Unmarshal(buf, reqContent) >> reqContent.UnmarshalVT(buf)

I'm using make to control all code gen, so I added the below as a final step:

for twirp in $${dir}/*.twirp.go; \
do \
  echo 'Updating' $${twirp}; \
  sed -i '' -e 's/respBytes, err := proto.Marshal(respContent)/respBytes, err := respContent.MarshalVT()/g' $${twirp}; \
  sed -i '' -e 's/if err = proto.Unmarshal(buf, reqContent); err != nil {/if err = reqContent.UnmarshalVT(buf); err != nil {/g' $${twirp}; \
done; \

TinyGo support

First off, wanted to say thanks for such a great project. While the issue title may sound like a feature request, the code generated by vtprotobuf already works with TinyGo. You can see an example of that here: https://github.com/kyleconroy/go-wasm-plugins

What I'd like to ask about is making TinyGo support explicit via a CI job. Is this something you'd be interested in supporting? If so I can take a first pass.

Duplicate Functions and Variables in Generated Package

First, just want to say thank you for working on and releasing vtprotobuf. We're working on transitioning out of gogo/protobuf and read your great blog post announcing this alternative.

We've found a slight issue with our use case. We have a few proto packages that have multiple files in them. The generated _vtproto.pb.go files redeclare some utility functions/variables (e.g. sov, skip, ErrInvalidLength, etc.). As an example:

hellopb/service.proto:

syntax = "proto3";

package hellopb;

...

message HelloRequest {
  string q = 1;
}

message HelloResponse {
  string response = 2;
}

service HelloService {
  rpc Hello(HelloRequest) returns (HelloResponse) {}
}

hellopb/db.proto:

syntax = "proto3";

package hellopb;

...

message MessageEntry {
  string q = 1;
  ...
}

We then run:

protoc --proto_path=. --proto_path=../../../  \
  --go_out=../../../ --plugin protoc-gen-go=/go/bin/protoc-gen-go \
  --go-grpc_out=../../../ --plugin protoc-gen-go-grpc=/go/bin/protoc-gen-go-grpc \
  --grpc-gateway_out=../../../ \
  --go-vtproto_out=../../../ --plugin protoc-gen-go-vtproto=/go/bin/protoc-gen-go-vtproto \
  --go-vtproto_opt=features=marshal+unmarshal+size hellopb/service.proto

and

protoc --proto_path=. --proto_path=../../../  \
  --go_out=../../../ --plugin protoc-gen-go=/go/bin/protoc-gen-go \
  --go-grpc_out=../../../ --plugin protoc-gen-go-grpc=/go/bin/protoc-gen-go-grpc \
  --grpc-gateway_out=../../../ \
  --go-vtproto_out=../../../ --plugin protoc-gen-go-vtproto=/go/bin/protoc-gen-go-vtproto \
  --go-vtproto_opt=features=marshal+unmarshal+size hellopb/db.proto

If we run go vet we get:

service_vtproto.pb.go:214:6: encodeVarint redeclared in this block
    db_vtproto.pb.go:125:54: previous declaration
service_vtproto.pb.go:308:6: sov redeclared in this block
    db_vtproto.pb.go:188:23: previous declaration
service_vtproto.pb.go:311:6: soz redeclared in this block
    db_vtproto.pb.go:191:23: previous declaration
service_vtproto.pb.go:694:6: skip redeclared in this block
    db_vtproto.pb.go:545:36: previous declaration
service_vtproto.pb.go:774:2: ErrInvalidLength redeclared in this block
    db_vtproto.pb.go:625:2: previous declaration
service_vtproto.pb.go:775:2: ErrIntOverflow redeclared in this block
    db_vtproto.pb.go:626:2: previous declaration
service_vtproto.pb.go:776:2: ErrUnexpectedEndOfGroup redeclared in this block
    db_vtproto.pb.go:627:2: previous declaration

Is there anyway to avoid this? Post protoc cleanup on this is pretty tough. Is there anyway those functions/variables could just be imported from vtprotobuf?

Support unconditional file generation

In order to support usage of a protoc plugin in bazel rules one needs to ensure that such plugin does always generate a file per each input file.

Would it be fine to add a feature to force a file generation? e.g. force feature which would just unconditionally return true as the result of GenerateFile(...)

Better support for maps

Hello,
While testing a bit this library in our code base, I noted that map support with pool is not perfect.

  • map themselves are not pooled
  • ResetVT does not return to pool values from a map<key, MessagePooled>
  • UnmarshalVT allocates a new message instead of using the pool for map<key, MessagePooled>

Doing these changes manually in the .pb.go reduces a lot allocations and speeds up unmarshalling.

I can provide a sample proto file and modifications made if needed.

Pre changes:
BenchmarkUnmarshalStdProto-12 202058 4971 ns/op 3840 B/op 54 allocs/op
BenchmarkUnmarshalVTProto-12 228591 5238 ns/op 3429 B/op 47 allocs/op
BenchmarkUnmarshalVTProtoWithPool-12 238689 4967 ns/op 2605 B/op 44 allocs/op

Post changes:
BenchmarkUnmarshalStdProto-12 203602 5240 ns/op 3840 B/op 54 allocs/op
BenchmarkUnmarshalVTProto-12 199917 5864 ns/op 3433 B/op 47 allocs/op
BenchmarkUnmarshalVTProtoWithPool-12 601562 2009 ns/op 302 B/op 5 allocs/op

Offering assistance and discussing upkeep / releases for vtprotobuf

Hello everyone,

I'd like to discuss the future maintenance of vtprotobuf. As an avid user from Datadog, I've recognized its potential in enhancing protobuf message operations. However, recent activity in the repository has been limited, with the last release dating back to January.

To bolster vtprotobuf's capabilities for developers, I propose exploring a more active maintenance and release cycle. Acknowledging the demands of open-source projects, I'm here to extend a hand, along with some of my coworkers from Datadog, to offer assistance.

Although notable bugs and promising ideas exist in pull requests and issues (like #83 and #54), no PRs have been merged since the beginning of the year. We're enthusiastic about bridging this gap and improving vtprotobuf's efficiency and robustness.

In this regard, I suggest opening a discussion on project maintenance and future releases. Our goal is collaborative growth, not imposition. Whether it's contributing code, managing issues, updating documentation, or handling releases, we are more than willing to step in.

Let's collectively ensure that vtprotobuf remains a valuable resource for developers. Your insights are vital, and I'm excited about the potential improvements we can achieve.

Looking forward to your thoughts and suggestions. Thank you for your time and consideration.

equal: generated code does not check reference equality

Expected: Generated code returns true if this and that instances are the same, i.e. msg.EqualVT(msg) should be super fast regardless of the fields in the message

Actual: Generated code still walks the whole message hierarchy and compares field-by-field

I believe implementing this should be as simple as replacing the currently generated code

if this == nil {
    return that == nil
} else if that == nil {
    return false
}

with

if this == that {
    return true
} else if this == nil || that == nil {
    return false
}

Could not find a contribution guide, so let me know if you'd like a PR for this or if you would rather make the change yourselves. Thanks for the great project.

Discussion on the use of codec

Does vtprotobuf supports the use of vtprotobuf and standard proto.Message in the same project (we can rely on the Pb. Go file generated by other projects);
Our real scene is that the internal proto file uses vtprotobu to improve performance, and some other functions rely on the third-party pb.go file; The following error occurred during processing

stream is closed with error: xds: stream.Recv() failed: rpc error: code = Internal desc = grpc: error while marshaling: failed to marshal, message is *envoy_service_discovery_v3.DiscoveryRequest (missing vtprotobuf helpers)" func=goexit 

Marshaling empty message in oneof is incompatible with proto.Marshal

An empty message used in a oneof is not marshaled the same as proto.Marshal().

Steps to reproduce:

syntax = "proto3";
package repro;

message Repro {
  message Empty {}
  oneof str_or_empty {
    string str = 1;
    Empty empty = 2;
  }
}
func TestEmptyOneOf(t *testing.T) {
	m := &pb.Repro{StrOrEmpty: &pb.Repro_Empty_{}}

	protobuf, err := proto.Marshal(m)
	if err != nil {
		t.Fatal(err)
	}

	vtprotobuf, err := m.MarshalVT()
	if err != nil {
		t.Fatal(err)
	}

	fmt.Printf("protobuf: %#v\n", protobuf)
	fmt.Printf("vtprotobuf: %#v\n", vtprotobuf)

	require.True(t, bytes.Equal(protobuf, vtprotobuf))
}

Output:

protobuf: []byte{0x12, 0x0}
vtprotobuf: []byte{}
--- FAIL: TestEmptyOneOf (0.00s)
    main_test.go:129: 
        	Error Trace:	main_test.go:129
        	Error:      	Should be true
        	Test:       	TestEmptyOneOf
FAIL
exit status 1

Performance regression for int32 and sfixed32 lists

While benchmarking vtprotobuf in our projects, we noticed a performance regression in case of lists of int32 and sfixed32 numbers.

Marshaling and unmarshaling both seem to be slower with vtprotobuf in case of repeated int32 fields.

Although unmarshaling is faster with vtprotobuf for repeated sfixed32, marshaling is slower.

This repository contains samples of these microbenchmarks: https://github.com/themreza/vtprotobuf-bench/tree/main

What could be causing this? Is there a way to improve the performance?

It would be helpful to have automated benchmarks for different data types comparing vtprotobuf with the built-in proto.Marshal and proto.Unmarshal.

Proto2 Support

Is it expected that the vtproto files aren't currently being generated for proto2 files or do I have something misconfigured?

Unmarshaling empty messages is incompatible with proto.Unmarshal

When unmarshaling an empty message embedded in another message, vtprotobuf is allocating the message, but proto.Unmarshal is using a typed nil.

Steps to reproduce:

syntax = "proto3";
package repro;

message TopLevel {
  message Empty {}
  Empty embedded = 1;
}
func TestEmbeddedEmpty(t *testing.T) {
	m := &pb.TopLevel{Embedded: &pb.TopLevel_Empty{}}

	// Marshal it with protobuf.
	protobuf, err := proto.Marshal(m)
	if err != nil {
		t.Fatal(err)
	}

	// Unmarshal it with protobuf.
	pbm := &pb.TopLevel{}
	if err := proto.Unmarshal(protobuf, m); err != nil {
		t.Fatal(err)
	}

	// Unmarshal it with vtprotobuf.
	vtm := &pb.TopLevel{}
	if err := vtm.UnmarshalVT(protobuf); err != nil {
		t.Fatal(err)
	}

	fmt.Printf("%#v\n\n", pbm)
	fmt.Printf("%#v\n", vtm)

	require.True(t, pbm.EqualVT(vtm), "EqualVT")
	require.True(t, proto.Equal(pbm, vtm), "proto.Equal")
}

Output:

&proto.TopLevel{state:impl.MessageState{NoUnkeyedLiterals:pragma.NoUnkeyedLiterals{}, DoNotCompare:pragma.DoNotCompare{}, DoNotCopy:pragma.DoNotCopy{}, atomicMessageInfo:(*impl.MessageInfo)(nil)}, sizeCache:0, unknownFields:[]uint8(nil), Embedded:(*proto.TopLevel_Empty)(nil)}

&proto.TopLevel{state:impl.MessageState{NoUnkeyedLiterals:pragma.NoUnkeyedLiterals{}, DoNotCompare:pragma.DoNotCompare{}, DoNotCopy:pragma.DoNotCopy{}, atomicMessageInfo:(*impl.MessageInfo)(nil)}, sizeCache:0, unknownFields:[]uint8(nil), Embedded:(*proto.TopLevel_Empty)(0xc00011fbc0)}

--- FAIL: TestEmbeddedEmpty (0.00s)
    main_test.go:37: 
        	Error Trace:	main_test.go:37
        	Error:      	Should be true
        	Test:       	TestEmbeddedEmpty
        	Messages:   	EqualVT
FAIL
exit status 1

Support DiscardUnknown fields with UnmarshalVT

https://pkg.go.dev/google.golang.org/protobuf/proto#UnmarshalOptions

Currently UnmarshalVT tracks unknownFields:

default:
 			iNdEx = preIndex
 			skippy, err := skip(dAtA[iNdEx:])
 			if err != nil {
 				return err
 			}
 			if (skippy < 0) || (iNdEx+skippy) < 0 {
 				return ErrInvalidLength
 			}
 			if (iNdEx + skippy) > l {
 				return io.ErrUnexpectedEOF
 			}
 			m.unknownFields = append(m.unknownFields, dAtA[iNdEx:iNdEx+skippy]...) <----
 			iNdEx += skippy

It would be helpful to have an option to discard these.

Inconsistent (de)serialization behavior

I am building an application protocol with protobufs, and I'm using vtprotobuf exclusively to marshal and unmarshal the messages. Currently, I'm experiencing strange behavior I'm not understanding that I think is related to vtprotobuf.

Here are my message definitions:

message Header {
  fixed32 Size = 1; // Size of the next message
  fixed32 Checksum = 2; // Checksum of the serialized message
}

message RaftControlPayload {
  oneof Types {
    GetLeaderIDRequest GetLeaderIdRequest = 1;
    GetLeaderIDResponse GetLeaderIdResponse = 2;
    IdRequest IdRequest = 3;
    IdResponse IdResponse = 4;
    IndexState IndexState = 5;
    ModifyNodeRequest ModifyNodeRequest = 6;
    ReadIndexRequest ReadIndexRequest = 7;
    ReadLocalNodeRequest ReadLocalNodeRequest = 8;
    RequestLeaderTransferResponse RequestLeaderTransferResponse = 9;
    RequestSnapshotRequest RequestSnapshotRequest = 10;
    SnapshotOption SnapshotOption = 12;
    StopNodeResponse StopNodeResponse = 13;
    StopRequest StopRequest = 14;
    StopResponse StopResponse = 15;
    SysOpState SysOpState = 16;
    DBError Error = 17;
  }
  enum MethodName {
      ADD_NODE = 0;
      ADD_OBSERVER = 1;
      ADD_WITNESS = 2;
      GET_ID = 3;
      GET_LEADER_ID = 4;
      READ_INDEX = 5;
      READ_LOCAL_NODE = 6;
      REQUEST_COMPACTION = 7;
      REQUEST_DELETE_NODE = 8;
      REQUEST_LEADER_TRANSFER = 9;
      REQUEST_SNAPSHOT = 10;
      STOP = 11;
      STOP_NODE = 12;
  }
  MethodName Method = 18;
}

This message serializes to 10 bytes, which I send across a network stream as a header for whatever unknown message payload is coming next. This allows me to simply pass raw protobuf messages across a network stream without having to leverage gRPC or other RPC frameworks.

Sending a message across the network stream is pretty straightforward. I prepare a message, serialize the message, create a header with all of the appropriate values, serialize the header, send the header, then send the message.

idReqPayload := &database.RaftControlPayload{
	Method: database.RaftControlPayload_GET_ID,
	Types: &database.RaftControlPayload_IdRequest{
		IdRequest: &database.IdRequest{},
	},
}
payloadBuf, _ := idReqPayload.MarshalVT()

initialHeader := &transportv1.Header{
	Size: uint32(len(payloadBuf)),
	Checksum: crc32.ChecksumIEEE(payloadBuf),
}
headerBuf, _ := initialHeader.MarshalVT()

stream.Write(headerBuf)
stream.Write(payloadBuf)

Receiving a message on the network stream is also pretty straightforward. I read the header into a buffer, deserialize it, read the next N bytes from the stream based off the Size field in the header message, and verify some checksums, then serialize the byte array into the equivalent messages.

headerBuf := make([]byte, 10)
if _, err := io.ReadFull(stream, headerBuf); err != nil {
	logger.Error().Err(err).Msg("cannot readAndHandle raft control header")
	continue
}

// marshall the header
header := &transportv1.Header{}
if err := header.UnmarshalVT(headerBuf); err != nil {
	logger.Error().Err(err).Msg("cannot unmarshal header")
	return
}

// prep the message buffer
msgBuf := make([]byte, header.Size)
if _, err := io.ReadFull(stream, msgBuf); err != nil {
	logger.Error().Err(err).Msg("cannot read message payload")
	return
}

// verify the message is intact
checked := crc32.ChecksumIEEE(msgBuf)
if checked != header.GetChecksum() {
	logger.Error().Msg("checksums do not match")
}

// unmarshal the payload
msg := &database.RaftControlPayload{}
if err := msg.UnmarshalVT(msgBuf); err != nil {
	logger.Error().Err(err).Msg("cannot unmarshal payload")
}

Here's where things start to get confusing. When I serialize idReqPayload via MarshalVT() and run a checksum against it, I'll get uint32(1298345897); when I send the header as you see here, the Size field is uint32(5) and Checksum is uint32(1298345897). When the header message gets deserialized on the receiving end of a localhost connection, it looks very different.

The header message gets deserialized with the Size field being uint32(5) and the Checksum field being uint(1). That's the first strange thing.

When I run a checksum against the next 5 bytes of the serialized idReqPayload payload which followed, it checksums to uint32(737000948) even though there was no change to the byte array from the time it was serialized to the time it was received. That's the second strange thing.

When I run an equality check against the value of the deserialised header Checksum field against a local checksum of the serialized idReqPayload payload with checked := crc32.ChecksumIEEE(msgBuf); if checked != header.GetChecksum() { // ... }, it passes an equality check - the deserialized header Checksum field's value is uint(1) whereas the calculated checksum of the received message is uint32(737000948). That's the third strange thing.

When I deserialize the serialized idReqPayload byte array, it deserializes without an error. However, the message information is incorrectly serialized. When I serialize protobuf with this configuration:

idReqPayload := &database.RaftControlPayload{
	Method: database.RaftControlPayload_GET_ID,
	Types: &database.RaftControlPayload_IdRequest{
		IdRequest: &database.IdRequest{},
	},
}

It deserializes into this equivalent:

msg := &database.RaftControlPayload{
	Method: database.RaftControlPayload_ADD_NODE,
	Types: nil,
}

The Method field is reset so the enum is defaulted to 0, and the Types field is nil.

I'm fairly positive this could partially be related to #51, but I updated my local protoc-gen-go-vtproto binary to 0ae748f and the problem still persists. I've also eliminated the network stream as it's a localhost network stream, so nothing is intercepting it or modifying it in transit.

Am I doing something wrong or is this a bug of some kind?

Proto3 optional field support

Hello, i have just tried running this generator on my proto3 files and it failed with:

myproto.proto is a proto3 file that contains optional fields, but code generator protoc-gen-go-vtproto hasn't been updated to support optional fields in proto3. Please ask the owner of this code generator to support proto3 optional.--go-vtproto_out:

Is optional field supported by this generator ?

Thanks

Custom Tag Support

Does vtprotobuf provide custom tag support like gogoprotobuf? I can't seem to find any information about this online.

in a .proto file, with gogoprotobuf I can attach customized tags like:

import "github.com/gogo/protobuf/gogoproto/gogo.proto";

option go_package                  = "events";
option (gogoproto.unmarshaler_all) = true;
option (gogoproto.sizer_all)       = true;
option (gogoproto.marshaler_all)   = true;

message Event {
  string AuctionID                               = 1 [(gogoproto.moretags) = 'gorm:"column:auction_id;type:VARCHAR(64);primary_key"'];
  int64  CampaignID                              = 2 [(gogoproto.moretags) = 'gorm:"column:campaign_id;type:BIGINT;index"'];
  int64  ImpIndex                                = 3 [(gogoproto.moretags) = 'gorm:"column:imp_index;type:INT"'];
  string DomainKey                               = 5 [(gogoproto.moretags) = 'gorm:"column:domain_key;type:VARCHAR(1024)"'];

Does vtprotobuf provide any support for anything like gogoproto.moretags?

equal: code doesn't correctly differentiate between absence and zero values in maps

The generated code doesn't check for presence in the map in other. This may lead to spuriously returning true if:

  1. both maps are of equal size,
  2. keys present in both maps map to the same respective values, and
  3. keys present in only the first map map to the zero value for the respective type.

Failing test case:

func TestEqualVT_Map_AbsenceVsZeroValue(t *testing.T) {
	a := &TestAllTypesProto3{
		MapInt32Int32: map[int32]int32{
			1: 0,
			2: 37,
		},
	}
	b := &TestAllTypesProto3{
		MapInt32Int32: map[int32]int32{
			2: 37,
			3: 42,
		},
	}

	aJson, err := protojson.Marshal(a)
	require.NoError(t, err)
	bJson, err := protojson.Marshal(b)
	require.NoError(t, err)

	if a.EqualVT(b) {
		assert.JSONEq(t, string(aJson), string(bJson))
		err := fmt.Errorf("these %T should not be equal:\nmsg = %+v\noriginal = %+v", a, a, b)
		require.NoError(t, err)
	}
}

how to use vtprotobuf in bufbuild/buf should in README

how to use vtprotobuf in bufbuild/buf guide:

in macOS

step 1:

set up PATH

export GOBIN=/Users/xxxx/go/bin
export PATH=$PATH:$GOBIN:

step 2:

check out README, install vtprotobuf

go install github.com/planetscale/vtprotobuf/cmd/protoc-gen-go-vtproto@latest

step 3:

check out https://docs.buf.build/installation
to install buf in macOS, like this

brew install bufbuild/buf/buf

step 4:

add vtprotobuf to buf.gen.yaml

version: v1
managed:
  enabled: true
  go_package_prefix:
    default: github.com/your/grpc-project
plugins:
  - plugin: buf.build/bufbuild/connect-go
    out: ./
    opt: paths=source_relative
  - plugin: buf.build/protocolbuffers/go
    out: ./
    opt: paths=source_relative
  - plugin: go-vtproto
    out: ./
    opt: paths=source_relative

here is

  - plugin: go-vtproto
    out: ./
    opt: paths=source_relative

add to buf generate plugins

final , generate proto to support vtprobuf

buf generate

that's all done.

How to generate pool methods from buf cli

So I've got a slightly different method of generating protobufs. I used the buf cli to generate protos and my config file looks like the following:

version: v1
plugins:
  - name: go
    out: ./generated/
    opt: paths=source_relative
  - plugin: buf.build/grpc/go:v1.3.0
    out: ./
    opt:
      - paths=source_relative
  - plugin: go-vtproto
    out: ./
    opt:
      - paths=source_relative
      - features=marshal+unmarshal+size+equal+pool+clone

I have both used and not used features, which ends up having the same outcome. None of the pool methods are generated on the types, such as ReturnToVTPool or ResetVT. All other methods are generated.

I'm using the latest version of the plugin.

I noticed that in a bug, someone was doing:

message Parent {
  option (vtproto.mempool) = true;
  repeated Child children = 1;
  Child one = 2;
}

message Child {
  option (vtproto.mempool) = true;
  uint32 field = 1;
}

with option (vtproto.mempool) = true;

This didn't work for me, giving an error about an unsupported option. Also, the only reference I found was in the bug, I never saw it in regular documentation.

I'm sure I'm missing something simple, but not sure what it is. Any help would be appreciated. And thanks for the hard work on this project. We certainly needed something to take up the slack with gogoproto being deprecated.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.