sarvalabs / go-polo Goto Github PK
View Code? Open in Web Editor NEWGo Implementation of the POLO Serialization Scheme
License: Apache License 2.0
Go Implementation of the POLO Serialization Scheme
License: Apache License 2.0
Unexpected panic occurs when an object is decoded with the following code
type Fruit struct {
Name string
Cost *uint8
}
func main() {
fruit := Fruit{Name: "Orange", Cost: nil}
wire, err := polo.Polorize(fruit)
if err != nil {
log.Println(err)
}
fmt.Println(wire)
decoded := new(Fruit)
if err = polo.Depolorize(decoded, wire); err != nil {
log.Println(err)
}
fmt.Println(decoded)
}
[14 47 6 96 79 114 97 110 103 101]
panic: reflect.Set: value of type uint64 is not assignable to type uint8 [recovered]
panic: reflect.Set: value of type uint64 is not assignable to type uint8
...
/Users/manish/sdk/go/src/reflect/value.go:3064 +0x210
reflect.Value.Set({0x102e6da00?, 0x14000120bb9?, 0x102e6da00?}, {0x102e6d9c0?, 0x102e452e8?, 0x38746e69752a2a?})
/Users/manish/sdk/go/src/reflect/value.go:2090 +0xcc
github.com/sarvalabs/go-polo.(*Depolorizer).depolorizePointer(0x102e668e0?, {0x102ecc230, 0x102e668e0})
/Users/manish/code/github/sarvalabs/go-polo/depolorizer.go:774 +0xe4
github.com/sarvalabs/go-polo.(*Depolorizer).depolorizeValue(0x0?, {0x102ecc230?, 0x102e668e0})
/Users/manish/code/github/sarvalabs/go-polo/depolorizer.go:812 +0x9a8
github.com/sarvalabs/go-polo.(*Depolorizer).depolorizeStructValue(0x14000059c28?, {0x102ecc230?, 0x102e90ca0})
/Users/manish/code/github/sarvalabs/go-polo/depolorizer.go:686 +0x618
github.com/sarvalabs/go-polo.(*Depolorizer).depolorizeValue(0x102e614a0?, {0x102ecc230?, 0x102e90ca0})
/Users/manish/code/github/sarvalabs/go-polo/depolorizer.go:897 +0xd2c
github.com/sarvalabs/go-polo.(*Depolorizer).Depolorize(0x14000120b80?, {0x102e614a0?, 0x140001322e8?})
/Users/manish/code/github/sarvalabs/go-polo/depolorizer.go:90 +0x120
github.com/sarvalabs/go-polo.Depolorize({0x102e614a0, 0x140001322e8}, {0x14000120b80?, 0x1?, 0x1?})
...
Lines 760 to 777 in f767a8b
The above code for Depolorizer.depolorizePointer() shows that the function does not convert the underlying reflected value into its true type before setting it (line 774). This causes uint64 (PosInt) number to not be cast to uint8. This only occurs if the type is declared as a pointer, resulting in a call to
depolorizePointer`, where the problem code lives
We should fix the call line 774 from p.Elem().Set(value)
to p.Elem().Set(value.Convert(p.Elem().Type()))
. This will ensure that value gets converted to its
go-polo
does not enforce any size limit and uses 64-bit unsigned arithmetic and hence has a much higher potential size. This needs to be artificially limited to 64 MB to make the implementation crash tolerant while loading large messages into memory. There has been no instance of such crashes (yet) but better safe than sorry. (This must probably also be enforced by the specification across implementation)EncodingOption
to determine the allowed message size and default the wire config value to 64MB
and disallow values over 2GB in the encoding optionpolo.Raw
that can be used by consuming applications to define a type in a struct that must decoded into its raw data, rather than to any actual concrete type. This raw data can then be used to decode that data independently.polo.Raw
when used with Depolorize()
.WireRaw
Wire TypeWireRaw
wire type at value 5, replacing the deprecated WireBigInt
.Raw
object is attempted to be polorized, it will be encoded with this new wiretype.[5 3 1 44]
is a valid POLO wire tagged as with the WireRaw
wire type. It can either be decoded into a polo.Raw
or into a []byte
which will contain the data [3 1 44]
. This data is again POLO encoded data which can be interpreted as a WirePosInt
with a value of 300Document
type to map[string]Raw
WireLoad
that alternates between WireWord
and WireRaw
for each key and value in the Document
instead of the current form of being all WireWord
objects.Unexpected panics that occurs when an object is decoded with the following code
type Fruit struct {
Name string
Cost uint64
}
func main() {
var fruit *Fruit
data := []byte{...}
if err := polo.Depolorize(fruit, data); err != nil {
log.Println(err)
}
}
Lines 80 to 100 in f767a8b
The above code shows that the Depolorize
method of Depolorizer
does not perform a check to see if the element of the value
variable (reflected object for a pointer) is Settable and presumes this property purely based on the fact that it is a pointer and addressable. This causes the following lines where we set the decoded value to the object with value.Elem().Set()
to panic.
Object declared with var fruit *Fruit
instead of fruit := new(Fruit)
as in the problem code above will result in a non-settable pointer value
After checking that value
is a pointer, we need to perform a call value.Elem().CanSet
to determine if it is a settable object, otherwise, we must return an error to communicate the same.
WireBigInt
wire type does not express this polarity.WireBigInt
wiretype and instead uses the WirePosInt
and WireNegInt
wiretypes which are currently used for integers upto 64-bits. The new behaviour allows those wiretypes to now express arbitrary sized integer data.MapPolorizer
and a MapDepolorizer
that allow inserting/fetching key-values by random access instead of sorted (ordered) access.DocumentEncode
to doc-encode an object, only the top-level struct gets encoded that way. Any structs that are more deeply nested are encoded regularly.Document Encoding is a special feature of POLO allowing an alternate encoded form for string addressable collections.
This useful for preserving field names for classes/structs, because they are encoded into the wire as well. The real capabilities
of Document Encoding are after they have been decoded into a Document
object.
Document
Typepolo.Document
is an intermediary structure when decoding a document encoded wire. (wire type 13)type Document map[string][]byte
Document
map represents some POLO encoded data which in turn may be atomic. or compound in nature (can also be another document).polo:"<name>"
tagging convention.type Fruit struct {
Name string `polo:"name"`
Cost uint64 `polo:"cost"`
Alias []string `polo:"aliases"`
}
Polorize
function as it would not be able to determine it independently. So we need a separate function DocumentEncode
for encoding objects into document encoded wire.Depolorize
function to recognize that the target object is Document
(reflect.Map
Kind) or a struct when encountering the WireDoc
wire type.// DocumentEncode encodes a given object into its POLO bytes with Document Encoding.
// Will return an error if given object is not a supported type (structs, maps with strings keys or a pointer to either).
func DocumentEncode(object any) ([]byte, error) {...}
// Size returns the number of elements in the Document
func (doc Document) Size() int {...}
// Bytes collapses the Document into its POLO wire form
func (doc Document) Bytes() []byte {...}
// Is returns whether the wire for a given key is of a specific WireType.
// Allows pseudo type inspection for document encoded components
func (doc Document) Is(key string, kind WireType) bool {...}
// Get retrieves the raw POLO bytes for a given key. Returns nil if no data for key.
func (doc Document) Get(key string) []byte {...}
// Set inserts some raw POLO bytes for a given key. Does not validate the data.
// Can lead to malform data while decoding if not POLO bytes.
func (doc Document) Set(key string, data []byte) {...}
// GetObject decodes the data for a given key into the given object.
// Returns an error if the object is not a pointer, if there is no data for the key or if it cannot decoded into the given object type
func (doc Document) GetObject(key string, object any) error {...}
// SetObject encodes the given object and sets it for the given key.
// Returns an error if the given object cannot be serialized with Polorize.
func (doc Document) SetObject(key string, object any) error {...}
Remove the unintended print statement in the following line in depolorizer.go
Line 498 in fbca3b7
Currently the POLO API specifies 5 ways to encode data.
Polorize()
can be used to encode an arbitrary Go object into its POLO bytesDepolorize()
can be used to decode some POLO bytes into an arbitrary Go object.Packer
type exposes methods to encode pack encoded data sequentially with its Pack
and PackWire
methodsUnpacker
type exposes methods to decoded pack encoded data sequentially with its Unpack
and UnpackWire
methods.DocumentEncode()
can be used to encode string keyed maps and structs into doc-encoded POLO bytes with the Document
type being able to encode and decode objects from it using GetObject
and SetObject
methodsThe proposal is to unify the encoding/decoding API with the following.
Polorizer
and Depolorizer
types to replace Packer
and Unpacker
to allow encoding/decoding atomics and pack encoded data and documents. It allows for a more performant model of sequential encoding/decoding without using Go's reflect
library notorious for its performance problems. We will still retain the Polorize()
and Depolorize()
functions to perform this polorization using the reflect package.func main() {
type Fruit struct {
Name string
Cost int
Alias []string
}
doc := make(Document)
doc.SetObject("Name", "orange")
doc.SetObject("Cost", 300)
doc.SetObject("Alias", []string{"tangerine", "mandarin"})
docwire := doc.Bytes()
fmt.Println(docwire)
fruit := new(Fruit)
if err := Depolorize(fruit, docwire); err != nil {
fmt.Println(err)
}
fmt.Println(fruit)
}
[13 175 1 6 86 198 3 134 4 198 4 134 5 65 108 105 97 115 6 14 63 6 150 1 116 97 110 103 101 114 105 110 101 109 97 110 100 97 114 105 110 67 111 115 116 6 3 1 44 78 97 109 101 6 6 111 114 97 110 103 101]
decode error: struct field [polo.Fruit.Cost <int>]: incompatible wire type. expected: posint. got: word
&{ 0 []}
The issue with Document.Bytes()
method persists. This seems to be because the fix with #9 did not correctly collapse the bytes. Calling DocumentEncode
on a document constructed map has the effect of creating extra word tags for the already encoded data in the Document
.
Replace the Bytes()
method with the code below. This time the regular Polorize
is sufficient but the first tag of the message is replaced to have the WireDoc
tag instead of the WirePack
tag.
// Bytes returns the POLO wire representation of a Document
func (doc Document) Bytes() []byte {
data, _ := Polorize(doc)
data[0] = byte(WireDoc)
return data
}
newLoadDepolorizer
as it does not allow us to propagate the wire config correctly.Polorizer
and Depolorizer
to have option flags to change the encoding/decoding behaviour of the buffers.Polorize
and Depolorize
functionsDocument
Upon CollapseCalling Document.Bytes()
to collapse a Document
object into its POLO bytes form should return some binary data that starts with a byte of value 13 [0b1101]
indicating the WireDoc
wire type. But it currently returns 14 [0b1110]
which indicates that it is pack encoded (incorrectly)
This seems to be caused by an improper implementation of the Bytes
method of Document
. It should be using DocumentEncode
to polorize the Document
into its bytes but instead uses the standard Polorize
function leading it to be pack encoded instead of being doc encoded. See below for details
// github.com/sarvalabs/go-polo/document.go
// Bytes returns the POLO wire representation of a Document
func (doc Document) Bytes() []byte {
data, _ := Polorize(doc) // << BUG >>
return data
}
The fix is straigthforward, requiring a refactor of the Bytes
method to use the DocumentEncode
function instead of Polorize
WirePack
and can similarly be 'unpacked' element by element.readbuffer
and loadreader
to do this cleanly.type Packer struct {
// unexported readbuffer
}
// NewPacker is a constructor for Packer
func NewPacker() *Packer {...}
// Pack encodes the given object into Packer
func (pack *Packer) Pack(object any) error {...}
// PackWire writes the encoded wire into Packer.
// wire must not be empty and begin with a valid WireType
func (pack *Packer) PackWire(wire []byte) error {...}
// Bytes returns the encoded data as valid WirePack tagged POLO bytes
func (pack Packer) Bytes() []byte {...}
type Unpacker struct {
// unexported loadreader
}
// NewUnpacker is a constructor for Unpacker
// Returns an error if the data is not a compound wire
func NewUnpacker (wire []byte) (*Unpacker, error) {...}
// Unpack decodes the next element into the given object.
// Returns error if not a pointer or if the element cannot be decoded into the object
func (unpack *Unpacker) Unpack(object any) error {...}
// UnpackWire reads an element from Unpacker and returns it.
func (unpack *Unpacker) UnpackWire() ([]byte, error) {...}
// Done returns whether all the elements have been unpacked
func (unpack Unpacker) Done() bool {...}
Polorize
with nil
.This panic only occurs when passing abstract nil
such as in the following function
func Do(mgs interface{}) error {
bytes, err := polo.Polorize(msg) // panic occurs here
if err != nil {
return err
}
// more logic
}
The panic message and partial stack trace is as follows
panic: reflect: call of reflect.Value.Type on zero Value [recovered]
panic: reflect: call of reflect.Value.Type on zero Value
panic({0x102538760, 0x140001322b8})
/Users/manish/sdk/go/src/runtime/panic.go:838 +0x204
reflect.Value.Type({0x0?, 0x0?, 0x4?})
/Users/manish/sdk/go/src/reflect/value.go:2453 +0x130
github.com/sarvalabs/go-polo.polorize({0x0?, 0x0?, 0x102538520?}, 0x102538520?)
/Users/manish/code/github/sarvalabs/go-polo/polorize.go:69 +0x264
github.com/sarvalabs/go-polo.Polorize({0x0?, 0x0?})
/Users/manish/code/github/sarvalabs/go-polo/polo.go:10 +0xec
This kind of panic is expected to occur only for nil
values of abstract types such as those of interface declarations. For concrete nil
values, this issue does not seem to occur.
The specific culprit is when an unsupported type is passed, the error message attempts to pass the unsupported type and its kind with the error, requiring a call to reflect.Value.Type()
. This call panics for zero values and cannot be checked for abstract nil
values with IsNil
or IsZero
without also causing another panic.
We can attempt to check for nil
values at the Polorize
function level before reflecting on the given object but this will rule out nil values for valid and concrete types such as slices and maps.
The best option is to recover from the panic when it occurs within the unsupported type case of the reflection tree within the polorize
function
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.