Giter Site home page Giter Site logo

betwixt-labs / bebop Goto Github PK

View Code? Open in Web Editor NEW
1.8K 16.0 32.0 107.52 MB

๐ŸŽทNo ceremony, just code. Blazing fast, typesafe binary serialization.

Home Page: https://bebop.sh/

License: Apache License 2.0

C# 65.53% TypeScript 13.69% JavaScript 0.42% Batchfile 0.05% Dart 1.03% Shell 1.44% HTML 1.07% CSS 0.16% C++ 1.86% Makefile 0.01% CMake 0.22% Rust 10.67% PowerShell 0.81% Python 2.06% Go 0.99%
serialization deserialization real-time marshalling rpc zero-copy c-sharp typescript dart javascript

bebop's Introduction

Bebop

Bebop

No ceremony, just code.
Blazing fast, typesafe binary serialization.

Apache License Discord
Twitter

Introduction

Bebop is a high-performance data interchange format designed for fast serialization and deserialization.

        
// Example Bebop Schema
struct Person {
  string name;
  uint32 age;
}
        
      
        
// Generated TypeScript Code
new Person({
    name: "Spike Spiegel",
    age: 27
}).encode();
        
      
Write concise and expressive schemas with Bebop's intuitive syntax. Using a generated class to persist data.

It combines the simplicity of JSON with the efficiency of binary formats, delivering exceptional performance. In benchmarks, Bebop outperforms Protocol Buffers by approximately 10 times in both C# and TypeScript. Compared to JSON, Bebop is roughly 10 times faster in C# and about 5 times faster in TypeScript.

Benchmark Graphs

Bebop provides a modern, developer-friendly experience while ensuring top-notch performance. It is the ideal choice for any application that requires efficient data serialization, especially in performance-critical scenarios.

To explore the schema language and see examples of the generated code, check out the playground.

Key Features

  • ๐Ÿง™โ€โ™‚๏ธย  Supports Typescript, C#, Rust, C++, and more.
  • ๐ŸŽย  Snappy DX - integrate bebopc into your project with ease. Language support available in VSCode.
  • ๐Ÿƒย  Lightweight - Bebop has zero dependencies and a tiny runtime footprint. Generated code is tightly optimized.
  • ๐ŸŒ—ย  RPC - build efficient APIs with Tempo.
  • โ˜๏ธย  Runs everywhere - browsers, serverless platforms, and on bare metal.
  • ๐Ÿ“šย  Extendable - write extensions for the compiler in any language.

๐Ÿ‘‰ For more information, check out the documentation. ๐Ÿ‘ˆ

See You Space Cowboy...

bebop's People

Contributors

andrewmd5 avatar bengreenier avatar bugq avatar dependabot[bot] avatar frobthebuilder avatar gerhobbelt avatar homeworkprod avatar jakecooper avatar kushsolitary avatar l1qu1d1ert avatar liquidiert avatar lynn avatar mattiekat avatar owobred avatar pajlada avatar patriksvensson avatar programmeral avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bebop's Issues

Rust generated code does not support enum 'bitfields'

Describe the bug

Rust runtime cannot serialize or deserialize 'bitfields' (flags, bitflags).

To Reproduce

schema:

enum Flags {
    None = 0;
    A = 1;
    B = 2;
    C = 4;
    D = 8;
}
  1. You cannot serialize a combination of these 'flags' (Flags::A | Flags::B in other languages) (simply not a thing in Rust due to enums), you cannot cast a u32 into a Flags (non-primative)
  2. Deserializing calls the enums try_into which expects distinct values:
impl ::core::convert::TryFrom<u32> for Flags {
    type Error = ::bebop::DeserializeError;

    fn try_from(value: u32) -> ::bebop::DeResult<Self> {
        match value {
            0 => Ok(Flags::None),
            1 => Ok(Flags::A),
            2 => Ok(Flags::B),
            4 => Ok(Flags::C),
            8 => Ok(Flags::D),
            d => Err(::bebop::DeserializeError::InvalidEnumDiscriminator(d)),
        }
    }
}

Expected behavior

Be able to serialize + deserialize 'bitfields' in some way.

Imo distinct enum values should either be enforced or expected for 'regular enums', 'bitfields' can then be denoted with an 'attribute' (Similar to C# [Flags]) to allow bitfields which the code generators can then use.

Bebop info:

  • 2.3.0
  • Rust

Desktop (please complete the following information):

  • OS: Windows
  • Version 10 19043.1237

RPC API generation API spec

Service definition syntax

message HelloRequest {
    1 -> bool hello;
}
message HelloResponse {
    1 -> string hi;
}
service MyService {
    sayHello(HelloRequest) => HelloResponse
}

@lynn noted that we could also use the normal arrow syntax here instead, though I do prefer the double-wide because of the different semantics in this context. Either way seems alright to me though.

Generated code

type RpcImpl = (methodName: string, requestData: BebopView, callback: (responseData: BebopView) => void) => void
interface ServiceClass {
    new(rpcImpl: RpcImpl): Service
}

The rpcImpl parameter is simply a function that accepts the method name, encoded request data, and a callback which is to be called upon receiving the response with the encoded response data. This data MUST be of the type specified in the service definition. The purpose of this function is to handle the actual sending of the request down the wire, and catching the response to send back to the Bebop runtime.
The constructed Service contains the methods defined by the schema, using interfaces for easy call ergonomics.

import {MyService} from "generated"
const service = new MyService(rpcImpl)

console.log(await service.sayHello({ hello: true })) //{ hi: "hello" }

Error: Cannot find module '...\node_modules\bebop\dest\index.js'. Please verify that the package.json has a valid "main" entry

Describe the bug
The file dest/index.js is missing from the npm distribution due to the project's .gitignore file. Thus, the import fails in Node.js.

To Fix:
Remove .gitignore, run npm run build, and publish a new version.

To Reproduce
Steps to reproduce the behavior:

  1. Follow the steps from Getting Started with TypeScript.

Expected behavior
The command node ./index.js (from the guide above) should run successfully.

Bebop info:

  • Version: 2.0.0
  • Runtime: Node.js v14.3.0

Desktop:

  • OS: Windows 10 Home
  • Version: 2004 (19041.630)

MSBuild Targets Not Working on macOS

Describe the bug

The bebop tools don't seem to resolve properly on macOS so the compiler tools fail to compile the .bop files.

To Reproduce
Steps to reproduce the behavior:

  1. Create a client library
  2. Install bebop-tools from NuGet
  3. Modify .csproj with "Getting Started" MSBuild target
  4. Compile the application

Expected behavior

The .bop file is compiled into the corresponding *.cs file.

Screenshots / Snippets

Modifying the target directly and fixing the path separators allows me to compile.

From Contracts.csproj

    <ItemGroup>
        <Bebop Include="**/*.bop" OutputDir="./Models/" OutputFile="Records.g.cs" Namespace="Cowboy.Contracts" />
    </ItemGroup>

Modified MSBuild Target

	<Target Name="CompileBops" BeforeTargets="CoreCompile" DependsOnTargets="PrepareForBuild">

		<Exec
			Command="$([System.IO.Path]::GetFullPath('$(MSBuildThisFileDirectory)../tools/macos/bebopc')) --log-format MSBuild --cs &quot;$([System.IO.Path]::GetFullPath('%(Bebop.OutputDir)'))%(Bebop.OutputFile)&quot; --namespace %(Bebop.Namespace) --files $(_BebopSchemas)"
			EchoOff='true'
			StandardErrorImportance='high'
			StandardOutputImportance='low'
			ConsoleToMSBuild='true'
			ContinueOnError='false'
			StdOutEncoding='utf-8'>
			<Output TaskParameter="ConsoleOutput" PropertyName="_BebopCompiler" />
			<Output TaskParameter="ExitCode" PropertyName="MSBuildLastExitCode" />
		</Exec>

	</Target>

Here is the running sample I got working. To be clear, I had to download the NuGet package and copy the tools to the root folder. Not sure if I needed to do that but it seems to work.

It might make sense to migrate the bebop-tools over to a dotnet global tool.

Here is a working sample https://github.com/khalidabuhakmeh/CowboyBebopSample

Bebop info:

  • Version: 2.0.2
  • Runtime: .NET 5.0

Desktop (please complete the following information):

  • OS: macOS
  • Version: Big Sure

Additional context
Add any other context about the problem here.

Fixed-size arrays

Is your feature request related to a problem? Please describe.

Yes, I was hoping to remove the overhead of having to prefix a length to an array of bytes that I know is of a fixed length ahead of time. Fields that effectively denote a fixed-size array of arbitrary types are very common. Some examples of such fields would be 256-bit hashes, 16-byte UUIDs, 128-bit bitstrings, etc.

Describe the solution you'd like

A way to denote a field to be a fixed-sized array of some type. The schema syntax could be as simple as [N]T where N is the fixed number of elements stored in the array, and T is the type of the elements of the array.

Describe alternatives you've considered

Using variable-length arrays, which as mentioned incurs an unnecessary size overhead for the type of data concerned.

Dart schema generator mistakenly add @ in front of required modifier

When using bebopc --dir schemas\ --dart lib\schemas.dart on my Windows the generated .dart class will have a @ in front of the required parameter.

Steps to reproduce the behavior:

  1. Create .bop file e.g.
struct Example{
    string id;
    string name;
    date timestamp;
}
  1. Run bebopc --dir schemas\ --dart lib\schemas.dart
  2. Go to schemas.dart and look at the Example class
class Example {
  String id;
  String name;
  DateTime timestamp;
  DateTime scannedTimestamp;
  Example({
    @required this.id,
    @required this.name,
    @required this.timestamp,
  });

//more code
 
}

There should not be a @ in front of the required modifiers.

  • Version: 2.2.2

  • Runtime: Dart 2.12.2

  • OS: Windows

Produce warning when enum lacks a default member.

Is your feature request related to a problem? Please describe.
Because code generated by bebopc is designed to be fast decoding of an enum is essentially just an unchecked cast of an integral value to the runtime representation of the enum.

For instance, in this enum:
enum Color { Red = 1; Green = 2; Blue = 3; }

Casting 1 to Color would result in Color.Red. However if for some reason a value such as 4 was read from the buffer, because the generated code may be subject to language-specific implementation details, the language may choose to not throw and cast 4 to Color.Red or default to 0 (even if it is not defined) which leads to undefined behavior.

This may sound fine initially, until you realize exploits like OMIGOD were caused due to the combination of a simple conditional statement coding mistake and an uninitialized auth struct, which meant any request without an Authorization header had its privileges default to uid=0, gid=0, which is root.

So if someone defined the following schema:

enum UserGroup { Root = 0; Limited = 1; Test = 2; }

readonly struct User {
    UserGroup group;
    uint32 id;
}

Calling new User() would be enough to create a new and valid instance that could be encoded and subsequently decoded by a sensitive system without producing a single runtime error. A modification to the buffer that holds the encoded enum value could also be modified (either purposefully or due to corruption) to have a value not defined in the constant which may be casted to the default value of the enum (in this case 0).

In both examples whether by human error or improper validation of input data a program may believe the user is now root, again without raising any runtime errors.

Describe the solution you'd like
When generating code if an enum does not define a Default member a warning should be produced. Because of #156 the Default member should be the default value of the integral chosen. So for an enum with a backing type of uint32 that is 0 but for one using int32 it is -1.

It would look like this:
enum UserGroup { Default = 0; Root =1; Limited = 2; Test = 3; }

This would let people know they're defining an enum that could create undefined behavior. This warning would be silenced if the a member is assigned the default value of the integral chosen with a name such as Invalid or Unknown to respect common patterns.

Describe alternatives you've considered

  • Making Default a reserved enum member that is automatically placed into generated code. This is an extreme option however and would map mapping existing enums difficult.
  • Creating an attribute that can be placed on a struct or message that would enable stricter checks when encoding / decoding an enum in the generated code on a per-type basis.

Additional context
The VS code plugin should be able to show warnings as well.

Improve ability to mock data with ToJson and FromJson generated methods

Is your feature request related to a problem? Please describe.

Trying to use Bebop as your primary serializer when working over HTTP / REST is very painful right now. There arenโ€™t any convenient means to easily create test data for a defined type.

With JSON you have the ability to both define values and structure, but there is no real type information. In Javascript a JSON object you manually create to mirror a type in a Bebop schema still canโ€™t be used to encode without first manually changing runtime types to be correct ones (which requires knowledge of those types!)

Describe the solution you'd like
Aggregate kinds such as unions, structs, and messages should generate with methods that both convert JSON to their underlying type and produce a JSON encoded version.

With these two methods a user can use any data mocking framework they want to test their implementations.

Additionally, bebopc should add two new flags, โ€”to-json and โ€”from-json

These flags would allow for the compiler itself to generate and validate test data.

bebopc โ€”files point.bop โ€”root-type Point โ€”from-json << '{โ€œxโ€: 2048,โ€œyโ€: 1080}' > encoded.bin

or

bebopc โ€”files point.bop โ€”root-type Point โ€”to-json << encoded.bin > decoded.json

Describe alternatives you've considered
I some standalone mocking tool which I actually got somewhat working but wasnโ€™t portable and would have been easier if generated code already had easy methods of marshaling JSON.

** Other thoughts **

Possibly โ€”json becomes a generation target that produces a valid JSON Schema which can be used to check JSON that is crafted by humans.

Fix the C# Laboratory

dotnet test doesn't seem to run and I believe the tests are also outdated. Also, there should be more tests for unions, etc.

C# [bebop encode-intoโ—] % dotnet test
  Determining projects to restore...
  Restored /Users/lynn/code/bebop/Runtime/C#/Bebop.csproj (in 167 ms).
  Restored /Users/lynn/code/bebop/Laboratory/C#/Test/Test.csproj (in 178 ms).
  Restored /Users/lynn/code/bebop/Laboratory/C#/Benchmarks/Benchmarks.csproj (in 274 ms).
  You are using a preview version of .NET. See: https://aka.ms/dotnet-core-preview
  You are using a preview version of .NET. See: https://aka.ms/dotnet-core-preview
  You are using a preview version of .NET. See: https://aka.ms/dotnet-core-preview
  You are using a preview version of .NET. See: https://aka.ms/dotnet-core-preview
  You are using a preview version of .NET. See: https://aka.ms/dotnet-core-preview
  Bebop -> /Users/lynn/code/bebop/Runtime/C#/bin/Debug/net472/Bebop.dll
  Bebop -> /Users/lynn/code/bebop/Runtime/C#/bin/Debug/netcoreapp3.1/Bebop.dll
  You are using a preview version of .NET. See: https://aka.ms/dotnet-core-preview
  Bebop -> /Users/lynn/code/bebop/Runtime/C#/bin/Debug/net5.0/Bebop.dll
  Bebop -> /Users/lynn/code/bebop/Runtime/C#/bin/Debug/net48/Bebop.dll
  Successfully created package '/Users/lynn/code/bebop/Runtime/C#/bin/Debug/bebop.0.0.1-20210422-1804.nupkg'.

/Users/lynn/code/bebop/Laboratory/C#/Test/UnitTest1.cs(5,13): error CS0234: The type or namespace name 'Runtime' does not exist in the namespace 'Bebop' (are you missing an assembly reference?) [/Users/lynn/code/bebop/Laboratory/C#/Test/Test.csproj]
/Users/lynn/code/bebop/Laboratory/C#/GeneratedTestCode/Output.g.cs(22,18): error CS0234: The type or namespace name 'Attributes' does not exist in the namespace 'Bebop' (are you missing an assembly reference?) [/Users/lynn/code/bebop/Laboratory/C#/Test/Test.csproj]
...

Support serde in the rust implementation

Is your feature request related to a problem? Please describe.
I would really like to switch from proto to bebop but one thing it misses is supporting serialization to and from json for debugging.

Describe the solution you'd like
It would be great if bebop would add support to generate and decode to and from json aswell

Rust build when source path is read-only

Is your feature request related to a problem? Please describe.
In some build scenarios, the "src" path in rust is read-only (for instance in NixOS), therefore generated files should be placed in folder pointed by "OUT_DIR" environment variable. Such scenario is well-handled by other schema encoding crates like prost_build.

Describe the solution you'd like
In orther for this to happen, generated .rs files should be imported singularly with something like:
include!(concat!(env!("OUT_DIR"), "bebop")) inside a mod declaration (prost provides a dedicated macro for this, which under the hood does exactly this), but this is not currently possible due to inner docs (the ones starting with //!) produced by bebopc in the beginning of files. Therefore, I'd like to request a flag to disable inner doc comments.

auto generated constructors

Is your feature request related to a problem? Please describe.
Not really a problem; rather an improvement for code readability.

Describe the solution you'd like
Currently bebopc does not generate constructors with properties for schemes, leading to code like this for initialization:

byte[] toSendOverWire = new NetworkMessage {
    IncomingOpCode = SessionMsg.OpCode,
    IncomingRecord = new SessionMsg { SessionId = sessionId, ClientId = clientId }.EncodeAsImmutable()
}.Encode();
// if one wants to send a message over tcp streams for example

I think a more developer friendly approach would be:

byte[] toSendOverWire = new NetworkMessage(
    SessionMsg.OpCode,
    new SessionMsg(sessionId, clientId).EncodeAsImmutable()
).Encode();

Though this may be not as explaining as the object initializer but I think most of the developers out there know how their schemes are structured :)

Hope you see it the same way and if so I might be willing to create a PR regarding this :D
Cause I have some other ideas for the constructor generation as well; like checking the property type and if it's also a bebop scheme auto encode it.

Either way have a nice christmas! ๐ŸŽ„

Clarification on byte vs uint8

Intuitively byte == uint8, however in Writing Bops, uint8 is not listed as a type, so its not allowed. So far so good.

The msgpack comparison example uses the uint8 type: i.e. uint8 iNT0; // "int0": 0,.

Is uint8 a valid builtin type? Is the example outdated? Do the docs just need a clarification 'byte' or 'uint8'?

Use C# Array Pool buffer when calling BebopWriter.Create() to Encode

Is your feature request related to a problem? Please describe.
I was benchmarking Bebop recently and noticed it allocated considerably more memory than the other binary serialization protocols I was comparing it to (Protobuf and MessagePack). The results of the benchmarks are at: https://github.com/ProgrammerAl/SerializationBenchmarks#benchmark-results

Looks like this is happening because new instances of BebopWriter are created with an empty byte array and then it's grown/re-allocated as values are written to it. I propose a new overload is added to BebopWriter.Create() that accepts an array so it doesn't start at empty. By default this array would come from ArrayPool<byte>.Shared.

Describe the solution you'd like
There are 2 alternatives I tried locally.

The first one, temporarily named EncodeImmutablyWithArrayPool(), gets the buffer from the pool using the final size needed to encode to aka ArrayPool<byte>.Shared.Rent(record.ByteCount). The benefit here is we know another array won't be allocated. In my benchmarks this was consistently 20-30 ns slower than the current code. My assumption is it was slower because of the call to GetByteCount(). The benefit is it allocates much less memory.

[global::System.Runtime.CompilerServices.MethodImpl(global::Bebop.Runtime.BebopConstants.HotPath)]
public static global::System.Collections.Immutable.ImmutableArray<byte> EncodeImmutablyWithArrayPool(global::Bebop.Codegen.Library record)
{
    var arrayPool = ArrayPool<byte>.Shared.Rent(record.ByteCount);
    try
    {
        var writer = global::Bebop.Runtime.BebopWriter.Create(arrayPool);
        __EncodeInto(record, ref writer);
        return writer.ToImmutableArray();
    }
    finally
    {
        ArrayPool<byte>.Shared.Return(arrayPool, clearArray: false);
    }
}

The second example, temporarily named EncodeImmutablyWithArrayPoolFixed(), gets an array from the pool using a constant value aka ArrayPool<byte>.Shared.Rent(1000). In my benchmarks this was consistently 20-30 ns faster to run than the current code. We can play around with what constant value to use, but if we rent a buffer too small, we'll be back to our problem of allocating another array. Either way, this will still allocate much less memory than the current code and will run faster.

[global::System.Runtime.CompilerServices.MethodImpl(global::Bebop.Runtime.BebopConstants.HotPath)]
public static global::System.Collections.Immutable.ImmutableArray<byte> EncodeImmutablyWithArrayPoolFixed(global::Bebop.Codegen.Library record)
{
    var arrayPool = ArrayPool<byte>.Shared.Rent(1000);
    try
    {
        var writer = global::Bebop.Runtime.BebopWriter.Create(arrayPool);
        __EncodeInto(record, ref writer);
        return writer.ToImmutableArray();
    }
    finally
    {
        ArrayPool<byte>.Shared.Return(arrayPool, clearArray: false);
    }
}

Additional context
Below are the benchmarks for the above mentioned methods:

Method Runtime Mean Error StdDev Min Max Gen 0 Gen 1 Gen 2 Allocated
EncodeOriginal .NET 4.7.2 468.3 ns 1.39 ns 1.30 ns 466.3 ns 469.9 ns 0.0811 - - 514 B
EncodeWithArrayPool .NET 4.7.2 518.0 ns 3.60 ns 3.19 ns 513.2 ns 524.3 ns 0.0172 - - 112 B
EncodeWithArrayPoolFixed .NET 4.7.2 427.3 ns 1.27 ns 1.19 ns 425.1 ns 428.9 ns 0.0176 - - 112 B
EncodeOriginal .NET Core 5.0 206.6 ns 0.50 ns 0.47 ns 205.8 ns 207.3 ns 0.0610 - - 512 B
EncodeWithArrayPool .NET Core 5.0 236.2 ns 0.51 ns 0.47 ns 235.2 ns 236.9 ns 0.0134 - - 112 B
EncodeWithArrayPoolFixed .NET Core 5.0 183.3 ns 0.42 ns 0.40 ns 182.4 ns 183.7 ns 0.0134 - - 112 B

Here's the benchmark code:

    [MinColumn]
    [MaxColumn]
    [MemoryDiagnoser]
    [SimpleJob(RuntimeMoniker.Net472)]
    [SimpleJob(RuntimeMoniker.NetCoreApp50)]
    public class ObjectReadWrite
    {
        private readonly Library _library;

        public ObjectReadWrite()
        {
            var testGuid = Guid.Parse("81c6987b-48b7-495f-ad01-ec20cc5f5be1");
            var song = new Song
            {
                Title = "Donna Lee",
                Year = 1974,
                Performers = new Musician[]
                {
                    new Musician {Name = "Charlie Parker", Plays = Instrument.Sax},
                    new Musician {Name = "Miles Davis", Plays = Instrument.Trumpet}
                }
            };

            _library = new Library { Songs = new Dictionary<Guid, Song> { { testGuid, song } } };
        }

        [Benchmark]
        public ImmutableArray<byte> EncodeOriginal()
            => Library.EncodeImmutably(_library);

        [Benchmark]
        public ImmutableArray<byte> EncodeWithArrayPool()
            => Library.EncodeImmutablyWithArrayPool(_library);

        [Benchmark]
        public ImmutableArray<byte> EncodeWithArrayPoolFixed()
            => Library.EncodeImmutablyWithArrayPoolFixed(_library);
    }

If this is a feature you want added, I can volunteer to take it on.

Another alternative I did not try is to get a buffer from the Array pool of size MaxByteCount for that entity. Currently the property re-calculates the value each time it's called. But it's essentially a constant value. We could change it to a constant that gets output by the compiler and use that constant when pulling a buffer from the array pool. This would take more effort, and could be done at a later date.

RPC Proposal

Goals

  1. Minimize boilerplate
    • Calling across RPC should be almost the same as calling local functions
    • If something will always be the same for every implementation, we should provide it, if it will usually be the same, we should consider including it as separate package
  2. Be wicked fast
    • minimize round trips
    • keep packet overhead small
    • use bebop for fast serialization ๐Ÿš€
  3. KISS

Basic Features

Inspirations

  • JSON-RPC - Dead simple stateless RPC
  • gRPC - Feature rich, widely adopted, and moderately performant
  • Cap'n Proto - Cool, but they even have to describe levels of features because of how complicated it is
  • GraphQL - A way to make HTTP calls slower

Paradigm Decisions

(Bold items we plan to implement, non-bold items we considered).

  • Stateful OR stateless
  • Abstract transport protocol OR specific transport protocol - If we don't assume a given transport protocol we need to decide what features it should bring to the table (e.g. lossy/guaranteed, multichannel/singlechannel, encryption, compression, heartbeats, ...)
    • Reliable socket like transport
    • Encrypted
    • Not Ordered but we don't re-order for the user
    • Not compressed (we might want to support compression of messages)
    • Heartbeats/we know when disconnects happen
  • Static API OR Object oriented - Cap'n proto has a way of returning an "object" which can then be used for future method calls and they use it as a method of scoping as well (e.g. all auth requiring functions can only be reached after authenticating which returns an object). (Object in quotes because it could just be an ID value).

Possible Features:

(Bold items we plan to implement, non-bold items we considered).

  1. Function calls - basic takes parameters and has return
  2. Streamed return - takes parameters and then allows for an async stream of response objects
    • Second priority
    • Might not be useful at all since main RPC channel can coordinate separate sockets being connected for subscriptions
  3. Streamed args - one or more arguments are streamed and when done a single response is sent
  4. Call cancellation - e.g. being able to stop waiting for a response if it is taking too long and to notify the remote party
    • Required with stream responses
    • Maybe a nice to have for function calls?
  5. Server side call pipelining - cap'n proto's claim to fame which allows the caller to describe a series of RPC calls without 6. needing each response to return to them before the next request can be made
  6. Request batching - multiple RPCs in one operation
  7. Signature checking - ensuring that the caller and callee do not disagree on the protocol specs (signatures generated by types only, not names)
  8. Composable signatures - allows checking if a signature is compatible using XORs of sub-components in messages and unions.
  9. Version negotiation - attempting to run even if versions are different based on rules regarding compatibility
  10. Deadlines - allow the caller to specify a max amount time after which it will no longer care about a response
  11. P2P - allow both parties to make calls on the same transport channel

Schema

RPC introduces a new keyword service which allows defining a set of functions. Services have function definitions which express what data is expected and returned. They also define the opcode to function name mapping (names are used in the code, opcodes in the serialized form).

Restrictions:

  • The 0 opcode is reserved for string serviceName()
  • The 0xfff0-0xffffopcodes are reserved for future uses
  • Opcode is a uint16
  • A channel endpoint can only support one service, but multiple services can be defined in bebop. One channel can therefore have two services Aโ†โ†’B but each side can only call the one the opposing side is responding to.
  • Functions may only return one value
  • Function names must be unique

Definitions:

  • Functions have an opcode, name, arguments, and return values. Both arguments and return types automatically defines a struct. Generated code should strive to elide the argument/return wrapper structs as appropriate for the language.
  • Argument structs are automatically defined (see protocol)
  • Return structs are automatically defined (see protocol)
  • void is a special return type which indicates no bytes and declares an empty struct.
  • Functions which accept no arguments declare an empty struct.
// User types (here for example of generating signatures)
struct User {
  guid id;
  date joined;
  string first;
  string last;
}

message DoTheThing {
  1 -> uint32[] items;
  2 -> string thing;
}

union ThingsDone {
  1 -> struct { byte thing; }
  2 -> message { 4 -> uint64 other; }
}

// RPC definitions
service HelloService {
  /*
    Authenticate with the server. Stateful connection so no need to pass a token
    around.
  */
  1 -> void authenticate(string username, string password);
  // 2 perhaps was an old function that is no longer supported.
  /* Retrieve information about the current user. Requires being authenticated. */
  3 -> User getUserDetails();
  4 -> ThingsDone doTheThing(DoTheThing myThing, uint32 limit, string msg);
}

Protocol

The language-agnostic components which can be written in bebop.

Static Structures

These do not change based on the user schema and should be included in bebopc as a text string which can be generated on demand for the needed languages.

Headers

/* Static RPC request header used for all request datagrams. */
readonly struct RpcRequestHeader {
  /*
    Identification for the caller to identify responses to this request.
    
    The caller should ensure ids are always unique at the time of calling. Two active
    calls with the same id is undefined behavior. Re-using an id that is not currently
    in-flight is acceptable.

    These are unique per connection.
  */
  uint16 id;

  /*
    How long in seconds this request is allowed to be processed for. A value of 0
    indicates no limit.
    
    This allows for potentially long queries to be cancelled if the requester will no
    longer be interested and more importantly allows unreliable transports to establish
    an agreed upon point at which the requester is going to assume the packet was lost
    even if it just had yet to be sent.
    
    By using a max-time to compute rather than an expiration time, we reduce the risk
    of different system times causing confusion. Though there will be some overlap
    where a response may be sent and ignored as the requester already considers it
    to be expired. 
  */
  uint16 timeout;

  /*
    Function signature includes information about the args to ensure the caller and
    callee are referencing precisely the same thing. There is a non-zero risk of
    accidental signature collisions, but 32-bits is probably sufficient for peace of
    mind.
    
    I did some math, about a 26% chance of collision using 16-bits assuming 200 unique
    RPC calls which is pretty high, or <0.0005% chance with 32-bits.
  */
  uint32 signature;
}
/* Static RPC response header used for all response datagrams. */
readonly struct RpcResponseHeader {
  /* The caller-assigned identifier */
	uint16 id;
}

Null Service

This should be hardcoded and not be generated from static bebop, see Implementation.

Datagram

/*
  All data sent over the transport MUST be represented by this union.
  
  Note that data is sent as binary arrays to avoid depending on the generated structure
  definitions that we cannot know in this context. Ultimately the service will be
  responsible for determining how to interpret the data.
*/
union RpcDatagram {
  1 -> struct RpcRequestDatagram {
    RpcRequestHeader header;
    /* The function that is to be called. */
    uint16 opcode;
    /* Callee can decode this given the opcode in the header. */
    byte[] request;
  }
  2 -> struct RpcResponseOk {
    RpcResponseHeader header;
    /* Caller can decode this given the id in the header. */
    byte[] data;
  }
  3 -> struct RpcResponseErr {
    RpcResponseHeader header;
    /*
      User is responsible for defining what code values mean. These codes denote
      errors that happen only once user code is being executed and are specific
      to each domain.
    */
    uint32 code;
    /* An empty string is acceptable */
    string info;
  }
  /* Default response if no handler was registered. */
  0xfc -> struct CallNotSupported {
    RpcResponseHeader header;
  }
  /* Function id was unknown. */
  0xfd -> struct RpcResponseUnknownCall {
    RpcResponseHeader header;
  }
  /* The remote function signature did not agree with the expected signature. */
  0xfe -> struct RpcResponseInvalidSignature {
    RpcResponseHeader header;
    /* The remote function signature */
    uint32 signature;
  }
  /*
    A message received by the other end was unintelligible. This indicates a
    fundamental flaw with our encoding and possible bebop version mismatch.

    This should never occur between proper implementations of the same version.
  */
  0xff -> struct DecodeError {
    /* Information provided on a best-effort basis. */
    RpcResponseHeader header;
    string info;
  }
}

Dynamic Structures

These are not actually written in bebop syntax anywhere,ย however,ย the AST structure would beย generated and used to produce the language-appropriate structures.ย So for readability they areย presented here as bebop.

It is preferable to avoid using aย unionย here because it would create either another layer ofย sizeย andย opcodeย on top of what is already present OR it would create a dependency between theย static definition and the definitions for each and every service leading to significantly moreย code generation.

Return structs don't strictly add a ton of value, but they make the code more symmetric and a little easier for the generator to spit out in theory. We may opt to remove them later. The same goes for empty argument structs with only zero or one items.

// md5("(string,string)(void)") = b51ddc223579c70014a1e23c98329b08
const uint32 _HelloServiceAuthenticateSignature = 0xb51ddc22;
// md5("(void)((guid,date,string,string))") = cc627ee5abfcca0211acdf9716a70854
const uint32 _HelloServiceGetUserDetailsSignature = 0xcc627ee5;
// md5("(message{1:uint32[],2:string},uint32,string)(union{1:(byte),2:message{4:uint64}},uint64)") = 0836646b276d1768e0924c99dcdaca78
const uint32 _HelloServiceDoTheThingSignature = 0x0836646b;

struct _HelloServiceAuthenticateArgs {
  string username;
  string password;
}

struct _HelloServiceAuthenticateReturn {}

struct _HelloServiceGetUserDetailsArgs {}

struct _HelloServiceGetUserDetailsReturn {
  User value;
}

struct _HelloServiceDoTheThingArgs {
  DoTheThing myThing;
  uint32 limit;
  string msg;
}

struct _HelloServiceDoTheThingReturn {
  ThingsDone value;
}

Standard Components

These can be hardcoded since they donโ€™t change or have special code written to make them semi-dynamic and are the same for all services. This needs to be the same signature for all to allow cross-service name checks. This is the only function which should be guaranteed to work no matter what service is on the other end.

// md5("(void)(string)") = 1bf832690ab97e3599a46d2b08739140
const uint32 _HelloServiceNameSignature = 0x1bf83269;

// optionally could leave this struct out since it is a special case anyway
struct _HelloServiceNameArgs {}

struct _HelloServiceNameReturn {
  string serviceName;
}

Signatures

Signatures are to to ensure the binary data sent between the peers is interpretable. It should not therefore include things which are changeable without altering the binary representation if at all possible.

We already ensure the sizesย line up with what is expected; this is a good check for ensuring we don't end up reading invalid memory, but it is not enough to catch many possible errors. Using signatures means we can throw a signature error rather than a generic decode error. They are designed to catch human mistakes and prevent possible data corruption in a database. Signatures are not needed to ensure the protocol itself remains safe.

There are cases where we would have been able to correctly read the data even if the signature does not match such as if a new field is added to a union or a message since bebop is required to be forward compatible for both. There is a solution to this using composable hashes, but it is a feature we will need to revisit later if we decide it is needed.

The above example strings above are probably not what we will end up generating but do give the idea of what needs to be captured and one way it could be done. We may also want to include these raw signature strings as constants so they can be included in error messages.

Implementation

The language dependent components which connect with the generated code and make it function.

The following pseudocode is written in Rust as it is the most expressive of the languages we are using. The real implementation will be somewhat different as things come up. This is a template which hopes to set forth the logical structures and to pin down naming choices.

The bebop structs mentioned in the protocol section are assumed to be generated and will not be re-listed here.

Static Runtime

Building Blocks

/// Transport protocol has a few main responsibilities:
/// 1. interpreting the raw stream as datagrams
/// 2. automatically reconnecting and dealing with network issues
/// 3. deciding how it wants to handle recv futures
trait TransportProtocol {
  fn set_handler(&mut self, recv: fn(datagram: RpcDatagram) -> Future<Output=()>);
  async fn send(&self, datagram: &RpcDatagram) -> Result<(), TransportError>;

  async fn send_decode_error_response(&self, call_id: u16, info: Option<&str>) -> TransportResult {
    // ...
  }
}

/// The local end of the pipe handles messages.
/// Implementations are generated from bebop service definitions.
trait ServiceHandlers {
  /// Use opcode to determine which function to call, whether the signature matches,
  /// how to read the buffer, and then convert the returned values and send them as a
  /// response
  async fn _recv_call(&self, opcode: u16, sig: u32, call_id: u16, buf: &[u8]);
}

/// Wrappers around the process of calling remote functions.
/// Implementations are generated from bebop service definitions.
trait ServiceRequests {
  const NAME: &'static str;
}

Router

/// This is the main structure which represents information about both ends of the
/// connection and maintains the needed state to make and receive calls. This
/// is the only struct of which an instance should need to be maintaned by the user.
struct Router<P: TransportProtocol, L: ServiceHandlers, R: ServiceRequests> {
  /// Underlying transport
  transport: P,
  /// Local service handles requests from the remote.
  local_service: L,
  /// Remote service converts requests from us, so this also provides the callable RPC
  /// functions.
  remote_service: R,
}

/// Allows passthrough of function calls to the remote
impl Deref for Router {
    fn deref(&self) -> &R {}
}

impl Router {
  fn new(...) -> Self { ... }
  
  /// Receive a datagram and routes it
  async fn _recv(&self, datagram: RpcDatagram) {
    self.local_service._recv_call(...).await
  }

  /// Send a request
  async fn _send_request(&self, call_id: u16, buf: &[u8]) -> TransportResult {}

  /// Send a response to a call
  async fn _send_response(&self, call_id: u16, data: &[u8]) -> TransportResult {}

  async fn _send_error_response(&self, call_id: u16, code: u32, msg: Option<&str>) -> TransportResult {}
  async fn _send_unknown_call_response(&self, call_id: u16) -> TransportResult {}
  async fn _send_invalid_sig_response(&self, call_id: u16, expected_sig: u32) -> TransportResult {}
  async fn _send_call_not_supported_response(&self, call_id: u16) -> TransportResult {}
  async fn _send_decode_error_response(&self, call_id: u16, info: Option<&str>) -> TransportResult {
    self.transport.send_decode_error_response(...).await
  }
}

Error Handling

enum TransportError {
  // Things that could go wrong with the underlying transport, need it to be
  // somewhat generic. Things like the internet connection dying would fall
  // under this.
}

type TransportResult = Result<(), TransportError>;

/// Errors that the local may return when responding to a request.
enum LocalRpcError {
  CustomError(u32, String),
  CustomErrorStatic(u32, &'static str),
  NotSupported,
}

/// Response type that is returned locally and will be sent to the remote.
type LocalRpcResponse<T> = Result<T, LocalRpcError>;

/// Errors that can be received when making a request of the remote.
enum RemoteRpcError {
  TransportError(TransportError),
  CustomError(u32, Option<String>),
  NotSupported,
  UnknownCall,
  /// When the received datagram has a union branch we don't know about.
  UnknownResponse,
  InvalidSignature(u32),
  CallNotSupported,
  DecodeError(Option<String>)
}

/// A response on the channel from the remote.
type RemoteRpcResponse<T> = Result<T, RemoteRpcError>;

Implementations for Static Parts

These may be worth hardcoding rather than trying to implement by generating from static bebop.

/// A service used when one end of the channel does not offer any callable endpoints.
/// You can also use a NullService for a remote which is any other service and it will
/// mask it, making it impossible to call, but also not causing any errors.
struct NullService;

impl ServiceHandlers for NullService {
  async fn service_name(&self) -> LocalRpcResponse<&str> { Ok("NullService") }
}

impl ServiceRequests for NullService {
  async fn service_name(&self) -> RemoteRpcResponse<()> { Ok("NullService") }
}

Generated Code

This generated code is separate from what is currently created by bebop and cannot be made simply by leveraging the AST. In its implementation, it will reference types that were generated more classically and are described in protocol.

Service Definitions

/// The local handlers for the service
trait HelloServiceHandlers {
  async fn service_name(&self) -> LocalRpcResponse<&str> { Ok("HelloService") }
  async fn authenticate(&self, username: &str, password: &str) -> LocalRpcResponse<()> { Err(NotSupported) }
  async fn get_user_details(&self) -> LocalRpcResponse<User> { Err(NotSupported) }
  async fn do_the_thing(&self, arg1: DoTheThing, arg2: u32, arg3: &str) -> LocalRpcResponse<(ThingsDone, u64)> { Err(NotSupported) }
}

impl SerivceHandlers for HelloServiceHandlers {
  const NAME: &'static str = "HelloService";

  fn _recv_call(&self, opcode: u16, sig: u32, call_id: u16, buf: &[u8]) {
    /* generated routing and stuff */
  }
}

/// Wrapper around the remote functions we can call.
struct HelloServiceRequests;
impl HelloServiceRequests {
  async fn service_name(&self) -> RemoteRpcResponse<()> {}
  async fn authenticate(&self, username: &str, password: &str) -> RemoteRpcResponse<()> {}
  async fn get_user_details(&self) -> RemoteRpcResponse<User> {}
  async fn do_the_thing(&self, myThing: DoTheThing, limit: u32, msg: &str) -> RemoteRpcResponse<ThingsDone> {}
}

impl ServiceRequests for HelloServiceRequests {
  const NAME: &'static str = "HelloService";
}

User Code

  1. Define the local service implementation
  2. Define/import transport implementation
  3. Create tokio runtime
  4. Create a Router instance + transport + local service
  5. Begin making calls
/// This is the user's service and it may contain state. They are then able to implement
/// all of the handlers however they want.
struct HelloService;
impl HelloServiceHandlers for HelloService {
  // ..
}

struct WebsocketTransport {
  // magic for now
}

impl TransportProtocol for WebsocketTransport {
  // ...
}

#[tokio::main]
fn main() {
  // make a router for the "client" which can call HelloService but does not accept any
	// calls from the remote endpoint.
  let router = Router::new(WebsocketTransport::new(), NullService, HelloService);
  
  // deref is implemented, so we can simply call its functions 
  assert_eq!(router.service_name().await, "HelloService");
  
  router.authenticate("someperson", "somepassword").await.unwrap();
  println!(router.get_user_details().await);
  let (done, time) = router.do_the_thing(DoTheThing {}, 1234, "blah").await.unwrap();
  // ...
}

Watch mode for bebopc

Is your feature request related to a problem? Please describe.
Right now there's no way to have the bebop compiler watch your input directories for changes and automatically recompile your schemas. Most compilers of this sort have such a feature, and it is integral to many workflows.

Describe the solution you'd like
Watch mode for bebopc, in which it watches the input directories specified by your bebop.json file and automatically recompiles the schemas whenever they change. Basically just an equivalent to tsc's implementation: bebopc --watch.

Describe alternatives you've considered
The VSCode plugin will support automatic compilation itself Soon(TM), but not everyone is going to want to use that. A generic watch mode is the very most minimal and flexible implementation of this feature.

Clarification on readonly messages

The bebop vscode extension reports The 'readonly' modifer cannot be applied to 'MyMessage' as it is not a struct when given readonly message MyMessage.

Assuming this is intentional, what about messages disables them from being generated in a way that they are immutable to the consumer?

Python Support

Is your feature request related to a problem? Please describe.
A lot of people use Python, and this would be really handy for things like online games.

Describe the solution you'd like
It would be nice if BeBop supported Python

Additional context
N/A

Owned records for Rust

Is your feature request related to a problem? Please describe.

  1. It is very redundant to define nearly duplicate message structures in our code which own the data.
  2. For RPC it would be nice if we are able to have datagrams that did not rely on the lifetime of ephemeral data

Describe the solution you'd like
Generate nearly duplicate definitions that use owned types instead of borrowed types and eliminate the lifetimes. Implement From<Borrowed> for Owned and Record for Into<T> where T: Record.

Describe alternatives you've considered
We are living it right now. It is somewhat annoying.

Additional context
We will need to carefully manage what types are used based on ownership expectations. This will probably be the hardest part.

can some proto files compose a project called" schema project". by a config file like an empty c# assmbly ,so we call check types safely

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Go implementation

In case anyone is interested, I am currently working on adding Go support. Though fair warning: I am doing this for fun over my staycation and I currently don't have any plans to use it in any sort of production code.

I am uploading my changes as I go here: cwize1/Go.

It is still a work in progress. But I have made enough progress on it that I feel confident that I'll be able to have it mostly completed by the end of the year.

Iterable serialization for Rust

Is your feature request related to a problem? Please describe.
Sometimes we want to serialize into a Write or AsyncWrite type and to do so without allocating a buffer to store all the data in serialized form before passing it to write requires a way to serialize fields one at a time in a depth-first manner.

Describe the solution you'd like
A new function on Record and SubRecord which returns an Iterator<Item=usize> (saying how many bytes where written on each go) and takes a mutable reference to a [u8;1024] or something buffer which can be used to serialize some data into on each go.

This can either be accomplished with unsafe code (possibly some magic safe code that I can't think of ATM), but we could also pass the buffer to it mutably on the .next(&mut buf) call, but that would mean it no longer conforms to std::iter::Iterator.

We could also use a wrapping struct that is a true Iterator type and only goes 1 byte at a time, and an inner non-iterator type that get's the &mut buf on each call, which only happens when the outer type has exhausted what was written last time.

Describe alternatives you've considered
Write all data into a buffer to then pass to Write types as they can accept them. This requires an extra alloc and is non-ideal.

Additional context
This is useful for RPC and networking in general and would allow us to implement Read for Record types.

Potential issue with unaligned memory access

Hi,

I was looking at the recent C++ implementation, and saw that there was a specific little endian hot-path added, and because I'd recently run into an issue on ARM platforms (specifically on Android) when adding this to another project, I took a quick look, and it looks like bebop might also have the issue.

The problem has to do with CPUs that don't support unaligned memory access (x86 are fine with this, but it can cause segfaults on certain ARMs), which can happen when you cast and dereference from a smaller type to a larger type, ex:
https://github.com/RainwayApp/bebop/blob/eb71010763bb25af1581637fa7bb59b2b3d1f1fd/Runtime/C%2B%2B/src/bebop.hpp#L211

The way I ended up solving it was to memcpy into a temporary variable of the resulting type, and then have GCC use its knowledge of the target CPU to determine if it was safe to just convert this to a cast + dereference, or if it needed to do the appropriate bit-twiddling.

This was a good explanation of what was going on, https://blog.quarkslab.com/unaligned-accesses-in-cc-what-why-and-solutions-to-do-it-properly.html

And you can see if in action on godbolt https://godbolt.org/z/hjWoW4 (if you change the compiler targets you can see when the memcpy gets optimized out).

Unfortunately I don't have any device that actually might exhibit this issue to test it out on, so I'm not 100% sure it will cause issues, but because I ran into this myself recently, I just wanted to give a heads up.

Browser Support

Is your feature request related to a problem? Please describe.
Hi, is this supported in browsers or is it simply for NodeJs.

Thanks

bebop.json conflicts with commandline arguments

when arguments are fed into the bebopc CLI they should take priority over the existence of bebop.json. When the compiler chooses to use bebop.json it is an opaque operation, which leads to errors on the CLI.

The bug is here in the flag parsing logic is below:

https://github.com/RainwayApp/bebop/blob/34e6e71b63d61a8e32455ce071141b7d89b63a13/Compiler/CommandLineFlags.cs#L433-L450

bebop.json is loaded, parsed, and used regardless of the existence of other flags. The correct behavior is that bebop.json is only used when there are no other flags provided or when --config is specified.

[question] How to perform type conversions while populating

Hi! Thanks for all your work on bebop, it's a really exciting new framework.

I'm working on integrating bebop into the rust serialization benchmark. The code is structured like this:

  • Data starts as native types (think user-defined structs).
  • Then some frameworks like prost (protobuf) may convert the native types to their generated types (this is usually called the "populate" step).
  • The data (native or populated) is then serialized.
  • The process runs backwards for deserializing

The populate and serialize (also called encode) steps are separated out to time them separately. That way, users can pick which timing is appropriate for their use case.

I made a trait for bebop that looks like this:

pub trait Serialize<'a> {
    type Populated: 'a + Record<'a>;

    fn populate_bb(&'a self) -> Self::Populated;
    fn depopulate_bb(target: Self::Populated) -> Self;
}

This is intended to convert a native type to a bebop type so that it can be serialized. It works great, except for the following example:

lib.rs:

struct MyStruct {
    my_int: i32,
    my_float: f32,
}

struct MyData {
    my_meta: i32,
    my_data: Vec<MyStruct>,
}

schema.bop:

struct MyStruct {
    int32 my_int;
    float32 my_float;
}

struct MyData {
    int32 my_meta;
    MyStruct[] my_data;
}

In this case, there's an array of user-defined structs that needs to be serialized. If there's a native counterpart to this, then I have to convert a &'a crate::MyData that holds a Vec<crate::MyStruct> to a bebop_generated::MyData<'a> that holds a bebop::SliceWrapper<'a, bebop_generated::MyStruct>.

I used this pattern successfully to serialize strings because I can borrow a &str from any String. I can't use this pattern to serialize collections of native types because I can't borrow a &bebop_generated::Type from any crate::Type.

One option is to define a second intermediate type that uses bebop structs but not bebop containers:

struct Prepopulated {
    my_int: i32,
    my_data: Vec<bebop_generated::MyStruct>,
}

However this really increases the amount of work needed to get things up and running and will be very complicated later on with highly-structured data.

Is there a better way to approach this problem? Or a location in the serialization flow where I could define some custom behavior?

schema support namespace feature for big project

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

How to support dynamic message types for top-level messages/structs?

Bebop looks great. I'm thinking of switching to it for a game I'm building.

Is your feature request related to a problem? Please describe.
What's the best way to handle multiple message types (in TypeScript)?

Say we have a number of possible messages the server might send:

message WorldUpdateMessageContainer {
  1 -> WorldMeta meta;
  2 -> Entity[] newEntities;
  3 -> EntityUpdate[] updatedEntities;
  4 -> AbstractMessage[] messages;
  5 -> uint8[] deleteEntities;
}

message AbstractMessage {
  1 -> MessageType type;
}

message UpdateMapChunkBlobMessage {
  1 -> MessageType type;
  2 -> MapChunk chunk;
}

message PlaceBlockMessage {
  1 -> MessageType type;
  2 -> UpdateBlock[] blocks;
  3 -> uint16 player;
  4 -> int16 error;
  5 -> uint16 saveCount;
}

message ChatMessage {
  1 -> MessageType type;
  2 -> string body;
  3 -> uint16 player;
  4 -> date sentAt;
  5 -> string username;
}

In this case, the "top-level object" would be a WorldUpdateMessageContainer. What's the best practice for detecting which message it sent (in AbstractMessage)?

Describe the solution you'd like
Is this something opcodes/mirroring is for?

Describe alternatives you've considered
For the current schema format, I override the function to decode the AbstractMessage to look at the MessageType value and decode the rest in the appropriate Message. I could do that here too, but is there a better way? It feels hacky to edit the generated code.

It looks like this:

const maxMessageValue = Math.max(
  ...Object.keys(CompiledSchema.MessageType)
    .filter((key) => !key.endsWith("Message"))
    .map((n) => parseInt(n, 10))
);
CompiledSchema.encoderByMessageType = new Array(maxMessageValue);
CompiledSchema.decoderByMessageType = new Array(maxMessageValue);

for (var key: MessageType in CompiledSchema.MessageType) {
  if (!key.endsWith("Message")) {
    continue;
  }

  const index = parseInt(CompiledSchema.MessageType[key], 10);
  CompiledSchema.encoderByMessageType[index] = CompiledSchema[
    "encode" + key
  ].bind(CompiledSchema);
  CompiledSchema.decoderByMessageType[index] = CompiledSchema[
    "decode" + key
  ].bind(CompiledSchema);
}

CompiledSchema.decodeAbstractMessage = function (bb, _result = null) {
  var result = _result ? _result : {};

  if (!bb || !(bb instanceof CompiledSchema.ByteBuffer)) {
    bb = new CompiledSchema.ByteBuffer(bb);
  }

  while (true) {
    switch (bb.readVarUint()) {
      case 0:
        return result;

      case 1:
        result["type"] = CompiledSchema["MessageType"][bb.readVarUint()];
        return CompiledSchema.decoderByMessageType[
          CompiledSchema["MessageType"][result["type"]]
        ](bb, result);
        break;

      default:
        throw "Error!";
    }
  }
}.bind(CompiledSchema);

Another way to do this would be to wrap all the objects inside other object like this:

message AbstractMessage {
  1 ->  UpdateMapChunkBlobMessage updateMapChunkBlobMessage;
  2 -> PlaceBlockMessage placeBlockMessage;
  3 -> ChatMessage chatMessage;
  // ...
}

But that would get expensive to encode/decode since its many extra object allocations.

Support alternative enum sizes

In Rust you can write

#[repr(u16)]
enum MyEnum {
  A = 1,
  B = 2,
  ...
}

The idea is to support something like

[repr(uint16)]
union U {
  ...
}

This should probably only accept uint8, uint16, uint32, and maybe uint64.

repr would be supported for both message and union types and would help in cases where more than 256 variants exist (well 255 since 0 is special). By default there would just be an implicit repr(uint8) if no override is defined.

It is also possible this could be extended to enums to allow them to be smaller than a uint32 but was not discussed.

Design question: Semicolons and import statements

I was surprised when writing some bebop schemas that:

import "./example.bop";

struct Test {
    string Field;
}

... did not compile -- Expected Enum or Struct or Message or Union or Service, but found ';' of kind Semicolon.

And

import "./example.bop"

struct Test {
    string Field
}

Also did not compile -- Expected Semicolon, but found '}' of kind CloseBrace

The rules are, in one way, consistent -- top level definitions do not use semicolons and field definitions do use semicolons -- but why care? Why enforce the use of semicolons at all? It's been demonstrated by compilers in the past (Javascript + Typescript)(Go) that you can remove the need for semicolons as above entirely by injecting a semicolon if they are required and not provided.

An additional point of inconsistency: based on the theoretical rule I stated above, this should compile, as it is a top level statement:

const bool boolconst = true

But we get: Expected Semicolon, but found '' of kind EndOfFile

Were bebop to remove the need for semicolons but still support their presence, it would be a backwards compatible change, and formatters could automatically remove all of them (or place them where they are allowed, if that's your preferred style).

Improve runtime type guarding capabilities of generated Typescript code

Is your feature request related to a problem? Please describe.
When generating Typescript code the Bebop compiler composes the fields of a record into a Typescript interface and creates a constant which defines encode and decode methods that accept that interface.

So the following schema:

union SuperHero {
   1 -> readonly struct GreenLantern {
      float32 powerLevel;
   }
   2 -> readonly struct Batman {
     float32 powerLevel; 
   }
}

produces the following code:

export declare namespace Bebop.Example {
    interface IGreenLantern {
        readonly powerLevel: number;
    }
    const GreenLantern: {
        discriminator: 1;
        encode(message: IGreenLantern): Uint8Array;
        encodeInto(message: IGreenLantern, view: BebopView): number;
        decode(buffer: Uint8Array): IGreenLantern;
        readFrom(view: BebopView): IGreenLantern;
    };
    interface IBatman {
        readonly powerLevel: number;
    }
    const Batman: {
        discriminator: 2;
        encode(message: IBatman): Uint8Array;
        encodeInto(message: IBatman, view: BebopView): number;
        decode(buffer: Uint8Array): IBatman;
        readFrom(view: BebopView): IBatman;
    };
    type ISuperHero = {
        discriminator: 1;
        value: IGreenLantern;
    } | {
        discriminator: 2;
        value: IBatman;
    };
    const SuperHero: {
        encode(message: ISuperHero): Uint8Array;
        encodeInto(message: ISuperHero, view: BebopView): number;
        decode(buffer: Uint8Array): ISuperHero;
        readFrom(view: BebopView): ISuperHero;
    };
}

The promise of Bebop is to be more type-safe than JSON - and at face value this generated code keeps that promise. Now lets define a function that uses this generated code and step through it:

function getSuperHero(): Bebop.Example.ISuperHero {
    if (getRandomNumber() > 5) {
        return {
            discriminator: Bebop.Example.Batman.discriminator,
            value: {
                powerLevel: 200
            } as Bebop.Example.IBatman
        }
    }
    return {
        discriminator: Bebop.Example.GreenLantern.discriminator,
        value: {
            powerLevel: 200
        } as Bebop.Example.IGreenLantern
    }
}

One of the first issues that presents itself is how a developer must go about constructing and returning a superhero:

// creating a superhero is very verbose
return  {
    // while 'discriminator' must be either '1' or '2'
    // there is nothing that prevents a developer from setting the wrong discriminator for a value
    // this can result in subtle and hard to diagnose runtime exceptions
    // especially if a developer fails to use the constant (which they can and will do if given a choice)
    discriminator: Bebop.Example.Batman.discriminator,
   // the type check if done based on the discriminator
   // so even with the 'as` keyword TSC infers the type as whatever matches the `discriminator`
    value: {
        powerLevel: 200
    } as Bebop.Example.IBatman
}

Further, because objects are based on interfaces there is no way to implement type guards or differentiate between types:

const superHero = getSuperHero();
// not possible 
if (superHero.value instanceof Bebop.Example.IBatman)
// not possible
if (superHero.value instanceof Bebop.Example.IGreenLantern)
// not possible
function isBatman(superHero: Bebop.Example.ISuperHero) {
    return superHero.value instanceof Bebop.Example.Batman;
}

Which means the following is the only way to check the type of a value within a union:

// this isn't ideal 
if (superHero.discriminator === 1 || superHero.discriminator === Bebop.Example.Batman.discriminator) {
    console.log("I'm Batman")
}

And there is no way to check what the type of an object is at all:

const batman: Bebop.Example.IBatman = {
    powerLevel: 200
};
// not valid
if (batman instanceof Bebop.Example.IBatman)

Describe the solution you'd like
The Bebop compiler should produce Typescript code that leverages classes. This would allow not only runtime type checking but also allow unions to be used in a much more ergonomic way, like such:

class Superhero {
  discriminator: 1 | 2;
  value:  BaseGreenLantern | BaseBatman;
 
  constructor(value: BaseGreenLantern | BaseBatman) {
    this.value = value;
   if (value instanceof BaseGreenLantern) {
       this.discriminator = 1;
   }
  }

That is just one example.

Allow enums to explicitly specify other integral scalars

Is your feature request related to a problem? Please describe.

Currently an enum defined in a schema is guaranteed to take up 4-bytes on the wire even if the actual enum value is small. This means extra padding is being applied when these known constants may never even utilize a 4-byte memory space.

Describe the solution you'd like
It should be possible to define an enum with a different backing integral scalar. For example:

enum Color : ubyte { Red = 1; Green = 2; Blue = 3; }

When defined would only take up a single byte on the wire as the generated code would be aware only 1-byte is needed.

Additional context

  • Since enum values do not use varint encoding on the wire, negative values are just as efficient as positive ones. This change should allow for both uint or int to be valid backing integrals for the enum.
  • Because the generated code may be subject to language-specific limitations uint64 and int64 should not be supported.

Deno support

Is your feature request related to a problem? Please describe.
Deno support would be a nice feature, as the platform is growing steady and you don't have to deal with the burden of TS compilation using it.

Describe the solution you'd like
As far as I could see on the sources it should be relatively easy to port bebop to Deno, as you should't even need to deal with native bindings. I think the only overhead would be to replace some node APIs like child_process, path and maybe fs.

Just my two cents, thanks for your work.

Support for "protobuf" format

Is there something inherently different about the existing Protocol Buffer format such that you couldn't have adopted that as your standard format? Many companies have existing tooling to author or generate protobuf format already. Introducing a new format makes it harder to support.

Is the subset of Protobuf that Bebop supports the reason for the large performance benefits? Would adopting the functionality not supported in Bebop (nested defs, repeated properties) be a reason why a new format had to be invented?

Create a light version of the project icon

Is your feature request related to a problem? Please describe.

Based on discussion in #135 - When viewing the readme in GitHub dark mode the contrast on the logo is a bit rough. This issue tracks creating a lighter version of the logo for use in cases like that.

Describe the solution you'd like

A light version of https://github.com/RainwayApp/bebop/blob/master/assets/128%402x.png

Describe alternatives you've considered

Leaving as is. Since this is a cosmetic-only change, it's pretty low priority and I don't see an alternative other than not doing it.

Rust support

Is your feature request related to a problem? Please describe.
It would be nice to have a Rust library for this, as many Rust libraries and applications often attempt to maximize performance, and this would be a great boost to that.

Describe the solution you'd like
A Rust library implementing Bebop, using serde as the (de)serialization framework.

Generated C# files, unsupported by Unity

Describe the bug
Feature 'not pattern' is not available in C# 8.0. Please use language version 9.0 or greater;
'NotNullAttribute' is inaccessible due to its protection level;

To Reproduce
Steps to reproduce the behavior:

  1. Generate C# files from .bop
  2. Go to 'Unity'
  3. Use generated files
  4. See compile errors

Expected behavior
Normal working code, without errors

Bebop info:

  • Version: 2.2.2
  • Runtime: .NET

Desktop (please complete the following information):

  • OS: Windows
  • Version: 10

Additional context
I am trying to use bebop on Unity along with a Node.JS server. On the server side, everything works, but unfortunately the generated C # files use syntax not supported by the Unity (Negated not patterns) and attributes (NotNull, DisallowNull, MaybeNull, AllowNull)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.