Giter Site home page Giter Site logo

dbolin / apex.serialization Goto Github PK

View Code? Open in Web Editor NEW
86.0 86.0 13.0 535 KB

High performance contract-less binary serializer for .NET

License: MIT License

C# 99.96% Batchfile 0.04%
binary c-sharp custom-serialization fast graph netcore serialization serializer

apex.serialization's People

Contributors

alecrodden avatar dbolin avatar dependabot-preview[bot] avatar dependabot[bot] avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

apex.serialization's Issues

Is this serializer meant to be deterministic?

This is way longer than I wanted.. (Thought it would be useful to be a little verbose as im still digging into this) the main question I have is how suspicious is it that the serialized byte count is different with the same input?

Library Version Used & .NET Runtime Version:

.NET 6 w/ 4.0.3

Soo.. I'm looking into another issue we are having.

The problem with this one is it is truly random. We have in-depth integration tests and on my machine after running the tests over and over... randomly we get exceptions (sometimes it takes 5 attempts, sometimes 20) from the serializer that it cant cast object A to B.

(In case you are curious the exception looks like the following.. but its all inside the generated expressions.. and specific to our Types so its not really helpful)

System.InvalidCastException: Unable to cast object of type 'Engine.DataStructures.Values.SimpleValue' to type 'Engine.DataStructures.Values.Value'.
    at Apex.Serialization.Read_System.Collections.Immutable.ImmutableSortedDictionary`2+Node[[System.String, System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[Engine.DataStructures.Values.Value, Engine.DataStructures, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null]](Closure , BufferedStream& , Binary`2 )
    at Apex.Serialization.Binary`2.ReadSealedInternal[T](Boolean useSerializedVersionId)
    at Apex.Serialization.Read_System.Collections.Immutable.ImmutableSortedDictionary`2+Node[[System.String, System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[Engine.DataStructures.Values.Value, Engine.DataStructures, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null]](Closure , BufferedStream& , Binary`2 )
    at Apex.Serialization.Binary`2.ReadSealedInternal[T](Boolean useSerializedVersionId)
    at Apex.Serialization.Read_Engine.DataStructures.Values.Collection(Closure , BufferedStream& , Binary`2 )
    at Apex.Serialization.Binary`2.ReadInternal()
    at Apex.Serialization.Read_Engine.Stateless.State.ExecutionStateRoot(Closure , BufferedStream& , Binary`2 )
    at Apex.Serialization.Binary`2.ReadSealedInternal[T](Boolean useSerializedVersionId)
    at Apex.Serialization.Binary`2.ReadObjectEntry[T]()
    at Apex.Serialization.Binary`2.Read[T](Stream 

One of the first things I noticed is that the size of the bytes the serializer is different than another with the same input. (I need to check if this is 100% the same input.. its at least the same build up objects.. its all from Test Builders)
So my question is, is this serializer deterministic? What would make it not be?

I'm still digging into my actual problem, but I thought it was weird that the size of the bytes produced is sometimes different, and when the size is different than others it usually explodes deserializing.

Being that this is random is difficult for me to create a small reproducible set.. im still trying though

What ive determined thus far

The exception originates from reading the data in the stream while trying to pull out an already loaded reference via (

) but its the wrong type. (The refIndex used to read the LoadedObjectRefs is off by one)

I think the problem is on the writing side and not reading side so far just based on the # of bytes written is different.

Use of mutation testing in Apex.Serialization - Help needed

Hello there!

My name is Ana. I noted that you use the mutation testing tool strykernet in the project.
I am a postdoctoral researcher at the University of Seville (Spain), and my colleagues and I are studying how mutation testing tools are used in practice. With this aim in mind, we have analysed over 3,500 public GitHub repositories using mutation testing tools, including yours! This work has recently been published in a journal paper available at https://link.springer.com/content/pdf/10.1007/s10664-022-10177-8.pdf.

To complete this study, we are asking for your help to understand better how mutation testing is used in practice, please! We would be extremely grateful if you could contribute to this study by answering a brief survey of 21 simple questions (no more than 6 minutes). This is the link to the questionnaire https://forms.gle/FvXNrimWAsJYC1zB9.

Drop me an e-mail if you have any questions or comments ([email protected]). Thank you very much in advance!!

Incorrect serialization with Nullable<> + Generics + ValueType (maybe?)

Hey Dominic.
I'm still looking into this, but I'm opening this (before I have a full fix PR) incase you know something off the top of your head that would help me diagnose this.

Basically our application has crashed a few times now due to memory corruption. I've traced it down to Apex.Serialization while deserializing some state. It looks to me that it incorrectly serialized that state. (Its pretty hard to debug compiled expressions.. more on that later...)

I think I got it worked down to a small reproducible sample. What is funny is that while serializing in debug it actually catches that its about to write some bad stuff and explodes.

Example exception while in DEBUG mode of the library
System.InvalidOperationException Operation is not valid due to the current state of the object. at Apex.Serialization.Internal.BufferedStream.CheckReserved(Int32 size) in C:\Github\Apex.Serialization\Apex.Serialization\Internal\BufferedStream.cs:line 152 at Apex.Serialization.Write_Apex.Serialization.Tests.Option(Closure , Option , BufferedStream& , Binary`2 ) at Apex.Serialization.Binary`2.WriteSealedInternal[T](T value, Boolean useSerializedVersionId) in C:\Github\Apex.Serialization\Apex.Serialization\Binary.Internal.cs:line 628

Current thoughts:

  1. Maybe we could add a Apex.Serialization.Settings property that turns these sanity checks on even in release mode? (I personally would eat the loss perf to check this to better know that the seriation was accurate and wont crash my app again lol)
  2. In my commit the Settings.IsTypeSerializable() has some shadiness to it.. im not sure this actually needed. The version of the library we are using is before the whitelisting.. so im prob doing someting wrong.
  3. My current target area for the problem/fix is around DynamicCode.HandleNullableWrite It is writing a byte 0 when hasValueMethod is actually null is what is setting off the InvalidOperationException
  4. With my work's continuing to use this type of library.. I'm wondering the feasibility of creating a SourceGenerator version of this library. The main point being debuggability for any future problems we may have... with side a side bonus of performance. (I may take a pass at this later this week.. it will help me get into this codebase more as well)

I'll provide updates to this issue as I continue to work on this.

Library Version Used & .NET Runtime Version:
Reproduced in 1.3.3 (version we were on at the time) and latest in master. (4.0.2)
.NET 3.1 & .NET 6

Steps to Reproduce:

In my branch
I added a test showing the problem.
My branch/commit: Zoxive@c254a0a

Test:
https://github.com/Zoxive/Apex.Serialization/blob/c254a0ae020e217d31da8fad45a29dbc889d3407/Tests/Apex.Serialization.Tests/NullableGenericValueTypeTests.cs

I commented out some [Conditional("DEV"]) statements so if you ran it in RELEASE mode you see the problem as well.

Expected Behavior:
Dont create serialized state which when deserializing cause memory corruption and crash the CLR

Actual Behavior:
Work without crashing : )

More than one object in a stream ...

.NET 4.7.1 and v1.3.4
.NET Core 3.0 and v.2.0.1

Steps to Reproduce:
ApexTest

Expected Behavior:
Serialize and Deserialize two objects

Actual Behavior:
Runtime Error: System.InvalidOperationException: in CheckSize

I've tried to send a stream of Objects over a NetworkStream to another system,
but it crashes after a few objects.

Is this a bug or by design ??

NETStandard 2.0

Hello,
would it be possible for version 2.x to still support .NETStandard 2.0?

Stackoverflow when deserializing large object graph

Library Version Used & .NET Runtime Version:
Reproduced in 1.3.3 (version we were on at the time) and 2.0.1
.NET 3.0 and 3.1 tested as well

Steps to Reproduce:
I'm still attempting to create a reproducible project.. the object we are serializing is quite a large graph so i havent been able to reproduce it in a simpler form yet.

Expected Behavior:
Deserialize

Actual Behavior:
Dont explode

Notes
I'm hopeful that the dynamic code generation can be rewritten to be more stack efficient, this appears to be the problem. (Feels similar to dotnet/aspnetcore#2737 fixed by dotnet/extensions#570)

If i manually set my StackSize to really high it deserializes fine. set COMPlus_DefaultStackSize=10000000

Does not work when underlying stream does not return full blocks at once

Hi,
I have run into some issues when combining Apex.Serialization and LZ4 compression.

Library Version Used & .NET Runtime Version:
.NET Framework 4.7.1
Apex.Serialization v1.3.3

Steps to Reproduce:

  1. Get lz4net and K4os.Compression.Streams Nuget packages (and Apex.Serialization of course)
  2. Run the example code: Example.txt

Expected Behavior:
It should work

Actual Behavior:
It does not works and throws an error:

Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at Apex.Serialization.Internal.BufferedStream.Flush()
   at Apex.Serialization.Read_APG.SimResults(Closure , BufferedStream& , Binary`1 )
   at Apex.Serialization.Binary`1.ReadSealedInternal[T]()
   at Apex.Serialization.Binary`1.ReadObjectEntry[T]()
   at Apex.Serialization.Binary`1.Read[T](Stream inputStream)

I though it was related to the LZ4 compression I was using and raised and issue with them but they explained that Apex.Serialization does not handle some cases correctly:

MiloszKrajewski/K4os.Compression.LZ4#36 :

You can raise an issue of Apex.Serialization saying it does not work when underlying stream does not return full blocks at once (like network stream).

https://docs.microsoft.com/en-us/dotnet/api/system.io.stream.read?view=netcore-3.1

Note: "Returns: The total number of bytes read into the buffer. This can be less than the number of bytes allocated in the buffer if that many bytes are not currently available, or zero (0) if the end of the stream has been reached."

There is a lot of libraries which does not handle this case correctly.

In the meantime you can use K4os.Compression.Streams 1.2.2-beta which changed default behaviour (and blocks until full block is read).

Thanks,
Titas

Deserialization error in MVC project

Hi,
we have a project in MVC environment.
We use Apex.Serialization to serialize and then store byte[] in our DB.
Everything works fine until we stop and restart the IIS Application Pool.
After that (but not every time....) the deserialization of an object saved before result in a
"Index out of range. Non-negative value and less than the collection size required" exception.
New serialized object works fine (until the next restart).
Subsequent restarts of Application Pool lead to random results (like the object can be deserialized again or every deserialization throw the exception).
The same code, run under windows project, works fine every time.
Thanks
Andrea

merge two files

Is there a way I can merge the two files without opening them?

Question: Is Apex.Serialization thread safe?

Documentation doesn't mention thread safety but it says to reuse instances: "Always reuse serializer instances when possible, as the instance caches a lot of data to improve performance when repeatedly serializing or deserializing objects."

I have a small test case where two simultaneous tasks running serialization and sharing the same instance, it either write zero bytes or invalid bytes to stream.
ApexBug.zip

OverflowException when deserializing byte[]

Hi,
I am trying to use Apex serialiser to serialise some objects with data stored in them. Some of that data is stored as byte[] and I noticed that the byte[] data is missing from the deserialised objects. I tried to serialise just byte[] and it seems to be throwing a OverflowException. Is there any way to work around this?

Many Thanks,
Titas

Library Version Used & .NET Runtime Version:
Apex.Serialization 1.3.4
.NET Framework 4.7.1
Platform Target: x64

Steps to Reproduce:

  1. Create byte[]
  2. Serialise it and save it to file
  3. Try to load from file an deserialise

Expected Behavior:
It should work

Actual Behavior:
It throws OverflowException and fails to reload

Exception thrown: 'System.OverflowException' in Apex.Serialization.dll
An unhandled exception of type 'System.OverflowException' occurred in Apex.Serialization.dll
Array dimensions exceeded supported range.

Code to reproduce:

using Apex.Serialization;
using System;
using System.IO;

namespace ConsoleApp2
{
    class Program
    {
        static void Main(string[] args)
        {
            // Create byte array
            Random rnd = new Random(22);
            byte[] byteData = new byte[10];
            rnd.NextBytes(byteData);

            // Save to file - this seems to work fine
            Save(byteData, @"D:\test1.bin");

            // Load from file - this throws OverflowException
            byte[] b1 = Load<byte[]>(@"D:\test1.bin");

            Console.WriteLine("Success");
            Console.ReadLine();
        }

        public static void Save(object inv, string filepath)
        {
            using (FileStream writeFile = File.Create(filepath))
            {
                using (IBinary apex = Binary.Create())
                {
                    apex.Write(inv, writeFile);
                }
            }
        }
        public static T Load<T>(string filepath)
        {
            T results;
            using (FileStream readFile = File.OpenRead(filepath))
            {
                using (IBinary apex = Binary.Create())
                {
                    results = apex.Read<T>(readFile);
                }
            }
            return results;
        }
    }
}

The type initializer for 'PerTypeValues`1' threw an exception.

Hi,
While deserializing byte array I was throws "The type initializer for 'PerTypeValues`1' threw an exception. "
InnerException: Could not load file or assembly 'System.Runtime.CompilerServices.Unsafe, Version=4.0.4.1, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a' or one of its dependencies. The located assembly's manifest definition does not match the assembly reference. (Exception from HRESULT: 0x80131040)

I have referenced System.Runtime.CompilerServices.Unsafe nuget, but it did not help.

Add whitelist option

Add an option to allow whitelisting types to serialize/deserialize. This may have to be a global option.

Enable Compression on serialize using Brotli Or ZstdSharp

Greetings and Regards.

I suggested that you add a compressor like Brotli or Zstd or Snappy to your beautiful library.
Both of these libraries have the highest compression and decompression speed and are very small in size.

Thanks

https://github.com/oleg-st/ZstdSharp
https://learn.microsoft.com/en-us/dotnet/api/system.io.compression.brotlistream?view=net-8.0

C Size ratio% C MB/s D MB/s Name
32823983 32.8 3.40 67.92 lzma 9
32872154 32.8 0.31 315.27 brotli 11d27
32925079 32.9 1.70 70.67 lzturbo 49
33936389 33.9 2.57 1701.35 lzturbo 39
34105370 34.1 3.32 952.59 zstd 22
36751363 36.7 48.30 1701.59 lzturbo 32
36920708 36.7 2.98 2355.32 lzturbo 29
46546059 46.5 163.77 1489.57 lzturbo 31
46805879 46.8 44.66 940.64 zstd 9
48152545 48.1 52.94 349.62 brotli 4
49497505 49.4 2.48 2299.20 lizard 49
49773790 49.7 38.08 1952.73 lzturbo 22
49860700 49.8 16.94 295.99 zlib 9
49962678 49.9 35.70 294.24 zlib 6
50278958 50.2 282.43 1372.91 lzturbo 30
52509931 52.5 290.96 347.16 brotli 1
52549655 52.5 239.35 2153.41 lzturbo 21
52928477 52.9 69.17 276.75 zlib 1
52983490 52.9 393.67 984.00 zstd 1
54251482 54.2 2.60 4367.15 lzturbo 19
54410769 54.4 46.37 3305.22 lz4 9
55923645 55.9 188.40 4200.23 lzturbo 12
57606731 57.6 386.90 3948.64 lzturbo 11
59085723 59.0 698.39 2196.24 lzturbo 20
61455711 61.4 800.71 4003.54 lzturbo 10
61938605 61.9 730.46 3330.40 lz4 1
100098564 100.0 8647.84 8408.10 memcpy

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.