dbolin / apex.serialization Goto Github PK

View Code? Open in Web Editor NEW

86.0 86.0 13.0 535 KB

High performance contract-less binary serializer for .NET

License: MIT License

C# 99.96% Batchfile 0.04%

binary c-sharp custom-serialization fast graph netcore serialization serializer

apex.serialization's People

Contributors

Stargazers

Watchers

Forkers

dadhi chenlongxi666 sharpcoder7 jesperbandersen keythemechanic idealist1508 pocmd alecrodden gevordanielyan zoxive hopla moryakspb jsboige

apex.serialization's Issues

Is this serializer meant to be deterministic?

This is way longer than I wanted.. (Thought it would be useful to be a little verbose as im still digging into this) the main question I have is how suspicious is it that the serialized byte count is different with the same input?

Library Version Used & .NET Runtime Version:

.NET 6 w/ 4.0.3

Soo.. I'm looking into another issue we are having.

The problem with this one is it is truly random. We have in-depth integration tests and on my machine after running the tests over and over... randomly we get exceptions (sometimes it takes 5 attempts, sometimes 20) from the serializer that it cant cast object A to B.

(In case you are curious the exception looks like the following.. but its all inside the generated expressions.. and specific to our Types so its not really helpful)

System.InvalidCastException: Unable to cast object of type 'Engine.DataStructures.Values.SimpleValue' to type 'Engine.DataStructures.Values.Value'.
    at Apex.Serialization.Read_System.Collections.Immutable.ImmutableSortedDictionary`2+Node[[System.String, System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[Engine.DataStructures.Values.Value, Engine.DataStructures, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null]](Closure , BufferedStream& , Binary`2 )
    at Apex.Serialization.Binary`2.ReadSealedInternal[T](Boolean useSerializedVersionId)
    at Apex.Serialization.Read_System.Collections.Immutable.ImmutableSortedDictionary`2+Node[[System.String, System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[Engine.DataStructures.Values.Value, Engine.DataStructures, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null]](Closure , BufferedStream& , Binary`2 )
    at Apex.Serialization.Binary`2.ReadSealedInternal[T](Boolean useSerializedVersionId)
    at Apex.Serialization.Read_Engine.DataStructures.Values.Collection(Closure , BufferedStream& , Binary`2 )
    at Apex.Serialization.Binary`2.ReadInternal()
    at Apex.Serialization.Read_Engine.Stateless.State.ExecutionStateRoot(Closure , BufferedStream& , Binary`2 )
    at Apex.Serialization.Binary`2.ReadSealedInternal[T](Boolean useSerializedVersionId)
    at Apex.Serialization.Binary`2.ReadObjectEntry[T]()
    at Apex.Serialization.Binary`2.Read[T](Stream

One of the first things I noticed is that the size of the bytes the serializer is different than another with the same input. (I need to check if this is 100% the same input.. its at least the same build up objects.. its all from Test Builders)
So my question is, is this serializer deterministic? What would make it not be?

I'm still digging into my actual problem, but I thought it was weird that the size of the bytes produced is sometimes different, and when the size is different than others it usually explodes deserializing.

Being that this is random is difficult for me to create a small reproducible set.. im still trying though

What ive determined thus far

The exception originates from reading the data in the stream while trying to pull out an already loaded reference via (

Apex.Serialization/Apex.Serialization/Internal/DynamicCode.cs

Line 645 in 3a93475

Expression.Convert(

) but its the wrong type. (The refIndex used to read the LoadedObjectRefs is off by one)

I think the problem is on the writing side and not reading side so far just based on the # of bytes written is different.

Add option for autogenerated serialized version uid

This would allow better exceptions in error cases where a mismatching type is attempted to be deserialized (instead of possible crashing the program or deserializing incorrect data).

Use of mutation testing in Apex.Serialization - Help needed

Hello there!

My name is Ana. I noted that you use the mutation testing tool strykernet in the project.
I am a postdoctoral researcher at the University of Seville (Spain), and my colleagues and I are studying how mutation testing tools are used in practice. With this aim in mind, we have analysed over 3,500 public GitHub repositories using mutation testing tools, including yours! This work has recently been published in a journal paper available at https://link.springer.com/content/pdf/10.1007/s10664-022-10177-8.pdf.

To complete this study, we are asking for your help to understand better how mutation testing is used in practice, please! We would be extremely grateful if you could contribute to this study by answering a brief survey of 21 simple questions (no more than 6 minutes). This is the link to the questionnaire https://forms.gle/FvXNrimWAsJYC1zB9.

Drop me an e-mail if you have any questions or comments ([email protected]). Thank you very much in advance!!

Incorrect serialization with Nullable<> + Generics + ValueType (maybe?)

Hey Dominic.
I'm still looking into this, but I'm opening this (before I have a full fix PR) incase you know something off the top of your head that would help me diagnose this.

Basically our application has crashed a few times now due to memory corruption. I've traced it down to Apex.Serialization while deserializing some state. It looks to me that it incorrectly serialized that state. (Its pretty hard to debug compiled expressions.. more on that later...)

I think I got it worked down to a small reproducible sample. What is funny is that while serializing in debug it actually catches that its about to write some bad stuff and explodes.

Example exception while in DEBUG mode of the library
System.InvalidOperationException Operation is not valid due to the current state of the object. at Apex.Serialization.Internal.BufferedStream.CheckReserved(Int32 size) in C:\Github\Apex.Serialization\Apex.Serialization\Internal\BufferedStream.cs:line 152 at Apex.Serialization.Write_Apex.Serialization.Tests.Option(Closure , Option , BufferedStream& , Binary`2 ) at Apex.Serialization.Binary`2.WriteSealedInternal[T](T value, Boolean useSerializedVersionId) in C:\Github\Apex.Serialization\Apex.Serialization\Binary.Internal.cs:line 628

Current thoughts:

Maybe we could add a Apex.Serialization.Settings property that turns these sanity checks on even in release mode? (I personally would eat the loss perf to check this to better know that the seriation was accurate and wont crash my app again lol)
In my commit the Settings.IsTypeSerializable() has some shadiness to it.. im not sure this actually needed. The version of the library we are using is before the whitelisting.. so im prob doing someting wrong.
My current target area for the problem/fix is around DynamicCode.HandleNullableWrite It is writing a byte 0 when hasValueMethod is actually null is what is setting off the InvalidOperationException
With my work's continuing to use this type of library.. I'm wondering the feasibility of creating a SourceGenerator version of this library. The main point being debuggability for any future problems we may have... with side a side bonus of performance. (I may take a pass at this later this week.. it will help me get into this codebase more as well)

I'll provide updates to this issue as I continue to work on this.

Library Version Used & .NET Runtime Version:
Reproduced in 1.3.3 (version we were on at the time) and latest in master. (4.0.2)
.NET 3.1 & .NET 6

Steps to Reproduce:

In my branch
I added a test showing the problem.
My branch/commit: Zoxive@c254a0a

Test:
https://github.com/Zoxive/Apex.Serialization/blob/c254a0ae020e217d31da8fad45a29dbc889d3407/Tests/Apex.Serialization.Tests/NullableGenericValueTypeTests.cs

I commented out some [Conditional("DEV"]) statements so if you ran it in RELEASE mode you see the problem as well.

Expected Behavior:
Dont create serialized state which when deserializing cause memory corruption and crash the CLR

Actual Behavior:
Work without crashing : )

More than one object in a stream ...

.NET 4.7.1 and v1.3.4
.NET Core 3.0 and v.2.0.1

Steps to Reproduce:
ApexTest

Expected Behavior:
Serialize and Deserialize two objects

Actual Behavior:
Runtime Error: System.InvalidOperationException: in CheckSize

I've tried to send a stream of Objects over a NetworkStream to another system,
but it crashes after a few objects.

Is this a bug or by design ??

NETStandard 2.0

Hello,
would it be possible for version 2.x to still support .NETStandard 2.0?

Optimization for non-nullable reference types

Don't emit a null byte marker for non-nullable reference types

Based on metadata described in https://github.com/dotnet/roslyn/blob/master/docs/features/nullable-metadata.md

Arrays are problematic for this, because while they offer the most potential benefit, it's not uncommon to have T[] where some elements are actually null (reserved space, sparse array, etc).

Stackoverflow when deserializing large object graph

Library Version Used & .NET Runtime Version:
Reproduced in 1.3.3 (version we were on at the time) and 2.0.1
.NET 3.0 and 3.1 tested as well

Steps to Reproduce:
I'm still attempting to create a reproducible project.. the object we are serializing is quite a large graph so i havent been able to reproduce it in a simpler form yet.

Expected Behavior:
Deserialize

Actual Behavior:
Dont explode

Notes
I'm hopeful that the dynamic code generation can be rewritten to be more stack efficient, this appears to be the problem. (Feels similar to dotnet/aspnetcore#2737 fixed by dotnet/extensions#570)

If i manually set my StackSize to really high it deserializes fine. set COMPlus_DefaultStackSize=10000000

Does not work when underlying stream does not return full blocks at once

Hi,
I have run into some issues when combining Apex.Serialization and LZ4 compression.

Library Version Used & .NET Runtime Version:
.NET Framework 4.7.1
Apex.Serialization v1.3.3

Steps to Reproduce:

Get lz4net and K4os.Compression.Streams Nuget packages (and Apex.Serialization of course)
Run the example code: Example.txt

Expected Behavior:
It should work

Actual Behavior:
It does not works and throws an error:

Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at Apex.Serialization.Internal.BufferedStream.Flush()
   at Apex.Serialization.Read_APG.SimResults(Closure , BufferedStream& , Binary`1 )
   at Apex.Serialization.Binary`1.ReadSealedInternal[T]()
   at Apex.Serialization.Binary`1.ReadObjectEntry[T]()
   at Apex.Serialization.Binary`1.Read[T](Stream inputStream)

I though it was related to the LZ4 compression I was using and raised and issue with them but they explained that Apex.Serialization does not handle some cases correctly:

MiloszKrajewski/K4os.Compression.LZ4#36 :

You can raise an issue of Apex.Serialization saying it does not work when underlying stream does not return full blocks at once (like network stream).

https://docs.microsoft.com/en-us/dotnet/api/system.io.stream.read?view=netcore-3.1

Note: "Returns: The total number of bytes read into the buffer. This can be less than the number of bytes allocated in the buffer if that many bytes are not currently available, or zero (0) if the end of the stream has been reached."

There is a lot of libraries which does not handle this case correctly.

In the meantime you can use K4os.Compression.Streams 1.2.2-beta which changed default behaviour (and blocks until full block is read).

Thanks,
Titas

Deserialization error in MVC project

Hi,
we have a project in MVC environment.
We use Apex.Serialization to serialize and then store byte[] in our DB.
Everything works fine until we stop and restart the IIS Application Pool.
After that (but not every time....) the deserialization of an object saved before result in a
"Index out of range. Non-negative value and less than the collection size required" exception.
New serialized object works fine (until the next restart).
Subsequent restarts of Application Pool lead to random results (like the object can be deserialized again or every deserialization throw the exception).
The same code, run under windows project, works fine every time.
Thanks
Andrea

merge two files

Is there a way I can merge the two files without opening them?

Question: Is Apex.Serialization thread safe?

Documentation doesn't mention thread safety but it says to reuse instances: "Always reuse serializer instances when possible, as the instance caches a lot of data to improve performance when repeatedly serializing or deserializing objects."

I have a small test case where two simultaneous tasks running serialization and sharing the same instance, it either write zero bytes or invalid bytes to stream.
ApexBug.zip

OverflowException when deserializing byte[]

Hi,
I am trying to use Apex serialiser to serialise some objects with data stored in them. Some of that data is stored as byte[] and I noticed that the byte[] data is missing from the deserialised objects. I tried to serialise just byte[] and it seems to be throwing a OverflowException. Is there any way to work around this?

Many Thanks,
Titas

Library Version Used & .NET Runtime Version:
Apex.Serialization 1.3.4
.NET Framework 4.7.1
Platform Target: x64

Steps to Reproduce:

Create byte[]
Serialise it and save it to file
Try to load from file an deserialise

Expected Behavior:
It should work

Actual Behavior:
It throws OverflowException and fails to reload

Exception thrown: 'System.OverflowException' in Apex.Serialization.dll
An unhandled exception of type 'System.OverflowException' occurred in Apex.Serialization.dll
Array dimensions exceeded supported range.

Code to reproduce:

using Apex.Serialization;
using System;
using System.IO;

namespace ConsoleApp2
{
    class Program
    {
        static void Main(string[] args)
        {
            // Create byte array
            Random rnd = new Random(22);
            byte[] byteData = new byte[10];
            rnd.NextBytes(byteData);

            // Save to file - this seems to work fine
            Save(byteData, @"D:\test1.bin");

            // Load from file - this throws OverflowException
            byte[] b1 = Load<byte[]>(@"D:\test1.bin");

            Console.WriteLine("Success");
            Console.ReadLine();
        }

        public static void Save(object inv, string filepath)
        {
            using (FileStream writeFile = File.Create(filepath))
            {
                using (IBinary apex = Binary.Create())
                {
                    apex.Write(inv, writeFile);
                }
            }
        }
        public static T Load<T>(string filepath)
        {
            T results;
            using (FileStream readFile = File.OpenRead(filepath))
            {
                using (IBinary apex = Binary.Create())
                {
                    results = apex.Read<T>(readFile);
                }
            }
            return results;
        }
    }
}

The type initializer for 'PerTypeValues`1' threw an exception.

Hi,
While deserializing byte array I was throws "The type initializer for 'PerTypeValues`1' threw an exception. "
InnerException: Could not load file or assembly 'System.Runtime.CompilerServices.Unsafe, Version=4.0.4.1, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a' or one of its dependencies. The located assembly's manifest definition does not match the assembly reference. (Exception from HRESULT: 0x80131040)

I have referenced System.Runtime.CompilerServices.Unsafe nuget, but it did not help.

Steps to Reproduce:

Nuget package: https://www.nuget.org/packages/Apex.Serialization/3.0.0

Expected Behavior:
Same version on the repo and nuget package.

Actual Behavior:
Not same version on the repo and nuget package.

Enable Compression on serialize using Brotli Or ZstdSharp

Greetings and Regards.

I suggested that you add a compressor like Brotli or Zstd or Snappy to your beautiful library.
Both of these libraries have the highest compression and decompression speed and are very small in size.

Thanks

https://github.com/oleg-st/ZstdSharp
https://learn.microsoft.com/en-us/dotnet/api/system.io.compression.brotlistream?view=net-8.0

C Size	ratio%	C MB/s	D MB/s	Name
32823983	32.8	3.40	67.92	lzma 9
32872154	32.8	0.31	315.27	brotli 11d27
32925079	32.9	1.70	70.67	lzturbo 49
33936389	33.9	2.57	1701.35	lzturbo 39
34105370	34.1	3.32	952.59	zstd 22
36751363	36.7	48.30	1701.59	lzturbo 32
36920708	36.7	2.98	2355.32	lzturbo 29
46546059	46.5	163.77	1489.57	lzturbo 31
46805879	46.8	44.66	940.64	zstd 9
48152545	48.1	52.94	349.62	brotli 4
49497505	49.4	2.48	2299.20	lizard 49
49773790	49.7	38.08	1952.73	lzturbo 22
49860700	49.8	16.94	295.99	zlib 9
49962678	49.9	35.70	294.24	zlib 6
50278958	50.2	282.43	1372.91	lzturbo 30
52509931	52.5	290.96	347.16	brotli 1
52549655	52.5	239.35	2153.41	lzturbo 21
52928477	52.9	69.17	276.75	zlib 1
52983490	52.9	393.67	984.00	zstd 1
54251482	54.2	2.60	4367.15	lzturbo 19
54410769	54.4	46.37	3305.22	lz4 9
55923645	55.9	188.40	4200.23	lzturbo 12
57606731	57.6	386.90	3948.64	lzturbo 11
59085723	59.0	698.39	2196.24	lzturbo 20
61455711	61.4	800.71	4003.54	lzturbo 10
61938605	61.9	730.46	3330.40	lz4 1
100098564	100.0	8647.84	8408.10	memcpy

dbolin / apex.serialization Goto Github PK

apex.serialization's People

Contributors

Stargazers

Watchers

Forkers

apex.serialization's Issues

Library Version Used & .NET Runtime Version:

What ive determined thus far

Recommend Projects

Recommend Topics

Recommend Org