jtmueller / collections.pooled Goto Github PK

View Code? Open in Web Editor NEW

529.0 25.0 48.0 6.68 MB

Fast, low-allocation ports of List, Dictionary, HashSet, Stack, and Queue using ArrayPool and Span.

License: MIT License

C# 100.00%

list dictionary stack arraypool span hashset queue

collections.pooled's People

Contributors

Stargazers

Watchers

Forkers

erisonliang equiprogrammer huex kopffarben cmpm72 taigacon eaba keepitsimple libs-space eakova jgefen deviceman ashafei maxchiu proektsoftbg ipavel83 mckanpolat kawhiwei pinkuburu neoayi mingyaaaa qiufengshe wstrong007 omegated wunmao shibox jianyuyanyu leinlin pga1992 zqlovejyc ares128 zzhnysr charygao mike-cd-csharp hoavk frostebite fredatgithub yunusbyrak truonggiangvu papercut110 zilo555 freddiebabord hengle monarchsolutions innotech-ameen-alqattow thanhdang143 jshergal changweizhu

collections.pooled's Issues

PooledList: the indexer-get operation is 9-10 times slower than that of List, but Span is fine

PooledList: the indexer-get operation is 9-10 times slower than that of List, despite being the exact same code, as far as I can tell! I haven't found any way of closing that gap other than removing the bounds-check, which isn't a good idea. I suspect some sort of List-specific optimization in the framework that PooledList can't benefit from. However, indexing into PooledList.Span is 84% faster than indexing into List, so there's that.

https://github.com/jtmueller/Collections.Pooled/blob/master/docs/benchmarks/netcoreapp2.2/List_Indexer_Types-report-github.md

I will look into this.

Add support for Tuple<TKey, TValue> to dictionary constructor/AddRange

Is your feature request related to a problem? Please describe.
We already support ValueTuple, but there are cases where support for Tuple would be convenient, particularly in F# to make an equivalent to the dict function.

Describe the solution you'd like
Constructors, AddRange, and extension methods for PooledDictionary should support IEnumerable<Tuple<TKey, TValue>>. No need to support spans of same.

Add support for PooledQueue

Is your feature request related to a problem? Please describe.
PooledQueue is the only major non-Sorted collection left to support.

Describe the solution you'd like
Implement PooledQueue, as with the others.

Add suppport for custom ArrayPools

Is your feature request related to a problem? Please describe.
Currently the pooled collections are all hard-coded to use ArrayPool<T>.Shared, but ArrayPool<T> is an abstract class intended to allow for custom implementations. For example, on a resource-constrained device someone might need an ArrayPool of a fixed size, or that pre-allocates arrays or something. At the moment there is no way to get PooledCollections to make use of such a custom ArrayPool.

Describe the solution you'd like
We need a way to optionally specify a custom ArrayPool, and default to ArrayPool<T>.Shared if none is provided. However, passing in an instance of a custom ArrayPool wouldn't work for PooledDictionary or PooledSet, because those types rely on arrays of a private type inaccessible from outside.

Perhaps there's a way to use a generic type parameter to specify the type of a custom ArrayPool, and then create an instance of this type inside the collection code?

Resurrect rented buffers or arrays in finalyzer

Has it been considered to provide an option of returning the rented buffers/arrays from the collection finalyzer. May be useful when disposing of a pooled collection instance is too complicated or impossible to enforce. For instance, a collection is being returned from some library as its base interface (IList as an example).

To not waste anybody's time I am not providing possible ways to implementing such feature until know for sure it is not a duplicate to some other, known request or issue.

Thank you.

Add a Span-based constructor overload to PooledStack

Is your feature request related to a problem? Please describe.

If foo is PooledList<int>...
new PooledStack<int>(foo.Span)
...does not work without an explicit cast.

Describe the solution you'd like
It would be nice to not have to cast. There may be other pooled collections to which this applies.

Add equivalent of CollectionsMarshal.GetValueRef for PooledDictionary.

Is your feature request related to a problem? Please describe.

.NET 6 exposes a low-level method to get a ref to the entry in a Dictionary<,>: dotnet/runtime#49388

It would be nice to have an equivalent to this in the pooled collections.

Describe the solution you'd like

Something like PooledCollectionsMarshal.GetValueRefOrNullRef?

Support for .NET Standard 2.1

.NET Standard 2.1 is in Preview (as part of .NET Core 3.0) as of this writing. The current version...

Is implemented by .NET Core 3.0 but not .NET Core 2.1 or 2.2.
Adds Span, Memory, ArrayPool, and support for the C# 8 Index and Range syntax
Fully supports everything Collections.Pooled needs.

The downside is this:

Currently we target .NET Standard 2.0 (with NuGet packages for Span and ArrayPool) and .NET Core 2.1 (with a higher-performance built-in Span)
It would be nice to target only .NET Standard 2.0 and 2.1, without targeting any specific frameworks. However, this would mean that .NET Core 2.1/2.2 clients (which don't support .NET Standard 2.1) would get the Standard 2.0 version of the library with the slower implementation of Span, even though there's a faster one built in. Maybe type redirection would cause it to use fast-Span regardless, but I'm not sure.
However, keeping the .NET Core 2.1 target while adding .NET Standard 2.1 is also a problem. A .NET Core 3.0 client (at least as of the current preview) will default to loading the .NET Core 2.1 version of the library when presented with a choice between Core 2.1 and Standard 2.1. I guess that specific frameworks trump standards, even when the supported standard is newer. This means that .NET Core 3.0 clients would not get the version of the library that supports Indexes and Ranges.
To fix that last issue, I would have to target four frameworks: Standard 2.0 and 2.1, Core 2.1 and 3.0. This list would likely keep growing over time. I don't want to do this, because it complicates conditional compilation, and I'd rather target a small number of standards than many different versions of .NET Core.

My current plan is this, when the final release of .NET Core 3 comes out:

Increment the major version number, as this is a breaking change for .NET Core 2.1/2.2 users.
Target only .NET Standard 2.0 and 2.1.

By incrementing the major version, .NET Core 2.x users will not auto-upgrade to the new release, and can keep using Collections.Pooled 1.x. Users of .NET Core 3 will get the Standard 2.1 version, and all other users will get Standard 2.0 (until another framework adds support for Standard 2.1 of course).

This code is currently fully implemented in the core-3 branch if you want to experiment with it.

@dzmitry-lahoda and @Wixred I'm tagging you because you've shown an interest. Do you have any objection to this plan, or any alternate suggestions?

Performance of dictionary

I looked into dozen of dict benchmarks. I do not see when pooled is faster. In some cases it is even slower for some reason. Could you please document?

Aggressive optimization and inlining foor PooledList.Span

Is your feature request related to a problem? Please describe.
Next method is kind of slow in Unity runtime (not sure about Core):

public Span<T> Span => _items.AsSpan(0, _size);

Describe the solution you'd like
Mark this method as aggressive optimization and inlining.

Describe alternatives you've considered
Provide unsafe access to writable array as ReadOnlyProperty.

Additional context
We get data from Span by ref.

I know Core 3 optimized Span, but there are issues to measure that because of BDN-JIT instability. Same time Unity is involved into .NET Ecosystem and hopefully will optimize Span in some time (because people will use .NET Standard with Spans and Unity stated it is 2.0 and will be 2.1)

Nested pooled collections crash the Visual Studio heap snapshot

Describe the bug
When nesting pooled collections inside each other, the debugger crashes when taking a heap snapshot.

To Reproduce
Create a nested collection, for example PooledDictionary<int, PooledSet<int>> and attempt to take a heap snapshot in the Visual Studio debugger. This occurs in both the prerelease and 1.0 Nuget packages.

Expected behavior
The debugger should not crash.

Is it also possible that nesting pooled collections is just not recommended usage? I'm not really sure how it works internally.

Memory analysis on on dict benchmarks, like it wad did for list.

There is not enough external observation for developers to start using dictionary as replacement.

PooledDictionary get fails with ValueTuple key

Describe the bug
Lookups fail in a PooledDictionary<(int, int), string> but succeed in a Dictionary<(int, int), string>.

To Reproduce

class Program
{
    private static void PooledTest()
    {
        var dict = new PooledDictionary<(int, int), string>
        {
            { (1, 0), "1,0" },
            { (2, 0), "2,0" },
            { (0, 1), "0,1" },
            { (0, 2), "0,2" }
        };

        bool found1 = dict.TryGetValue((1, 0), out var _);
        bool found2 = dict.TryGetValue((2, 0), out var _);
        bool found3 = dict.TryGetValue((0, 1), out var _);
        bool found4 = dict.TryGetValue((0, 2), out var _);

        Console.WriteLine("PooledDictionary: {0}, {1}, {2}, {3}", found1, found2, found3, found4);
    }

    private static void DictTest()
    {
        var dict = new Dictionary<(int, int), string>
        {
            { (1, 0), "1,0" },
            { (2, 0), "2,0" },
            { (0, 1), "0,1" },
            { (0, 2), "0,2" }
        };

        bool found1 = dict.TryGetValue((1, 0), out var _);
        bool found2 = dict.TryGetValue((2, 0), out var _);
        bool found3 = dict.TryGetValue((0, 1), out var _);
        bool found4 = dict.TryGetValue((0, 2), out var _);

        Console.WriteLine("Dictionary: {0}, {1}, {2}, {3}", found1, found2, found3, found4);
    }

    static void Main(string[] args)
    {
        DictTest();
        PooledTest();
    }
}

Expected behavior

Dictionary: True, True, True, True
PooledDictionary: False, False, False, True

We should get the same results from both dictionary types, without having to specify custom comparers.

Retrieve Memory<T> from PooledList<T>

I am new to Span and Memory, and am trying to migrate my library to them. I discovered PooledList when looking into how to AddRange a Span to a List. My initial migration increased the warm of my program by 30% (parsing a large data file and loading it into different lists and dictionaries in memory). Thanks!
However I did find one point that I needed to work around:
It currently isn't possible to get a Memory from PooledList.
Being able to access a Memory from the PooledList will allow me to pass around heap storable references of sub-ranges which makes the PooledList even more flexible. I'd be able to use references to it with Linq IEnumerable and also be able to use it in local functions. Ie. it would free the references for the restrictions that are required by Span.

I understand that PooledList is built upon ArrayPool which means that returning a Memory will cause a memory leak when expanding the PooledList.
Is it possible to implement this functionality - or would it come as an unacceptable tradeoff to the performance/complexity of the library?

Thanks again

I see non generic IDictionary implemented. Would you accept pull do drop it?

Add ability to force clear of pooled data

Is your feature request related to a problem? Please describe.
Pooled collections have different behavior between .NET Core 2.1+ and .NET Standard 2.0. Because .NET Standard has no mechanism to easily determine if a value type contains a reference type, the .NET Standard version always clears values before returning arrays to the array pool.

The .NET Core 2.1+ version has an optimization that only clears values before returning arrays when those values are reference types or contain reference types. This helps performance, but some users might be concerned about leaking sensitive data to other code running in the same process, and might want even value-type data to be cleared before returning arrays to the pool.

Describe the solution you'd like
Pooled collections should allow the user to specify an "always clear" behavior - at least in the .NET Core 2.1+ version.

Upgrad and support .net 6

Is there is any plan to support .NET6?

Binary Serialization

Describe the bug
This is a reminder to test a likely issue that's not been verified yet: it's likely that when pooled collections are serialized and then deserialized, disposing them causes an exception because the deserialized arrays do not come from the ArrayPool, which throws an error if you return to it an array it didn't create.

Expected behavior
Either ArrayPool returns should be in a try/catch block, or deserialization should copy the deserialized array into an ArrayPool array instead of using it directly.

Is the repos alive?

Pooled Lookup

Would it be possible to have a PooledLookup implementation that implements ILookup?

This would drastically help us in scenarios we have a large (>1m objects), intermediate lookup we create that gets GC’d right away.

Support for ObservableCollection

Would be amazing if you could add support for ObservableCollection as it's the only collection type that fully supports binding. If you add support I would recommend or suggest you also add a AddRange method as the original collection hasn't but for performance reasons it's a must-have. There are many modified versions that added such as a method but none which is optimized with ArrayPool etc.

Pooled Array

It would be great if there was PooledArray implementation along w/ the it's supporting LINQ methods (ToPooledArray etc)

Why not implents INotifyCollectionChanged or provide a BindingVersion

INotifyCollectionChanged is useful ,I wanted it~

Disposing PooledDictionary doesn't dispose PooledList stored as values

In the following code stored lists are not returned to the pool:

var lookup = new PooledDictionary<int, PooledList<string>>(clearMode: ClearMode.Always);

// add values to collection...

lookup.Dispose();

new PooledList<PlayerState>(Players.Count, preallocate : true)

Is your feature request related to a problem? Please describe.
Zero overhead structs clone.

Describe the solution you'd like

            PooledList<PlayerState> players = null;
            if (Players != null)
            {
                players = new PooledList<PlayerState>(Players.Count, preallocate : true); // i guess preallocated just sets _size = capacity;
                for (var i = 0; i < players.Count; i++)
                {
                    PlayerState.Clone(in Players.Span[i], ref players.Span[i]);
                }
            }

Describe alternatives you've considered

            PooledList<PlayerState> players = null;
            if (Players != null)
            {
                players = new PooledList<PlayerState>(Players.Count);
                for (var i = 0; i < players.Count; i++)
                {
                    // Clone will fail until
                    players[i] = default; // overhead to create 200+ byte struct on stack and pass it into indexer (not sure if optimized away by compilers?)
                    PlayerState.Clone(in Players.Span[i], ref players.Span[i]);
                }
            }

Additional context
Usual List could not do that because it would leak abstraction (that it is actually array list).
PooledList already leaks Span as property and ArrayPool as ctor arg(also ctor arg is no of so problem). So adding that method will not increase leakage, but will allow more performance uses for structs.

Consider PooledSortedList

Put xml docs into nuget packages

so we clear read on what does copy-allocate and what is not.

GC of boxing if key is struct

Is your feature request related to a problem? Please describe.

In Unity runtime if key is struct, than key == null produced garbage of boxing. Probably same happens within .NET Core.

Describe the solution you'd like

Could check !typeof(TKey).IsValueType before comparing with null, or may be is null could do that already. How fast is that check?

Describe alternatives you've considered
Copy and paste whole code.

linq benefits from pooled?

Hi,

I've update my project to pooled collections, but it seems time costs has not been imporoved. I've used a lot of linq technologies in my project. So I wonder does linq benefits from Pooled?

Thank you.

PooledList.Dispose semantics

As I see current semantics is to return array into pool, set capacity-size to 0, and allow use later.

        /// <summary>
        /// Returns the internal buffers to the ArrayPool.
        /// </summary>
        public void Dispose()
        {
            ReturnArray();
            _size = 0;
            _version++;
        }

This contradicts with corefx classes Dispose when you cannot use object after Dispose.

Real semantic seems to be Dispose = Reallocate(capacity: 0). ReturnPool => Reallocate(capacity: 0). I suggest to rename Dispose to something else.

Or document:

        /// <summary>
        /// Returns the internal buffers to the ArrayPool and sets capacity to zero. Instance still can be used.
        /// </summary>

PooledDictionary.Remove with Count 0 = Exception

Describe the bug
PooledDictionary.Remove with Count 0 = Exception

To Reproduce

var dict = new PooledDictionary<string, string>();
dict.Remove("Something");

Expected behavior
return without doing anything.

Make PooledStack Indexable and Insertable

Is your feature request related to a problem? Please describe.
Although not need for my primary use cases, there are some situations where I'd find it beneficial for a Stack implementation to be indexable and insertable. Unfortunately, most implementations do not support this, therefore I have to use a List and it's Add and RemoveAt and forgo the optimizations done for Pop, Peak, and Push.

Describe the solution you'd like
Add an Indexer and implement Insert(int index, T item) for PooledStack

PooledSet is generating GC when item gets removed

in my test HashSet is much faster and 0 gc

using System.Collections.Generic;
using Collections.Pooled;
using Gma.DataStructures;
using UnityEngine;

public class Test169 : bs
{
    HashSet<int> HashSet = new HashSet<int>(){1,2,3,4,5,6,7,8,9};
    PooledSet<int> PooledSet = new PooledSet<int>(){1,2,3,4,5,6,7,8,9};
    
    public void Update()
    {
        
        using (bs.Profile("pooled"))
            for (int i = 0; i < 1000; i++)
            {
                PooledSet.Remove(2);
            }
        
        using(bs.Profile("list"))
            for (int i = 0; i < 1000; i++)
            {
                HashSet.Remove(2);
            }
        // Debug.Log(dd);
    }
}

How about call GC.SuppressFinalize(this) to prevent add to finalization queue.

   /// <summary>
        /// Returns the internal buffers to the ArrayPool.
        /// </summary>
        public void Dispose()
        {
            ReturnArray();
            _size = 0;
            _version++;
        }

https://github.com/jtmueller/Collections.Pooled/blob/86cb27aa7ef64f52f169d0e2f7d512493ddc1b51/Collections.Pooled/PooledList.cs?2#L1568

jtmueller / collections.pooled Goto Github PK

collections.pooled's People

Contributors

Stargazers

Watchers

Forkers

collections.pooled's Issues

Recommend Projects

Recommend Topics

Recommend Org