Giter Site home page Giter Site logo

jtmueller / collections.pooled Goto Github PK

View Code? Open in Web Editor NEW
529.0 25.0 48.0 6.68 MB

Fast, low-allocation ports of List, Dictionary, HashSet, Stack, and Queue using ArrayPool and Span.

License: MIT License

C# 100.00%
list dictionary stack arraypool span hashset queue

collections.pooled's People

Contributors

ares128 avatar dzmitry-lahoda avatar ipavel83 avatar jtmueller avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

collections.pooled's Issues

PooledList: the indexer-get operation is 9-10 times slower than that of List, but Span is fine

PooledList: the indexer-get operation is 9-10 times slower than that of List, despite being the exact same code, as far as I can tell! I haven't found any way of closing that gap other than removing the bounds-check, which isn't a good idea. I suspect some sort of List-specific optimization in the framework that PooledList can't benefit from. However, indexing into PooledList.Span is 84% faster than indexing into List, so there's that.

https://github.com/jtmueller/Collections.Pooled/blob/master/docs/benchmarks/netcoreapp2.2/List_Indexer_Types-report-github.md

I will look into this.

Add support for Tuple<TKey, TValue> to dictionary constructor/AddRange

Is your feature request related to a problem? Please describe.
We already support ValueTuple, but there are cases where support for Tuple would be convenient, particularly in F# to make an equivalent to the dict function.

Describe the solution you'd like
Constructors, AddRange, and extension methods for PooledDictionary should support IEnumerable<Tuple<TKey, TValue>>. No need to support spans of same.

Add support for PooledQueue

Is your feature request related to a problem? Please describe.
PooledQueue is the only major non-Sorted collection left to support.

Describe the solution you'd like
Implement PooledQueue, as with the others.

Add suppport for custom ArrayPools

Is your feature request related to a problem? Please describe.
Currently the pooled collections are all hard-coded to use ArrayPool<T>.Shared, but ArrayPool<T> is an abstract class intended to allow for custom implementations. For example, on a resource-constrained device someone might need an ArrayPool of a fixed size, or that pre-allocates arrays or something. At the moment there is no way to get PooledCollections to make use of such a custom ArrayPool.

Describe the solution you'd like
We need a way to optionally specify a custom ArrayPool, and default to ArrayPool<T>.Shared if none is provided. However, passing in an instance of a custom ArrayPool wouldn't work for PooledDictionary or PooledSet, because those types rely on arrays of a private type inaccessible from outside.

Perhaps there's a way to use a generic type parameter to specify the type of a custom ArrayPool, and then create an instance of this type inside the collection code?

Resurrect rented buffers or arrays in finalyzer

Has it been considered to provide an option of returning the rented buffers/arrays from the collection finalyzer. May be useful when disposing of a pooled collection instance is too complicated or impossible to enforce. For instance, a collection is being returned from some library as its base interface (IList as an example).

To not waste anybody's time I am not providing possible ways to implementing such feature until know for sure it is not a duplicate to some other, known request or issue.

Thank you.

Add a Span-based constructor overload to PooledStack

Is your feature request related to a problem? Please describe.

If foo is PooledList<int>...
new PooledStack<int>(foo.Span)
...does not work without an explicit cast.

Describe the solution you'd like
It would be nice to not have to cast. There may be other pooled collections to which this applies.

Support for .NET Standard 2.1

.NET Standard 2.1 is in Preview (as part of .NET Core 3.0) as of this writing. The current version...

  • Is implemented by .NET Core 3.0 but not .NET Core 2.1 or 2.2.
  • Adds Span, Memory, ArrayPool, and support for the C# 8 Index and Range syntax
  • Fully supports everything Collections.Pooled needs.

The downside is this:

  • Currently we target .NET Standard 2.0 (with NuGet packages for Span and ArrayPool) and .NET Core 2.1 (with a higher-performance built-in Span)
  • It would be nice to target only .NET Standard 2.0 and 2.1, without targeting any specific frameworks. However, this would mean that .NET Core 2.1/2.2 clients (which don't support .NET Standard 2.1) would get the Standard 2.0 version of the library with the slower implementation of Span, even though there's a faster one built in. Maybe type redirection would cause it to use fast-Span regardless, but I'm not sure.
  • However, keeping the .NET Core 2.1 target while adding .NET Standard 2.1 is also a problem. A .NET Core 3.0 client (at least as of the current preview) will default to loading the .NET Core 2.1 version of the library when presented with a choice between Core 2.1 and Standard 2.1. I guess that specific frameworks trump standards, even when the supported standard is newer. This means that .NET Core 3.0 clients would not get the version of the library that supports Indexes and Ranges.
  • To fix that last issue, I would have to target four frameworks: Standard 2.0 and 2.1, Core 2.1 and 3.0. This list would likely keep growing over time. I don't want to do this, because it complicates conditional compilation, and I'd rather target a small number of standards than many different versions of .NET Core.

My current plan is this, when the final release of .NET Core 3 comes out:

  • Increment the major version number, as this is a breaking change for .NET Core 2.1/2.2 users.
  • Target only .NET Standard 2.0 and 2.1.

By incrementing the major version, .NET Core 2.x users will not auto-upgrade to the new release, and can keep using Collections.Pooled 1.x. Users of .NET Core 3 will get the Standard 2.1 version, and all other users will get Standard 2.0 (until another framework adds support for Standard 2.1 of course).

This code is currently fully implemented in the core-3 branch if you want to experiment with it.

@dzmitry-lahoda and @Wixred I'm tagging you because you've shown an interest. Do you have any objection to this plan, or any alternate suggestions?

Performance of dictionary

I looked into dozen of dict benchmarks. I do not see when pooled is faster. In some cases it is even slower for some reason. Could you please document?

Aggressive optimization and inlining foor PooledList.Span

Is your feature request related to a problem? Please describe.
Next method is kind of slow in Unity runtime (not sure about Core):

public Span<T> Span => _items.AsSpan(0, _size);

Describe the solution you'd like
Mark this method as aggressive optimization and inlining.

Describe alternatives you've considered
Provide unsafe access to writable array as ReadOnlyProperty.

Additional context
We get data from Span by ref.

I know Core 3 optimized Span, but there are issues to measure that because of BDN-JIT instability. Same time Unity is involved into .NET Ecosystem and hopefully will optimize Span in some time (because people will use .NET Standard with Spans and Unity stated it is 2.0 and will be 2.1)

Nested pooled collections crash the Visual Studio heap snapshot

Describe the bug
When nesting pooled collections inside each other, the debugger crashes when taking a heap snapshot.

To Reproduce
Create a nested collection, for example PooledDictionary<int, PooledSet<int>> and attempt to take a heap snapshot in the Visual Studio debugger. This occurs in both the prerelease and 1.0 Nuget packages.

Expected behavior
The debugger should not crash.

Is it also possible that nesting pooled collections is just not recommended usage? I'm not really sure how it works internally.

PooledDictionary get fails with ValueTuple key

Describe the bug
Lookups fail in a PooledDictionary<(int, int), string> but succeed in a Dictionary<(int, int), string>.

To Reproduce

class Program
{
    private static void PooledTest()
    {
        var dict = new PooledDictionary<(int, int), string>
        {
            { (1, 0), "1,0" },
            { (2, 0), "2,0" },
            { (0, 1), "0,1" },
            { (0, 2), "0,2" }
        };

        bool found1 = dict.TryGetValue((1, 0), out var _);
        bool found2 = dict.TryGetValue((2, 0), out var _);
        bool found3 = dict.TryGetValue((0, 1), out var _);
        bool found4 = dict.TryGetValue((0, 2), out var _);

        Console.WriteLine("PooledDictionary: {0}, {1}, {2}, {3}", found1, found2, found3, found4);
    }

    private static void DictTest()
    {
        var dict = new Dictionary<(int, int), string>
        {
            { (1, 0), "1,0" },
            { (2, 0), "2,0" },
            { (0, 1), "0,1" },
            { (0, 2), "0,2" }
        };

        bool found1 = dict.TryGetValue((1, 0), out var _);
        bool found2 = dict.TryGetValue((2, 0), out var _);
        bool found3 = dict.TryGetValue((0, 1), out var _);
        bool found4 = dict.TryGetValue((0, 2), out var _);

        Console.WriteLine("Dictionary: {0}, {1}, {2}, {3}", found1, found2, found3, found4);
    }

    static void Main(string[] args)
    {
        DictTest();
        PooledTest();
    }
}

Expected behavior

Dictionary: True, True, True, True
PooledDictionary: False, False, False, True

We should get the same results from both dictionary types, without having to specify custom comparers.

Retrieve Memory<T> from PooledList<T>

I am new to Span and Memory, and am trying to migrate my library to them. I discovered PooledList when looking into how to AddRange a Span to a List. My initial migration increased the warm of my program by 30% (parsing a large data file and loading it into different lists and dictionaries in memory). Thanks!
However I did find one point that I needed to work around:
It currently isn't possible to get a Memory from PooledList.
Being able to access a Memory from the PooledList will allow me to pass around heap storable references of sub-ranges which makes the PooledList even more flexible. I'd be able to use references to it with Linq IEnumerable and also be able to use it in local functions. Ie. it would free the references for the restrictions that are required by Span.

I understand that PooledList is built upon ArrayPool which means that returning a Memory will cause a memory leak when expanding the PooledList.
Is it possible to implement this functionality - or would it come as an unacceptable tradeoff to the performance/complexity of the library?

Thanks again

Add ability to force clear of pooled data

Is your feature request related to a problem? Please describe.
Pooled collections have different behavior between .NET Core 2.1+ and .NET Standard 2.0. Because .NET Standard has no mechanism to easily determine if a value type contains a reference type, the .NET Standard version always clears values before returning arrays to the array pool.

The .NET Core 2.1+ version has an optimization that only clears values before returning arrays when those values are reference types or contain reference types. This helps performance, but some users might be concerned about leaking sensitive data to other code running in the same process, and might want even value-type data to be cleared before returning arrays to the pool.

Describe the solution you'd like
Pooled collections should allow the user to specify an "always clear" behavior - at least in the .NET Core 2.1+ version.

Binary Serialization

Describe the bug
This is a reminder to test a likely issue that's not been verified yet: it's likely that when pooled collections are serialized and then deserialized, disposing them causes an exception because the deserialized arrays do not come from the ArrayPool, which throws an error if you return to it an array it didn't create.

Expected behavior
Either ArrayPool returns should be in a try/catch block, or deserialization should copy the deserialized array into an ArrayPool array instead of using it directly.

Pooled Lookup

Would it be possible to have a PooledLookup implementation that implements ILookup?

This would drastically help us in scenarios we have a large (>1m objects), intermediate lookup we create that gets GCā€™d right away.

Support for ObservableCollection

Would be amazing if you could add support for ObservableCollection as it's the only collection type that fully supports binding. If you add support I would recommend or suggest you also add a AddRange method as the original collection hasn't but for performance reasons it's a must-have. There are many modified versions that added such as a method but none which is optimized with ArrayPool etc.

Pooled Array

It would be great if there was PooledArray implementation along w/ the it's supporting LINQ methods (ToPooledArray etc)

new PooledList<PlayerState>(Players.Count, preallocate : true)

Is your feature request related to a problem? Please describe.
Zero overhead structs clone.

Describe the solution you'd like

            PooledList<PlayerState> players = null;
            if (Players != null)
            {
                players = new PooledList<PlayerState>(Players.Count, preallocate : true); // i guess preallocated just sets _size = capacity;
                for (var i = 0; i < players.Count; i++)
                {
                    PlayerState.Clone(in Players.Span[i], ref players.Span[i]);
                }
            }

Describe alternatives you've considered

            PooledList<PlayerState> players = null;
            if (Players != null)
            {
                players = new PooledList<PlayerState>(Players.Count);
                for (var i = 0; i < players.Count; i++)
                {
                    // Clone will fail until
                    players[i] = default; // overhead to create 200+ byte struct on stack and pass it into indexer (not sure if optimized away by compilers?)
                    PlayerState.Clone(in Players.Span[i], ref players.Span[i]);
                }
            }

Additional context
Usual List could not do that because it would leak abstraction (that it is actually array list).
PooledList already leaks Span as property and ArrayPool as ctor arg(also ctor arg is no of so problem). So adding that method will not increase leakage, but will allow more performance uses for structs.

GC of boxing if key is struct

Is your feature request related to a problem? Please describe.

In Unity runtime if key is struct, than key == null produced garbage of boxing. Probably same happens within .NET Core.

Describe the solution you'd like

Could check !typeof(TKey).IsValueType before comparing with null, or may be is null could do that already. How fast is that check?

Describe alternatives you've considered
Copy and paste whole code.

linq benefits from pooled?

Hi,

I've update my project to pooled collections, but it seems time costs has not been imporoved. I've used a lot of linq technologies in my project. So I wonder does linq benefits from Pooled?

Thank you.

PooledList.Dispose semantics

As I see current semantics is to return array into pool, set capacity-size to 0, and allow use later.

        /// <summary>
        /// Returns the internal buffers to the ArrayPool.
        /// </summary>
        public void Dispose()
        {
            ReturnArray();
            _size = 0;
            _version++;
        }

This contradicts with corefx classes Dispose when you cannot use object after Dispose.

Real semantic seems to be Dispose = Reallocate(capacity: 0). ReturnPool => Reallocate(capacity: 0). I suggest to rename Dispose to something else.

Or document:

        /// <summary>
        /// Returns the internal buffers to the ArrayPool and sets capacity to zero. Instance still can be used.
        /// </summary>

PooledDictionary.Remove with Count 0 = Exception

Describe the bug
PooledDictionary.Remove with Count 0 = Exception

To Reproduce

var dict = new PooledDictionary<string, string>();
dict.Remove("Something");

Expected behavior
return without doing anything.

Make PooledStack Indexable and Insertable

Is your feature request related to a problem? Please describe.
Although not need for my primary use cases, there are some situations where I'd find it beneficial for a Stack implementation to be indexable and insertable. Unfortunately, most implementations do not support this, therefore I have to use a List and it's Add and RemoveAt and forgo the optimizations done for Pop, Peak, and Push.

Describe the solution you'd like
Add an Indexer and implement Insert(int index, T item) for PooledStack

PooledSet is generating GC when item gets removed

in my test HashSet is much faster and 0 gc

using System.Collections.Generic;
using Collections.Pooled;
using Gma.DataStructures;
using UnityEngine;

public class Test169 : bs
{
    HashSet<int> HashSet = new HashSet<int>(){1,2,3,4,5,6,7,8,9};
    PooledSet<int> PooledSet = new PooledSet<int>(){1,2,3,4,5,6,7,8,9};
    
    public void Update()
    {
        
        using (bs.Profile("pooled"))
            for (int i = 0; i < 1000; i++)
            {
                PooledSet.Remove(2);
            }
        
        using(bs.Profile("list"))
            for (int i = 0; i < 1000; i++)
            {
                HashSet.Remove(2);
            }
        // Debug.Log(dd);
    }
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    šŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. šŸ“ŠšŸ“ˆšŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ā¤ļø Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.