Giter Site home page Giter Site logo

Comments (3)

rhshadrach avatar rhshadrach commented on May 27, 2024

Thanks for the suggestion. We would need to check whether the groups do indeed form a contiguous sequence. In addition, it would mean the type of index in the result is dependent on the values being grouped, which can make it hard to predict for users. For these reasons, I do not think we should be swapping out the index.

from pandas.

Alexia-I avatar Alexia-I commented on May 27, 2024

@rhshadrach Thanks for your quick response and for considering the suggestion. I believe the proposed optimization could be particularly useful when users group rows by an integer dtype column that forms a contiguous sequence. It really consumes much memory when using other index type other than rangeindex. I believe this enhancement could contribute to the overall efficiency and performance of Pandas, especially for data processing tasks involving large datasets. This is a common scenario, especially in large datasets, where the choice of index type can significantly impact memory usage. Utilizing RangeIndex in such cases could offer substantial memory savings.
Besides, most users interact with DataFrame contents rather than the index type and I think most operations do not distinguish int64index and rangeindex, and it would not affect user too much. Also, I anticipate that implementing this change could be relatively straightforward. Thus I think this might be easy and useful to fix.

from pandas.

phofl avatar phofl commented on May 27, 2024

I agree with @rhshadrach, we don't want value dependent behaviour if we can avoid it, the improvement isn't worth the hassle in this case. You can cast the Index by yourself if necessary.

from pandas.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.