Giter Site home page Giter Site logo

Comments (2)

Chuck321123 avatar Chuck321123 commented on August 20, 2024

@MarcoGorelli I'll add this alternative method which is faster than the current solution if you choose to continue on this issue in the future:

import pandas as pd
import polars as pl

num_rows = 1000000
utc_time = pd.date_range(start='2023-01-01', periods=num_rows, freq='s')

df = pd.DataFrame({
    'UTC_Time': utc_time
})

df['UTC_Time'] = df['UTC_Time'].sort_values()

print(df.head())

df = pl.DataFrame(df)

df = df.with_columns(pl.col("UTC_Time").dt.truncate("2m").alias("Method1"))

df = df.with_columns(pl.from_epoch((pl.col("UTC_Time")
                                    .dt.epoch(time_unit="ns")
                                    // (2 * 60 * 1_000_000_000))
                                   * (2 * 60 * 1_000_000_000),
                                   time_unit="ns").alias("Method2"))

%timeit df.with_columns(pl.col("UTC_Time").dt.truncate("2m"))

%timeit df.with_columns(pl.from_epoch((pl.col("UTC_Time").dt.epoch(time_unit="ns") // (2 * 60 * 1_000_000_000)) * (2 * 60 * 1_000_000_000), time_unit="ns"))

Console print:

4.62 ms ± 989 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
2.63 ms ± 178 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

This was done on the latest version 1.0.0 alpha version 1.

from polars.

MarcoGorelli avatar MarcoGorelli commented on August 20, 2024

thanks - i'm seeing a much smaller difference though

3.16 ms ± 77.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
2.61 ms ± 186 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

from polars.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.