Giter Site home page Giter Site logo

Wrong order after rdst about rdst HOT 6 OPEN

vigna avatar vigna commented on September 1, 2024
Wrong order after rdst

from rdst.

Comments (6)

vigna avatar vigna commented on September 1, 2024 1

The second thing LOL. Thanks for the fix!

However, in the end we realized unstable is fine. Any suggestion for optimizing the sort for a few billion pairs (usize, usize)?

from rdst.

nessex avatar nessex commented on September 1, 2024

Thanks! I've grabbed the file and managed to replicate it. Still trying to work out the cause though. It takes a while, the issue appears right at the end of the dataset, and when taking a subset of the data it disappears 😅

It looks like it's possibly something to do with one of the layer skips in LsbSort, so unfortunately for data that big there isn't really a great replacement for that algorithm. You could limit the sort to just Regions sort and Ska sort for now, but it's a lot slower for that data.

from rdst.

vigna avatar vigna commented on September 1, 2024

Ok thanks!

BTW, any possibility of getting a stable version? 😏

from rdst.

nessex avatar nessex commented on September 1, 2024

BTW, any possibility of getting a stable version? 😏

Depends on if you mean a stable library or stable sort order 😅 . For the former, I've released 0.20.13 that includes a fix for the ordering issue here. It was an issue with the combination of two optimizations (level skip + alternating input vs. output arrays), where counting wasn't alternating input arrays. Counts were correct (same data), but the flag / check for if the data was "already sorted" wasn't.

If you mean stable sort order, give Algorithm::MtLsb and Algorithm::Lsb a try! They have stable output. Something like this should do:

struct StableTuner;
impl Tuner for StableTuner {
    fn pick_algorithm(&self, p: &TuningParams, _counts: &[usize]) -> Algorithm {
        if p.input_len >= 500_000 {
            Algorithm::MtLsb
        } else {
            Algorithm::Lsb
        }
    }
}

from rdst.

nessex avatar nessex commented on September 1, 2024

The most important thing would be to make the RadixKey impl. as fast as possible. If you can flip the order of the pair of usize, and if you're on exclusively little-endian machines, something like this is probably faster:

impl RadixKey for Pair {
    const LEVELS: usize = 16;

    #[inline(always)]
    fn get_level(&self, level: usize) -> u8 {
        #[cfg(target_endian = "little")]
        unsafe { (self.0.as_ptr() as *const u8).add(level).read_unaligned() }

        #[cfg(target_endian="big")]
        compile_error!("big endian is not supported");
    }
}

This function is used in some very hot loops, so a single conditional or divide can have a major impact.

Beyond that, the standard tuner is still in the same ballpark as other options I tried when running it against that billion pairs dataset you provided. Occasionally a different tuning was a little faster / slower, but nothing consistently so.

Depending on the machine you're running on, the included low_mem tuner could be faster though. Most of my runs were on a m1 macbook, which seems to have great memory performance (both bandwidth and latency). On a mid-range Ryzen it's a different story and memory can be a bit of a bottleneck, so the low_mem tuner wins out more often. If you test the standard tuner, low_mem tuner and also that MtLsb option in the comment above, then you'll have a pretty good snapshot of what options exist. More complex tunings probably won't have a great impact beyond those three options.

from rdst.

vigna avatar vigna commented on September 1, 2024

Thanks! As is it it's two times faster than par_sort_unstable, so we're already very happy. We'll keep experimenting. Thanks for a great crate!

from rdst.

Related Issues (10)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.