Giter Site home page Giter Site logo

parellel-radix-sort's Introduction

Parellel Radix Sort

by Max Newcomer

Team solution to project: https://github.com/remzi-arpacidusseau/ostep-projects/tree/master/concurrency-sort

Preamble

When we first started this project, we actually thought that the sort was supposed to be within the same file. So we actually started with this pretty cool (but not super fast) thread latching radix sort. The problem with that sort was that there were many iterations of syncronizing the threads.

We went from that approach to our current implementation when reading the tests (and the project description lol) again. With the requirement of another output file we thought up our new implementation.

Current Implementation

  1. Spawn 16 threads with id's 0 through F.
  2. Each thread reads (in-parellel) the input file and grabs all records with the Most Sig Bits of the key matching the thread id (since each key is a IID variable, we can expect with large inputs to have even distribution of keys, this is something we would need to consider changing if we were expecting a different distribution of keys)
  3. Threads keep track of how many they added into their personal memory
  4. Thread 0 finds its starting point in the output mmap
    • Then thread 1, thread 2, ... , thread F
    • This is the only part where order matters
  5. All threads radix sort within their own subarray within the mmap
    • making sure to use bit operations instead of integer operations

The main problem with this implementation is that the size of the initial gathering array is impossible to know. So, we default it to filesize / 100 / 16 records. With the ability to expand into "extra" arrays. This adds an annoying extra complexity, but it's pretty dynamic which is nice.

Pretty proud of this sort and think it's actually pretty cool.

Future Ideas

If we had a little more time and/or we were getting paid to actually deliver this we would like to make the number of threads a little more dynamic, allowing for numerous processor core cpus to take advantage of each core. This would look something like spawning n threads and then bit masking the first log_2(n) bits of each record to delegate work to each thread.

Step 4. could actually be further optimized (maybe only slightly). By having each thread have a public count each thread n could add up all n-1 threads before and use that as an offset. This would require constant checking by each thread that they have up to date information which honestly might slow down the program. Only way to know would be to implement. Given that the current implementation uses condition signalling it might be hard to further optimize this part.

Extras

Did some performance tuning in valgrind as well.

parellel-radix-sort's People

Contributors

maxwnewcomer avatar

Watchers

 avatar  avatar

parellel-radix-sort's Issues

convert while(1) to while(pthread_mutex_trylock() != 0);

I think instead of the while(1) loop we do something like:

while(1) {
if(pthread_mutex_lock(s_mem->lock) == -1) {
pthread_cond_wait(s_mem->checkable, s_mem->lock);
}
if(s_mem->t_turn % global->THREADS == tid) {
s_mem->c_t_arr = 0;
s_mem->c_t_idx = 0;
break;
}

while(!pthread_mutex_trylock(s_mem->lock)) {
    if(s_mem->t_turn % global->THREADS == tid) { 
         s_mem->c_t_arr = 0; 
         s_mem->c_t_idx = 0; 
         break; 
     } else {
         pthread_cond_wait(s_mem->checkable, s_mem->lock); 
     }
}

Will try tomorrow morning, just writing down so I don't forget

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.