Giter Site home page Giter Site logo

Comments (1)

JosiahParry avatar JosiahParry commented on June 12, 2024

Here's my user take on parallelism:

One of hte main reasons to use Rust is because of its "fearless concurrency." As a novice user of Rust, I find this to be one of the major reasons to use it! Rayon and tokio are game changing. I personally use both rayon and tokio in tandem with extendr but it requires a lot of hoop jumping.

There are three ways that I would use parallelism with extendr.

  1. Using rust native types as a parallel iterator, collecting into a native Rust type, and converting into an R type
    • This is supported today
  2. Using a rust native type as a parallel iterator, collecting into an Robj / R type
    • this is not supported today.
  3. Using R native types as a parallel iterator and collecting into an Robj / Robj
    • this is not supported today.
    • Example use case: I have a List of structs (Geoms in {rsgeo}) which I iterate over in parallel and call a method on it, for example x.unsigned_area() and collect it into Doubles

Today I introduce parallelism by taking a List of structs, cloning them into a Rust owned vector, using Rayon, then collecting into a Vector of non-extendr types. Then in another iteration convert them into an extendr type. There becomes 2 extra iterations over my vector in that case to use parallelism. In some cases that has caused parallel iteration to be slower and less efficient than single threaded nature. Example here.

I would like to be able to do at least 2. I envision this as having some rust type that I can introduce as a parallel iterator. A real example is here. Ideally I could take a Vec<Vec<u8>> and process this in parallel using rayon. In each iteration I create an Robj that I would like to collect::<Vec<Robj>>(). After processing, I would turn this into a List::from_values(res_vec). However, this is not possible because even if the Robj has not seen R yet, it cannot be sent between threads. I'd like something like SendRobj or UnsafeRobj or something that I can create in parallel and collect on a main thread so I can at least create Robjs in parallel.

I understand the difficulty of #3 and don't foresee that being feasible right away. Nor would I need to do that. But I find 2 to be frustrating becasue the data (at least from my perspective) as not even seen R yet it just has an Robj type.

from extendr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.