Giter Site home page Giter Site logo

Comments (3)

cboettig avatar cboettig commented on August 19, 2024 1

from arkdb.

HenrikBengtsson avatar HenrikBengtsson commented on August 19, 2024

Hello.

Yes, one test does use future.apply to check parallel reads, but doesn’t set the number of cores.

It could be that testthat somehow circumvents R CMD check detection of parallelly::availableCores(); I created HenrikBengtsson/parallelly#109 to investigate if this could be the case.

Some tests may be using duckdb now, which is threaded by default.

Ah, that could be it. From the htop screenshot, the child processes looks like forked processes (e.g. mclapply, "multicore"), but it could be that threads appear like that too. I'm quite sure it's not background R workers (e.g. PSOCK, "multisession"), because they'd have another set of command-line options.

Do you have a writeup on your rant?

Definitely; https://www.jottr.org/2022/12/05/avoid-detectcores/

You are familiar with ‘nice’ or containers / linux jails to control cpu use? Usually I find ram is the issue on shared
machines…

Yes; I can technically, protect others against this, and eventually our system will get cgroups2 protection others against this too. That would only protect others from overuse, but I would still overuse, resulting in lost of context switching, and more so the more CPU cores there are. And, in the bigger picture, this won't help anyone else out there. My blog post gives most of my arguments.

from arkdb.

cboettig avatar cboettig commented on August 19, 2024

thanks, I agree with all your points that are specific to detectCores() -- as you see we're not setting that. (There are also reasons when a user might want to set this higher than the number of actual cores, e..g. when I/O is limiting). I'm less clear how your take applies to other tools, specifically lower-level libraries that are threaded by default (duckdb, arrow, gdal, openblas, loads of ml stuff etc). Most of these things are not impacted by the number of connections. I agree that connections in R can be a limited resource, but I don't quite understand the case that cpu use itself is limiting... anyway we are probably on the same page, I'm just a bit unclear how you feel about the default-threaded C-level applications like duckdb. To me, I think these are reasonable defaults that cause no real risk.

anyway, thanks for opening an issue and for all you do for the R community!

from arkdb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.