Giter Site home page Giter Site logo

Comments (15)

bshoshany avatar bshoshany commented on May 10, 2024 1

You can use std::this_thread::get_id.

from thread-pool.

mcopik avatar mcopik commented on May 10, 2024 1

@bshoshany This is a wider issue than simply running in a specific thread. It's about accessing specific data from a thread.

For example, each task might require using allocated data structures that shouldn't be shared across threads. On an n-thread pool, it's sufficient to allocate N copies and access with thread ID. In our example, we had to use a large multi-grid iterative solver and it would be very inefficient to reallocate it. In the OpenMP tasking and threading model, we could simply use the get_thread_num function to ensure that each thread is using one copy at a time. An alternative is using static variables in the pool worker function, which is messy and quickly becomes complex.

Another example is aggregating results and statistics across tasks. A common pattern is to allocate a copy of partial results independently for each thread, allowing for a lock-free update. Then, final results can be computed after all tasks are finished. Otherwise, one has to use atomics and/or locks to update the single copy of results after each task has been processed.

from thread-pool.

bshoshany avatar bshoshany commented on May 10, 2024 1

@JonasLieber, in case it wasn't clear, I do appreciate your comments and suggestions!

The hashes are not mathematically guaranteed to be unique (that's true for most hash functions) but since the hashes are 64-bit integers, the probability of two out of a few dozen hashes being exactly the same is, for all intents and purposes, zero. However, if you want uniqueness to be mathematically guaranteed (I admit I would prefer that too) then I think the best solution is to do what @nguyenpham suggested.

Importantly, if I were to implement a function that provides an integer thread ID as you ask, keep in mind that the thread pool object doesn't have control of the std::this_thread object, so this will most likely be implemented with an std::unordered_map<std::thread::id, uint32_t> stored inside the thread_pool object itself, and your task will call something like pool.get_integer_id(std::this_thread) to get an integer from this map. So this still has some overhead (including also the overhead of having to pass the pool object to each task).

I can't think of any use case where std::this_thread::get_id will not by itself provide the necessary functionality (after all that's exactly what it's meant to do!), but if you have such a use case in mind, you are more than welcome to provide a concrete code example.

from thread-pool.

nguyenpham avatar nguyenpham commented on May 10, 2024

Thanks. It works

from thread-pool.

mcopik avatar mcopik commented on May 10, 2024

@bshoshany This adds an unnecessary layer of indirection. For example, in my case, I need to store a state between invocations on the same thread. I could simply do with an array of objects, but that works only if thread IDs are aligned from 0..numThreads-1.

One could support both f(args) and f(id, args) user functions with SFINAE.

from thread-pool.

bshoshany avatar bshoshany commented on May 10, 2024

@mcopik when a thread in the thread pool finishes executing a task, it picks up the next available task from the queue. So any thread can pick up any task at any time. There's no way to guarantee that the same thread will execute different invocations of the same task when using a thread pool. If you need to run a specific task in a specific thread with a specific ID, then I don't think a thread pool is the correct framework to use. Instead, I suggest creating the threads yourself and feeding tasks into each thread manually.

from thread-pool.

JonasLieber avatar JonasLieber commented on May 10, 2024

I agree with @mcopik's last post. I have a similar use case. While it's possible to use std::this_thread::get_id, the std::thread::id class is a bit annoying since it cannot be converted to an integer easily, see here. According to R. Martinho Fernandes' answer, "The portable solution is to pass your own generated IDs into the thread.".

Functionality like get_thread_num would make it much easier to switch from openMP to thread-pool. It would be great if the function pointer submitted to parallelize_loop had a way to read (and only read) the iteration index t (line 148-158).

from thread-pool.

bshoshany avatar bshoshany commented on May 10, 2024

@JonasLieber, I'm sorry, but I just don't see why the thread ID needs to be an integer. If what you want is to guarantee that no two threads access the same data by comparing their thread IDs, then you can just compare the objects of type std::thread::id that you get from std::this_thread::get_id, as they have operator== overloaded. And if you must get an integer for other reasons, then you can use std::hash as suggested in the StackOverflow link you provided, which is fully portable and standards-compliant. I really don't see why something that isn't broken needs to be fixed.

from thread-pool.

nguyenpham avatar nguyenpham commented on May 10, 2024

I have used a map to map between a thread to a record, in the record, we can do what we want, say, auto-assign a number as an ID or order number to that thread. So simple:

    std::unordered_map<std::thread::id, ThreadRecord> threadMap;

from thread-pool.

JonasLieber avatar JonasLieber commented on May 10, 2024

Thank you, @nguyenpham. As far as I can see, the drawbacks of this approach are

  • each user of the library has to implement the get_id function when it could be provided by the library
  • the (admittedly small) overhead of a lookup in the map
  • while the thread ids are unique, the hashes may not be (see Michael Goldshteyn's comment under 888's answer on the stack overflow post I linked before. I don't follow this to be honest.

So I think it would be a nice feature for the user, particularly those used to similar features by openMP. But it's not strictly necessary.

from thread-pool.

JonasLieber avatar JonasLieber commented on May 10, 2024

@bshoshany : Thanks for your response and your work on the thread-pool. I'm not suggesting that anything is broken. I simply think that it would be a useful feature. Marcin and nguyenpham might agree.

from thread-pool.

mcopik avatar mcopik commented on May 10, 2024

@bshoshany I also didn't try to imply that the library is incomplete or broken. I think that it's a nice feature for users that need to access data that shouldn't be reallocated between tasks, and I think it's both simpler and more efficient than a hash table (which we use currently).

Perhaps my understanding of your code was incorrect, but I thought that you already assign consecutive integers to threads in the create_threads function. The i counter can be passed to worker function, and this function invokes user-defined tasks. Thus, each thread has a unique integer that you use when storing thread instances in an array, and it doesn't change as long as the worker function doesn't terminate. If this is correct, then there's no need to pass additional objects and perform lookups - just propagate the id to user task invocation there, and the packed task can accept the id as well.

from thread-pool.

bshoshany avatar bshoshany commented on May 10, 2024

@mcopik I didn't think you tried to imply that, no worries!

The worker just calls task() without any arguments. This is necessary because std::function<void()> is a function wrapper that takes no arguments. Therefore, the task being pushed must have no arguments as well. To do what you suggest, I would have to change the queue to std::queue<std::function<void(uint32_t)>> or similar, so I can pass the integer ID to the packaged task, and in addition, the user would have to add an additional argument that accepts the integer ID to every task they push into the pool. I think that would complicate things unnecessarily, especially given that most users will not need the integer ID passed to the tasks.

What I can do, if you want, is write a separate helper class enumerate_pool that stores an std::unordered_map<std::thread::id, uint32_t> from thread IDs to consecutive integers (the same ones that create_threads() uses). You will construct an object using enumerate_pool(pool) where pool is the thread_pool object, and then you could call enumerate_pool(std::this_thread) from within a task to get the integer ID. This way the functionality you want will be added for the users who need it, but will not complicate the main thread_pool class.

from thread-pool.

roussec avatar roussec commented on May 10, 2024

I am having problems understanding this. I want to reuse certain information per thread. I can use std::thread::id to identify the threads. However there are more thread ids than threads in the pool. E.g. 37 different thread ids but thread_count is only 20. What am I missing? Thank you

from thread-pool.

bshoshany avatar bshoshany commented on May 10, 2024

@roussec that shouldn't happen. There should be exactly as many unique thread IDs as the pool's thread count. In fact, this is tested directly by the test program BS_thread_pool_test.cpp. The function count_unique_threads() detects how many unique thread IDs there are in the pool and uses that to count the number of threads and verify independently that the correct number of threads was created. Since the test always passes, this proves that the number of unique IDs always matches the thread count.

from thread-pool.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.