Comments (15)
You can use std::this_thread::get_id
.
from thread-pool.
@bshoshany This is a wider issue than simply running in a specific thread. It's about accessing specific data from a thread.
For example, each task might require using allocated data structures that shouldn't be shared across threads. On an n-thread pool, it's sufficient to allocate N copies and access with thread ID. In our example, we had to use a large multi-grid iterative solver and it would be very inefficient to reallocate it. In the OpenMP tasking and threading model, we could simply use the get_thread_num
function to ensure that each thread is using one copy at a time. An alternative is using static variables in the pool worker function, which is messy and quickly becomes complex.
Another example is aggregating results and statistics across tasks. A common pattern is to allocate a copy of partial results independently for each thread, allowing for a lock-free update. Then, final results can be computed after all tasks are finished. Otherwise, one has to use atomics and/or locks to update the single copy of results after each task has been processed.
from thread-pool.
@JonasLieber, in case it wasn't clear, I do appreciate your comments and suggestions!
The hashes are not mathematically guaranteed to be unique (that's true for most hash functions) but since the hashes are 64-bit integers, the probability of two out of a few dozen hashes being exactly the same is, for all intents and purposes, zero. However, if you want uniqueness to be mathematically guaranteed (I admit I would prefer that too) then I think the best solution is to do what @nguyenpham suggested.
Importantly, if I were to implement a function that provides an integer thread ID as you ask, keep in mind that the thread pool object doesn't have control of the std::this_thread
object, so this will most likely be implemented with an std::unordered_map<std::thread::id, uint32_t>
stored inside the thread_pool
object itself, and your task will call something like pool.get_integer_id(std::this_thread)
to get an integer from this map. So this still has some overhead (including also the overhead of having to pass the pool
object to each task).
I can't think of any use case where std::this_thread::get_id
will not by itself provide the necessary functionality (after all that's exactly what it's meant to do!), but if you have such a use case in mind, you are more than welcome to provide a concrete code example.
from thread-pool.
Thanks. It works
from thread-pool.
@bshoshany This adds an unnecessary layer of indirection. For example, in my case, I need to store a state between invocations on the same thread. I could simply do with an array of objects, but that works only if thread IDs are aligned from 0..numThreads-1
.
One could support both f(args)
and f(id, args)
user functions with SFINAE.
from thread-pool.
@mcopik when a thread in the thread pool finishes executing a task, it picks up the next available task from the queue. So any thread can pick up any task at any time. There's no way to guarantee that the same thread will execute different invocations of the same task when using a thread pool. If you need to run a specific task in a specific thread with a specific ID, then I don't think a thread pool is the correct framework to use. Instead, I suggest creating the threads yourself and feeding tasks into each thread manually.
from thread-pool.
I agree with @mcopik's last post. I have a similar use case. While it's possible to use std::this_thread::get_id
, the std::thread::id
class is a bit annoying since it cannot be converted to an integer easily, see here. According to R. Martinho Fernandes' answer, "The portable solution is to pass your own generated IDs into the thread.".
Functionality like get_thread_num
would make it much easier to switch from openMP to thread-pool
. It would be great if the function pointer submitted to parallelize_loop
had a way to read (and only read) the iteration index t
(line 148-158).
from thread-pool.
@JonasLieber, I'm sorry, but I just don't see why the thread ID needs to be an integer. If what you want is to guarantee that no two threads access the same data by comparing their thread IDs, then you can just compare the objects of type std::thread::id
that you get from std::this_thread::get_id
, as they have operator==
overloaded. And if you must get an integer for other reasons, then you can use std::hash
as suggested in the StackOverflow link you provided, which is fully portable and standards-compliant. I really don't see why something that isn't broken needs to be fixed.
from thread-pool.
I have used a map to map between a thread to a record, in the record, we can do what we want, say, auto-assign a number as an ID or order number to that thread. So simple:
std::unordered_map<std::thread::id, ThreadRecord> threadMap;
from thread-pool.
Thank you, @nguyenpham. As far as I can see, the drawbacks of this approach are
- each user of the library has to implement the get_id function when it could be provided by the library
- the (admittedly small) overhead of a lookup in the map
- while the thread ids are unique, the hashes may not be (see Michael Goldshteyn's comment under 888's answer on the stack overflow post I linked before. I don't follow this to be honest.
So I think it would be a nice feature for the user, particularly those used to similar features by openMP
. But it's not strictly necessary.
from thread-pool.
@bshoshany : Thanks for your response and your work on the thread-pool
. I'm not suggesting that anything is broken. I simply think that it would be a useful feature. Marcin and nguyenpham might agree.
from thread-pool.
@bshoshany I also didn't try to imply that the library is incomplete or broken. I think that it's a nice feature for users that need to access data that shouldn't be reallocated between tasks, and I think it's both simpler and more efficient than a hash table (which we use currently).
Perhaps my understanding of your code was incorrect, but I thought that you already assign consecutive integers to threads in the create_threads
function. The i
counter can be passed to worker
function, and this function invokes user-defined tasks. Thus, each thread has a unique integer that you use when storing thread instances in an array, and it doesn't change as long as the worker
function doesn't terminate. If this is correct, then there's no need to pass additional objects and perform lookups - just propagate the id to user task invocation there, and the packed task can accept the id as well.
from thread-pool.
@mcopik I didn't think you tried to imply that, no worries!
The worker just calls task()
without any arguments. This is necessary because std::function<void()>
is a function wrapper that takes no arguments. Therefore, the task being pushed must have no arguments as well. To do what you suggest, I would have to change the queue to std::queue<std::function<void(uint32_t)>>
or similar, so I can pass the integer ID to the packaged task, and in addition, the user would have to add an additional argument that accepts the integer ID to every task they push into the pool. I think that would complicate things unnecessarily, especially given that most users will not need the integer ID passed to the tasks.
What I can do, if you want, is write a separate helper class enumerate_pool
that stores an std::unordered_map<std::thread::id, uint32_t>
from thread IDs to consecutive integers (the same ones that create_threads()
uses). You will construct an object using enumerate_pool(pool)
where pool
is the thread_pool
object, and then you could call enumerate_pool(std::this_thread)
from within a task to get the integer ID. This way the functionality you want will be added for the users who need it, but will not complicate the main thread_pool
class.
from thread-pool.
I am having problems understanding this. I want to reuse certain information per thread. I can use std::thread::id to identify the threads. However there are more thread ids than threads in the pool. E.g. 37 different thread ids but thread_count is only 20. What am I missing? Thank you
from thread-pool.
@roussec that shouldn't happen. There should be exactly as many unique thread IDs as the pool's thread count. In fact, this is tested directly by the test program BS_thread_pool_test.cpp
. The function count_unique_threads()
detects how many unique thread IDs there are in the pool and uses that to count the number of threads and verify independently that the correct number of threads was created. Since the test always passes, this proves that the number of unique IDs always matches the thread count.
from thread-pool.
Related Issues (20)
- [BUG] shared_ptr is not being copied in parallelize_loop HOT 6
- [REQ] Getting thread ids HOT 3
- Performance very slow HOT 1
- Task Priority HOT 4
- [REQ] Balanced workload for push_loop HOT 1
- [REQ] Document unexpected cleanup of tasks in worker HOT 1
- [BUG ?] multiple definitions get_index, get_pool link fails with VS 2022 HOT 4
- Support Task Prepending (executing higher priority tasks first) HOT 1
- Place private classes at head of file HOT 2
- Error linking with version 4.0.0 HOT 1
- [REQ] Q? HOT 1
- [REQ] Allow compiling with exceptions disabled. HOT 8
- [REQ] Can the threads in a pool be pinned to specific cores? HOT 1
- [REQ] ability to set threads to 0, and have everything execute serially HOT 3
- [REQ] does this library support recursive/nested tasks? HOT 5
- [REQ] Explain destructor behavior for thread_pool HOT 4
- [REQ] Add compiler flag/define to make threadpool use a stack instead of queue HOT 3
- Get rid of shared pointer for submitted tasks promises HOT 1
- [REQ] Wait on any task in multi_future and thread_pool HOT 2
- [BUG] Thread pool is NOT avaliable when it was created. HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from thread-pool.