I divide data by the number of threads. Each thread will update its divided data. All

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

I agree with <a class="user-mention notranslate" data-hovercard-type="user" data-hover

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Thank you, <a class="user-mention notranslate" data-hovercard-type="user" data-hoverca

Is there anyway to get and pass thread IDs or pointers to tasks? about thread-pool HOT 15 CLOSED

nguyenpham commented on May 10, 2024

Is there anyway to get and pass thread IDs or pointers to tasks?

from thread-pool.

Comments (15)

bshoshany commented on May 10, 2024 1

You can use std::this_thread::get_id.

from thread-pool.

mcopik commented on May 10, 2024 1

@bshoshany This is a wider issue than simply running in a specific thread. It's about accessing specific data from a thread.

For example, each task might require using allocated data structures that shouldn't be shared across threads. On an n-thread pool, it's sufficient to allocate N copies and access with thread ID. In our example, we had to use a large multi-grid iterative solver and it would be very inefficient to reallocate it. In the OpenMP tasking and threading model, we could simply use the get_thread_num function to ensure that each thread is using one copy at a time. An alternative is using static variables in the pool worker function, which is messy and quickly becomes complex.

Another example is aggregating results and statistics across tasks. A common pattern is to allocate a copy of partial results independently for each thread, allowing for a lock-free update. Then, final results can be computed after all tasks are finished. Otherwise, one has to use atomics and/or locks to update the single copy of results after each task has been processed.

from thread-pool.

bshoshany commented on May 10, 2024 1

@JonasLieber, in case it wasn't clear, I do appreciate your comments and suggestions!

The hashes are not mathematically guaranteed to be unique (that's true for most hash functions) but since the hashes are 64-bit integers, the probability of two out of a few dozen hashes being exactly the same is, for all intents and purposes, zero. However, if you want uniqueness to be mathematically guaranteed (I admit I would prefer that too) then I think the best solution is to do what @nguyenpham suggested.

Importantly, if I were to implement a function that provides an integer thread ID as you ask, keep in mind that the thread pool object doesn't have control of the std::this_thread object, so this will most likely be implemented with an std::unordered_map<std::thread::id, uint32_t> stored inside the thread_pool object itself, and your task will call something like pool.get_integer_id(std::this_thread) to get an integer from this map. So this still has some overhead (including also the overhead of having to pass the pool object to each task).

I can't think of any use case where std::this_thread::get_id will not by itself provide the necessary functionality (after all that's exactly what it's meant to do!), but if you have such a use case in mind, you are more than welcome to provide a concrete code example.

from thread-pool.

nguyenpham commented on May 10, 2024

Thanks. It works

from thread-pool.

mcopik commented on May 10, 2024

@bshoshany This adds an unnecessary layer of indirection. For example, in my case, I need to store a state between invocations on the same thread. I could simply do with an array of objects, but that works only if thread IDs are aligned from 0..numThreads-1.

One could support both f(args) and f(id, args) user functions with SFINAE.

from thread-pool.

bshoshany commented on May 10, 2024

@mcopik when a thread in the thread pool finishes executing a task, it picks up the next available task from the queue. So any thread can pick up any task at any time. There's no way to guarantee that the same thread will execute different invocations of the same task when using a thread pool. If you need to run a specific task in a specific thread with a specific ID, then I don't think a thread pool is the correct framework to use. Instead, I suggest creating the threads yourself and feeding tasks into each thread manually.

from thread-pool.

JonasLieber commented on May 10, 2024

I agree with @mcopik's last post. I have a similar use case. While it's possible to use std::this_thread::get_id, the std::thread::id class is a bit annoying since it cannot be converted to an integer easily, see here. According to R. Martinho Fernandes' answer, "The portable solution is to pass your own generated IDs into the thread.".

Functionality like get_thread_num would make it much easier to switch from openMP to thread-pool. It would be great if the function pointer submitted to parallelize_loop had a way to read (and only read) the iteration index t (line 148-158).

from thread-pool.

bshoshany commented on May 10, 2024

@JonasLieber, I'm sorry, but I just don't see why the thread ID needs to be an integer. If what you want is to guarantee that no two threads access the same data by comparing their thread IDs, then you can just compare the objects of type std::thread::id that you get from std::this_thread::get_id, as they have operator== overloaded. And if you must get an integer for other reasons, then you can use std::hash as suggested in the StackOverflow link you provided, which is fully portable and standards-compliant. I really don't see why something that isn't broken needs to be fixed.

from thread-pool.

nguyenpham commented on May 10, 2024

I have used a map to map between a thread to a record, in the record, we can do what we want, say, auto-assign a number as an ID or order number to that thread. So simple:

    std::unordered_map<std::thread::id, ThreadRecord> threadMap;

from thread-pool.

JonasLieber commented on May 10, 2024

Thank you, @nguyenpham. As far as I can see, the drawbacks of this approach are

each user of the library has to implement the get_id function when it could be provided by the library
the (admittedly small) overhead of a lookup in the map
while the thread ids are unique, the hashes may not be (see Michael Goldshteyn's comment under 888's answer on the stack overflow post I linked before. I don't follow this to be honest.

So I think it would be a nice feature for the user, particularly those used to similar features by openMP. But it's not strictly necessary.

from thread-pool.

JonasLieber commented on May 10, 2024

@bshoshany : Thanks for your response and your work on the thread-pool. I'm not suggesting that anything is broken. I simply think that it would be a useful feature. Marcin and nguyenpham might agree.

from thread-pool.

mcopik commented on May 10, 2024

@bshoshany I also didn't try to imply that the library is incomplete or broken. I think that it's a nice feature for users that need to access data that shouldn't be reallocated between tasks, and I think it's both simpler and more efficient than a hash table (which we use currently).

Perhaps my understanding of your code was incorrect, but I thought that you already assign consecutive integers to threads in the create_threads function. The i counter can be passed to worker function, and this function invokes user-defined tasks. Thus, each thread has a unique integer that you use when storing thread instances in an array, and it doesn't change as long as the worker function doesn't terminate. If this is correct, then there's no need to pass additional objects and perform lookups - just propagate the id to user task invocation there, and the packed task can accept the id as well.

from thread-pool.

bshoshany commented on May 10, 2024

@mcopik I didn't think you tried to imply that, no worries!

The worker just calls task() without any arguments. This is necessary because std::function<void()> is a function wrapper that takes no arguments. Therefore, the task being pushed must have no arguments as well. To do what you suggest, I would have to change the queue to std::queue<std::function<void(uint32_t)>> or similar, so I can pass the integer ID to the packaged task, and in addition, the user would have to add an additional argument that accepts the integer ID to every task they push into the pool. I think that would complicate things unnecessarily, especially given that most users will not need the integer ID passed to the tasks.

What I can do, if you want, is write a separate helper class enumerate_pool that stores an std::unordered_map<std::thread::id, uint32_t> from thread IDs to consecutive integers (the same ones that create_threads() uses). You will construct an object using enumerate_pool(pool) where pool is the thread_pool object, and then you could call enumerate_pool(std::this_thread) from within a task to get the integer ID. This way the functionality you want will be added for the users who need it, but will not complicate the main thread_pool class.

from thread-pool.

roussec commented on May 10, 2024

I am having problems understanding this. I want to reuse certain information per thread. I can use std::thread::id to identify the threads. However there are more thread ids than threads in the pool. E.g. 37 different thread ids but thread_count is only 20. What am I missing? Thank you

from thread-pool.

bshoshany commented on May 10, 2024

@roussec that shouldn't happen. There should be exactly as many unique thread IDs as the pool's thread count. In fact, this is tested directly by the test program BS_thread_pool_test.cpp. The function count_unique_threads() detects how many unique thread IDs there are in the pool and uses that to count the number of threads and verify independently that the correct number of threads was created. Since the test always passes, this proves that the number of unique IDs always matches the thread count.

from thread-pool.

Is there anyway to get and pass thread IDs or pointers to tasks? about thread-pool HOT 15 CLOSED

Comments (15)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent