Comments (2)
Hello.
I am uploading multiple tasks to a pool using
toloka_client.create_tasks(tasks)
and would like to have an estimate on how long the upload would take. Is there any way to monitor the progress of task creation?
Unfortunately, there is no out of the box way to monitor progress at the moment =( Am I right to assume that your upload has taken segnificantly more time than you expected? If so, one insight is that currently, upload time is more affected by the number of tasks, not the data size.
I realise that I can wrap a tqdm around a loop with
toloka_client.create_task(task)
, but that would switch the upload from asynchronous to synchronous? Is there a better way to do this?
If I understood correctly, you are asking if a code like this will work slower that simple toloka_client.create_task(task)
from tqdm import tqdm
from itertools import islice
def split_into_batches(sequence, size):
"""Splits a given sequesnce into a list of lists of consequtive elements"""
iterator = iter(sequence)
return list(iter(lambda: list(islice(iterator, size)), []))
...
for tasks_batch in tqdm(split_into_batches(tasks, 3)):
toloka_client.create_tasks(tasks_batch)
If so, the answer is no, it won't be slower. async_mode
does not stand for Python coroutines but for asynchronous type of communication with API. Let me explain:
- If
async_mode
isTrue
(default) the method will upload your tasks in one HTTP request, get back an TasksCreateOperation instance and poll its status until uploaded tasks won't be actually created. - If
async_mode
isFalse
there will be a single HTTP request to API that will hang until both tasks upload and processing is finished. This is not a recommended approach, because this single request may hit a request timeout
So there is no parallelism involved and you should be able to use tqdm
+ split_into_batches
without significant performance drop.
from toloka-kit.
Thank you! The tqdm + batchification solve my issue, indeed!
from toloka-kit.
Related Issues (20)
- [DOCS] HOT 1
- [BUG] type annotations for optional parameters
- [FEATURE] pipeline support on windows
- [BUG] Can't get pool which has specified 'experimental group' parameter in audience filter settings HOT 3
- [FEATURE] Add support of cattr 22.2.0 HOT 3
- [BUG] TypeError: __init__() got an unexpected keyword argument 'verified' HOT 1
- [BUG] Impossible to get a pool with filter that contains a subfilter with null value HOT 1
- 404 from API is not eloquent enough HOT 4
- Attrs dependency requirement conflicts with Airflow HOT 6
- Is it possible to delete task from pool? HOT 2
- "UnboundLocalError: local variable 'start_soon' referenced before assignment" in pipeline.py HOT 2
- PoolCompletedPercentage metric raises an error for closed pools HOT 1
- Improvement 6.streaming_pipelines example HOT 2
- Hook for automatic stubs and documentation re-generation
- TypeError: __init__() got an unexpected keyword argument 'allowed_methods' in v 0.1.23 HOT 2
- KeyError while using toloka.client.TolokaClient.find_pools/get_pools HOT 3
- [Feature]: Add parameter 'infinite_overlap': True for control tasks HOT 1
- Make optional dependencies HOT 2
- [Question] Pool type HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from toloka-kit.