Giter Site home page Giter Site logo

async-pool's People

Contributors

apostolos avatar chechu avatar colint avatar dependabot[bot] avatar heilhead avatar jpike88 avatar krlwlfrt avatar mhjam avatar rxaviers avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

async-pool's Issues

Package has dependencies on GitHub

assert is pulled from GitHub which is a performance issue on some of my systems. Additonally you need git on all machines that want to use your package.

It would be great if the assert module can be pulled from npm instead of GitHub. There are many assert library on npm. Maybe you could choose one that is already on npm?

Chrome fails (Uncaught (in promise) TypeError: Failed to fetch)

Chrome (not FF and not Safari, they are just fine) was failing when I send a large number (more than 1000 or so) api calls via a fetch array. I thought I could solve this by reducing concurrency using async-pool, but Chrome still fails with code like the following:

  const ps = jdone.cprefixes.map(p => fetch(`${baseUrl}${jdone.num}&prefix=${p.prefix}`))
  const timeout = i => new Promise(resolve => setTimeout(() => resolve(i), i))
  const reqs =  await asyncPool(2, ps, timeout).then(j => j.map(k => k.json()))
  console.log('reqs', reqs)

In Firefox and Safari, the above code executes without any issues no matter the size of the array, but in Chrome (87.0.xxx at the moment, but many versions) I get a series of failures (Uncaught (in promise) TypeError: Failed to fetch) when the number of fetch calls in the array is large (more than several hundred)

Any clue how I could make chrome work with this?

Use named imports

First of all: thank you very much for the concise and easy to use implementation. In my opionion it's much better than comparable solutions.

I added TypeScript definitions to DefinitelyTyped to be able to use your async pool better and more easily in my TypeScript projects. It was a little tricky because you are not using a named import but are exporting the function directly. This leads to rather weird imports in TypeScript modules.

import asyncPool = require('tiny-async-pool');

If you would use named exports the import would be the normal and standard way:

import {asyncPool} from 'tiny-async-pool';

Feature: guarantee order of results

The way we used to use this library was something along these lines:

// fetch a list of rows from the database;
const rawRows = await sqlQuery;

// construct the complete objects from those rows
const rows = await asyncPool(10,rawRows, async (row) => {const completeRow = await doSomething(row); return completeRow})`

The problem with V2 is that since now each result returns as soon as it completes, the array rows in the previous example is not guaranteed to be in the same order as rawRows, it will all depend on how fast the callbacks run.

The following is just a simple example to see it in action

console.log('start pool ---------------');
	const results = [];
	for await (const result of asyncPool(10, [1, 2, 3, 4, 5], async (val) => {
		await new Promise((res) => setTimeout(res, 1000 / val));
		console.log(val);
		return val;
	})) {
		results.push(result);
	}
	console.log('end pool --------- ');

which results in

start pool ---------------
5
4
3
2
1
end pool --------- 

I see that V1 was calling promise.all whereas V2 uses promise.race even if the pool is larger than the list of arrays.

Would it be possible for this library to preserve the order of the items of the array?

Add documentation about ordering no longer being preserved

Hi, I was looking into upgrading from 1.x to 2.x of tiny-async-pool, and I'm glad that I perused the source code, because it is evident that the order of the results no longer matches the order of the input data!

In the previous implementation, the order is clearly preserved because the array of ret is in the order of the input iterable.

This is rather a trap that someone could easily fall into, since the incorrect result order might not crop up in unit tests or under most circumstances!

Can we update the documentation, and/or add an example of how to preserve order?

Here's some possible code to wrap this in a way that preserves order and has 1.x semantics -

import asyncPool from "tiny-async-pool";

// Wrap tiny-async-pool with semantics of 1.x, also ensuring the order of the results
export const awaitPromisesWithConcurrency = async <IN, OUT>(
  concurrency: number,
  data: readonly IN[],
  iteratorFn: (value: IN) => Promise<OUT>
): Promise<OUT[]> => {
   // Tag the data & result with its index, to ensure the results are in the correct order
  const dataWithIndexes: [IN, number][] = data.map((value, index) => [value, index]);
  const iteratorFnWithIndex: (value: [IN, number]) => Promise<[OUT, number]> =
    ([value, index]: [IN, number]) => iteratorFn(value).then(result => [result, index]);
  const asyncIterable = asyncPool(concurrency, dataWithIndexes, promiseCreatorWithIndex);
  // Insert data in the results array in the correct index
  const results: OUT[] = [];
  for await (const [result, index] of asyncIterable) {
    results[index] = result;
  }
  return results;
};

Pool size and promise-rejections

If the poolLimit is bigger than the input array, an "UnhandledPromiseRejectionWarning" is caused for every rejection. Take for example this code:

const asyncPool = require('tiny-async-pool');
const handler = async () => {
  const poolLimit = 2;
  const timeout = i => new Promise((resolve, reject) => setTimeout(() => reject(i), i));
  await asyncPool(poolLimit, [100, 500, 300, 200], timeout).catch(error => {console.log(error);});
};
handler();

Every promise is rejected, so the pool stops and rejects after the very first rejection, and the output is:

100

This is working as expected, but once I increase the poolLimit to 5 (or more, as long as it's bigger than the number of elements in the input array), the output looks like this:

100
(node:6106) UnhandledPromiseRejectionWarning: 100
(node:6106) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1)
(node:6106) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
(node:6106) UnhandledPromiseRejectionWarning: 200
(node:6106) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 2)
(node:6106) UnhandledPromiseRejectionWarning: 300
(node:6106) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 3)
(node:6106) UnhandledPromiseRejectionWarning: 500
(node:6106) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 4)

README.md usage issue

When I checked how to use it in README, it said to do the following when importing.

import asyncPool from "tiny-async-pool";

However, when it was actually imported and used, it was used only as follows.

import * as asyncPool from "tiny-async-pool";

another args to function

in ES6 can use ...theArgs for function

async function asyncPool(poolLimit, array, iteratorFn, ...theArgs) {
...
const p = Promise.resolve().then(() => iteratorFn(item, array, theArgs));

in work i use that:

async function ticker(item, array, userId){
  return new Promise(base.findOne({'userId': userId, 'field': item}));
}

Readability issue

https://github.com/rxaviers/async-pool/blob/master/lib/es7.js#L20

I was scratching my head on this line

    const e = p.then(() => executing.splice(executing.indexOf(e), 1));

Then I realized that this is equivalent to this

    const e = p.then((x) => executing.splice(executing.indexOf(x), 1));

I know js has a lot of weird rules but, for readability reasons can we avoid these?

This would be a big help for dumb people like me :)

Promise.allSettled() instead of Promise.all() option

Hi, this package looks really good.

Would it be possible to use Promise.allSettled() instead of Promise.all() ? sometimes I get silly 500 errors on HTTPS requests that can be easliy fixed by throwing the failing URL and push it to the original array. I think is a nice trick that will work really well with those errors that can be retried.

Here is an example of what I mean but with batches, I will really like to do the same but with pool

async function asyncBatches(data, concurrency) {
  
  while (data.length) {
    
    // Batched concurret request per second
    const batch = data.splice(0, concurrency).map(requestAPI);
    const results = await Promise.allSettled(batch);

    // Deal with errors thrown by requestAPI function
    for (const { status, reason } of results) {
      if (status === 'rejected') {
        // Here the API throws the url cause I know it can be safetly retried
        // So reason is now the url which I am pushing to the data array and the while loop will pick up
        data.push(reason)
      }
    }
  }
}

Many thanks,
Alvaro

Typescript bindings

Currently the package is missing typescript bindings, so the result of running asyncPool is untyped. Can you please incorporate typings?

I'm using this as a workaround:

declare module "tiny-async-pool" {
	export default function<P, V>(
		poolSize: number,
		parameters: P[],
		worker: (p: P) => Promise<V>,
	): Promise<V[]>
}

You can simply put export default ... (without the wrapping declare module) in index.d.ts in the root of your package, that should do it.

Benchmarks, overhead information

I'm hesitant to use this lib without a little more understanding of what the overhead is (if any) of such a library, such as speed/memory impacts (if any). Could you fill me in on that if you know?

Does not "yield" reliably

... or with other words, the body of the for await loop is not executed for every iteration if more than one promise is fullfilled within the same tick (I guess, it's a (nodejs) tick...).

The reason is, that all those promises are removed from executing before yield comes into action again.

I don't know if this is understandable ;-), but you can try with this timeout-function from your examples:

const timeout = ms => Promise.resolve(ms)

Any idea for a fix?

Unhandled rejections in some corner case

Because executing array can be left un-raced, even after #9 fix cases of unhandled exceptions can occur, e.g.:

> asyncPool = require('tiny-async-pool')
> await asyncPool(2, [0,1,2], (i,a) => i<a.length-1 ? Promise.resolve(i) : Promise.reject(i)).catch(()=>{/*ignore*/})
undefined
> Uncaught 2 <-- shouldn't be logged as we catch exceptions

Allow for 1.x style await, without having to use for await

Treating asyncPool as an iterator is one way to use it, but for those who use it purely for it to run the 'jobs' it contains, without needing to post-process the result of the callback, for await creates unnecessarily complicated code.

See your own example:

const timeout = ms => new Promise(resolve => setTimeout(() => resolve(ms), ms));

for await (const ms of asyncPool(2, [1000, 5000, 3000, 2000], timeout)) {
  console.log(ms);
}

The console.log is nice... but what if I don't need to do console.log? Or what if I prefer to keep my console.log in the timeout callback, instead of fragmenting the logic into two pieces?

I think providing a asyncPoolAll or something like that will help keep the code more succinct if it needs to be used that way.

Allow iterable to be an async-iterable / async-generator

I currently see no way of asynchronously and incrementally retrieving the items to process concurrently. For example, if we want to concurrently process a lot of items from a database table, one would have to load at least the ids of ALL elements into memory first to be able to process them concurrently via asyncPool. This is infeasible for lots of data.

If the iterable could also be an async-iterable (e.g. an async-generator), this scenario could be implemented easily and efficiently.
Also, this could be implemented in a non-breaking way because checking whether the parameter is an AsyncIterable/AsyncIterator can be done beforehand.
See https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Iteration_protocols#the_async_iterator_and_async_iterable_protocols and https://stackoverflow.com/questions/70337056/verify-iterator-versus-asynciterator-type for this.

Even for little amounts of data, this would make code nicer as not all data has to be fetched and aggregated beforehand.

Would you accept such an extension into the library?

Example code:

import asyncPool from 'tiny-async-pool';

async function main() {
  for await (const result of asyncPool(20, getItems(), (id) => this.processItem(id))) {
    console.log(result);
  }
}

async function processItem(id: string) {
  // heavy computation
  await sleep(1000);
  return "yay";
}

async function *getItems() {
  for (let i = 0; i < 100; i++) {
    await sleep(100);
    yield i;
  }
}

function sleep(durationInMs: number): Promise<void> {
  return new Promise((resolve) => setTimeout(resolve, durationInMs));
}

main();

your code amazing!

i read your source code, u are useing closure and recursion to implement like for loop.
this is amazing ideal 😄😄😄

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.