Giter Site home page Giter Site logo

phantom-pool's Introduction

phantom-pool

Build Status

Resource pool based on generic-pool for PhantomJS.

Creating new phantom instances with phantom.create() can be slow. If you are frequently creating new instances and destroying them, as a result of HTTP requests for example, this module can help by keeping a pool of phantom instances alive and making it easy to re-use them across requests.

Here's an artificial benchmark to illustrate:

Starting benchmark without pool

noPool-0: 786.829ms
noPool-1: 790.822ms
noPool-2: 795.150ms
noPool-3: 788.928ms
noPool-4: 793.788ms
noPool-5: 798.075ms
noPool-6: 813.130ms
noPool-7: 803.801ms
noPool-8: 782.936ms
noPool-9: 805.630ms

Starting benchmark with pool

pool-0: 48.160ms
pool-1: 98.966ms
pool-2: 89.573ms
pool-3: 99.057ms
pool-4: 101.970ms
pool-5: 102.967ms
pool-6: 102.938ms
pool-7: 99.359ms
pool-8: 101.972ms
pool-9: 103.309ms

Done

Using pool in this benchmark results in an average >8x speed increase.

Install

npm install --save phantom-pool

Requires Node v6+

Usage

See ./test directory for usage examples.

const createPhantomPool = require('phantom-pool')

// Returns a generic-pool instance
const pool = createPhantomPool({
  max: 10, // default
  min: 2, // default
  // how long a resource can stay idle in pool before being removed
  idleTimeoutMillis: 30000, // default.
  // maximum number of times an individual resource can be reused before being destroyed; set to 0 to disable
  maxUses: 50, // default
  // function to validate an instance prior to use; see https://github.com/coopernurse/node-pool#createpool
  validator: () => Promise.resolve(true), // defaults to always resolving true
  // validate resource before borrowing; required for `maxUses and `validator`
  testOnBorrow: true, // default
  // For all opts, see opts at https://github.com/coopernurse/node-pool#createpool
  phantomArgs: [['--ignore-ssl-errors=true', '--disk-cache=true'], {
    logLevel: 'debug',
  }], // arguments passed to phantomjs-node directly, default is `[]`. For all opts, see https://github.com/amir20/phantomjs-node#phantom-object-api
})

// Automatically acquires a phantom instance and releases it back to the
// pool when the function resolves or throws
pool.use(async (instance) => {
  const page = await instance.createPage()
  const status = await page.open('http://google.com', { operation: 'GET' })
  if (status !== 'success') {
    throw new Error('cannot open google.com')
  }
  const content = await page.property('content')
  return content
}).then((content) => {
  console.log(content)
})

// Destroying the pool:
pool.drain().then(() => pool.clear())

// For more API doc, see https://github.com/coopernurse/node-pool#generic-pool

Security

When using phantom-pool, you should be aware that the phantom instance you are getting might not be in a completely clean state. It could have browser history, cookies or other persistent data from a previous use.

If that is an issue for you, make sure you clean up any sensitive data on the phantom instance before returning it to the pool.

phantom-pool's People

Contributors

mikedaly avatar olalonde avatar zaaack avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

phantom-pool's Issues

How to clear data from Phantom Instances

If that is an issue for you, make sure you clean up any sensitive data on the phantom instance before returning it to the pool.

As mentioned in the repo, that previous instances might have data left in it, how to clear them? Could you point me to any example?

Thank you for your time. :)

Validate a phantom instance before reusing it

Hi,

I was willing to create a fork in order to add the possibility to validate a resource each time it is borrowed when I saw @mikedaly pull request. So once it is merged, no need for another one. Cool. (Any roadmap regarding this merging ? I would be glad to use it as soon as possible 👯‍♂️)

In any case, the reason I want the 'validate' functionality is because I think the pool should check the phantom instance is still functional before giving it.

I saw that by accident. Make the test :

  • launch a very long phantom evaluate script
  • kill the phantom process (via the task manager for instance)
  • the phantom instance will stay in the pool and be provided again and again although it is not usable anymore (Error reading from stdin…)

Would you know a performant way to check the phantom instance validity ? The only option I see when reading phantomjs-node documentation would be by creating a web page via #createPage but it seems a bit much.

How to cleanup phantom instance before reuse

Hi,

In the readme you say it is important to cleanup phantom instances:

When using phantom-pool, you should be aware that the phantom instance you are getting might not be in a completely clean state. It could have browser history, cookies or other persistent data from a previous use.

I checked examples in this repository and phantomjs-node and I didn't find any documentation or example about how to cleanup a phantom instance before returning to pool.

The excerpt below from phantoms-node says exit and kill kills the phantom process in order to cleanup.

    /**
     * Cleans up and end the phantom process
     */
    exit(): Promise<Response> {
        clearInterval(this.heartBeatId);
        if (this.commands.size > 0) {
            this.logger.warn('exit() was called before waiting for commands to finish. ' +
                'Make sure you are not calling exit() too soon.');
        }
        return this.execute('phantom', 'invokeMethod', ['exit']);
    }

    /**
     * Clean up and force kill this process
     */
    kill(errmsg: string = 'Phantom process was killed'): void {
        this._rejectAllCommands(errmsg);
        this.process.kill('SIGKILL');
    }

I think this is not what you meant. How do I suppose to clean it up?

Example are all broken with node 8.1.1

Trying to run the examples with node v8.1.1 results in an error:

import phantom from 'phantom'
^^^^^^
SyntaxError: Unexpected token import

Also, the front page example is wrong (missing and miss placed comas that make the object structure wrong)... and using:

const phantom_pool = require('phantom-pool');

instead of import resulting in an error when trying to create the pool:

const pool = phantom_pool();

PhantomJS Instance is not destroyed when it is more than maximum instances

I am running the phantom pool to maintain the instances of the phantomJS with

const pool = phantomPool.default({ max: 100, // default min: 20, // default // how long a resource can stay idle in pool before being removed idleTimeoutMillis: 30000, // default. // maximum number of times an individual resource can be reused before being destroyed; set to 0 to disable maxUses: 50, // default // function to validate an instance prior to use; see https://github.com/coopernurse/node-pool#createpool validator: () => Promise.resolve(true), // defaults to always resolving true // validate resource before borrowing; required for maxUses and validator
testOnBorrow: true, // default
// For all opts, see opts at https://github.com/coopernurse/node-pool#createpool
phantomArgs: [['--ignore-ssl-errors=true', '--disk-cache=true'], {
logLevel: 'info',
}], // arguments passed to phantomjs-node directly, default is []. For all opts, see https://github.com/amir20/phantomjs-node#phantom-object-api
})`

I run the application after sometime the number of phantom instances is going beyond 100(which is the maximum)

Phantomjs getting crashed and Socked getting closed

Hi All,

I have been using node pool for a while. When doing single requests tests, all works fine but when I rapidly send requests to the phantomjs, then I get a lot of crash errors in the console. The common one which I can catch is

Error: Error reading from stdin: Error: This socket has been ended by the other party
1|www      |     at Phantom._rejectAllCommands (/home/node/node_modules/phantom-pool/node_modules/phantom/lib/phantom.js:373:
41)
1|www      |     at Phantom.kill (/home/node/node_modules/phantom-pool/node_modules/phantom/lib/phantom.js:351:14)
1|www      |     at Socket.Phantom.process.stdin.on.e (/home/node/node_modules/phantom-pool/node_modules/phantom/lib/phantom.
js:186:18)
1|www      |     at emitOne (events.js:96:13)
1|www      |     at Socket.emit (events.js:188:7)
1|www      |     at Socket.writeAfterFIN [as write] (net.js:294:8)
1|www      |     at Phantom.executeCommand (/home/node/node_modules/phantom-pool/node_modules/phantom/lib/phantom.js:268:28)
1|www      |     at Phantom.execute (/home/node/node_modules/phantom-pool/node_modules/phantom/lib/phantom.js:284:21)
1|www      |     at Phantom.createPage (/home/node/node_modules/phantom-pool/node_modules/phantom/lib/phantom.js:211:21)
1|www      |     at Promise (/home/node/controllers/modules/phantomjs/index.js:41:20)
1|www      |     at pool.use (/home/node/controllers/modules/phantomjs/index.js:39:16)
1|www      |     at process._tickDomainCallback (internal/process/next_tick.js:129:7)

Other ones that I get are :

(node:4177) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 42): Error: Phantom process stopped with exit code null
and
error: PhantomJS has crashed. Please read the bug reporting guide at1

and

| You have triggered an unhandledRejection, you may have forgotten to catch a Promise rejection:
1|www      | Error: Phantom process stopped with exit code null
1|www      |     at Phantom._rejectAllCommands (/home/node/node_modules/phantom-pool/node_modules/phantom/lib/phantom.js:373:41)

Here is my pool code:

const pool = phantomPool({
  max: 20, // default
  min: 2, // default
  
  idleTimeoutMillis: 15000, // default.

  maxUses: 20, // default

  validator: () => Promise.resolve(true), // defaults to always resolving true

  testOnBorrow: true, // default

  phantomArgs: [[], {
    logLevel: 'warn',
  }], // arguments passed to phantomjs-node directly, default is `[]`. For all opts, see
      // https://github.com/amir20/phantomjs-node#phantom-object-api
});

And my pool.use code:

function phantomChecks(url) {
  return new Promise((rosolve1, reject1)=> {
    try {
       pool.use((instance) => {
         return new Promise((resolve, reject) => {
         });
       });
      } catch(error){
         reject1(error);
      }
   });
}

I don't know to catch those errors and handle them as 2 promises and one try and catch statement is not doing it. Also, Is it okay to return a promise inside pool.use method?

Thanks

CPU Usage to 100%

After running few hundered urls phantomJs CPU usage shoots up to 100% and instances are not killed. Is anybody else seeing this? I have to manually restart the pool. Is there a better way to fix this?

My current configuration bellow:

self.pool = phantomPool.default({
max: 5, // default
min: 1, // default
// how long a resource can stay idle in pool before being removed
idleTimeoutMillis: 30000, // default.
// maximum number of times an individual resource can be reused before being destroyed; set to 0 to disable
maxUses: 10, // default
// function to validate an instance prior to use; see https://github.com/coopernurse/node-pool#createpool
validator: () => Promise.resolve(true), // defaults to always resolving true
// validate resource before borrowing; required for maxUses and validatortestOnBorrow: true, // default // For all opts, see opts at https://github.com/coopernurse/node-pool#createpool phantomArgs: [phantomArgs, { logLevel: 'error', }], // arguments passed to phantomjs-node directly, default is[]`. For all opts, see https://github.com/amir20/phantomjs-node#phantom-object-api
});

Add factory afterCreate handler

Could you please add a possibility to execute code after phantoms instance was created? For example before open several page it is needed to perform login operation. Login should be done only once time for already created instance.

Linux dependencies

Hi ! :)

I would like to create a small Docker image, based on Alpine, to use phantom-pool into a Docker container.

By default, from Node image 8.11-alpine, it's seems to not working. Workers does not starting up:

Mon Jun 18 2018 17:31:11 GMT+0000 (UTC) [warning] worker 1 is being written to, restart delayed
Mon Jun 18 2018 17:31:11 GMT+0000 (UTC) [warning] worker 2 is being written to, restart delayed
Mon Jun 18 2018 17:31:11 GMT+0000 (UTC) [warning] worker 3 is being written to, restart delayed
Mon Jun 18 2018 17:31:11 GMT+0000 (UTC) [warning] worker 4 is being written to, restart delayed
Mon Jun 18 2018 17:31:11 GMT+0000 (UTC) [warning] worker 1 is being written to, restart delayed
Mon Jun 18 2018 17:31:11 GMT+0000 (UTC) [warning] worker 2 is being written to, restart delayed
Mon Jun 18 2018 17:31:11 GMT+0000 (UTC) [warning] worker 3 is being written to, restart delayed
Mon Jun 18 2018 17:31:11 GMT+0000 (UTC) [warning] worker 4 is being written to, restart delayed

But from Node image 8.11, all is running fine.

What are the Linux dependencies of Phantom Pool ?

Images are not included in screenshot

Hello,

I've tried setting the following

  phantomArgs: [
    [
      '--ignore-ssl-errors=true',
      '--disk-cache=true',
      '--load-images=true',
      '--local-to-remote-url-access=true',
    ],
    {},
  ],

as well as calling my page.render here:

page.on('onLoadFinished', async () => {
    await page.render(screenshotPath);
}

I've also tried doing a setTimeout for 30 seconds, but for whatever reason images just are not being included in the screenshot. Any advice? Thank you!

Phantom pool fails to run on Alpine Linux

I am running Phantom-pool in a docker container which is running Alpine Linux.

Here is the error:
/node_modules/phantomjs-prebuilt/lib/phantom/bin/phantomjs: error while loading shared libraries: libfontconfig.so.1: cannot open shared object file: No such file or directory

The libfontconfig.so lib does not exist for Alpine Linux.

I did digging and found that the phantom-prebuilt v. 2.1.4 does have this problem and could be resolved by updating to 2.1.8 or greater.

https://github.com/amir20/phantomjs-node/issues/514

Can the dependency on prebuilt be updated? Also, if and when the update happens how can I get the respective npm module updated?

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.