Giter Site home page Giter Site logo

Chrome Connection Pools about chromedp HOT 10 CLOSED

chromedp avatar chromedp commented on May 14, 2024 1
Chrome Connection Pools

from chromedp.

Comments (10)

derekperkins avatar derekperkins commented on May 14, 2024 3

That's correct, with the addition of the continued utilization of instances. I'm under the assumption that launching a new instance has significant overhead, while simply reusing an existing instance should be at least an order of magnitude faster. If I have a work queue with 1M render tasks in it, I want them to render as fast as possible, and I think cdb should be able to manage that connection pool. Here's my pseudocode for the program flow.

  1. Launch Chrome Docker container with 2 CPU and 4 GB RAM, knowing that it can handle 10 concurrent instances without degrading performance.
  2. Run cdb.CreateInstancePool(ctx context.Context, maxInstances int = 10, maxTabsPerInstance int = 1) (*Pool, error) and all the instances are created once, but nothing is executed
  3. Call (p *Pool) Exec(ctx context.Context, tasks cdp.Tasks) (*Response, error)
  4. Like a channel, Pool waits to accept incoming Exec requests until an instance becomes available, never exceeding maxInstances
  5. When you're done, call pool.Close() and all the pool instances will be shut down.

Here's the relevant pool code in database/sql, though it's not super easily readable.
https://sourcegraph.com/github.com/golang/go@838eaa738f2bc07c3706b96f9e702cb80877dfe1/-/blob/src/database/sql/sql.go#L780:1-781:1

from chromedp.

kenshaw avatar kenshaw commented on May 14, 2024

In its current form, chromedp is just about managing one chrome instance, but there's nothing preventing running multiple chromes at the same time (since you can connect this to a remote chrome instance, and that two Chrome processes run in isolation from each other). In fact, I have a half-finished example of spinning up two chrome instances and then having them communicate back and forth using WebRTC.

Anyway, I may not be understanding what you're referring to here, but would definitely welcome a pull request for this, if you think it is something that would make sense after I read through the example. I think what you're getting at here is having a simple / easy worker pool for launching Chrome instances?

from chromedp.

derekperkins avatar derekperkins commented on May 14, 2024

The ramifications that I'm not aware of for something like this would be how it handles caching when you are reusing instances. In the http://sitespeed.io instance, that could throw off your results. If you're crawling an entire site, it could speed things up considerably and save a lot of bandwidth.

from chromedp.

clanstyles avatar clanstyles commented on May 14, 2024

I'd be interested in this too!

from chromedp.

JalfResi avatar JalfResi commented on May 14, 2024

This is a cracking idea! I have this exact need. @derekperkins do you have some example code or a pull request/gist?

As an additional idea - when the instance is returned to the pool, maybe some way of clearing the browsers session so that its a clean slate again ready for the next worker to grab an instance from the pool...

from chromedp.

kenshaw avatar kenshaw commented on May 14, 2024

I maintain that this is likely a best fit for a higher level package. For example, this repo: https://github.com/ory/dockertest supports "pools" of docker images.

from chromedp.

kenshaw avatar kenshaw commented on May 14, 2024

I just committed a Pool for testing purposes -- please see the unit tests for how it's used with headless_shell. I will be adding more unit tests, etc. in the future. I have also been working on using a "Docker pool" similar to the dockertest package I linked above, but the issue is that the Docker client API is a bit of a pain, and it takes quite a bit of configuration to setup the networking and everything correctly.

from chromedp.

JalfResi avatar JalfResi commented on May 14, 2024

Experimenting now with the pool last night and today, looks great so far, nice job!

from chromedp.

derekperkins avatar derekperkins commented on May 14, 2024

Looks awesome!

from chromedp.

sashayakovtseva avatar sashayakovtseva commented on May 14, 2024

Hello @kenshaw,

Can you point where exactly the Pool code is located? I am not able to find it so far.

Thanks.

UPD: Found commit 42c6cca with the pool. The code is removed in master :(

from chromedp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.