Comments (10)
That's correct, with the addition of the continued utilization of instances. I'm under the assumption that launching a new instance has significant overhead, while simply reusing an existing instance should be at least an order of magnitude faster. If I have a work queue with 1M render tasks in it, I want them to render as fast as possible, and I think cdb
should be able to manage that connection pool. Here's my pseudocode for the program flow.
- Launch Chrome Docker container with 2 CPU and 4 GB RAM, knowing that it can handle 10 concurrent instances without degrading performance.
- Run
cdb.CreateInstancePool(ctx context.Context, maxInstances int = 10, maxTabsPerInstance int = 1) (*Pool, error)
and all the instances are created once, but nothing is executed - Call
(p *Pool) Exec(ctx context.Context, tasks cdp.Tasks) (*Response, error)
- Like a channel,
Pool
waits to accept incomingExec
requests until an instance becomes available, never exceedingmaxInstances
- When you're done, call
pool.Close()
and all the pool instances will be shut down.
Here's the relevant pool code in database/sql
, though it's not super easily readable.
https://sourcegraph.com/github.com/golang/go@838eaa738f2bc07c3706b96f9e702cb80877dfe1/-/blob/src/database/sql/sql.go#L780:1-781:1
from chromedp.
In its current form, chromedp is just about managing one chrome instance, but there's nothing preventing running multiple chromes at the same time (since you can connect this to a remote chrome instance, and that two Chrome processes run in isolation from each other). In fact, I have a half-finished example of spinning up two chrome instances and then having them communicate back and forth using WebRTC.
Anyway, I may not be understanding what you're referring to here, but would definitely welcome a pull request for this, if you think it is something that would make sense after I read through the example. I think what you're getting at here is having a simple / easy worker pool for launching Chrome instances?
from chromedp.
The ramifications that I'm not aware of for something like this would be how it handles caching when you are reusing instances. In the http://sitespeed.io instance, that could throw off your results. If you're crawling an entire site, it could speed things up considerably and save a lot of bandwidth.
from chromedp.
I'd be interested in this too!
from chromedp.
This is a cracking idea! I have this exact need. @derekperkins do you have some example code or a pull request/gist?
As an additional idea - when the instance is returned to the pool, maybe some way of clearing the browsers session so that its a clean slate again ready for the next worker to grab an instance from the pool...
from chromedp.
I maintain that this is likely a best fit for a higher level package. For example, this repo: https://github.com/ory/dockertest supports "pools" of docker images.
from chromedp.
I just committed a Pool
for testing purposes -- please see the unit tests for how it's used with headless_shell. I will be adding more unit tests, etc. in the future. I have also been working on using a "Docker pool" similar to the dockertest
package I linked above, but the issue is that the Docker client API is a bit of a pain, and it takes quite a bit of configuration to setup the networking and everything correctly.
from chromedp.
Experimenting now with the pool last night and today, looks great so far, nice job!
from chromedp.
Looks awesome!
from chromedp.
Hello @kenshaw,
Can you point where exactly the Pool
code is located? I am not able to find it so far.
Thanks.
UPD: Found commit 42c6cca with the pool. The code is removed in master :(
from chromedp.
Related Issues (20)
- when use chromedp.Evaluate, how can I get the promise error info ?
- Navigate Hangup with custom url scheme HOT 1
- page.StopLoading() cannot stop navigate
- chrome failed to start with no detail error
- Screenshot from remote browser
- context canceled even with new context HOT 1
- Download events being omitted on the page level but chromedp listens for it on the Browser level HOT 1
- Image not showing up in header
- Can't use proxy and open multiple tabs ?
- Target.targetCrashed > errorCode 11 with chromedp.Navigate() in Docker container environnement HOT 7
- Is it possible to use the net/http client in chromedp ? HOT 1
- GetOuterHTML().WithPierce(true) not returning <iframe> contents
- How to execute JavaScript in a specified context? HOT 1
- How to set the state of ShadowDOM from closed to open?
- Question: condition for set FooterTemplate
- Is it possible to capture error messages related to CORS, CSP violations, mixed-content violations, etc.?
- Can I Capture Raw HTTP Data?
- How to start chrome in arm environment?Are there any other plans?
- Why can't I listen to my iframe's network requests
- How i get dpi of chrome
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chromedp.