raff / godet Goto Github PK
View Code? Open in Web Editor NEWRemote client for Chrome DevTools
License: MIT License
Remote client for Chrome DevTools
License: MIT License
Hello, I am trying to follow/use your example to open link in a newly activated tab. However link always open in already existing tab. What I am doing wrong?
err = remote.SetControlNavigation(true)
if err != nil {
panic(err)
}
// create new tab
tab, err := remote.NewTab("https://www.google.com")
log.Printf("%+v\n", tab)
// navigate in existing tab
err = remote.ActivateTab(tab)
if err != nil {
panic(err)
}
_, err = remote.Navigate("https://www.google.com") // navigates different tab
if err != nil {
panic(err)
}
err = remote.CloseTab(tab) // closes tab that I openned, ok
if err != nil {
panic(err)
}
Sometimes calling remote.Close()
while websocket is busy communicating results in a panic:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x122dc67]
goroutine 37 [running]:
github.com/gorilla/websocket.(*Conn).WriteMessage(0x0, 0x1, 0xc420222000, 0x8d, 0xb9, 0x0, 0x0)
/Users/iafanasyev/.go/src/github.com/gorilla/websocket/conn.go:735 +0x37
github.com/raff/godet.(*RemoteDebugger).sendMessages(0xc420164120)
/Users/iafanasyev/.go/src/github.com/raff/godet/godet.go:394 +0x1b5
created by github.com/raff/godet.Connect
/Users/iafanasyev/.go/src/github.com/raff/godet/godet.go:235 +0x267
In my case remote.Close()
is called from a gorountine.
When running your example from the README verbatim (apart from wrapping into a func main()
) on my Mac, I'm getting this:
$ go run main.go
REQUEST: GET /json/list HTTP/1.1
Host: localhost:9222
ERROR: dial tcp [::1]:9222: getsockopt: connection refused REQUEST: "GET /json/list HTTP/1.1\r\nHost: localhost:9222\r\n\r\n"
panic: runtime error: invalid memory address or nil pointer dereference
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1226ba6]
goroutine 1 [running]:
github.com/raff/godet.(*RemoteDebugger).Close(0x0, 0x0, 0x13f5240)
/Users/xxxxxxxxxxxx/.go/src/github.com/raff/godet/godet.go:321 +0x26
panic(0x126da60, 0x1422700)
/usr/local/Cellar/go/1.9.1/libexec/src/runtime/panic.go:491 +0x283
github.com/raff/godet.(*RemoteDebugger).Version(0x0, 0x12ccf58, 0x0, 0x0)
/Users/xxxxxxxxxxxx/.go/src/github.com/raff/godet/godet.go:505 +0x26
main.main()
/Users/xxxxxxxxxxxx/Documents/Work/Personal/Go/headless_chrome/main.go:17 +0x8a
exit status 2
The error seems to be valid (I guess Chrome needs to be run in a special mode in order to listen on that port), but the program shouldn't panic.
From the trace it seems like that a nil remote
variable is passed down the Close()
func, and then the Lock()
method is called on it in this line: https://github.com/raff/godet/blob/master/godet.go#L321
Is this hub give the method to generate a HAR file, like browsermob-proxy.
Open multiple profiles on multiple tabs? is it possible ? And how?
thank you so much
Hi @raff, I just wanted to make you aware of my cdp-go project which has the same goal as yours: https://godoc.org/github.com/neelance/cdp-go
Key differences are that I leverage the net/rpc
package to do some heavy lifting for the RPC and that I automatically generate the API from the browser_protocol.json
found in the Chromium source code. The project is WIP, but I wanted to tell you about it so it maybe saves you some time.
Hello, I'm very new to Go, so please excuse me if I made a silly mistake.
My machine is on Windows, using Go v1.16.6. Whenever I call remote.SetCookie, such as this:
remote.SetCookie(godet.Cookie{Name: "Foo", Value: "Bar"})
I get this error, every time:
panic: interface conversion: interface {} is nil, not bool
goroutine 1 [running]:
github.com/raff/godet.(*RemoteDebugger).SetCookie(0xc0001921e0, 0x8c9be3, 0x3, 0x8c9bb9, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
C:/Users/Jandro/go/pkg/mod/github.com/raff/[email protected]/godet.go:1055 +0x598
main.main()
E:/Users/Jandro/Files/Programming/go/godet-set-cookie/src/main.go:9 +0x98
exit status 2
Seems like this line is the culprit. As I'm a beginner, I don't really understand the syntax to know what's happening, so I can't make any suggestions.
Here's an example repo that recreates the error (For me, at least).
RemoteDebugger.EvaluateWrap(jsString),
if len(jsString)>2000, it do not work
After I call remote.Close()
(or when activating another tab), the library always spits these two error messages in the log:
2018/02/14 21:55:10 read message: read tcp 127.0.0.1:57401->127.0.0.1:32930: use of closed network connection
2018/02/14 21:55:10 permanent network error
These errors are harmless (websocket connection is being closed anyway, that's why reading from a websocket fails in the first place), but they are distracting and can't be muted.
I solved this locally by introducing a isClosing
flag raised in Close()
function that can be analyzed in the websocket message reading loop, but I'm not sure if this is the best way to handle the error. Will provide a PR soon for the reference.
On line 270 in goget.go, instead of assigning the value to nil as below
remote.responses[reqID] = nil
we should remove it from the map
delete(remote.responses, reqID)
Add a headers
parameter to Connect
so as to allow modifying the headers of the request. This is needed specifically to send a Host
header to chrome instances so that it doesn't reject requests from clients residing outside of the machine where it is running.
There are places such as https://github.com/raff/godet/blob/master/godet.go#L531 where the headers are hardcoded to nil where the header map could be easily integrated
I get this when I try to connect to the local host
cannot connect to Chrome instance: Get http://localhost:9222/json/list: dial tcp [::1]:9222: connect: connection refused
What's the reason?
Since a lot of APIs are not yet implemented in godet, I think exposing something like the sendRequest method would be useful.
This way, users can use godet to access APIs that are not yet implemented in godet. For example, I needed to capture the screenshot in a specific size, but I had no way to call Emulation.setVisibleSize
with godet. If we had SendRequest
, I could simply call it this way:
godet.SendRequest("Emulation.setVisibleSize", godet.Param{})
I'm interested in experimenting with godet and I would really appreciate if the amount of dependencies would be limited. In this case it looks like the net/http client from the std lib would work well and that would result in only 1 external dependency: Gorilla's websocket pkg (which might potentially be replaced by https://godoc.org/golang.org/x/net/websocket but that's a different issue all together).
What do you think?
I'm trying set geolocation use next code:
func setGeolocationOverride(remote *godet.RemoteDebugger, latitude, longitude, accuracy float64) error {
_, err := remote.SendRequest("Emulation.setGeolocationOverride", godet.Params{
"latitude": latitude,
"longitude": longitude,
"accuracy": accuracy,
})
return err
}
Did I miss something? Because if check it via navigator.geolocation it shows my real geolocation.
remote.CallbackEvent("Network.responseReceived", func(params godet.Params) {
log.Printf("%s\t%d", params["response"].(map[string]interface{})["url"], int(params["response"].(map[string]interface{})["status"].(float64)))
})
remote.NetworkEvents(true)
//_, _ = remote.Navigate(link) //valid
tab, err := remote.NewTab(link)//invalid
I suggest you change this line
_ = <-done
to this:
start := time.Now()
var pageFound bool = false
DONE:
for time.Since(start) < time.Second*2 {
time.Sleep(time.Millisecond * 200)
select {
case pageFound = <-done:
break DONE
default:
}
}
if ! pageFound {
return errors.New("Page Not found")
}
in case the domain does not exist, the first code will freezes the code.
Closing godet.RemoteDebugger
leaks a goroutine raff/godet.(*RemoteDebugger).sendMessages
. It seems to wait for a message which obviously is never going to come. I think this channel should be closed when calling remote.Close()
Hi! I'm using godet for webscrapping and would like to know if I could leave the chrome instance which godet is beeing used opened and get the HTML from pages there without have to create a new tab and navigate throw that bacause it takes a lot of machine work and as long as I only need the HTML (which a GET method can do alone) it might be possible. Is it?
Hi, is it possible, to set proxy for each tab?
It would be nice when EvaluateError would return the description, lineNumber and columnNumber when the error is of type "SyntaxError".
Hi, guys, I found strange things. I run a google-chrome-stable in my docker(docker file is pasted at the end); then I run the following code, it runs fine, but after a while, the container of the chrome crashed, with the exited code(135), that means chrome kill by SIGBUS???
docker ps -a | grep d358
d358dbec2cf8 yangluo/chrome "google-chrome-sta..." 2 weeks ago Exited (135) About a minute ago practical_allen
Actually, what I need is create a few tabs(targets), and then navigate to url simultaneous, finally do capturescreens. Because If I do capturescreen one by one, I would be very slow.
package main
import (
"github.com/raff/godet"
"fmt"
)
var (
urlList = []string {
"https://github.com",
"https://github.com/beetbox/beets",
"https://github.com/CreateJS/SoundJS",
"https://github.com/Soundnode/soundnode-app",
"https://github.com/gillesdemey/Cumulus",
"https://github.com/AudioKit/AudioKit",
"https://github.com/mopidy/mopidy",
"https://github.com/cashmusic/platform",
"https://github.com/musescore/MuseScore",
}
)
func main() {
remote, err := godet.Connect("localhost:9222", true)
if err != nil {
fmt.Println("cannot connnect to Chrome Insatnce:", err)
return
}
defer remote.Close()
remote.RuntimeEvents(true)
remote.NetworkEvents(true)
remote.PageEvents(true)
for _, url := range urlList {
tab, _ := remote.NewTab(url)
fmt.Println("new tab:", tab, url)
}
fmt.Println("End.")
}
the Dockerfile is following:
# Base docker image
FROM debian:sid
LABEL name="chrome-headless" \
maintainer="Shenweimn <[email protected]>" \
description="Google Chrome Headless in a container"
# Install deps + add Chrome Stable + purge all the things
RUN apt-get update && apt-get -y install \
apt-transport-https \
ca-certificates \
curl \
gnupg \
--no-install-recommends \
&& curl -sSL https://dl.google.com/linux/linux_signing_key.pub | apt-key add - \
&& echo "deb [arch=amd64] https://dl.google.com/linux/chrome/deb/ stable main" > /etc/apt/sources.list.d/google-chrome.list \
&& apt-get update \
&& apt-get -y install google-chrome-stable --no-install-recommends \
&& apt-get -y install fonts-droid ttf-wqy-zenhei ttf-wqy-microhei fonts-arphic-ukai fonts-arphic-uming
#&& apt-get purge --auto-remove -y curl gnupg \
#&& rm -rf /var/lib/apt/lists/*
# Add Chrome as a user
RUN groupadd -r chrome && useradd -r -g chrome -G audio,video chrome \
&& mkdir -p /home/chrome && chown -R chrome:chrome /home/chrome
# Run Chrome non-privileged
USER chrome
# Expose port 9222
EXPOSE 9222
# Autorun chrome headless with no GPU
ENTRYPOINT [ "google-chrome-stable" ]
CMD [ "--headless", "--disable-gpu","--hide-scrollbars", "--remote-debugging-address=0.0.0.0", "--remote-debugging-port=9222" ]
run container this way
docker run -d --cap-add=SYS_ADMIN -p 9222:9222 yangluo/multi-chrome
go run main.go
# github.com/gobs/httpclient
..\..\..\github.com\gobs\httpclient\httpclient.go:419:14: undefined: reuseport.Dialer
Windows 10 64bit
Chrome: 72.0.3626.81
Go version: 1.11
hello.
remote.CallbackEvent(eventname,*);
where is eventname?
What it contains?
Forgive my bad English。
Thanks!
rem, err := godet.Connect("localhost:9222", false)
// err == nil
Then activate tab fails:
err := remote.ActivateTab(tab)
// err = websocket: bad handshake
Not possible, unless I restart process.
Apart from that (another not resilient usecase):
Closing a tab in chrome and then trying to use it (I keep slice of tabs locally)
2017/05/11 23:24:20 read message: read tcp 127.0.0.1:49192->127.0.0.1:9222: use of closed network connection // after closing it in chrome
2017/05/11 23:24:21 <nil> // after program call to: err := remote.ActivateTab(tab) log.Println(err)
Ok, connection was closed, but hey - no error;)
So I wrote https://github.com/icco/archive.city/blob/master/main.go, heavily basing it off of your two great examples. All of my screenshots end up blank though. All I want to do is load a web page, take a screenshot of it (eventually I'd love a fullscreen screenshot, but that's for later), and exit.
when i want to use remote.EnableRequestPaused(true, godet.FetchRequestPattern{UrlPattern: "", RequestStage: godet.RequestStageResponse}, godet.FetchRequestPattern{UrlPattern: "", RequestStage: godet.RequestStageRequest}) and to pause the response
after i use
data, _ := remote.FetchResponseBody(requestId) get the response and try to modify the body and then i use
remote.FulfillRequest(requestId, 200, "", ni, []byte(base64.StdEncoding.EncodeToString(data)))
it look like the request unable to sent .evern i use the right headers
Hi,
are there any pointers on how I would go about taking a screenshot with this or something like this?
I'm not aware if this is something headless chrome can do, or the operating system.
thanks!
Is it possible to have gotdet send selenium commands? For instance if I have these commands, how can I get the chrome browser to execute them using godet? Thanks!
{
"id": "858ff576-3ebd-4a31-baee-62fa56598c8e",
"version": "2.0",
"name": "JS",
"url": "http://localhost:3000",
"tests": [{
"id": "10039cd8-4364-4e97-aa42-d97f85b3c8b8",
"name": "JSTest",
"commands": [{
"id": "011beeea-932b-4ff3-92f4-86748ec0dff5",
"comment": "",
"command": "open",
"target": "/",
"targets": [],
"value": ""
}, {
"id": "5354988e-8d7b-4c1f-99fa-2c999b5484b1",
"comment": "",
"command": "setWindowSize",
"target": "1114x790",
"targets": [],
"value": ""
}, {
"id": "c831ff0b-a38f-495a-b222-4ccbd460195a",
"comment": "",
"command": "mouseOver",
"target": "css=.buttons:nth-child(8)",
"targets": [
["css=.buttons:nth-child(8)", "css:finder"],
["xpath=//button[5]", "xpath:position"]
],
"value": ""
SetInputFiles send DOM.setInputFiles request,setInputFiles change to setFileInputFiles
https://chromedevtools.github.io/devtools-protocol/tot/DOM/#method-setFileInputFiles
As the title, the chrome can load all resources of the page only then I call "SavePDF" function, otherwise it can't load jpg
html
...etc..only js
css
can be load. My code like this
remote.SetCacheDisabled(false)
remote.NetworkEvents(true)
remote.PageEvents(true)
remote.DOMEvents(true)
_, _ = remote.Navigate("http://zat00.aytest.top")
_ = remote.SavePDF("page.pdf", 0644)
However, I don't want to call SavePDF, because I don't want to create a file. So, I want to know which event can make chrome load all resources, and how can I do ?
This code does not close the connection
// disconnect when done
defer remote.Close()
I tried to crawl so many pages and as time went on, chrome started to eat a lot of memory. This did't happen when I changed the defer code to this:
// disconnect when done
defer func(){ remote.Close() }()
Just saying.
I found the StartProfiler
and StopProfiler
functions for creating JS profile, but recent versions of Chrome now show this in the Dev Tools.
The JavaScript CPU profiler will be removed shortly.
I think this happened after version 59. Seems like Google wants people to move to using Timeline profiles, I think. Or maybe some new profile that's a combination of Timeline and CPU profiles.
https://developers.google.com/web/tools/chrome-devtools/evaluate-performance/reference
Can godet create profiles compatible with the Performance panel?
panic: interface conversion: interface {} is nil, not string
goroutine 42 [running]:
github.com/raff/godet.(*RemoteDebugger).GetResponseBody(0xc04204c1e0, 0xc042493490, 0xd, 0x9, 0xc0434f4508, 0x0, 0x0, 0xc042c6de18)
F:/Code/go_project/src/github.com/raff/godet/godet.go:897 +0x332
github.com/soulbird/elastic/chrome.(*Chrome).networkResponseReceived(0xc042090a00, 0xc0429f2810)
F:/Code/go_project/src/github.com/soulbird/elastic/chrome/chrome.go:131 +0x566
github.com/soulbird/elastic/chrome.(*Chrome).(github.com/soulbird/elastic/chrome.networkResponseReceived)-fm(0xc0429f2810)
F:/Code/go_project/src/github.com/soulbird/elastic/chrome/chrome.go:211 +0x3b
github.com/raff/godet.(*RemoteDebugger).processEvents(0xc04204c1e0)
F:/Code/go_project/src/github.com/raff/godet/godet.go:531 +0x35b
created by github.com/raff/godet.Connect
F:/Code/go_project/src/github.com/raff/godet/godet.go:252 +0x229
Firstly, thank you very much for this wonderful library.
There is a Chrome extension that opens in a separate window. How can I access and control this window
This window cannot access its enspect elemnt
The extension is Line chrome extension
The purpose is to protect the groups you created from intruders.
the Line API does not provide the control feature I need.
The solution is to build a controller through the browser Or anything else.
Thank you so much.
Hi. I tried to get status code
of the requested url
with a code like this:
...
remote.CallbackEvent("Network.responseReceived", func(params godet.Params) {
resp := params.Map("response")
url = resp["url"].(string)
statusCode = int(resp["status"].(float64))
fmt.Println(statusCode, url)
})
tab, err := remote.NewTab("https://www.somewhere.com")
...
but when I run it, i get this:
200 https://www.somewhere.com/Content/css?v=nzhr89jyV
200 https:/www.somewhere.com/bundles/angularjs?v=7xwsovO
200 https://www.somewhere.com/bundles/js?v=C3adeImoIV
200 https://www.somewhere.com/bundles/app?v=vp6dm1p3JfVY
200 https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js
...
the url https://www.somewhere.com is not returned among them. how can I get the page's status code?
I want to block all downloads , but can not find a way to set it.
selenium could set download_restrictions :3 to settle this .
Remote.SetDownloadBehavior(godet.DenyDownload,"/A:")
Tab, err = Remote.NewTab("https://dl.google.com/tag/s/appguid%3D%7B8A69D345-D564-463C-AFF1-A69D9E530F96%7D%26iid%3D%7BF49017DD-2E64-6292-C8FB-E1121560FAAA%7D%26lang%3Dzh-CN%26browser%3D3%26usagestats%3D0%26appname%3DGoogle%2520Chrome%26needsadmin%3Dprefers%26ap%3Dx64-stable-statsdef_1%26brand%3DYTUH%26installdataindex%3Dempty/update2/installers/ChromeSetup.exe")
not work
there must a way by godet , how to set it , thank you
When using NewTab()
immediately after calling Connect()
it is possible the readMessages()
goroutine started by Connct()
hasn't started before NewTab()
changes remote.ws
. NewTab()
then also starts a new readMessages()
goroutine which will try to read on the same remote.ws
as the old goroutine. gorilla/websocket
is not thread safe which means this will cause a panic:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0xa8 pc=0x9dad46]
goroutine 277 [running]:
github.com/gorilla/websocket.(*Conn).NextReader(0x0, 0x3, 0xc423f99ce8, 0xc423f99ce8, 0x9e4252, 0xc423e7a98c)
/home/erik/src/github.com/gorilla/websocket/conn.go:931 +0x26
github.com/gorilla/websocket.(*Conn).ReadMessage(0x0, 0x0, 0xc4200ca100, 0xc42001219a, 0xc4200661b8, 0xc4200121c0, 0x3)
/home/erik/src/github.com/gorilla/websocket/conn.go:1021 +0x2f
github.com/raff/godet.(*RemoteDebugger).readMessages(0xc423e7a960)
/home/erik/src/github.com/raff/godet/godet.go:426 +0x7c
created by github.com/raff/godet.(*RemoteDebugger).connectWs
/home/erik/src/github.com/raff/godet/godet.go:308 +0x42f
One solution to fix this would be to add the ws
as an argument to readMessages()
.
But I was wondering how open you are to changing the API. The current way tabs are implemented is too limited to be useful. Right now to support multiple tabs I'm forced to create multiple RemoteDebugger
instances and connect them to the different tabs. I would be much nicer if godet supported this in a cleaner way. I propose that most of the methods from RemoteDebugger
are moved to the Tab
struct. This combined with each tab getting its own websocket would mean tab support gets a lot better. It would allow multiple tabs to be used independently without having to ActivateTab()
all the time and without all tabs sharing the same event handlers.
In theory this could also be done in a backwards compatible way if we keep track of the active tab in RemoteDebugger
and forward all the methods to this active tab. But of course a cleaner solution would be to break backwards compatibility and move all methods to the Tab
struct.
<html>
<script>alert(1);alert(2);alert(3);</script>
</html>
remote.HandleJavaScriptDialog(true, "") //handle one
Currently the code depends on the standard "log" package and there's no way to disable logging (which is quite verbose).
It would be nice if there was an ability to disable logging whatsoever, or expose the Logger
interface with one Println
method that external consumer could provide an implementation of (via e.g. SetLogger(logger Logger)
method). In the second case providing a nil logger could also mute the logging.
Go version : 1.18.1
Platform: Windows 11
Problem : a javascript alert block the loding process
code :
package main
import (
"log"
"time"
"github.com/raff/godet"
)
func main() {
remote, err := godet.Connect("localhost:9222", true)
if err != nil {
log.Println("cannot connect to Chrome instance:", err)
}
defer remote.Close()
remote.CallbackEvent("Page.javascriptDialogOpening", func(params godet.Params) {
remote.HandleJavaScriptDialog(true, "")
})
remote.PageEvents(true)
tab, _ := remote.NewTab("DomainWithAlertBox") //blocked by an alertbox
time.Sleep(8 * time.Second) //to allow time for the alert to appear
remote.EvaluateWrap("return 1")
defer remote.CloseTab(tab)
}
may want to reopen previous issue to remove dependency
go get go get github.com/raff/godet
github.com\gobs\httpclient\httpclient.go:419:14: undefined: reuseport.Dialer
i do like this:
remote.FullfillRequest(params["requestId"].(string),
200,
"OK",
responseHeaders,
[]byte("============"),
)
but it not work!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.