Comments (15)
Because the package has to enable it for the base actions to work out of the box. As such, there's no need for an additional call.
from chromedp.
@edoardottt I'd need to see the rest of the script to really understand what you're doing. I quickly wrote this just now, which properly returns the javascript value:
package main
import (
"context"
"flag"
"fmt"
"os"
"github.com/chromedp/cdproto/network"
"github.com/chromedp/chromedp"
)
func main() {
urlstr := flag.String("url", "https://google.com/", "url")
flag.Parse()
if err := run(context.Background(), *urlstr); err != nil {
fmt.Fprintf(os.Stderr, "error: %v\n", err)
os.Exit(1)
}
}
func run(ctx context.Context, urlstr string) error {
ctx, cancel := chromedp.NewContext(ctx)
defer cancel()
const script = `"a string"`
headers := network.Headers{
"x-header": "a header",
}
var res string
err := chromedp.Run(ctx,
network.Enable(),
network.SetExtraHTTPHeaders(headers),
chromedp.Navigate(urlstr),
chromedp.EvaluateAsDevTools(script, &res),
)
fmt.Fprintf(os.Stdout, "err: %v\ngot: %q\n", err, res)
return err
}
Running it:
$ go run main.go
err: <nil>
got: "a string"
from chromedp.
@edoardottt I don't specifically inject a lot of headers on when scraping with chromedp
, as usually I'm probably not doing anything that would "need" a full blown Chrome instance when manipulating the headers directly. However, if I had to guess, it's maybe because you're sending a non string type as the value? Please note that network.Headers
is rightly a map[string]interface{}
which corresponds to the generic JSON object type.
From the PDL, you can see that the runtime.Headers
file in the Chromium source tree is this:
# Network domain allows tracking network activities of the page. It exposes information about http,
# file, data and other requests and responses, their headers, bodies, timing, etc.
domain Network
depends on Debugger
depends on Runtime
depends on Security
# Request / response headers as keys / values of JSON object.
type Headers extends object
I am inferring here that Chrome is rejecting the set header request, because of badly formatted data. I would try changing the values you're sending as strings. This likely is causing a silent error that you're not catching in your script.
Apologies if this is the case, but chromedp/cdp
more or less respects the defined protocol. I'll do some testing on my end to see if this is the likely cause.
from chromedp.
I'm looking through your pphack
repo that you linked here, and I don't see the specific payload you're trying to inject. Could you share an actual complex script and the actual header values you're injecting?
from chromedp.
Hi @kenshaw, thank you so much for your reply.
I've added a new branch (https://github.com/edoardottt/pphack/tree/add-headers) in order to show how I use the headers in chromedp.
To make a test I comment these two lines (https://github.com/edoardottt/pphack/blob/add-headers/pkg/scan/chrome.go#L50-L51, network.Enable
and network.SetExtraHTTPHeaders
). Then I execute:
echo https://edoardottt.github.io/pp-test/ | go run cmd/pphack/main.go -H "test:test" -v
and the output is https://edoardottt.github.io/pp-test/?constructor.prototype. ...
confirming that the err is nil and the JS evaluation is performed correctly.
if instead I use those two lines (not commenting them) and I use the same command I get no std output and this error:
[ERR] encountered an undefined value
Using a proxy I can see Test: test
in the HTTP headers, so I guess the headers are set correctly.
from chromedp.
I'll look at this further. BTW -- if you haven't already, you should try turning on the debug logging to see the messages going back and forth, as it might be helpful:
ectx, ecancel := chromedp.NewExecAllocator(context.Background(), copts...)
pctx, pcancel := chromedp.NewContext(ectx, chromedp.WithDebugf(log.Printf))
(in your project's scan/chrome.go
)
from chromedp.
Are you expecting different results based on the User-Agent?
from chromedp.
Regarding the debug I've tried to look at it, but I don't see anything weird tbh. If someone can understand better I can provide those logs too. But as far as I can understand, the request is sent with the proper headers.
Are you expecting different results based on the User-Agent?
No, I just want to add some extra headers
from chromedp.
So -- I believe the network.Enable()
call is being called, and yours is resetting the UA. From what I can tell on the output, chromedp
is working as intended, as it appears from the cdp
protocol messages everything is sent/received correctly.
Specifically the error you are getting is because the JS value window.xxxx
is not present. That value can't be unmarshaled to a string
, asundefined
doesn't have a corresponding value in Go that it could be unmarshaled to. The error here should be more of a "invalid destination type" or some such. Note that you could capture the actual raw value and then evaluate after the fact if it is a string or something else.
from chromedp.
So -- I believe the network.Enable() call is being called, and yours is resetting the UA.
So I have to use something like SetUserAgentOverride for this, but that's not the point here...
window.xxxx
is not present
How? Why setting an extra HTTP header like Test: test
should change the JS evaluation of a static website? I'm still not understanding
I've checked and this behavior is present in other similar tools, e.g. https://github.com/kosmosec/proto-find/
from chromedp.
I have no idea why it's not present. You can play around with this code:
func Scan(ctx context.Context, headers map[string]interface{}, js, targetURL string) (string, error) {
var res *runtime.RemoteObject
err := chromedp.Run(ctx, chromedp.Tasks{
network.SetExtraHTTPHeaders(network.Headers(headers)),
chromedp.Navigate(targetURL),
chromedp.EvaluateAsDevTools(js, &res),
})
var s string
if res.Type == runtime.TypeString { // this is also just "string"
s = string(res.Value)
}
log.Printf("s: %q -- %v", s, err)
return s, err
}
Unfortunately, I'm not able to dig further into your code. Please update here if you find the issue.
from chromedp.
Okay, thanks for your help though.
Why you removed network.Enable
? Is not necessary?
from chromedp.
@kenshaw Using the debug I've got something:
- chromedp-logs-no-header.txt (log file without using
network.SetExtraHTTPHeaders
) - chromedp-logs-with-header.txt (log file using
network.SetExtraHTTPHeaders
)
In the second one there's the error:
2024/02/05 09:42:53 <- {"method":"Runtime.exceptionThrown","params":{"timestamp":1.707122573427766e+12,"exceptionDetails":{"exceptionId":1,"text":"Uncaught","lineNumber":244,"columnNumber":2,"scriptId":"4","url":"https://rawcdn.githack.com/alrusdi/jquery-plugin-query-object/9e5871fbb531c5e246aac2aaf056b237bc7cc0a6/jquery.query-object.js","stackTrace":{"callFrames":[{"functionName":"","scriptId":"4","url":"https://rawcdn.githack.com/alrusdi/jquery-plugin-query-object/9e5871fbb531c5e246aac2aaf056b237bc7cc0a6/jquery.query-object.js","lineNumber":244,"columnNumber":2}]},"exception":{"type":"object","subtype":"error","className":"ReferenceError","description":"ReferenceError: jQuery is not defined\n at https://rawcdn.githack.com/alrusdi/jquery-plugin-query-object/9e5871fbb531c5e246aac2aaf056b237bc7cc0a6/jquery.query-object.js:245:3","objectId":"7735756507232330443.2.1","preview":{"type":"object","subtype":"error","description":"ReferenceError: jQuery is not defined\n at https://rawcdn.githack.com/alrusdi/jquery-plugin-query-object/9e5871fbb531c5e246aac2aaf056b237bc7cc0a6/jquery.query-object.js:245:3","overflow":false,"properties":[{"name":"stack","type":"string","value":"ReferenceError: jQuery is not defined\n at https\u2026c2aaf056b237bc7cc0a6/jquery.query-object.js:245:3"},{"name":"message","type":"string","value":"jQuery is not defined"}]}},"executionContextId":2}},"sessionId":"5EB446D1FB128D2499FB60BC9B58875C"}
Seems like using the header changes something and jQuery is not loading properly. TBH it's hard to think it's a problem of the website, as it's static content and returns always the same content.
from chromedp.
Ok, glad you were able to figure it out!
from chromedp.
I've got that clue, but I'm not able to solve the issue @kenshaw.
As I wrote I guess it's related to chomedp, but I don't know how to fix that behavior
Hence, the issue should not be closed
from chromedp.
Related Issues (20)
- when use chromedp.Evaluate, how can I get the promise error info ?
- Navigate Hangup with custom url scheme HOT 1
- page.StopLoading() cannot stop navigate
- chrome failed to start with no detail error
- Screenshot from remote browser
- context canceled even with new context HOT 1
- Download events being omitted on the page level but chromedp listens for it on the Browser level HOT 1
- Image not showing up in header
- Can't use proxy and open multiple tabs ?
- Target.targetCrashed > errorCode 11 with chromedp.Navigate() in Docker container environnement HOT 7
- Is it possible to use the net/http client in chromedp ? HOT 1
- GetOuterHTML().WithPierce(true) not returning <iframe> contents
- How to execute JavaScript in a specified context? HOT 1
- How to set the state of ShadowDOM from closed to open?
- Question: condition for set FooterTemplate
- Is it possible to capture error messages related to CORS, CSP violations, mixed-content violations, etc.?
- Can I Capture Raw HTTP Data?
- How to start chrome in arm environment?Are there any other plans?
- Why can't I listen to my iframe's network requests
- How i get dpi of chrome
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chromedp.