johntitus / node-horseman Goto Github PK
View Code? Open in Web Editor NEWRun PhantomJS from Node
License: MIT License
Run PhantomJS from Node
License: MIT License
In order to run custom functions during a chain i would like to propose a "do" function. The function can be added to actions.js and could be as simple as:
exports.do = function(fn){
return fn();
}
In the chain the user can use do like this:
horseman
.open('http://'example.com')
.screenshot('test.png')
.do(function(){
updateProgress();
console.log('took a screenshot')
})
.type('input[name="Email"]', '[email protected]')
.click('input[name="signIn"]')
.do(function(){
console.log("signed in")
})
.finally(function(){
console.log('finished');
horseman.close();
});
The function might need some polishing, but you get the idea. Extra functionality could be to call a next() function within the do function, to continue the chain.
Hi Johnitus,
First of all thanks for the wonderful work !!!
I tried to create more than one horseman instance in a different phantomJs port.But while doing the second horseman instance waits to instantiate till the first horseman instance gets closed.
Is it possible to create more than one horseman instance from a same process using different phantomJs port?
I have attached sample code block below
/**
* Module Imports
*/
var create = require("./headless.js");
/**
* Global Variables
*/
var urlList = [
"http://www.google.com/",
"http://www.bing.com/",
"http://www.yahoo.com/",
"https://github.com/",
"https://wordpress.com/"
];
var port = 12401;
/**
* Looping Through Url List And Creating New Horseman Instance For Each Url
*/
for (var url in urlList) {
(function (port) {
console.log(port,"Looping Once Again");
create.createInstance(urlList[url],port)
.then(function (horseman) {
console.log(port,horseman.url());
console.log(port,"closing node horseman");
horseman.close();
});
})(port);
port++;
}
/**
* Global Imports
*/
var Promise = require('bluebird');
var Horseman = require('node-horseman');
/**
* Creates A Horseman Instance For Each Url In A Given Port
*/
exports.createInstance = function (link,port) {
return new Promise(function (resolve,reject) {
console.log(port,"creating new horseman instance");
console.log("Port For New Horseman Instance",port);
horseman = new Horseman({
port : port
});
console.log(port,"created Instance");
horseman
.open(link);
resolve(horseman);
});
};
12401 'Looping Once Again'
12401 'creating new horseman instance'
Port For New Horseman Instance 12401
12401 'created Instance'
12402 'Looping Once Again'
12402 'creating new horseman instance'
Port For New Horseman Instance 12402
http://www.google.com/
12401 'closing node horseman'
12402 'created Instance'
12403 'Looping Once Again'
12403 'creating new horseman instance'
Port For New Horseman Instance 12403
http://www.bing.com/
12402 'closing node horseman'
12403 'created Instance'
12404 'Looping Once Again'
12404 'creating new horseman instance'
Port For New Horseman Instance 12404
https://www.yahoo.com/
12403 'closing node horseman'
12404 'created Instance'
You can clearly see from the sample output.Only when previous horseman instance gets closed next instance gets created.
Is there a way to create parallel instance of horseman from same process??
I keep getting the following error when using Internet Explorer
Assertion failed: !current_buffer, file src\node_http_parser.cc, line 387
I have a Express website that uses AJAX request to pull data in using node-horseman. Whenever Internet Explorer is used I get the above error. Google Chrome however does not have any issues.
I am using node v0.10.36
While troubleshooting, I have even removed everything but the var x = new horseman
initialization and I still get the error. As soon as I remove that, the issue goes away.
Please help. Thanks.
It seems that I can't install node-horseman on iojs greater than version 2.5.0 or node 4.0.0.
Is node-horseman still being supported? It looks like the trouble may be with 'deasync'? I just figured I would start a thread for others trying to upgrade.
npm ERR! Linux 3.13.0-57-generic
npm ERR! argv "/home/doppleruser/.nvm/versions/io.js/v3.0.0/bin/iojs" "/home/doppleruser/.nvm/versions/io.js/v3.0.0/bin/npm" "install"
npm ERR! node v3.0.0
npm ERR! npm v2.13.3
npm ERR! code ELIFECYCLE
npm ERR! [email protected] install: node ./build.js
npm ERR! Exit status 1
npm ERR!
npm ERR! Failed at the [email protected] install script 'node ./build.js'.
npm ERR! This is most likely a problem with the deasync package,
npm ERR! not with npm itself.
npm ERR! Tell the author that this fails on your system:
npm ERR! node ./build.js
npm ERR! You can get their info via:
npm ERR! npm owner ls deasync
npm ERR! There is likely additional logging output above.
npm ERR! Please include the following file with any support request:
npm ERR! /home/package-trials/npm-debug.log
Thanks for this awesome pack. I have been searching for a way to use phantomjs from node, and horseman is exactly what I needed!
I have been going through the api and examples.
Here is one problem I am facing: When I run the quickstart example, google search,
I have attached the screenshot....
var Horseman = require('node-horseman');
var horseman = new Horseman();
var numLinks = horseman
.open('http://www.google.com')
.type('input[name="q"]', 'github')
.click("button:contains('Google Search')")
.waitForNextPage()
.count("li.g");
// I have added this code to take screenshot.
horseman.screenshot('g3.jpeg');
console.log("Number of links: " + numLinks);
horseman.close();
// Output - Number of links: null
For sites that already include jQuery, the injected jQuery can cause problems, clobbering the site's jQuery and removing plugins, etc., that it has added to the jQuery.fn
namespace. It would be nice to be able to turn off this feature with something like var horseman = new Horseman({injectjQuery: false})
. If you'd like me to send a pull request for this, let me know.
Version 2.1.0 made the API chainable. Re-write the tests to use the chainable api, so they can function as examples for users.
Hi - I find horseman really useful, and I used to use the .crop
feature - is this no longer available in v2? I can't see it in the docs.
I am trying to start a horseman instance on heroku.
Example:
var phantomjs = require("phantomjs"); var Horseman = require("node-horseman"); var config = {phantomPath:phantomjs.path}; new Horseman(config)
Debug Output: Sat, 28 Mar 2015 18:06:49 GMT horseman .setup() creating phantom instance on
nom Error :npm ERR! code ELIFECYCLE
Version 1.x had support for tabs, but 2.x does not. Should be straightforward - just store an array of phantom pages. That's how 1.x worked.
Hi! After require('node-horseman') i lose control of the process in cmd (windows 8)
i have to write e.g. process.on('SIGINT', function() { process.exit(); }) yourself to close the process from terminal on ctrl+c. e.t.c
It would be nice if you could spit out the result of an evaluate-type function, or a string, without breaking the chain. Seems like it shouldn't be too hard, but my brain isn't coming up with the answer this morning.
Currently, you need to do something like this:
horseman
.open('http://www.google.com')
.count('a')
.then(function(count){
console.log(count);
horseman
.open('http://www.yahoo.com')
.count('a')
.then(function(count){
console.log(count);
});
});
Desired:
horseman
.open('http://www.google.com')
.count('a')
.log() // spits out the value returned from count
.open('http://www.yahoo.com')
.count('a')
.log() // spits out the value returned from count
.log('here i am') // spits out the string provided.
Phantomjs supports: --remote-debugger-port=9000
Is there any support for this via nodehorseman?
The weak
option was only used when the module depended on phantom
. It's not needed now that it uses node-phantom-simple
.
Here's the code I used:
yield horseman.on('alert', co.wrap(function*(msg) {
console.log(':\nALERT: ' + msg);
return true;
}));
I also set an event for urlChanged and that works perfectly fine with the same format above. (using the co.wrap)
Any idea what might be the cause?
The rest of the code works fine if a confirm/alert is not on the site. Also, not sure if this is important but I used an android device user agent... so this was for mobile.
Thanks
So I have tried the following code on node 0.10.x, 0.12.x and iojs 2.3.x with phantomjs 1.9.8 and 2.0.0 on OSX Yosemite. All I'm getting is an error that getCookies is not a function on the phantom object.
TypeError: this.phantom.getCookies is not a function
at Horseman.exports.cookies (/Users/Josh/Desktop/Projects/horse-test/node_modules/node-horseman/lib/actions.js:233:18)
at Object.<anonymous> (/Users/Josh/Desktop...
var Horseman = require('node-horseman');
var horseman = new Horseman({
cookiesFile: 'cookies.txt'
});
var result = horseman
.open('http://google.com')
.cookies();
console.log(result);
I know it's getting the cookies because it's saving to cookies.txt that I set in the options object.
Any idea what's going on?
Hello! We've been using horseman to access a certain feature in the youtube cms. It was working well before but throws these errors now.
Here's the general logic:
horseman .open https://accounts.google.com/ServiceLogin?continue=https%3A%2F%2Fwww.youtube.com%2Fsignin%3Faction_handle_signin%3Dtrue%26next%3D%252F%26app%3Ddesktop%26feature%3Dsign_in_button%26hl%3Den&passive=true&hl=en&service=youtube&uilel=3#identifier +2s
horseman injected jQuery +13ms
horseman .click() +2ms
horseman .type() +10ms input[name="Email"] [email protected] undefined
horseman .click() +3ms
horseman .wait() +19ms 2000
horseman .type() +2s input[name="Passwd"] password123 undefined
horseman .click() +2ms
horseman .waitForNextPage() +24ms
horseman Timeout during waitForNextPage() +5s
horseman .waitForNextPage() +0ms
horseman Timeout during waitForNextPage() +5s
horseman .close(). +1ms
horseman .open https://www.youtube.com/my_channels?o=<page1> +6ms
horseman .wait() +0ms 2000
horseman .open https://www.youtube.com/my_channels?o=<page2> +1ms
Unhandled rejection HeadlessError: Phantom Process died
at poll_func (/Users/ninz/proj/frnky/cms_interface/node_modules/node-horseman/node_modules/node-phantom-simple/node-phantom-simple.js:563:10)
at /Users/ninz/proj/frnky/cms_interface/node_modules/node-horseman/node_modules/node-phantom-simple/node-phantom-simple.js:61:7
at ClientRequest.<anonymous> (/Users/ninz/proj/frnky/cms_interface/node_modules/node-horseman/node_modules/node-phantom-simple/node-phantom-simple.js:484:11)
at emitOne (events.js:77:13)
at ClientRequest.emit (events.js:169:7)
at Socket.socketErrorListener (_http_client.js:259:9)
at emitOne (events.js:77:13)
at Socket.emit (events.js:169:7)
at emitErrorNT (net.js:1250:8)
at doNTCallback2 (node.js:429:9)
at process._tickCallback (node.js:343:17)
horseman .open https://www.youtube.com/my_channels?o=<page3> +1ms
Unhandled rejection HeadlessError: Phantom Process died
at poll_func (/Users/ninz/proj/frnky/cms_interface/node_modules/node-horseman/node_modules/node-phantom-simple/node-phantom-simple.js:563:10)
at /Users/ninz/proj/frnky/cms_interface/node_modules/node-horseman/node_modules/node-phantom-simple/node-phantom-simple.js:61:7
at ClientRequest.<anonymous> (/Users/ninz/proj/frnky/cms_interface/node_modules/node-horseman/node_modules/node-phantom-simple/node-phantom-simple.js:484:11)
at emitOne (events.js:77:13)
at ClientRequest.emit (events.js:169:7)
at Socket.socketErrorListener (_http_client.js:259:9)
at emitOne (events.js:77:13)
at Socket.emit (events.js:169:7)
at emitErrorNT (net.js:1250:8)
at doNTCallback2 (node.js:429:9)
at process._tickCallback (node.js:343:17)
horseman .open https://www.youtube.com/my_channels?o=<pag4> +0ms
I'm writing code that uses the async library that does web spidering. I've tried opening just 1 site at a time and the code runs without a hitch. I ran into this while trying to use node-horseman:
(node) warning: possible EventEmitter memory leak detected. 11 SIGINT listeners added. Use emitter.setMaxListeners() to increase limit.
Trace
at process.addListener (events.js:179:15)
at process.on.process.addListener (node.js:668:26)
at /app/node_modules/node-horseman/node_modules/node-phantom-simple/node-phantom simple.js:87:21
at Array.forEach (native)
at spawnPhantom (/app/node_modules/node-horseman/node_modules/node-phantom-simple/node-phantom-simple.js:86:31)
at Object.exports.create (/app/node_modules/node-horseman/node_modules/node-phantom-simple/node-phantom-simple.js:201:5)
at new Horseman (/app/node_modules/node-horseman/lib/index.js:135:11)
In the PATH, in the directory?
You'd think this would be in the install instruction 😉
$ phantomjs 1.js
Error: Cannot find module 'http'
phantomjs://bootstrap.js:289
phantomjs://bootstrap.js:254 in require
C:/Users/n37r06u3/Desktop/node/node_modules/node-horseman/node_modules/node-ph
antom-simple/node-phantom-simple.js:3
C:/Users/n37r06u3/Desktop/node/node_modules/node-horseman/node_modules/node-ph
antom-simple/node-phantom-simple.js:474
Error: Cannot find module 'path'
phantomjs://bootstrap.js:289
phantomjs://bootstrap.js:254 in require
C:/Users/n37r06u3/Desktop/node/node_modules/node-horseman/lib/index.js:4
C:/Users/n37r06u3/Desktop/node/node_modules/node-horseman/lib/index.js:290
TypeError: '[object Object]' is not a constructor (evaluating 'new Horseman()')
1.js:2
var pattern = '0 */'+ 5 +' * * * *';
function job(){
console.log('Start to run every ' + 5 + ' mins');
var price = horseman
.open('http://viettelstore.vn/dien-thoai/apple-iphone-6-plus-64gb-ban-quoc-te--pid975.html')
.evaluate(function () {
return $('#_price_new436').text()
})
console.log(price);
horseman.close();
}
new CronJob(pattern, job, null, true, 'America/Los_Angeles');
I use node-cron to to schedule crawling job as above. Horseman works very good without node-cron, when putting horseman in node-cron, the first call is ok, all call after that is fail with below error
Request() error evaluating open() call: Error: connect ECONNREFUSED
Request() error evaluating evaluate() call: Error: connect ECONNREFUSED
Hello,
First of all, thank you for the great work!
I was trying to run node-horseman
with mocha on travis but it failed with:
> node test/horseman.js
undefined
npm ERR! Test failed. See above for more details.
The script is almost exactly the same as the example:
var Horseman = require('node-horseman');
var horseman = new Horseman();
var numLinks = horseman
.open('http://www.google.com')
.type('input[name="q"]', 'github')
.click("button:contains('Search')")
.waitForNextPage()
.title();
console.log("Title: " + numLinks);
horseman.close();
Please let me know if this is a known issue or if I have missed anything.
Thanks,
Ben
Add a generic sendEvent
method, so users can send keyboard/mouse events.
In a script, I was cropping some areas, then wanting to take full screenshots later on.
However, the clipRect
set by .crop
affects all subsequent .screenshot
calls - is this by design?
Also, not sure why a viewport of 1200 x 800 is set after a crop? For me, altering this code so that the clipRect is removed after a crop works:
https://github.com/johntitus/node-horseman/blob/master/lib/actions.js#L267
this.page.set('clipRect', rect, function(){
self.pause.unpause('clipRect');
self.screenshot( path );
// self.page.set("viewportSize",{
// width: 1200,
// height: 800
// });
self.page.set('clipRect', {});
return this;
});
When i'm trying to make a screenshot in the callback of "loadFinished", the screenshot is nowhere to be found. No error occurs (success callback, "done" is printed).
When i'm using "screenshot" as a separate callback in the chain, it does work, but the screenshot might not be complete because the page is not done loading.
var horseman = new Horseman();
horseman
.userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64; rv:27.0) Gecko/20100101 Firefox/27.0")
.viewport(1280, 1024)
.on("loadFinished", function() {
horseman.screenshot(path.resolve(__dirname, '123.png')).then(function(){
console.log("done");
}, function(e){
console.error(e);
});
})
.on("error", function(e) {
console.log("error", e);
})
.on("timeout", function(e) {
console.log("timeout", e);
})
.open("http://google.com")
.finally(function() {
horseman.close();
});
Am i doing something wrong here?
My environment:
Windows 8
Nodejs 4.1.1
PhantomJS 2.0.0
It doesn't appear like you have an option to save persistent cookies to a file. Does your api possibly persist them in memory though for each unique object?
For example, if I login to a website and don't log out, can a subsequent connection that uses the same object use the same website session (stored in a cookie) even after horseman.close();
has been called? Or is this a limitation with the current api?
I assume this is because there is a dialog asking to confirm if you are OK with re-sending the request.
After writing some Mocha tests utilising Horseman I've come across an issue with Javascript confirmation dialogues.
Following the PhantomJs documentation of dealing with confirmation events. I'm returning true
from the on confirm callback. This page was referenced by the Horseman docs.
As explained on the PhantomJs page:
true
=== pressing the "OK" button,false
=== pressing the "Cancel" button
};
No matter what I return in my Horseman code it says it says cancel was pressed.
I've created some test files to illustrate this issue.
My test HTML:
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>Confirm check</title>
<script>
function confirmation() {
var answer = confirm('OK or Cancel?');
if (answer === true) {
console.log('You pressed OK');
} else {
console.log('You pressed Cancel');
}
}
</script>
</head>
<a id="confirm" href="#" onclick="confirmation()" >Confirm?</a>
</body>
</html>
My test Horseman script:
var Horseman = require('node-horseman');
var browser = new Horseman();
var url = 'file:///Users/user/Desktop/confirm.html';
browser
.viewport(1920, 1080)
.on('loadFinished', function(status) {
console.log('%s Status %s', url, status);
})
.on('consoleMessage', function( msg, lineNumber, sourceId ) {
console.log('Console: %s, %s, %s', msg, lineNumber, sourceId);
})
.on('confirm', function( msg ) {
console.log('Confirm message: %s', msg);
return false;
})
.open(url)
.click('a#confirm');
browser.close();
The terminal output with DEBUG=horseman
enabled:
horseman .setup() creating phantom instance on +0ms
horseman .setup() phantom instance created ok. +448ms
horseman .setup() creating phantom page. +1ms
horseman .setup() phantom page created ok. +21ms
horseman .setup() setting viewport. +0ms
horseman .setup() viewport set ok. +4ms
horseman .viewport() set +60ms
horseman .on loadFinished set. +0ms
horseman .on consoleMessage set. +0ms
horseman .on confirm set. +0ms
horseman .opening: file:///Users/user/Desktop/confirm.html +0ms
file:///Users/user/Desktop/confirm.html Status success
horseman .open: file:///Users/user/Desktop/confirm.html - status: success +22ms
Confirm message: OK or Cancel?
Console: You pressed Cancel, undefined, undefined
horseman .click() a#confirm +81ms
horseman .close(). +0ms
I may be doing things wrong but I've not seen any indication that I should be doing things differently to respond to confirmation events.
Any help with this would be appreciated.
Cheers.
After executing a file that uses horseman to load a page, the file never completes executing. This is the code I'm using:
var Horseman = require("node-horseman");
var horseman = new Horseman();
horseman
.userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64; rv:27.0) Gecko/20100101 Firefox/27.0")
.open('http://www.google.com')
.close();
I then run node horsemantest.js
and it just hangs.
I am using node 0.12.5 and phantomjs 2.0.0, and the latest node-horseman.
.waitForSelector(selector)
Wait until the element selector is present e.g. .wait('#pay-button')
In the above section located within the waitForSelector section of Readme.md, you show an example that uses a different method then what is expected.
Is this a typo? Shouldn't it read:
Wait until the element selector is present e.g. .waitForSelector('#pay-button')
While we're on the topic, what is the logic behind using different methods and not just using .wait and detecting for number or selector? Is it for speed?
I've been trying to use waitForNextPage
, but it doesn't work as expected.
In your links example, if you change this line to
return horseman.waitForNextPage();
the screenshot will give you an unfinished page. The same occurs with the pizza example.
So far I've been able to hack it out with waitForSelector
, but that's highly prone to error.
Horseman ends my long search for an up-to-date Phantom driver, but this bug renders it almost unusable.
I'm on Phantom 2.0.0.
I am doing something like the code below, where I have wrapped a horseman operation in a function that I want to return a value. The problem is, the chain gets started asynchronously so the function returns undefined immediately. Later on when horseman finishes, it console logs the correct value.
How can I cause the chain to block or return the value I want from the .then() / .evaluate() section?
var Horseman = require('node-horseman');
var horseman = new Horseman();
console.log("outside val: " + request("a"));
function request(query) {
horseman
.userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64; rv:27.0) Gecko/20100101 Firefox/27.0")
.open('http://example.com/'+query)
.evaluate(function() {
return $('input[name="fieldName"]').val();
})
.then(function(value) {
console.log("Running then");
horseman.close();
if (value !== undefined) {
console.log("inside val: " + value);
return value;
}
});
}
Output:
outside val: undefined
Running then
inside val: correctValue
I want to test a website that pushes a CSV into my browser download folder. I need to be able to open the csv file and review the contents inside my phantom. Do you have anything to assist with this process?
line 233 in lib/action.js:
this.phantom.getCookies(function( result ){
I can't find this method in the phantom API. Maybe the phantom object is extended, but I get this error:
undefined is not a function at Horseman.exports.cookies(/Users/kristian/WebstormProjects/HorseManScraper/node_modules/node-horseman/lib/actions.js:233:18)
Was there a reason for removing goto()? I was using that API in a previous nightmare script. I see you have open(), which I suppose is better? I think it would be nice to have both, for legacy scripts.
The .evaluate() method, and anything that depends on it (such as .click(), .type(), and a host of others) can be slow if there is more than one called in quick succession.
A discussion of this can be seen in this Nightmare issue: segment-boneyard/nightmare#123
Both Nightmare and Horseman use the same dependency, https://github.com/sgentle/phantomjs-node, which is the source of this problem.
Provide a path is nice for local storage of image but we should be allowed to give things like a buffer for other manipulations.
I would be happy to be allowed to upload it, serve it with HTTP, etc... It's endless.
The horseman.close() method doesn't always fully close out horseman.
@johntitus I'm not sure if you have seen, but segmentio/nightmare has now moved over from PhantomJS to Electron. Is this something that you are considering?
The nightmare README.md claims that it's faster than PhantomJS:
Under the covers it uses Electron, which is similar to PhantomJS but faster and more modern.
Support Header Footer in pdf render part. Something like following
page.paperSize = {
format: 'A4',
margin: "1cm",
/* default header/footer for pages that don't have custom overwrites (see below) */
header: {
height: "1cm",
contents: phantom.callback(function(pageNum, numPages) {
if (pageNum == 1) {
return "";
}
return "< h1>Header " + pageNum + " / " + numPages + "";
})
},
footer: {
height: "1cm",
contents: phantom.callback(function(pageNum, numPages) {
if (pageNum == numPages) {
return "";
}
return "< h1>Footer " + pageNum + " / " + numPages + "";
})
}
};
Is there any way to run horseman in asynchronous way with callback. In below example, I have to wait horseman for finishing to run 'myFunction'. Is there any way to run 'myFunction' without waiting for horseman?
var Horseman = require('node-horseman');
var horseman = new Horseman();
var numLinks = horseman
.userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64; rv:27.0) Gecko/20100101 Firefox/27.0")
.open('http://www.google.com')
.type('input[name="q"]', 'github')
.click("button:contains('Google Search')")
.waitForNextPage()
.count("li.g");
console.log("Number of links: " + numLinks);
horseman.close();
// call myFuction()
Let users run with a PhantomJS that's not in their PATH.
thomas@workstation:shopify-endeavor$ npm i node-horseman --save
npm info it worked if it ends with ok
npm info using [email protected]
npm info using [email protected]
npm info package.json [email protected] license should be a valid SPDX license expression
npm info package.json [email protected] No license field.
npm info package.json [email protected] No license field.
npm info package.json [email protected] No license field.
npm info attempt registry request try #1 at 8:55:43 PM
npm http request GET http://registry.npmjs.org/node-horseman
npm http 304 http://registry.npmjs.org/node-horseman
npm info install [email protected] into /Users/thomas/Desktop/shopify-endeavor
npm info installOne [email protected]
npm info preinstall [email protected]
npm info attempt registry request try #1 at 8:55:43 PM
npm http request GET http://registry.npmjs.org/deasync
npm info attempt registry request try #1 at 8:55:43 PM
npm http request GET http://registry.npmjs.org/clone
npm info attempt registry request try #1 at 8:55:43 PM
npm http request GET http://registry.npmjs.org/defaults
npm info attempt registry request try #1 at 8:55:43 PM
npm http request GET http://registry.npmjs.org/node-phantom-simple
npm http 304 http://registry.npmjs.org/clone
npm http 304 http://registry.npmjs.org/defaults
npm http 304 http://registry.npmjs.org/node-phantom-simple
npm http 304 http://registry.npmjs.org/deasync
npm info install [email protected] into /Users/thomas/Desktop/shopify-endeavor/node_modules/node-horseman
npm info install [email protected] into /Users/thomas/Desktop/shopify-endeavor/node_modules/node-horseman
npm info install [email protected] into /Users/thomas/Desktop/shopify-endeavor/node_modules/node-horseman
npm info install [email protected] into /Users/thomas/Desktop/shopify-endeavor/node_modules/node-horseman
npm info installOne [email protected]
npm info installOne [email protected]
npm info installOne [email protected]
npm info installOne [email protected]
npm info preinstall [email protected]
npm info build /Users/thomas/Desktop/shopify-endeavor/node_modules/node-horseman/node_modules/defaults
npm info linkStuff [email protected]
npm info preinstall [email protected]
npm info install [email protected]
npm info postinstall [email protected]
npm info build /Users/thomas/Desktop/shopify-endeavor/node_modules/node-horseman/node_modules/clone
npm info linkStuff [email protected]
npm info install [email protected]
npm info postinstall [email protected]
npm info preinstall [email protected]
npm info build /Users/thomas/Desktop/shopify-endeavor/node_modules/node-horseman/node_modules/node-phantom-simple
npm info linkStuff [email protected]
npm info install [email protected]
npm info postinstall [email protected]
npm info preinstall [email protected]
npm info attempt registry request try #1 at 8:55:44 PM
npm http request GET http://registry.npmjs.org/bindings
npm info attempt registry request try #1 at 8:55:44 PM
npm http request GET http://registry.npmjs.org/nan
npm http 304 http://registry.npmjs.org/nan
npm http 304 http://registry.npmjs.org/bindings
npm info install [email protected] into /Users/thomas/Desktop/shopify-endeavor/node_modules/node-horseman/node_modules/deasync
npm info install [email protected] into /Users/thomas/Desktop/shopify-endeavor/node_modules/node-horseman/node_modules/deasync
npm info installOne [email protected]
npm info installOne [email protected]
npm info preinstall [email protected]
npm info build /Users/thomas/Desktop/shopify-endeavor/node_modules/node-horseman/node_modules/deasync/node_modules/bindings
npm info linkStuff [email protected]
npm info install [email protected]
npm info postinstall [email protected]
npm info preinstall [email protected]
npm info build /Users/thomas/Desktop/shopify-endeavor/node_modules/node-horseman/node_modules/deasync/node_modules/nan
npm info linkStuff [email protected]
npm info install [email protected]
npm info postinstall [email protected]
npm info build /Users/thomas/Desktop/shopify-endeavor/node_modules/node-horseman/node_modules/deasync
npm info linkStuff [email protected]
npm info install [email protected]
> [email protected] install /Users/thomas/Desktop/shopify-endeavor/node_modules/node-horseman/node_modules/deasync
> node ./build.js
Build failed
npm info [email protected] Failed to exec install script
npm ERR! Darwin 14.4.0
npm ERR! argv "/Users/thomas/.nvm/versions/io.js/v3.2.0/bin/iojs" "/Users/thomas/.nvm/versions/io.js/v3.2.0/bin/npm" "i" "node-horseman" "--save"
npm ERR! node v3.2.0
npm ERR! npm v2.13.3
npm ERR! code ELIFECYCLE
npm ERR! [email protected] install: `node ./build.js`
npm ERR! Exit status 1
npm ERR!
npm ERR! Failed at the [email protected] install script 'node ./build.js'.
npm ERR! This is most likely a problem with the deasync package,
npm ERR! not with npm itself.
npm ERR! Tell the author that this fails on your system:
npm ERR! node ./build.js
npm ERR! You can get their info via:
npm ERR! npm owner ls deasync
npm ERR! There is likely additional logging output above.
npm info preuninstall [email protected]
npm info uninstall [email protected]
npm info postuninstall [email protected]
npm info preuninstall [email protected]
npm info uninstall [email protected]
npm info postuninstall [email protected]
npm ERR! Please include the following file with any support request:
npm ERR! /Users/thomas/Desktop/shopify-endeavor/npm-debug.log
Failed to exec install script
Hi - great lib :)
Is it possible to run arbitrary code in a chain?
In the node context, not the browser context (which is .evaluate
).
for example:
horseman
.open("www.gogle.com")
.wait(1000)
.then(function(){
//do stuff here
});
1、执行示例代码报错,请问大概是什么原因?
Error: spawn ENOENT
npm install node-horseman
2、安装模块报错
npm WARN package.json [email protected] No description
npm WARN package.json [email protected] No repository field.
npm WARN package.json [email protected] No README data
\
[email protected] install /Users/qinmudi/ericqin/test/horseman/node_modules/node-horseman/node_modules/deasync
node ./build.js
darwin-x64-node-0.10
exists; testing
Binary is fine; exiting
[email protected] node_modules/node-horseman
├── [email protected]
├── [email protected]
├── [email protected]
├── [email protected] ([email protected])
└── [email protected] ([email protected], [email protected])
Hello, I can't run a test using horseman, the apparent cause is a error of dependency of npm packets, follow:
/home/fguedes/projetos/tripper/node_modules/horseman/node_modules/jsdom/lib/jsdom.js:3
`jsdom 4.x onward only works on io.js, not Node.js™: https://github.com/tmpvar
^
SyntaxError: Unexpected token ILLEGAL
at Module._compile (module.js:439:25)
at Object.Module._extensions..js (module.js:474:10)
at Module.load (module.js:356:32)
at Function.Module._load (module.js:312:12)
at Module.require (module.js:364:17)
at require (module.js:380:17)
at Object. (/home/fguedes/projetos/tripper/node_modules/horseman/lib/horseman.js:3:13)
at Module._compile (module.js:456:26)
at Object.Module._extensions..js (module.js:474:10)
at Module.load (module.js:356:32)
Sorry but I can't find solution for this bug.
Any suggestions?
Within express app when I tried to create instance of Horseman globally its working correctly without any errors.But when I try the same within the function it throwing the following error
node: ../src/node_http_parser.cc:387: static v8::Handlev8::Value node::Parser::Execute
(const v8::Arguments&): Assertion `!current_buffer' failed.
Aborted (core dumped)
When I do the same thing within sails my code hangs at the line where I instantiate the instance of Horseman.
The same thing works fine when I do it normally without using it with sails or express
The following code block creates a simple express app which requires node_horseman_test_2.js where I instantiate node horseman.
|--------------------------------------------------------------------------
| Global Imports
|--------------------------------------------------------------------------
*/
var express = require('express');
var Promise = require('bluebird');
var path = require('path');
var bodyParser = require('body-parser');
var nodeHorseman = require('./node_horseman_test_2.js');
/*
|--------------------------------------------------------------------------
| Create our Express application
|--------------------------------------------------------------------------
*/
var app = express();
app.use(bodyParser.urlencoded({
extended: true
}));
app.all('/*', function(req, res, next) {
// CORS headers
res.header("Access-Control-Allow-Origin", "*"); // restrict it to the required domain
res.header('Access-Control-Allow-Methods', 'GET,PUT,POST,DELETE,OPTIONS');
// Set custom headers for CORS
res.header('Access-Control-Allow-Headers', 'Content-type,Accept,X-Access-Token,X-Key');
if (req.method == 'OPTIONS') {
res.status(200).end();
} else {
next();
}
});
// Use environment defined port or 8888
var port = process.env.PORT || 8888;
// Create our Express router
var router = express.Router();
// Register all our routes with /api
app.use('/api/v1/', router);
// Initial dummy route for testing
router.get('/', function(req, res) {
res.json({ message: 'Test message' });
});
//Add productMatching router here
var test = router.route('/test');
//Get request here calls scraper to find matches
test.get(function(req, res) {
nodeHorseman.instantiate(req.query).then(function(results) {
res.send(results);
}).catch(function(err) {
if(err) res.send(err);
//LOG ERROR HERE
else res.send('Match failed. Check connection.');
});
});
// If no route is matched by now, it must be a 404
app.use(function(req, res, next) {
var err = new Error('Not Found');
err.status = 404;
//LOG ERROR HERE
next(err);
});
// Start the server
app.listen(port);
console.log('Send match searches to ' + port);
var Horseman = require('node-horseman');
var userAgent = 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.122 Safari/537.36';
var Promise = require("bluebird");
exports.instantiate = function() {
return new Promise(function(resolve, reject) {
var horse = new Horseman();
horse
.userAgent(userAgent)
.open("http://www.google.com");
console.log(horse.url());
horse.close();
});
};
Would you somehow mark next to each API item (example title()) what project introduced it? I believe you added new items such as text(), which is fantastic. I'd like to be able to see which project introduces which API command.
Perhaps like this?
.text(selector)
Gets the text inside of an element. (added Horseman)
First off, thanks for the awesome package! I have gotten a lot of use out of it, and it has proven very helpful.
I am curious if you plan to upgrade Horseman to use version ^2.0 of node-phantom-simple anytime soon? I ask because I am using FreeBSD, which is not supported in versions < 2.0 of node-phantom-simple. As a result of that, I added support for FreeBSD to node-phantom-simple and submitted a pull request which was accepted some time ago, but which has only just recently made it into a release. In the interim, I have been having to manually replace Horseman's node-phantom-simple package with the custom package, which I am still having to do since Horseman's dependency is for node-phantom-simple version ^1.2.0. It would be nice if I no longer had to do this, but obviously it requires that Horseman support version ^2.0.
Opening for comments. I've been playing with promises and would like to move the entire API to be promise based. Using Horseman would now look something like:
horseman
.headers( headers )
.then(function(){
return horseman.open( 'http://httpbin.org/headers' )
})
.then( function(){
return horseman.evaluate(function(){
return document.body.children[0].innerHTML;
});
})
.then( function( data ){
var response = JSON.parse( data );
response.should.have.property( 'headers' );
response.headers.should.have.property( 'X-Horseman-Header' );
response.headers[ 'X-Horseman-Header' ].should.equal( 'test header' );
})
.catch( function( e ){
console.log(e)
});
This would remove the dependency on deasync
, and make it easy to run multiple horsemen at the same time, without blocking the entire process.
Downwide - API gets more convoluted/not as easy to read.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.