chuyskywalker / rolling-curl Goto Github PK
View Code? Open in Web Editor NEWThis project forked from lionsad/rolling-curl
Rolling-Curl: A non-blocking, non-dos multi-curl library for PHP
Home Page: https://github.com/chuyskywalker/rolling-curl
This project forked from lionsad/rolling-curl
Rolling-Curl: A non-blocking, non-dos multi-curl library for PHP
Home Page: https://github.com/chuyskywalker/rolling-curl
Is it possible to get the data of each URL after the execution? I couldn't find any documentation to get the page content after the execution. It's impossible to save the data from the callback..
Lets say I have an array
$footprints = array('test', 'test2');
How would I pass it to
$rollingCurl->setCallback(function (\RollingCurl\Request $request, \RollingCurl\RollingCurl $rollingCurl) {
Hi Team,
It seems I encountered a problem using the POST Request Method.
I have post data request like this
array(3) {
["transfer_type"]=>
string(7) "one-way"
["transfer_details"]=>
array(1) {
["first_transfer"]=>
array(5) {
["pickup_type"]=>
string(1) "1"
["pick_up_city_code"]=>
string(3) "LON"
["pickup"]=>
string(7) "airport"
["dropoff"]=>
string(5) "hotel"
["transfer_date"]=>
string(10) "2018-10-08"
}
}
["pax"]=>
array(2) {
["adult"]=>
int(1)
["child_age"]=>
array(1) {
[0]=>
int(5)
}
}
}
when I send this post data request to my other API the request that I am getting there became like this
array(3) {
["transfer_type"]=>
string(7) "one-way"
["transfer_details"]=>
string(5) "Array"
["pax"]=>
string(5) "Array"
}
What do you think was the cause of this problem?
Thank Youuuuuuu
I am not getting page source (html) that is generate by javascript execution. I found a solution like phantomjs. Any solution
Thanks
Hello,
I use this script since the 1.0 version for recursivity (especially with the "callback" function in adding new URL on the fly) while this version seems not to do it (I'm probably wrong and don't know how to use it as it is), so here my suggestion (what I have done).
In "rolling-curl/src/RollingCurl/RollingCurl.php", just move the "callback" (lines 275 to 279, "remove the curl handle that just completed") before line 262 ("// start a new request (it's important to do this before removing the old one").
With this, you can add new URL on the fly in the "callback" function (you have to pass "RollingCurl" object in it).
Maybe it can be changed on these source?
Ty for this very useful script.
My apologies for emailing you about your Repository, however it appears the Issues button is turned off and I wanted to ask for your input on a PR I’d like to make before I start, since you seem to be the head of the project.
Currently, RollingCurl::clearCompleted() states it will help prevent out of memory errors, but it only clears the completedRequests array. In my opinion, it would be extremely beneficial to get the behavior of prunePendingRequestQueue() during that process as well.
I realize that a developer could make their own code run both, but if that is the desired approach I think the documentation and phpdocs should be updated slightly to reflect the necessity for that. When you’re running RollingCurl in a continuous script and handling thousands of requests, it can cause issues because of the pendingRequests array and without checking the code directly you wouldn’t know the prune method exists.
I was thinking that perhaps requests could be removed from the pendingRequests array as they are being processed or the prunePendingRequestQueue() could also be called inside clearCompleted().
I just wanted to know if you’d prefer the documentation update approach or the programmatic approach. Please let me know and I’ll gladly create a PR.
Thanks again for your great library and again, I apologize for contacting you via email about your repository.
hi, i'm trying to use your rolling-curl class for downloading image and save them to local drive but it is not working for that if m doing the same file_get_contents() is it working but i have to download bulk images so i need something faster like this... so can you help me out
here is a demo link of image
https://maps.googleapis.com/maps/api/streetview?size=400x400&location=taj%20Mahal&key=AIzaSyDTn9FYuxm3h3jKbEjwViHb7TKaCsXhUxI
Lines 251-252 are duplicates:
$request->setResponseErrno(curl_errno($transfer['handle']));
$request->setResponseError(curl_error($transfer['handle']));
Have i need to rorate ip while using this rolling curl to avoid blocking condition while getting html source for any site.
Currently I can slow down the requests by reducing SimultaneousLimit , but I am wondering if there is a way to add some random sleep for each request, to make it more nature?
Thanks!
Hello, so I am trying to setup a proxy checker with roling-curl, but setting CURLOPT_PROXY
for every request all the requests fail if the number of threads is > ~10, whereas if I set the IP of the proxies as CURLOPT_URL
I can run 500 threads fine and results are consistent. Is this a limitation of curl itself? It seems I have this behavior with any wrapper of curl_multi, so maybe curl can't handle many different CURLOPT_PROXY
settings simultaneously?
Respected Sir
In setcallback function $html = $request->getResponseText(); gives null result but if insert google.com,msn.com then it works but for some site yahoo.com getResponseText() gives null result
First off, I love rolling-curl. Great work! I noticed that if you mix IPv4 or IPv6, all the requests get set to one or the other. This is not desirable in my situation as I'm using a mixture of IPv4 and IPv6 interfaces for scraping. Let me know if I can help in any way.
The ideal output below would be your IPv6 IP and your IPv4 IP both.
$sites = [
'http://icanhazip.com?1'=>[CURLOPT_IPRESOLVE=>CURL_IPRESOLVE_V4],
'http://icanhazip.com?2'=>[CURLOPT_IPRESOLVE=>CURL_IPRESOLVE_V6],
];
foreach ($sites as $url => $options) {
$request = new \RollingCurl\Request($url);
$rc->add($request->addOptions($options));
}
$rc->setCallback(function(\RollingCurl\Request $request, \RollingCurl\RollingCurl $rc) {
$diff = microtime(true) - $_SERVER["REQUEST_TIME_FLOAT"];
echo "..{$diff}\t".$request->getUrl().": ".$request->getResponseText()."\n";
});
$rc->execute();
$diff = microtime(true) - $_SERVER["REQUEST_TIME_FLOAT"];
echo "..{$diff}\tdone\n";
Hello, i want to get curl_errno() message. How can i do this?
Hi,
I just found that prunePendingRequestQueue
does not work as it should, it always return empty list because of a bug in getNextPendingRequests
. Please see this code:
private function getNextPendingRequests($limit = 1)
{
$requests = array();
while ($limit--) {
if (!isset($this->pendingRequests[$this->pendingRequestsPosition])) {
break;
}
$requests[] = $this->pendingRequests[$this->pendingRequestsPosition];
$this->pendingRequestsPosition++;
}
return $requests;
}
If $limit
is 0, the while
loop will never run. I propose this patch to fix this bug:
- while ($limit--) {
+ $countPending = $limit <= 0 ? $this->countPending() : $limit;
+ while ($countPending--) {
E.g. when RollingCurl downloads page I parse it and find out I need to download different page (because of pagination, redirect or something like that).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.