thesw4rm / amazon-books-scraper Goto Github PK
View Code? Open in Web Editor NEWScraper I've written in C++/C for the purposes of learning the language. Trying to make from scratch as well as with libraries so I can learn by doing.
Scraper I've written in C++/C for the purposes of learning the language. Trying to make from scratch as well as with libraries so I can learn by doing.
When sending a successful HTTP request to http://mirror.vcu.edu/, the response is printed out even though there is no explicit command to do so. Also looks like the colour change for logs is not resetting before this happens, hinting at asynchronous print.
Following is the output, where everything after the first LOG:
is green.
Hi
Extracting request header data...
Extracted host: Host = mirror.vcu.edu Extracted path: Path = / Extracted ssl: ssl = false
LOG: Socket file descriptor is 3
LOG: Connected socket at descriptor 3 to IP 128.172.15.65 and port 80
LOG: Request header is GET / HTTP/1.1
Host: mirror.vcu.edu
LOG: Sent HTTP request from socket at descriptor 3 to IP 128.172.15.65 and port 80.
LOG: Starting receive operation
LOG: Received HTTP response from socket at descriptor 3 to IP 128.172.15.65dual. Anyone using this system expressly consents to such
monitoring.<p>
LOG OFF IMMEDIATELY if you do not agree to the conditions stated
in this warning.
<hr>
<b>CRYPTOGRAPHIC SOFTWARE</b><p>
Due to U.S. Exports Regulations, all cryptographic software on this site is subject to the following legal notice:<p>
This site includes publicly available encryption source code which, together with object code resulting from the compiling of publicly available source code,
may be exported from the United States under License Exception "TSU" pursuant to 15 C.F.R. Section 740.13(e).<p>
This legal notice applies to cryptographic software only. Please see the <a href="http://www.bis.doc.gov">Bureau of Industry and Security</a> for more informa
tion about current U.S. regulations.<p>
This server is located in Richmond, Virginia, USA. Use in violation of any applicable laws is prohibited.
<hr>
</body></html>
from it are
for official University business use as authorized by the
<a href="https://policy.vcu.edu/sites/default/files/Computer%20and%20Network%20Resources%20Use.pdf">
Virginia Commonwealth University Computer and Network Resources Use Policy.</a><p>
Monitoring and recording of users' activities may occur
when there is reasonable suspicion of unauthorized activity and
may be used in administrative, civil, and criminal action against an
indivi and port 80.
LOG: Closed socket at descriptor 3
Need to change
while (bytesReceived < (RESPONSE_MAX_LEN * sizeof(char)) && bytesReceived > bytesReceivedPrevious) {
bytesReceivedPrevious = bytesReceived;
bytesReceived = recv(sockFD, buffer, RESPONSE_BUFFER_SIZE, 0);
response = realloc(response, sizeof(response) + RESPONSE_BUFFER_SIZE);
strcat(response, buffer); //Append to the end, safe because recv takes care of limiting buffer size
}
response = realloc(response, sizeof(response) + sizeof(char));
response[strlen(response)] = '\0';
to
while (bytesReceived < (RESPONSE_MAX_LEN * sizeof(char)) && bytesReceived > bytesReceivedPrevious) {
bytesReceivedPrevious = bytesReceived;
bytesReceived = recv(sockFD, buffer, RESPONSE_BUFFER_SIZE, 0);
response = realloc(response, sizeof(*response) + RESPONSE_BUFFER_SIZE);
strcat(response, buffer); //Append to the end, safe because recv takes care of limiting buffer size
}
response = realloc(response, sizeof(*response) + sizeof(char));
response[strlen(response)] = '\0';
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.