Comments (10)
@lexciobotariu
i am running 30 queries per instance with 1 core (-c 1) this should solve your problem if speed is not important.
This is the script that i'm using which search "keywords.txt" and generates keywords01.txt to keywordsn.txt for every 30 queries and finally create run.sh for running scraper synchronously.
make a build.sh and copy the code below and make the file executable (chmod +x build.sh) and run it (./build.sh or bash build.sh) then run the created run.sh file.
#!/bin/bash
create_keywords_file() {
local file_prefix="$1"
local start_line=$2
local end_line=$3
sed -n "${start_line},${end_line}p" keywords.txt > "${file_prefix}.txt"
echo "./google-maps-scraper -input \"${file_prefix}.txt\" -results \"${file_prefix}.csv\" -exit-on-inactivity 3m -c 1 -depth 14 &" >> run.sh
echo "${file_prefix}=\$!" >> run.sh
echo "wait \$${file_prefix}" >> run.sh
}
main() {
if [ ! -f keywords.txt ]; then
echo "Error: keywords.txt not found!"
exit 1
fi
total_lines=$(wc -l < keywords.txt)
lines_per_file=30
num_files=$((total_lines / lines_per_file))
remainder=$((total_lines % lines_per_file))
for ((i=1; i<=num_files; i++)); do
start_line=$(( (i - 1) * lines_per_file + 1 ))
end_line=$(( i * lines_per_file ))
file_prefix="keywords$(printf "%02d" $i)"
create_keywords_file "$file_prefix" "$start_line" "$end_line"
echo "sleep 30" >> run.sh
done
if [ "$remainder" -gt 0 ]; then
start_line=$((num_files * lines_per_file + 1))
end_line=$((total_lines))
file_prefix="keywords$(printf "%02d" $((num_files + 1)))"
create_keywords_file "$file_prefix" "$start_line" "$end_line"
echo "sleep 30" >> run.sh
fi
chmod +x run.sh
echo "run.sh file created successfully!"
}
main
@gosom
i am using latest golang on both platform 1.22.1
Didn't used docker and i am off for 3 days after that i will test and comment here again.
from google-maps-scraper.
The latest release (v1.2.1) has perfomance enhancements and memory usage looks to be stable.
@admbyz can you try this one ?
PS. I have only tested in fedora linux but looks good
from google-maps-scraper.
Sure i will try without splitting my keywords and let you know.
Thanks for the bump.
PS :
@gosom I tried with -c 1 for a while and it seems problem is gone because now it ends running instances correctly didnt saw any instance more than 250mb so i ended that session and now running with -c 8 with 973 keywords. I'll update post when its done.
Problem is gone tested on xubuntu 22.04 with latest updates and compiled scraper with go 1.22.2.
973 keywords with -c 8 depth 14. Scraper finished its job successfully. No memory leaks and collected 38mb worth of data.
goodjob @gosom thank you for the effort!
from google-maps-scraper.
Hi @admbyz, I have the same problem with 10 cores and 32GB RAM, I was doing 10 queries at the time and run it manually again and again with new queries.
Would you mind sharing the bash file?
Thanks.
from google-maps-scraper.
@admbyz which go version you use?
from google-maps-scraper.
Do you have the same issue when you run using a docker container?
from google-maps-scraper.
@gosom any updates, my 32 GB ram aren't enough?
I'm using go 1.22.1 on Ubuntu 22
this should be enough to go way beyond 32GB:
Friseur München, deutschland
Restaurant München, deutschland
Dolmetscher München, deutschland
Tischler München, deutschland
Maler München, deutschland
Sanitär Installateur München, deutschland
Heizungsbauer München, deutschland
Schlosser München, deutschland
Elektriker München, deutschland
Fliesenleger München, deutschland
Zimmermann München, deutschland
Glaser München, deutschland
Dachdecker München, deutschland
Maurer München, deutschland
Metallbauer München, deutschland
Steinmetz München, deutschland
Schreiner München, deutschland
Installateur für Heizung, Lüftung und Sanitär München, deutschland
Bodenleger München, deutschland
Stuckateur München, deutschland
Kaminbauer München, deutschland
Ofenbauer München, deutschland
Parkettleger München, deutschland
Raumausstatter München, deutschland
Bautischler München, deutschland
Restaurator München, deutschland
Bootsbauer München, deutschland
Uhrmacher München, deutschland
Goldschmied München, deutschland
Silberschmied München, deutschland
Graveur München, deutschland
Uhrmacher München, deutschland
Modellbauer München, deutschland
Drechsler München, deutschland
Holzbildhauer München, deutschland
Kunstschmied München, deutschland
Sattler München, deutschland
Tapezierer München, deutschland
Polsterer München, deutschland
Schuhmacher München, deutschland
Immobilienmakler München, deutschland
Reisebüro München, deutschland
Blumenladen München, deutschland
Buchhandlung München, deutschland
Autowerkstatt München, deutschland
Elektronikgeschäft München, deutschland
Schuhgeschäft München, deutschland
Optiker München, deutschland
Fahrradladen München, deutschland
Goldschmied München, deutschland
Juwelier München, deutschland
Tattoostudio München, deutschland
Fotostudio München, deutschland
Hochzeitsphotograph München, deutschland
Rechtsanwaltskanzlei München, deutschland
Steuerberater München, deutschland
Architekturbüro München, deutschland
Innenarchitekt München, deutschland
Restaurant München, deutschland
Café München, deutschland
from google-maps-scraper.
#7
I have attached a memory graph in this issue, can help to check it? It seems like playwright didn't close correctly
from google-maps-scraper.
@gosom
I tried docker from win10 pc with -c 1 -depth 14 but memory usage was much more because of wsl and got error much faster because docker + wsl started with like 8gb ram usage.
from google-maps-scraper.
@gosom Works fine for me too! Thanks Mate!
from google-maps-scraper.
Related Issues (20)
- Image export labels HOT 1
- How to retrieve user_reviews? HOT 11
- antivirus flagged as a virus HOT 4
- The program is taking a very long time to scrape a single query. HOT 2
- Adding Claimed Business?
- Suggestion: Add option to specify number of Runs (-runs X) HOT 1
- Processes never end HOT 1
- Appears that max 21 results(numOfJobsCompleted":21) can be searched. Is it possible to specify -xx max results ?
- /
- o HOT 2
- Input Query in Output CSV HOT 2
- A JSON object, array or literal was expected. HOT 3
- user_reviews all null
- Option "do not scrape duplicates" HOT 2
- Wiki page HOT 1
- Can't Scrape all Results
- Apple M1 docker platform issue (Multi-platform image is missing)
- Thank you Georgios! HOT 1
- Scraper cannot get google map links
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from google-maps-scraper.