Comments (14)
Hi, I want to work on this. Assign it to me.
from awesome-hacker-search-engines.
Done @milinddethe15, let me know if you have some doubts or you need guidance.
from awesome-hacker-search-engines.
Hi @edoardottt, There are already some of the duplicate links in README.md.
[ ERR ] DUPLICATE FOUND!
- [C99.nl](https://api.c99.nl/)
- [HackerTarget](https://hackertarget.com/ip-tools/)
- [IntelligenceX](https://intelx.io/)
- [PhoneBook](https://phonebook.cz/)
- [Rapid7 - DB](https://www.rapid7.com/db/)
- [RocketReach](https://rocketreach.co/)
- [SynapsInt](https://synapsint.com/)
- [Vulmon](https://vulmon.com/)
- [wannabe1337.xyz](https://wannabe1337.xyz/)
We need to fix this before running script in workflow.
from awesome-hacker-search-engines.
Sorry @edoardottt, bugmenot is not a duplicate. I created duplicate of bugmenot to run script and forgot to discard it.
from awesome-hacker-search-engines.
Thank you @edoardottt ! I learned a lot about bash scripting and github actions in this issue.
from awesome-hacker-search-engines.
Hi @edoardottt, what do you mean by 'devel branches'?
from awesome-hacker-search-engines.
Hi @edoardottt, what do you mean by 'devel branches'?
Sorry I'm working on multiple repos. There's only the main
branch here. Sorry for the mistake
from awesome-hacker-search-engines.
Thanks @milinddethe15
1. How did you run the script?
If I run the script locally this is what I get:
$> ./scripts/check-dups.sh
[ OK! ] NO DUPLICATES FOUND.
350 links in README.
2. Clearly those are duplicate entries, but the fact is that they are okay... in the sense that they provide multiple services and so it's okay to have a single service providing e.g. dns and domain results.
As example:
cat README.md | grep Vulmon
- [Vulmon](https://vulmon.com/) - Vulnerability and exploit search engine
- [Vulmon](https://vulmon.com/) - Vulnerability and exploit search engine
There are two entries, but in different categories (one is vulnerability and the other is exploit)
3. The best solution would be to check if there are duplicates in each category. In that case the duplicated entry is an error.
from awesome-hacker-search-engines.
Updated script:
#!/bin/bash
readme="README.md"
pwd=$(pwd)
if [[ "${pwd: -7}" == "scripts" ]];
then
readme="../README.md"
fi
# Function to extract links from a section and check for duplicates
check_section() {
section=$1
section_content=$(awk -v section="$section" '/^### / {p=0} {if(p)print} /^### '"$section"'/ {p=1}' "$readme")
duplicate_links=$(echo "$section_content" | grep -oP '\[.*?\]\(\K[^)]+' | sort | uniq -d)
if [[ -n $duplicate_links ]]; then
echo "[ ERR ] DUPLICATE LINKS FOUND IN SECTION: $section"
echo "$duplicate_links"
else
echo "[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: $section"
fi
}
# Get all unique section headings from the README file and handle spaces and slashes
sections=$(grep '^### ' "$readme" | sed 's/^### //' | sed 's/[\/&]/\\&/g')
# Call the function for each section
for section in $sections; do
check_section "$section"
done
$ ./scripts/check-dups.sh
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: General
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: Search
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: Engines
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: Servers
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: Vulnerabilities
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: Exploits
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: Attack
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: Surface
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: Code
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: Mail
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: Addresses
[ ERR ] DUPLICATE LINKS FOUND IN SECTION: Domains
https://spyonweb.com/
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: URLs
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: DNS
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: Certificates
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: WiFi
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: Networks
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: Device
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: Information
[ ERR ] DUPLICATE LINKS FOUND IN SECTION: Credentials
https://bugmenot.com/
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: Leaks
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: Hidden
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: Services
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: Social
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: Networks
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: Phone
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: Numbers
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: Images
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: Threat
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: Intelligence
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: Web
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: History
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: Surveillance
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: cameras
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: Unclassified
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: Not
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: working
awk: warning: escape sequence `\/' treated as plain `/'
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: \/
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: Paused
There are duplicate links in some category. I will fix them.
Should I finalise this updated script?
from awesome-hacker-search-engines.
Amazing! Yes, you can create a new issue for deleting duplicates and open a PR removing them.
- For the spyonweb we can delete the first one, while (I may be wrong on this) bugmenot is not a duplicate... am I wrong? In the credentials section there is only one entry for bugmenot.
We should also fix this part:
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: Not
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: working
awk: warning: escape sequence `\/' treated as plain `/'
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: \/
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: Paused
This should be treated as a single category Not Working / Paused
Also this:
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: General
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: Search
[ OK! ] NO DUPLICATE LINKS FOUND IN SECTION: Engines
should be treated as a single category General Search Engines
Imo the script should always finish, but in the case duplicates are found it should exit with code 1
from awesome-hacker-search-engines.
Super, there is only one error to correct :)
from awesome-hacker-search-engines.
Hi @edoardottt ,
In the previous script, I am not able to solve the issue of multi-word categories name which should be in a single category which you mentioned in your reply. (Please give your input in this eror)
So, I have updated script in which if duplicate links is found it will print the duplicate link and exit with code 1.
readme="README.md"
pwd=$(pwd)
if [[ "${pwd: -7}" == "scripts" ]];
then
readme="../README.md"
fi
# Function to extract links from a section and check for duplicates
check_section() {
section=$1
section_escaped=$(sed 's/[&/\]/\\&/g' <<< "$section")
section_content=$(awk -v section="$section" '/^### / {p=0} {if(p)print} /^### '"$section"'/ {p=1}' "$readme")
duplicate_links=$(echo "$section_content" | grep -oP '\[.*?\]\(\K[^)]+' | sort | uniq -d)
if [[ -n $duplicate_links ]]; then
echo "[ ERR ] DUPLICATE LINKS FOUND"
echo "$duplicate_links"
exit 1
fi
}
# Get all unique section headings from the README file and handle spaces and slashes
sections=$(grep '^### ' "$readme" | sed 's/^### //' | sed 's/[\/&]/\\&/g')
# Call the function for each section
for section in $sections; do
check_section "$section"
done
Running this script:
$ ./scripts/check-dups.sh
awk: warning: escape sequence `\/' treated as plain `/'
gives this warning and am not able to resolve it.
Please give me your inputs on which script to use and its error.
Thanks.
from awesome-hacker-search-engines.
Sorry @milinddethe15 , open a pull request with the code you are trying to push and we can discuss better there. Here it's difficult using comments
from awesome-hacker-search-engines.
Completed, thank you so much @milinddethe15 for your contribution!
from awesome-hacker-search-engines.
Related Issues (20)
- Add https://ecosyste.ms/ to code section HOT 6
- Add https://www.loobins.io/ to exploits section
- Add https://kagi.com/ to general search engine section
- Add https://liveworldwebcams.com/ to Surveillance cameras HOT 1
- 1 HOT 1
- add 4get.ca HOT 3
- Add darkvisitors.com HOT 1
- Search engines
- Add https://br0k3nlab.com/LoFP/ HOT 1
- Add https://hijacklibs.net/ HOT 1
- Add https://wadcoms.github.io/ HOT 1
- Add https://lolapps-project.github.io/ HOT 1
- Add https://www.bootloaders.io/ HOT 1
- Add https://lothardware.com.tr/ HOT 1
- Add https://wtfbins.wtf/ HOT 1
- Add https://lofl-project.github.io/ HOT 1
- Search tool
- dev-v3
- Add https://dmarc.live/
- Add https://opsecfail.github.io/ HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from awesome-hacker-search-engines.