Giter Site home page Giter Site logo

smicallef / spiderfoot Goto Github PK

View Code? Open in Web Editor NEW
11.7K 360.0 2.1K 15.77 MB

SpiderFoot automates OSINT for threat intelligence and mapping your attack surface.

Home Page: http://www.spiderfoot.net

License: MIT License

Python 98.07% CSS 0.34% JavaScript 1.18% Dockerfile 0.10% Shell 0.06% RobotFramework 0.26%
footprinting osint threatintel python infosec intelligence-gathering osint-reconnaissance pentesting threat-intelligence security-tools

spiderfoot's Introduction

License Python Version Stable Release CI status Last Commit Codecov Twitter Follow Discord

SpiderFoot is an open source intelligence (OSINT) automation tool. It integrates with just about every data source available and utilises a range of methods for data analysis, making that data easy to navigate.

SpiderFoot has an embedded web-server for providing a clean and intuitive web-based interface but can also be used completely via the command-line. It's written in Python 3 and MIT-licensed.

FEATURES

  • Web based UI or CLI
  • Over 200 modules (see below)
  • Python 3.7+
  • YAML-configurable correlation engine with 37 pre-defined rules
  • CSV/JSON/GEXF export
  • API key export/import
  • SQLite back-end for custom querying
  • Highly configurable
  • Fully documented
  • Visualisations
  • TOR integration for dark web searching
  • Dockerfile for Docker-based deployments
  • Can call other tools like DNSTwist, Whatweb, Nmap and CMSeeK
  • Actively developed since 2012!

WANT MORE?

Need more from SpiderFoot? Check out SpiderFoot HX for:

  • 100% Cloud-based and managed for you
  • Attack Surface Monitoring with change notifications by email, REST and Slack
  • Multiple targets per scan
  • Multi-user collaboration
  • Authenticated and 2FA
  • Investigations
  • Customer support
  • Third party tools pre-installed & configured
  • Drive it with a fully RESTful API
  • TOR integration built-in
  • Screenshotting
  • Bring your own Python SpiderFoot modules
  • Feed scan data to Splunk, ElasticSearch and REST endpoints

See the full set of differences between SpiderFoot HX and the open source version here.

USES

SpiderFoot can be used offensively (e.g. in a red team exercise or penetration test) for reconnaissance of your target or defensively to gather information about what you or your organisation might have exposed over the Internet.

You can target the following entities in a SpiderFoot scan:

  • IP address
  • Domain/sub-domain name
  • Hostname
  • Network subnet (CIDR)
  • ASN
  • E-mail address
  • Phone number
  • Username
  • Person's name
  • Bitcoin address

SpiderFoot's 200+ modules feed each other in a publisher/subscriber model to ensure maximum data extraction to do things like:

INSTALLING & RUNNING

To install and run SpiderFoot, you need at least Python 3.7 and a number of Python libraries which you can install with pip. We recommend you install a packaged release since master will often have bleeding edge features and modules that aren't fully tested.

Stable build (packaged release):

 wget https://github.com/smicallef/spiderfoot/archive/v4.0.tar.gz
 tar zxvf v4.0.tar.gz
 cd spiderfoot-4.0
 pip3 install -r requirements.txt
 python3 ./sf.py -l 127.0.0.1:5001

Development build (cloning git master branch):

 git clone https://github.com/smicallef/spiderfoot.git
 cd spiderfoot
 pip3 install -r requirements.txt
 python3 ./sf.py -l 127.0.0.1:5001

Check out the documentation and our asciinema videos for more tutorials.

COMMUNITY

Whether you're a contributor, user or just curious about SpiderFoot and OSINT in general, we'd love to have you join our community! SpiderFoot now has a Discord server for seeking help from the community, requesting features or just general OSINT chit-chat.

WRITING CORRELATION RULES

We have a comprehensive write-up and reference of the correlation rule-set introduced in SpiderFoot 4.0 here.

Also take a look at the template.yaml file for a walk through. The existing 37 rules are also quite readable and good as starting points for additional rules.

MODULES / INTEGRATIONS

SpiderFoot has over 200 modules, most of which don't require API keys, and many of those that do require API keys have a free tier.

Name Description Type
AbstractAPI Look up domain, phone and IP address information from AbstractAPI. Tiered API
abuse.ch Check if a host/domain, IP address or netblock is malicious according to Abuse.ch. Free API
AbuseIPDB Check if an IP address is malicious according to AbuseIPDB.com blacklist. Tiered API
Abusix Mail Intelligence Check if a netblock or IP address is in the Abusix Mail Intelligence blacklist. Tiered API
Account Finder Look for possible associated accounts on over 500 social and other websites such as Instagram, Reddit, etc. Internal
AdBlock Check Check if linked pages would be blocked by AdBlock Plus. Tiered API
AdGuard DNS Check if a host would be blocked by AdGuard DNS. Free API
Ahmia Search Tor 'Ahmia' search engine for mentions of the target. Free API
AlienVault IP Reputation Check if an IP or netblock is malicious according to the AlienVault IP Reputation database. Free API
AlienVault OTX Obtain information from AlienVault Open Threat Exchange (OTX) Tiered API
Amazon S3 Bucket Finder Search for potential Amazon S3 buckets associated with the target and attempt to list their contents. Free API
Apple iTunes Search Apple iTunes for mobile apps. Free API
Archive.org Identifies historic versions of interesting files/pages from the Wayback Machine. Free API
ARIN Queries ARIN registry for contact information. Free API
Azure Blob Finder Search for potential Azure blobs associated with the target and attempt to list their contents. Free API
Base64 Decoder Identify Base64-encoded strings in URLs, often revealing interesting hidden information. Internal
BGPView Obtain network information from BGPView API. Free API
Binary String Extractor Attempt to identify strings in binary content. Internal
BinaryEdge Obtain information from BinaryEdge.io Internet scanning systems, including breaches, vulnerabilities, torrents and passive DNS. Tiered API
Bing (Shared IPs) Search Bing for hosts sharing the same IP. Tiered API
Bing Obtain information from bing to identify sub-domains and links. Tiered API
Bitcoin Finder Identify bitcoin addresses in scraped webpages. Internal
Bitcoin Who's Who Check for Bitcoin addresses against the Bitcoin Who's Who database of suspect/malicious addresses. Tiered API
BitcoinAbuse Check Bitcoin addresses against the bitcoinabuse.com database of suspect/malicious addresses. Free API
Blockchain Queries blockchain.info to find the balance of identified bitcoin wallet addresses. Free API
blocklist.de Check if a netblock or IP is malicious according to blocklist.de. Free API
BotScout Searches BotScout.com's database of spam-bot IP addresses and e-mail addresses. Tiered API
botvrij.eu Check if a domain is malicious according to botvrij.eu. Free API
BuiltWith Query BuiltWith.com's Domain API for information about your target's web technology stack, e-mail addresses and more. Tiered API
C99 Queries the C99 API which offers various data (geo location, proxy detection, phone lookup, etc). Commercial API
CallerName Lookup US phone number location and reputation information. Free API
Censys Obtain host information from Censys.io. Tiered API
Certificate Transparency Gather hostnames from historical certificates in crt.sh. Free API
CertSpotter Gather information about SSL certificates from SSLMate CertSpotter API. Tiered API
CINS Army List Check if a netblock or IP address is malicious according to Collective Intelligence Network Security (CINS) Army list. Free API
CIRCL.LU Obtain information from CIRCL.LU's Passive DNS and Passive SSL databases. Free API
CleanBrowsing.org Check if a host would be blocked by CleanBrowsing.org DNS content filters. Free API
CleanTalk Spam List Check if a netblock or IP address is on CleanTalk.org's spam IP list. Free API
Clearbit Check for names, addresses, domains and more based on lookups of e-mail addresses on clearbit.com. Tiered API
CloudFlare DNS Check if a host would be blocked by CloudFlare DNS. Free API
CoinBlocker Lists Check if a domain appears on CoinBlocker lists. Free API
CommonCrawl Searches for URLs found through CommonCrawl.org. Free API
Comodo Secure DNS Check if a host would be blocked by Comodo Secure DNS. Tiered API
Company Name Extractor Identify company names in any obtained data. Internal
Cookie Extractor Extract Cookies from HTTP headers. Internal
Country Name Extractor Identify country names in any obtained data. Internal
Credit Card Number Extractor Identify Credit Card Numbers in any data Internal
Crobat API Search Crobat API for subdomains. Free API
Cross-Referencer Identify whether other domains are associated ('Affiliates') of the target by looking for links back to the target site(s). Internal
CRXcavator Search CRXcavator for Chrome extensions. Free API
Custom Threat Feed Check if a host/domain, netblock, ASN or IP is malicious according to your custom feed. Internal
CyberCrime-Tracker.net Check if a host/domain or IP address is malicious according to CyberCrime-Tracker.net. Free API
Debounce Check whether an email is disposable Free API
Dehashed Gather breach data from Dehashed API. Commercial API
Digital Ocean Space Finder Search for potential Digital Ocean Spaces associated with the target and attempt to list their contents. Free API
DNS Brute-forcer Attempts to identify hostnames through brute-forcing common names and iterations. Internal
DNS Common SRV Attempts to identify hostnames through brute-forcing common DNS SRV records. Internal
DNS for Family Check if a host would be blocked by DNS for Family. Free API
DNS Look-aside Attempt to reverse-resolve the IP addresses next to your target to see if they are related. Internal
DNS Raw Records Retrieves raw DNS records such as MX, TXT and others. Internal
DNS Resolver Resolves hosts and IP addresses identified, also extracted from raw content. Internal
DNS Zone Transfer Attempts to perform a full DNS zone transfer. Internal
DNSDB Query FarSight's DNSDB for historical and passive DNS data. Tiered API
DNSDumpster Passive subdomain enumeration using HackerTarget's DNSDumpster Free API
DNSGrep Obtain Passive DNS information from Rapid7 Sonar Project using DNSGrep API. Free API
DroneBL Query the DroneBL database for open relays, open proxies, vulnerable servers, etc. Free API
DuckDuckGo Query DuckDuckGo's API for descriptive information about your target. Free API
E-Mail Address Extractor Identify e-mail addresses in any obtained data. Internal
EmailCrawlr Search EmailCrawlr for email addresses and phone numbers associated with a domain. Tiered API
EmailFormat Look up e-mail addresses on email-format.com. Free API
EmailRep Search EmailRep.io for email address reputation. Tiered API
Emerging Threats Check if a netblock or IP address is malicious according to EmergingThreats.net. Free API
Error String Extractor Identify common error messages in content like SQL errors, etc. Internal
Ethereum Address Extractor Identify ethereum addresses in scraped webpages. Internal
Etherscan Queries etherscan.io to find the balance of identified ethereum wallet addresses. Free API
File Metadata Extractor Extracts meta data from documents and images. Internal
Flickr Search Flickr for domains, URLs and emails related to the specified domain. Free API
Focsec Look up IP address information from Focsec. Tiered API
FortiGuard Antispam Check if an IP address is malicious according to FortiGuard Antispam. Free API
Fraudguard Obtain threat information from Fraudguard.io Tiered API
F-Secure Riddler.io Obtain network information from F-Secure Riddler.io API. Commercial API
FullContact Gather domain and e-mail information from FullContact.com API. Tiered API
FullHunt Identify domain attack surface using FullHunt API. Tiered API
Github Identify associated public code repositories on Github. Free API
GLEIF Look up company information from Global Legal Entity Identifier Foundation (GLEIF). Tiered API
Google Maps Identifies potential physical addresses and latitude/longitude coordinates. Tiered API
Google Object Storage Finder Search for potential Google Object Storage buckets associated with the target and attempt to list their contents. Free API
Google SafeBrowsing Check if the URL is included on any of the Safe Browsing lists. Free API
Google Obtain information from the Google Custom Search API to identify sub-domains and links. Tiered API
Gravatar Retrieve user information from Gravatar API. Free API
Grayhat Warfare Find bucket names matching the keyword extracted from a domain from Grayhat API. Tiered API
Greensnow Check if a netblock or IP address is malicious according to greensnow.co. Free API
grep.app Search grep.app API for links and emails related to the specified domain. Free API
GreyNoise Community Obtain IP enrichment data from GreyNoise Community API Tiered API
GreyNoise Obtain IP enrichment data from GreyNoise Tiered API
HackerOne (Unofficial) Check external vulnerability scanning/reporting service h1.nobbd.de to see if the target is listed. Free API
HackerTarget Search HackerTarget.com for hosts sharing the same IP. Free API
Hash Extractor Identify MD5 and SHA hashes in web content, files and more. Internal
HaveIBeenPwned Check HaveIBeenPwned.com for hacked e-mail addresses identified in breaches. Commercial API
Hosting Provider Identifier Find out if any IP addresses identified fall within known 3rd party hosting ranges, e.g. Amazon, Azure, etc. Internal
Host.io Obtain information about domain names from host.io. Tiered API
Human Name Extractor Attempt to identify human names in fetched content. Internal
Hunter.io Check for e-mail addresses and names on hunter.io. Tiered API
Hybrid Analysis Search Hybrid Analysis for domains and URLs related to the target. Free API
IBAN Number Extractor Identify International Bank Account Numbers (IBANs) in any data. Internal
Iknowwhatyoudownload.com Check iknowwhatyoudownload.com for IP addresses that have been using torrents. Tiered API
IntelligenceX Obtain information from IntelligenceX about identified IP addresses, domains, e-mail addresses and phone numbers. Tiered API
Interesting File Finder Identifies potential files of interest, e.g. office documents, zip files. Internal
Internet Storm Center Check if an IP address is malicious according to SANS ISC. Free API
ipapi.co Queries ipapi.co to identify geolocation of IP Addresses using ipapi.co API Tiered API
ipapi.com Queries ipapi.com to identify geolocation of IP Addresses using ipapi.com API Tiered API
IPInfo.io Identifies the physical location of IP addresses identified using ipinfo.io. Tiered API
IPQualityScore Determine if target is malicious using IPQualityScore API Tiered API
ipregistry Query the ipregistry.co database for reputation and geo-location. Tiered API
ipstack Identifies the physical location of IP addresses identified using ipstack.com. Tiered API
JsonWHOIS.com Search JsonWHOIS.com for WHOIS records associated with a domain. Tiered API
Junk File Finder Looks for old/temporary and other similar files. Internal
Keybase Obtain additional information about domain names and identified usernames. Free API
Koodous Search Koodous for mobile apps. Tiered API
LeakIX Search LeakIX for host data leaks, open ports, software and geoip. Free API
Leak-Lookup Searches Leak-Lookup.com's database of breaches. Free API
Maltiverse Obtain information about any malicious activities involving IP addresses Free API
MalwarePatrol Searches malwarepatrol.net's database of malicious URLs/IPs. Tiered API
MetaDefender Search MetaDefender API for IP address and domain IP reputation. Tiered API
Mnemonic PassiveDNS Obtain Passive DNS information from PassiveDNS.mnemonic.no. Free API
multiproxy.org Open Proxies Check if an IP address is an open proxy according to multiproxy.org open proxy list. Free API
MySpace Gather username and location from MySpace.com profiles. Free API
NameAPI Check whether an email is disposable Tiered API
NetworksDB Search NetworksDB.io API for IP address and domain information. Tiered API
NeutrinoAPI Search NeutrinoAPI for phone location information, IP address information, and host reputation. Tiered API
numverify Lookup phone number location and carrier information from numverify.com. Tiered API
Onion.link Search Tor 'Onion City' search engine for mentions of the target domain using Google Custom Search. Free API
Onionsearchengine.com Search Tor onionsearchengine.com for mentions of the target domain. Free API
Onyphe Check Onyphe data (threat list, geo-location, pastries, vulnerabilities) about a given IP. Tiered API
Open Bug Bounty Check external vulnerability scanning/reporting service openbugbounty.org to see if the target is listed. Free API
OpenCorporates Look up company information from OpenCorporates. Tiered API
OpenDNS Check if a host would be blocked by OpenDNS. Free API
OpenNIC DNS Resolves host names in the OpenNIC alternative DNS system. Free API
OpenPhish Check if a host/domain is malicious according to OpenPhish.com. Free API
OpenStreetMap Retrieves latitude/longitude coordinates for physical addresses from OpenStreetMap API. Free API
Page Information Obtain information about web pages (do they take passwords, do they contain forms, etc.) Internal
PasteBin PasteBin search (via Google Search API) to identify related content. Tiered API
PGP Key Servers Look up domains and e-mail addresses in PGP public key servers. Internal
PhishStats Check if a netblock or IP address is malicious according to PhishStats. Free API
PhishTank Check if a host/domain is malicious according to PhishTank. Free API
Phone Number Extractor Identify phone numbers in scraped webpages. Internal
Port Scanner - TCP Scans for commonly open TCP ports on Internet-facing systems. Internal
Project Honey Pot Query the Project Honey Pot database for IP addresses. Free API
ProjectDiscovery Chaos Search for hosts/subdomains using chaos.projectdiscovery.io Commercial API
Psbdmp Check psbdmp.cc (PasteBin Dump) for potentially hacked e-mails and domains. Free API
Pulsedive Obtain information from Pulsedive's API. Tiered API
PunkSpider Check the QOMPLX punkspider.io service to see if the target is listed as vulnerable. Free API
Quad9 Check if a host would be blocked by Quad9 DNS. Free API
ReverseWhois Reverse Whois lookups using reversewhois.io. Free API
RIPE Queries the RIPE registry (includes ARIN data) to identify netblocks and other info. Free API
RiskIQ Obtain information from RiskIQ's (formerly PassiveTotal) Passive DNS and Passive SSL databases. Tiered API
Robtex Search Robtex.com for hosts sharing the same IP. Free API
searchcode Search searchcode for code repositories mentioning the target domain. Free API
SecurityTrails Obtain Passive DNS and other information from SecurityTrails Tiered API
Seon Queries seon.io to gather intelligence about IP Addresses, email addresses, and phone numbers Commercial API
SHODAN Obtain information from SHODAN about identified IP addresses. Tiered API
Similar Domain Finder Search various sources to identify similar looking domain names, for instance squatted domains. Internal
Skymem Look up e-mail addresses on Skymem. Free API
SlideShare Gather name and location from SlideShare profiles. Free API
Snov Gather available email IDs from identified domains Tiered API
Social Links Queries SocialLinks.io to gather intelligence from social media platforms and dark web. Commercial API
Social Media Profile Finder Tries to discover the social media profiles for human names identified. Tiered API
Social Network Identifier Identify presence on social media networks such as LinkedIn, Twitter and others. Internal
SORBS Query the SORBS database for open relays, open proxies, vulnerable servers, etc. Free API
SpamCop Check if a netblock or IP address is in the SpamCop database. Free API
Spamhaus Zen Check if a netblock or IP address is in the Spamhaus Zen database. Free API
spur.us Obtain information about any malicious activities involving IP addresses found Commercial API
SpyOnWeb Search SpyOnWeb for hosts sharing the same IP address, Google Analytics code, or Google Adsense code. Tiered API
SSL Certificate Analyzer Gather information about SSL certificates used by the target's HTTPS sites. Internal
StackOverflow Search StackOverflow for any mentions of a target domain. Returns potentially related information. Tiered API
Steven Black Hosts Check if a domain is malicious (malware or adware) according to Steven Black Hosts list. Free API
Strange Header Identifier Obtain non-standard HTTP headers returned by web servers. Internal
Subdomain Takeover Checker Check if affiliated subdomains are vulnerable to takeover. Internal
Sublist3r PassiveDNS Passive subdomain enumeration using Sublist3r's API Free API
SURBL Check if a netblock, IP address or domain is in the SURBL blacklist. Free API
Talos Intelligence Check if a netblock or IP address is malicious according to TalosIntelligence. Free API
TextMagic Obtain phone number type from TextMagic API Tiered API
Threat Jammer Check if an IP address is malicious according to ThreatJammer.com Tiered API
ThreatCrowd Obtain information from ThreatCrowd about identified IP addresses, domains and e-mail addresses. Free API
ThreatFox Check if an IP address is malicious according to ThreatFox. Free API
ThreatMiner Obtain information from ThreatMiner's database for passive DNS and threat intelligence. Free API
TLD Searcher Search all Internet TLDs for domains with the same name as the target (this can be very slow.) Internal
Tool - CMSeeK Identify what Content Management System (CMS) might be used. Tool
Tool - DNSTwist Identify bit-squatting, typo and other similar domains to the target using a local DNSTwist installation. Tool
Tool - nbtscan Scans for open NETBIOS nameservers on your target's network. Tool
Tool - Nmap Identify what Operating System might be used. Tool
Tool - Nuclei Fast and customisable vulnerability scanner. Tool
Tool - onesixtyone Fast scanner to find publicly exposed SNMP services. Tool
Tool - Retire.js Scanner detecting the use of JavaScript libraries with known vulnerabilities Tool
Tool - snallygaster Finds file leaks and other security problems on HTTP servers. Tool
Tool - testssl.sh Identify various TLS/SSL weaknesses, including Heartbleed, CRIME and ROBOT. Tool
Tool - TruffleHog Searches through git repositories for high entropy strings and secrets, digging deep into commit history. Tool
Tool - WAFW00F Identify what web application firewall (WAF) is in use on the specified website. Tool
Tool - Wappalyzer Wappalyzer indentifies technologies on websites. Tool
Tool - WhatWeb Identify what software is in use on the specified website. Tool
TOR Exit Nodes Check if an IP adddress or netblock appears on the Tor Metrics exit node list. Free API
TORCH Search Tor 'TORCH' search engine for mentions of the target domain. Free API
Trashpanda Queries Trashpanda to gather intelligence about mentions of target in pastesites Tiered API
Trumail Check whether an email is disposable Free API
Twilio Obtain information from Twilio about phone numbers. Ensure you have the Caller Name add-on installed in Twilio. Tiered API
Twitter Gather name and location from Twitter profiles. Free API
UCEPROTECT Check if a netblock or IP address is in the UCEPROTECT database. Free API
URLScan.io Search URLScan.io cache for domain information. Free API
Venmo Gather user information from Venmo API. Free API
ViewDNS.info Identify co-hosted websites and perform reverse Whois lookups using ViewDNS.info. Tiered API
VirusTotal Obtain information from VirusTotal about identified IP addresses. Tiered API
VoIP Blacklist (VoIPBL) Check if an IP address or netblock is malicious according to VoIP Blacklist (VoIPBL). Free API
VXVault.net Check if a domain or IP address is malicious according to VXVault.net. Free API
Web Analytics Extractor Identify web analytics IDs in scraped webpages and DNS TXT records. Internal
Web Framework Identifier Identify the usage of popular web frameworks like jQuery, YUI and others. Internal
Web Server Identifier Obtain web server banners to identify versions of web servers being used. Internal
Web Spider Spidering of web-pages to extract content for searching. Internal
WhatCMS Check web technology using WhatCMS.org API. Tiered API
Whoisology Reverse Whois lookups using Whoisology.com. Commercial API
Whois Perform a WHOIS look-up on domain names and owned netblocks. Internal
Whoxy Reverse Whois lookups using Whoxy.com. Commercial API
WiGLE Query WiGLE to identify nearby WiFi access points. Free API
Wikileaks Search Wikileaks for mentions of domain names and e-mail addresses. Free API
Wikipedia Edits Identify edits to Wikipedia articles made from a given IP address or username. Free API
XForce Exchange Obtain IP reputation and passive DNS information from IBM X-Force Exchange. Tiered API
Yandex DNS Check if a host would be blocked by Yandex DNS. Free API
Zetalytics Query the Zetalytics database for hosts on your target domain(s). Tiered API
ZoneFile.io Search ZoneFiles.io Domain query API for domain information. Tiered API
Zone-H Defacement Check Check if a hostname/domain appears on the zone-h.org 'special defacements' RSS feed. Free API

DOCUMENTATION

Read more at the project website, including more complete documentation, blog posts with tutorials/guides, plus information about SpiderFoot HX.

Latest updates announced on Twitter.

spiderfoot's People

Contributors

amrawadk avatar anthirian avatar bcoles avatar bradchiappetta avatar cclauss avatar ccsplit avatar code-review-doctor avatar datasiph0n avatar faleksic avatar fallingcubes avatar gboddin avatar gormogon avatar ikp4success avatar krishnasism avatar latsku avatar leotrubach avatar mandreko-ts avatar mariovilas avatar mhchong avatar miaoski avatar mscherer avatar p-l- avatar ra80533 avatar rb64 avatar rootup avatar smicallef avatar swedishmike avatar thecrott avatar thetechromancer avatar woodrad avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

spiderfoot's Issues

Versions not mentioned

Supported versions of CherryPy, Python, etc. not mentioned in available documentation.

Scan fails after initial git clone due to lack of cache/ directory

Unhandled exception (IOError) encountered during scan. Please report this as a bug: ['Traceback (most recent call last):\n', ' File "/home/rydell/src/git/spiderfoot/sfscan.py", line 92, in startScan\n self.sf.cachePut("internet_tlds", self.config['_internettlds'])\n', ' File "/home/rydell/src/git/spiderfoot/sflib.py", line 164, in cachePut\n fp = file(cacheFile, "w")\n', "IOError: [Errno 2] No such file or directory: u'/home/rydell/src/git/spiderfoot/cache/95c5dab788d19e124540cb1e96e6277f0871c648f4b3f2526fa1f765'\n"]

after manually creating the cache directory inside the spiderfoot directory from git, everything seems to work fine. Just thought I should report it.

Empty Strings are not supported

Unhandled exception (AddrFormatError) encountered during scan. Please report this as a bug: ['Traceback (most recent call last):\n', ' File "sfscan.pyc", line 228, in startScan\n', ' File "sflib.pyc", line 1120, in notifyListeners\n', ' File "modules\sfp_bingsearch.pyc", line 83, in handleEvent\n', ' File "sflib.pyc", line 1120, in notifyListeners\n', ' File "modules\sfp_dns.pyc", line 162, in handleEvent\n', ' File "modules\sfp_dns.pyc", line 330, in processHost\n', ' File "sflib.pyc", line 1120, in notifyListeners\n', ' File "modules\sfp_bingsearch.pyc", line 83, in handleEvent\n', ' File "sflib.pyc", line 1120, in notifyListeners\n', ' File "modules\sfp_dns.pyc", line 162, in handleEvent\n', ' File "modules\sfp_dns.pyc", line 334, in processHost\n', ' File "modules\sfp_dns.pyc", line 417, in processDomain\n', ' File "modules\sfp_dns.pyc", line 330, in processHost\n', ' File "sflib.pyc", line 1120, in notifyListeners\n', ' File "modules\sfp_spider.pyc", line 259, in handleEvent\n', ' File "modules\sfp_spider.pyc", line 283, in spiderFrom\n', ' File "modules\sfp_spider.pyc", line 100, in processUrl\n', ' File "modules\sfp_spider.pyc", line 198, in contentNotify\n', ' File "sflib.pyc", line 1120, in notifyListeners\n', ' File "modules\sfp_dns.pyc", line 162, in handleEvent\n', ' File "modules\sfp_dns.pyc", line 330, in processHost\n', ' File "sflib.pyc", line 1120, in notifyListeners\n', ' File "modules\sfp_googlesearch.pyc", line 106, in handleEvent\n', ' File "sflib.pyc", line 1120, in notifyListeners\n', ' File "modules\sfp_spider.pyc", line 259, in handleEvent\n', ' File "modules\sfp_spider.pyc", line 309, in spiderFrom\n', ' File "modules\sfp_spider.pyc", line 100, in processUrl\n', ' File "modules\sfp_spider.pyc", line 198, in contentNotify\n', ' File "sflib.pyc", line 1120, in notifyListeners\n', ' File "modules\sfp_dns.pyc", line 162, in handleEvent\n', ' File "modules\sfp_dns.pyc", line 330, in processHost\n', ' File "sflib.pyc", line 1120, in notifyListeners\n', ' File "modules\sfp_bingsearch.pyc", line 104, in handleEvent\n', ' File "sflib.pyc", line 1120, in notifyListeners\n', ' File "modules\sfp_spider.pyc", line 259, in handleEvent\n', ' File "modules\sfp_spider.pyc", line 283, in spiderFrom\n', ' File "modules\sfp_spider.pyc", line 123, in processUrl\n', ' File "modules\sfp_spider.pyc", line 179, in linkNotify\n', ' File "sflib.pyc", line 1243, in matches\n', ' File "netaddr\strategy\ipv4.pyc", line 105, in valid_str\n', 'AddrFormatError: Empty strings are not supported!\n']

Unhandled exception (NameError) encountered during scan

Please report this as a bug: (I'm not sure if I should report this here or if there's a better place/method.)

['Traceback (most recent call last):\n', ' File "sfscan.pyc", line 228, in startScan\n', ' File "sflib.pyc", line 1120, in notifyListeners\n', ' File "modules\sfp_bingsearch.pyc", line 104, in handleEvent\n', ' File "sflib.pyc", line 1120, in notifyListeners\n', ' File "modules\sfp_spider.pyc", line 259, in handleEvent\n', ' File "modules\sfp_spider.pyc", line 309, in spiderFrom\n', ' File "modules\sfp_spider.pyc", line 100, in processUrl\n', ' File "modules\sfp_spider.pyc", line 198, in contentNotify\n', ' File "sflib.pyc", line 1120, in notifyListeners\n', ' File "modules\sfp_pageinfo.pyc", line 93, in handleEvent\n', ' File "sflib.pyc", line 146, in info\n', ' File "sflib.pyc", line 110, in _dblog\n', ' File "sfdb.pyc", line 276, in scanLogEvent\n', ' File "sflib.pyc", line 126, in fatal\n', "NameError: global name 'exit' is not defined\n"]

Feature request: simultaneous scans

SpiderFoot should be able to execute several scans simultaneous or at least have ability to queue scans. In my opinion this is a must have feature. Missing from 2.0.4 at least.

Error when running spiderfoot on Kali

I encounter below error when I want to run spiderfoot:

prompt# python ./sf.py
Traceback (most recent call last):
File "./sf.py", line 113, in
sfModules[modName]['object'] = getattr(mod, modName)()
AttributeError: 'module' object has no attribute 'sfp_tcpportscan'

Anyone an idea how to solve?

Exported CSV files are impossible to parse reliably

Currently the code to generate CSV files is this:

    # # Get result data in CSV format
    def scaneventresultexport(self, id, type):
        dbh = SpiderFootDb(self.config)
        data = dbh.scanResultEvent(id, type)
        blob = "\"Updated\",\"Type\",\"Module\",\"Source\",\"Data\"\n"
        for row in data:
            lastseen = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(row[0]))
            escapedData = cgi.escape(row[1].replace("\n", "#LB#").replace("\r", "#LB#"))
            escapedSrc = cgi.escape(row[2].replace("\n", "#LB#").replace("\r", "#LB#"))
            blob = blob + "\"" + lastseen + "\",\"" + row[4] + "\",\""
            blob = blob + row[3] + "\",\"" + escapedSrc + "\",\"" + escapedData + "\"\n"
        cherrypy.response.headers['Content-Disposition'] = "attachment; filename=SpiderFoot.csv"
        cherrypy.response.headers['Content-Type'] = "application/csv"
        cherrypy.response.headers['Pragma'] = "no-cache"
        return blob
    scaneventresultexport.exposed = True

Unfortunately, this code allows double quotes to be included in fields without escaping them. That means there's no reliable way of telling when each field ends, since splitting each row using the comma character wouldn't work either (fields may also contain commas).

One possible solution is to escape double quote characters. But a better solution, IMO, is to use a proper parser like the "csv" module from the standard library, instead of manually concatenating strings. This not only fixes parsing problems in a standardized way, it also lets you control the specific CSV dialect. I'm working on a patch right now, I'll send you a pull request when it's ready, for your consideration.

Error executing sf.py on Linux Mint

Traceback (most recent call last):
File "sf.py", line 64, in
mod = import('modules.' + modName, globals(), locals(), [modName])
File "/usr/local/bin/spiderfoot/modules/sfp_sslcert.py", line 16, in
import M2Crypto
File "/usr/local/lib/python2.7/dist-packages/M2Crypto/init.py", line 22, in
import __m2crypto
ImportError: /usr/local/lib/python2.7/dist-packages/M2Crypto/__m2crypto.so: undefined symbol: SSLv2_method

Unhandled exception

Here's what I got:

Unhandled exception (AttributeError) encountered during scan. Please report this as a bug: ['Traceback (most recent call last):\n', ' File "/spiderfoot/spiderfoot/sfscan.py", line 119, in startScan\n dns.resolver.restore_system_resolver()\n', "AttributeError: 'module' object has no attribute 'restore_system_resolver'\n"]

I did point this direct to an internal IP address however...if that makes a difference. Thank you.

memory error in sfp_spider.py

stack trace:
Unhandled exception (MemoryError) encountered during scan. Please report this as a bug: ['Traceback (most recent call last):\n', ' File "/home/gianz/spiderfoot/sfscan.py", line 195, in startScan\n module.start()\n', ' File "/home/gianz/spiderfoot/modules/sfp_bingsearch.py", line 81, in start\n self.notifyListeners(evt)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1050, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 120, in handleEvent\n self.processHost(match, parentEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 271, in processHost\n self.notifyListeners(evt)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1050, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 250, in handleEvent\n return self.spiderFrom(spiderTarget)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 274, in spiderFrom\n links = self.processUrl(startingPoint) # fetch first page\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 105, in processUrl\n self.contentNotify(url, fetched, self.urlEvents[url])\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 189, in contentNotify\n self.notifyListeners(event)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1050, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 120, in handleEvent\n self.processHost(match, parentEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 271, in processHost\n self.notifyListeners(evt)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1050, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 250, in handleEvent\n return self.spiderFrom(spiderTarget)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 300, in spiderFrom\n freshLinks = self.processUrl(link)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 128, in processUrl\n self.urlEvents[link] = self.linkNotify(link, self.urlEvents[url])\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 181, in linkNotify\n self.notifyListeners(event)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1050, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 120, in handleEvent\n self.processHost(match, parentEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 271, in processHost\n self.notifyListeners(evt)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1050, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 250, in handleEvent\n return self.spiderFrom(spiderTarget)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 274, in spiderFrom\n links = self.processUrl(startingPoint) # fetch first page\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 105, in processUrl\n self.contentNotify(url, fetched, self.urlEvents[url])\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 189, in contentNotify\n self.notifyListeners(event)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1050, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 120, in handleEvent\n self.processHost(match, parentEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 271, in processHost\n self.notifyListeners(evt)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1050, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 250, in handleEvent\n return self.spiderFrom(spiderTarget)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 300, in spiderFrom\n freshLinks = self.processUrl(link)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 105, in processUrl\n self.contentNotify(url, fetched, self.urlEvents[url])\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 189, in contentNotify\n self.notifyListeners(event)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1050, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 120, in handleEvent\n self.processHost(match, parentEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 271, in processHost\n self.notifyListeners(evt)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1050, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 250, in handleEvent\n return self.spiderFrom(spiderTarget)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 300, in spiderFrom\n freshLinks = self.processUrl(link)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 105, in processUrl\n self.contentNotify(url, fetched, self.urlEvents[url])\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 189, in contentNotify\n self.notifyListeners(event)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1050, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 120, in handleEvent\n self.processHost(match, parentEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 271, in processHost\n self.notifyListeners(evt)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1050, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 250, in handleEvent\n return self.spiderFrom(spiderTarget)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 300, in spiderFrom\n freshLinks = self.processUrl(link)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 105, in processUrl\n self.contentNotify(url, fetched, self.urlEvents[url])\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 189, in contentNotify\n self.notifyListeners(event)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1050, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 120, in handleEvent\n self.processHost(match, parentEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 271, in processHost\n self.notifyListeners(evt)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1050, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 250, in handleEvent\n return self.spiderFrom(spiderTarget)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 274, in spiderFrom\n links = self.processUrl(startingPoint) # fetch first page\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 105, in processUrl\n self.contentNotify(url, fetched, self.urlEvents[url])\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 189, in contentNotify\n self.notifyListeners(event)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1050, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 120, in handleEvent\n self.processHost(match, parentEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 271, in processHost\n self.notifyListeners(evt)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1050, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 250, in handleEvent\n return self.spiderFrom(spiderTarget)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 300, in spiderFrom\n freshLinks = self.processUrl(link)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 128, in processUrl\n self.urlEvents[link] = self.linkNotify(link, self.urlEvents[url])\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 181, in linkNotify\n self.notifyListeners(event)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1050, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 120, in handleEvent\n self.processHost(match, parentEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 271, in processHost\n self.notifyListeners(evt)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1050, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 250, in handleEvent\n return self.spiderFrom(spiderTarget)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 300, in spiderFrom\n freshLinks = self.processUrl(link)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 105, in processUrl\n self.contentNotify(url, fetched, self.urlEvents[url])\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 189, in contentNotify\n self.notifyListeners(event)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1050, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 120, in handleEvent\n self.processHost(match, parentEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 271, in processHost\n self.notifyListeners(evt)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1050, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 250, in handleEvent\n return self.spiderFrom(spiderTarget)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 300, in spiderFrom\n freshLinks = self.processUrl(link)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 117, in processUrl\n links = sf.parseLinks(url, fetched['content'], self.baseDomain)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 578, in parseLinks\n self.error("Error applying regex2 to: " + data)\n', 'MemoryError\n']

Name extraction stops on diacritics

Extraction of Human Names stops when the first char with diacritics is encountered. Example: there is a name "Martin Smišnál" where only "Martin Smi" is extracted.

Unhandled exception: Bad Zip File

stacktrace:
Unhandled exception (BadZipfile) encountered during scan. Please report this as a bug: ['Traceback (most recent call last):\n', ' File "/home/gianz/spiderfoot_5002/sfscan.py", line 195, in startScan\n module.start()\n', ' File "/home/gianz/spiderfoot_5002/modules/sfp_bingsearch.py", line 81, in start\n self.notifyListeners(evt)\n', ' File "/home/gianz/spiderfoot_5002/sflib.py", line 1050, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot_5002/modules/sfp_dns.py", line 120, in handleEvent\n self.processHost(match, parentEvent)\n', ' File "/home/gianz/spiderfoot_5002/modules/sfp_dns.py", line 271, in processHost\n self.notifyListeners(evt)\n', ' File "/home/gianz/spiderfoot_5002/sflib.py", line 1050, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot_5002/modules/sfp_spider.py", line 250, in handleEvent\n return self.spiderFrom(spiderTarget)\n', ' File "/home/gianz/spiderfoot_5002/modules/sfp_spider.py", line 300, in spiderFrom\n freshLinks = self.processUrl(link)\n', ' File "/home/gianz/spiderfoot_5002/modules/sfp_spider.py", line 128, in processUrl\n self.urlEvents[link] = self.linkNotify(link, self.urlEvents[url])\n', ' File "/home/gianz/spiderfoot_5002/modules/sfp_spider.py", line 181, in linkNotify\n self.notifyListeners(event)\n', ' File "/home/gianz/spiderfoot_5002/sflib.py", line 1050, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot_5002/modules/sfp_filemeta.py", line 114, in handleEvent\n doc = openxmllib.openXmlDocument(data=ret['content'], mime_type=mtype)\n', ' File "/home/gianz/spiderfoot_5002/ext/openxmllib/init.py", line 58, in openXmlDocument\n return class_(file_, mime_type=mime_type)\n', ' File "/home/gianz/spiderfoot_5002/ext/openxmllib/document.py", line 65, in init\n openxmldoc = zipfile.ZipFile(file_, 'r', zipfile.ZIP_DEFLATED)\n', ' File "/usr/lib/python2.7/zipfile.py", line 770, in init\n self._RealGetContents()\n', ' File "/usr/lib/python2.7/zipfile.py", line 811, in _RealGetContents\n raise BadZipfile, "File is not a zip file"\n', 'BadZipfile: File is not a zip file\n']

Database incompatability

The default database shipped is created on a 64-bit system, and seems to cause issues on 32-bit.

Ship the next release with a non-existent database, and create the database silently.

Python 3 Support

So... Python 3 is starting to be used more and more and, more tellingly, being shipped as default on many Linux distros.

I have started a fork at https://github.com/jspc/spiderfoot/tree/python3 which works under python3 merely as a WIP. Now this repo has no tests written for it as far as I can see (which is bad anyway) so a simple smoke test is all I've been able to do.

Alongside the changes in this fork for Python3 there is also:

  • Fixed the formatting and copy on the readme
  • Fixed the use of DOS line breaks
  • Usage of a .gitignore file.

The main reason for this fork: should this project not be using Python3? Why would it not use Python3?

Error - Unhandled exception encountered during scan

I was playing with spiderfoot and suddenly...

2013-08-14 01:45:50 SpiderFoot ERROR

Unhandled exception encountered during scan. Please report this as a bug: ['Traceback (most recent call last):\n', ' File "sfscan.pyc", line 142, in startScan\n', ' File "modules\sfp_similar.pyc", line 221, in start\n', ' File "modules\sfp_similar.pyc", line 148, in scrapeDomaintools\n', ' File "modules\sfp_similar.pyc", line 204, in storeResult\n', ' File "sflib.pyc", line 534, in notifyListeners\n', ' File "modules\sfp_xref.pyc", line 129, in handleEvent\n', ' File "sflib.pyc", line 534, in notifyListeners\n', ' File "modules\sfp_dns.pyc", line 102, in handleEvent\n', ' File "sflib.pyc", line 534, in notifyListeners\n', ' File "modules\sfp_dns.pyc", line 131, in handleEvent\n', ' File "modules\sfp_dns.pyc", line 176, in processHost\n', ' File "sflib.pyc", line 534, in notifyListeners\n', ' File "modules\sfp_spider.pyc", line 240, in handleEvent\n', ' File "modules\sfp_spider.pyc", line 289, in spiderFrom\n', ' File "modules\sfp_spider.pyc", line 103, in processUrl\n', ' File "modules\sfp_spider.pyc", line 187, in contentNotify\n', ' File "sflib.pyc", line 534, in notifyListeners\n', ' File "modules\sfp_dns.pyc", line 102, in handleEvent\n', ' File "sflib.pyc", line 534, in notifyListeners\n', ' File "modules\sfp_dns.pyc", line 131, in handleEvent\n', ' File "modules\sfp_dns.pyc", line 176, in processHost\n', ' File "sflib.pyc", line 534, in notifyListeners\n', ' File "modules\sfp_ripe.pyc", line 82, in handleEvent\n', ' File "sflib.pyc", line 534, in notifyListeners\n', ' File "modules\sfp_dns.pyc", line 102, in handleEvent\n', ' File "sflib.pyc", line 534, in notifyListeners\n', ' File "modules\sfp_spider.pyc", line 240, in handleEvent\n', ' File "modules\sfp_spider.pyc", line 263, in spiderFrom\n', ' File "modules\sfp_spider.pyc", line 103, in processUrl\n', ' File "modules\sfp_spider.pyc", line 187, in contentNotify\n', ' File "sflib.pyc", line 534, in notifyListeners\n', ' File "modules\sfp_dns.pyc", line 102, in handleEvent\n', ' File "sflib.pyc", line 534, in notifyListeners\n', ' File "modules\sfp_spider.pyc", line 240, in handleEvent\n', ' File "modules\sfp_spider.pyc", line 289, in spiderFrom\n', ' File "modules\sfp_spider.pyc", line 90, in processUrl\n', ' File "sflib.pyc", line 466, in fetchUrl\n', "UnicodeDecodeError: 'ascii' codec can't decode byte 0xf3 in position 53: ordinal not in range(128)\n"]

Help ~ Email search advice?

I'm not very experienced at this, but I've read a great deal of positive reviews on SpiderFoot. I am seeking to identify a email address footprints, could someone please inform me how I could achieve this? I am not sure of the correct format to enter the email address or the scan criteria to select. Many thanks in advance!

Problems with XHTML sites

SpiderFoot has several problems with XHTML sites. You can use http://www.nerv.fi as an example if you want. This might also be Javascript related issue.

Some examples:

  • http://www.nerv.fi/layout/news/{$HOST}info/help in Web technology
  • http://www.nerv.fi/news/2013-04-30/2013-01-29/cve-2013-0238-nervin-irc-verkon-palvelimet-paivitetaan in HTTP Headers. I don't even know where SpiderFoot got that URL since the service doesn't have URL syntax like that. Same thing in Linked URL - Internal so it might be how the parsing works

M2Crypto regression

Command pip install M2Crypto is not enough to get M2Crypto installed and spiderfoot running as per regression. There is forks available, but I think this problem should be fixed in the upstream. This issue is just a heads up for SpiderFoot projects so that you can update instructions accordingly.

Error is: ImportError: /home/test/utils/builds/python/2.7.5/lib/python2.7/site-packages/M2Crypto/__m2crypto.so: undefined symbol: SSLv2_method and I tested that the patch provided works.

Tested with:

no idea, but a traceback for you :)

Unhandled exception (TypeError) encountered during scan. Please report this as a bug: ['Traceback (most recent call last):\n', ' File "/home/tom/spiderfoot/sfscan.py", line 195, in startScan\n module.start()\n', ' File "/home/tom/spiderfoot/modules/sfp_bingsearch.py", line 64, in start\n useragent=self.opts['_useragent'], timeout=self.opts['_fetchtimeout']))\n', ' File "/home/tom/spiderfoot/sflib.py", line 825, in bingIterate\n if firstPage['code'] == 400 or "/challengepic?" in firstPage['content']:\n', "TypeError: argument of type 'NoneType' is not iterable\n"]

Example domain

When initializing new scan you should use example.org or example.com as example, because scantarget.com might be used by real organization. Not a big issue obviously, but for example Nmap had working domain in examples and the owner did not like it. You can guess why ;)

crash when adding a new domain

Hi, very often when i add a new domain to scan i get:
500 Internal Server Error

The server encountered an unexpected condition which prevented it from fulfilling the request.

Traceback (most recent call last):
File "cherrypy_cprequest.pyc", line 656, in respond
File "cherrypy\lib\encoding.pyc", line 188, in call
File "cherrypy_cpdispatch.pyc", line 34, in call
File "sfwebui.pyc", line 243, in startscan
File "mako\template.pyc", line 443, in render
File "mako\runtime.pyc", line 783, in _render
File "mako\runtime.pyc", line 815, in _render_context
File "mako\runtime.pyc", line 841, in _exec_template
File "dyn_newscan_tmpl", line 64, in render_body
IndexError: string index out of range

Export data to CSV does not work!

Hello,

I have been triying to export results to a CSV file, but with no success... :(

You receive the message: "Waiting to connect to localhost...", but it never finish...and finnaly it does not dump data to the CSV file...

Have someone experimented this issue? Someone can provide some info about this, please?

Thanks!

Update to installation inst.

In clean Debian system these packages are needed to install spiderfoot with your guide: python2.7-dev libxml2-dev libxslt1-dev swig git python python-pip

Installation instructions missing swig

Installation instructions is missing swig. For example in Debian this can be installed with command: sudo apt-get install swig. Spiderfoot needs swig for M2Crypto.

UnicodeEncodeError on spiderfoot 2.2

Unhandled exception (UnicodeEncodeError) encountered during scan. Please report this as a bug: ['Traceback (most recent call last):\n', ' File "/home/gianz/spiderfoot/sfscan.py", line 228, in startScan\n psMod.notifyListeners(firstEvent)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1120, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_bingsearch.py", line 83, in handleEvent\n self.notifyListeners(evt)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1120, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 162, in handleEvent\n self.processHost(match, parentEvent, affiliate=False)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 327, in processHost\n self.notifyListeners(evt)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1120, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_bingsearch.py", line 83, in handleEvent\n self.notifyListeners(evt)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1120, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 162, in handleEvent\n self.processHost(match, parentEvent, affiliate=False)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 331, in processHost\n self.processDomain(dom, evt)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 414, in processDomain\n self.processHost(name, domevt, affiliate=False)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 327, in processHost\n self.notifyListeners(evt)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1120, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_bingsearch.py", line 104, in handleEvent\n self.notifyListeners(evt)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1120, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 259, in handleEvent\n return self.spiderFrom(spiderTarget)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 309, in spiderFrom\n freshLinks = self.processUrl(link)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 100, in processUrl\n self.contentNotify(url, fetched, self.urlEvents[url])\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 198, in contentNotify\n self.notifyListeners(event)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1120, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 162, in handleEvent\n self.processHost(match, parentEvent, affiliate=False)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 327, in processHost\n self.notifyListeners(evt)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1120, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 206, in handleEvent\n self.processHost(addr, parentEvent, affiliate=False)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 327, in processHost\n self.notifyListeners(evt)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1120, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 241, in handleEvent\n self.processHost(sip, parentEvent, affiliate=False)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 327, in processHost\n self.notifyListeners(evt)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1120, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 241, in handleEvent\n self.processHost(sip, parentEvent, affiliate=False)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 327, in processHost\n self.notifyListeners(evt)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1120, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 241, in handleEvent\n self.processHost(sip, parentEvent, affiliate=False)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 327, in processHost\n self.notifyListeners(evt)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1120, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 243, in handleEvent\n self.processHost(sip, parentEvent, affiliate=False)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 327, in processHost\n self.notifyListeners(evt)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1120, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 206, in handleEvent\n self.processHost(addr, parentEvent, affiliate=False)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 327, in processHost\n self.notifyListeners(evt)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1120, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_googlesearch.py", line 106, in handleEvent\n self.notifyListeners(evt)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1120, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 259, in handleEvent\n return self.spiderFrom(spiderTarget)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 283, in spiderFrom\n links = self.processUrl(startingPoint) # fetch first page\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 100, in processUrl\n self.contentNotify(url, fetched, self.urlEvents[url])\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 198, in contentNotify\n self.notifyListeners(event)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1120, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 162, in handleEvent\n self.processHost(match, parentEvent, affiliate=False)\n', ' File "/home/gianz/spiderfoot/modules/sfp_dns.py", line 327, in processHost\n self.notifyListeners(evt)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1120, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_bingsearch.py", line 104, in handleEvent\n self.notifyListeners(evt)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1120, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 259, in handleEvent\n return self.spiderFrom(spiderTarget)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 309, in spiderFrom\n freshLinks = self.processUrl(link)\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 100, in processUrl\n self.contentNotify(url, fetched, self.urlEvents[url])\n', ' File "/home/gianz/spiderfoot/modules/sfp_spider.py", line 198, in contentNotify\n self.notifyListeners(event)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1120, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_names.py", line 137, in handleEvent\n self.notifyListeners(evt)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1120, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_socialprofiles.py", line 92, in handleEvent\n searchStr = sites[site][0].format(eventData).replace(" ", "%20")\n', "UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 3: ordinal not in range(128)\n"]

Scan failed: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)

Unhandled exception (UnicodeEncodeError) encountered during scan. Please report this as a bug: ['Traceback (most recent call last):\n', ' File "sfscan.pyc", line 152, in startScan\n', ' File "modules\sfp_dns.pyc", line 337, in start\n', ' File "modules\sfp_dns.pyc", line 255, in processHost\n', ' File "sflib.pyc", line 812, in notifyListeners\n', ' File "modules\sfp_bingsearch.pyc", line 97, in handleEvent\n', ' File "sflib.pyc", line 812, in notifyListeners\n', ' File "modules\sfp_malcheck.pyc", line 395, in handleEvent\n', ' File "modules\sfp_malcheck.pyc", line 334, in lookupItem\n', ' File "modules\sfp_malcheck.pyc", line 247, in resourceQuery\n', "UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)\n"]

Exception error when running spiderfoot on debian kali linux

I get the following error:

Unhandled exception (BaseException) encountered during scan. Please report this as a bug: ['Traceback (most recent call last):\n', ' File "/home/Download/spiderfoot-2.1.0/sfscan.py", line 91, in startScan\n self.config['_internettlds'] = self.sf.optValueToData(self.config['_internettlds'])\n', ' File "/home/Download/spiderfoot-2.1.0/sflib.py", line 72, in optValueToData\n self.error("Unable to open option URL, " + val + ".")\n', ' File "/home/Download/spiderfoot-2.1.0/sflib.py", line 96, in error\n raise BaseException("Internal Error Encountered: " + error)\n', 'BaseException: Internal Error Encountered: Unable to open option URL, http://mxr.mozilla.org/mozilla-central/source/netwerk/dns/effective_tld_names.dat?raw=1.\n']

Unhandled Exception in Spiderfoot 2.2

Unhandled exception (UnboundLocalError) encountered during scan. Please report this as a bug: ['Traceback (most recent call last):\n', ' File "/home/gianz/spiderfoot/sfscan.py", line 228, in startScan\n psMod.notifyListeners(firstEvent)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1120, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_ir.py", line 305, in handleEvent\n self.notifyListeners(asevt)\n', ' File "/home/gianz/spiderfoot/sflib.py", line 1120, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/gianz/spiderfoot/modules/sfp_ir.py", line 281, in handleEvent\n self.nbreported[asn] = True\n', "UnboundLocalError: local variable 'asn' referenced before assignment\n"]

HTTPError: (404, 'Missing parameters: modulelist,scantarget,scanname')

when running a scan i get this traceback:

404 Not Found

Missing parameters: modulelist,scantarget,scanname

Traceback (most recent call last):
  File "/usr/local/lib/python2.7/site-packages/cherrypy/_cprequest.py", line 656, in respond
    response.body = self.handler()
  File "/usr/local/lib/python2.7/site-packages/cherrypy/lib/encoding.py", line 188, in __call__
    self.body = self.oldhandler(*args, **kwargs)
  File "/usr/local/lib/python2.7/site-packages/cherrypy/_cpdispatch.py", line 40, in __call__
    raise sys.exc_info()[1]
HTTPError: (404, 'Missing parameters: modulelist,scantarget,scanname')

Scan failed: 'ascii' codec can't encode character u'\xae' in position 72: ordinal not in range(128)

Unhandled exception encountered during scan. Please report this as a bug: ['Traceback (most recent call last):\n', ' File "/root/spiderfoot/sfscan.py", line 134, in startScan\n module.start()\n', ' File "/root/spiderfoot/modules/sfp_ripe.py", line 125, in start\n self.notifyListeners(evt)\n', ' File "/root/spiderfoot/sflib.py", line 521, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/root/spiderfoot/modules/sfp_subdomain.py", line 71, in handleEvent\n self.notifyListeners(evt)\n', ' File "/root/spiderfoot/sflib.py", line 521, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/root/spiderfoot/modules/sfp_ripe.py", line 77, in handleEvent\n self.notifyListeners(evt)\n', ' File "/root/spiderfoot/sflib.py", line 521, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/root/spiderfoot/modules/sfp_subdomain.py", line 71, in handleEvent\n self.notifyListeners(evt)\n', ' File "/root/spiderfoot/sflib.py", line 521, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/root/spiderfoot/modules/sfp_dns.py", line 101, in handleEvent\n self.processHost(addr, event)\n', ' File "/root/spiderfoot/modules/sfp_dns.py", line 130, in processHost\n self.notifyListeners(evt)\n', ' File "/root/spiderfoot/sflib.py", line 521, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/root/spiderfoot/modules/sfp_spider.py", line 239, in handleEvent\n return self.spiderFrom(spiderTarget)\n', ' File "/root/spiderfoot/modules/sfp_spider.py", line 288, in spiderFrom\n freshLinks = self.processUrl(link)\n', ' File "/root/spiderfoot/modules/sfp_spider.py", line 102, in processUrl\n self.contentNotify(url, fetched, self.urlEvents[url])\n', ' File "/root/spiderfoot/modules/sfp_spider.py", line 186, in contentNotify\n self.notifyListeners(event)\n', ' File "/root/spiderfoot/sflib.py", line 521, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/root/spiderfoot/modules/sfp_subdomain.py", line 71, in handleEvent\n self.notifyListeners(evt)\n', ' File "/root/spiderfoot/sflib.py", line 521, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/root/spiderfoot/modules/sfp_ripe.py", line 77, in handleEvent\n self.notifyListeners(evt)\n', ' File "/root/spiderfoot/sflib.py", line 521, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/root/spiderfoot/modules/sfp_subdomain.py", line 71, in handleEvent\n self.notifyListeners(evt)\n', ' File "/root/spiderfoot/sflib.py", line 521, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/root/spiderfoot/modules/sfp_dns.py", line 99, in handleEvent\n self.processHost(host, event)\n', ' File "/root/spiderfoot/modules/sfp_dns.py", line 130, in processHost\n self.notifyListeners(evt)\n', ' File "/root/spiderfoot/sflib.py", line 521, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/root/spiderfoot/modules/sfp_spider.py", line 239, in handleEvent\n return self.spiderFrom(spiderTarget)\n', ' File "/root/spiderfoot/modules/sfp_spider.py", line 299, in spiderFrom\n nextLinks = self.cleanLinks(links)\n', ' File "/root/spiderfoot/modules/sfp_spider.py", line 160, in cleanLinks\n if filter(checkExts, self.opts['filterfiles']):\n', ' File "/root/spiderfoot/modules/sfp_spider.py", line 159, in \n checkExts = lambda ext: '.' + str.lower(ext) in str.lower(str(link))\n', "UnicodeEncodeError: 'ascii' codec can't encode character u'\xae' in position 72: ordinal not in range(128)\n"]

UnicodeEncodeError can crash scan

A smallish scan I was running crashed with the following error message:

2014-01-06 10:30:09 SpiderFoot  ERROR   
Unhandled exception (UnicodeEncodeError) encountered during scan. Please report this as a bug: ['Traceback (most recent call last):\n', '  File "sfscan.pyc", line 152, in startScan\n', '  File "modules\\sfp_googlesearch.pyc", line 167, in start\n', '  File "sflib.pyc", line 794, in notifyListeners\n', '  File "modules\\sfp_spider.pyc", line 249, in handleEvent\n', '  File "modules\\sfp_spider.pyc", line 299, in spiderFrom\n', '  File "modules\\sfp_spider.pyc", line 127, in processUrl\n', '  File "modules\\sfp_spider.pyc", line 180, in linkNotify\n', '  File "sflib.pyc", line 794, in notifyListeners\n', '  File "modules\\sfp_crossref.pyc", line 103, in handleEvent\n', '  File "sflib.pyc", line 678, in fetchUrl\n', "UnicodeEncodeError: 'ascii' codec can't encode character u'\\ufffd' in position 33: ordinal not in range(128)\n"]

Guessing the problem isn't the UnicodeEncodeError itself, as that could be the target's fault - just that it brought down the whole scan.

Fails to start

I got the package via Git, but I get these errors when I try to start it.

root@Kali:/Scripts/spiderfoot# ls
dyn evtypes.sql ext LICENSE LICENSE.tp modules README setup.py sfdb.py sflib.py sf.py sfscan.py sfwebui.py spiderfoot.schema static THANKYOU VERSION
root@Kali:
/Scripts/spiderfoot# ./sf.py
./sf.py: line 11: $'\r': command not found
./sf.py: line 14: $'\r': command not found
./sf.py: line 16: syntax error near unexpected token (' '/sf.py: line 16:cmd_subfolder = os.path.realpath(os.path.abspath(os.path.join(os.path.split(inspect.getfile(inspect.currentframe()))[0],"ext")))
root@Kali:~/Scripts/spiderfoot#

fetchUrl() automatically follows re-direct

Python's urllib2.urlopen() automatically follows redirects, but we need more control over this in cases where the redirecting site supplies a cookie that we don't catch.

cherrypy AttributeError

when I run the sf.py, this error occures: (kali linux - Python 2.7.3)

Starting web server at http://127.0.0.1:5001...
Traceback (most recent call last):
File "./sf.py", line 91, in
cherrypy.engine.autoreload.unsubscribe()
AttributeError: 'module' object has no attribute 'engine'

License unclear

SpiderFoot 2.0 is GPLv2 licensed, but there is no LICENSE file in the package.

error to report - not sure what it was. using tor proxy. HTH

Unhandled exception (error) encountered during scan. Please report this as a bug: ['Traceback (most recent call last):\n', ' File "/home/tom/spiderfoot/sfscan.py", line 195, in startScan\n module.start()\n', ' File "/home/tom/spiderfoot/modules/sfp_bingsearch.py", line 81, in start\n self.notifyListeners(evt)\n', ' File "/home/tom/spiderfoot/sflib.py", line 1033, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/tom/spiderfoot/modules/sfp_dns.py", line 120, in handleEvent\n self.notifyListeners(evt)\n', ' File "/home/tom/spiderfoot/sflib.py", line 1033, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/tom/spiderfoot/modules/sfp_dns.py", line 194, in handleEvent\n self.processHost(host, parentEvent)\n', ' File "/home/tom/spiderfoot/modules/sfp_dns.py", line 264, in processHost\n self.notifyListeners(evt)\n', ' File "/home/tom/spiderfoot/sflib.py", line 1033, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/tom/spiderfoot/modules/sfp_sharedip.py", line 121, in handleEvent\n self.notifyListeners(evt)\n', ' File "/home/tom/spiderfoot/sflib.py", line 1033, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/home/tom/spiderfoot/modules/sfp_malcheck.py", line 478, in handleEvent\n url = self.lookupItem(cid, typeId, eventData)\n', ' File "/home/tom/spiderfoot/modules/sfp_malcheck.py", line 419, in lookupItem\n return self.resourceList(cid, target, itemType)\n', ' File "/home/tom/spiderfoot/modules/sfp_malcheck.py", line 405, in resourceList\n re.match(rxTgt, line, re.IGNORECASE):\n', ' File "/usr/lib/python2.7/re.py", line 137, in match\n return _compile(pattern, flags).match(string)\n', ' File "/usr/lib/python2.7/re.py", line 242, in _compile\n raise error, v # invalid expression\n', 'error: multiple repeat\n']

Usability feedback

After starting new scan against my own domain scanning starts normally. The URL in question after successful startup is http://127.0.0.1:5001/startscan. My first reaction in that page was to reload it to get status. Reloading that page ends up as error:

404 Not Found

Missing parameters: modulelist,scantarget,scanname

Traceback (most recent call last):
  File "/home/fgeek/utils/builds/python/2.7.5/lib/python2.7/site-packages/cherrypy/_cprequest.py", line 656, in respond
    response.body = self.handler()
  File "/home/fgeek/utils/builds/python/2.7.5/lib/python2.7/site-packages/cherrypy/lib/encoding.py", line 188, in __call__
    self.body = self.oldhandler(*args, **kwargs)
  File "/home/fgeek/utils/builds/python/2.7.5/lib/python2.7/site-packages/cherrypy/_cpdispatch.py", line 40, in __call__
    raise sys.exc_info()[1]
HTTPError: (404, 'Missing parameters: modulelist,scantarget,scanname')

In my opinion if the parameters are not inputted it should either show the Scans-page or the specific scaninfo?id=-page. It's also not good habbit to give backtrace error in web ui if debugging is not enabled.

Also while SpiderFoot uses POST messages it makes it harder to navigate in the web interface as user can't click back button in the web browser. For example from specific scan to list of all scans.

Error handling malformed JSON from RIPE & GeoIP

File "/root/spiderfoot/modules/sfp_dns.py", line 130, in processHost\n self.notifyListeners(evt)\n', ' File "/root/spiderfoot/sflib.py", line 521, in notifyListeners\n listener.handleEvent(sfEvent)\n', ' File "/root/spiderfoot/modules/sfp_geoip.py", line 64, in handleEvent\n hostip = json.loads(res['content'])\n', ' File "/usr/lib/python2.6/json/init.py", line 307, in loads\n return _default_decoder.decode(s)\n', ' File "/usr/lib/python2.6/json/decoder.py", line 319, in decode\n obj, end = self.raw_decode(s, idx=_w(s, 0).end())\n', 'TypeError: expected string or buffer\n']

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.