Giter Site home page Giter Site logo

wally3k / wally3k.github.io Goto Github PK

View Code? Open in Web Editor NEW
751.0 50.0 39.0 269 KB

Repo for Firebog hosting

Home Page: https://firebog.net

License: MIT License

CSS 60.39% HTML 39.61%
blocklist blocklists hosts hostsfile pi-hole pi-hole-blocklists blacklist blacklist-data pihole pihole-blocklists

wally3k.github.io's Introduction

WaLLy3K's GitHub pages repo

This content is for my domain https://firebog.net as well as my Big Blocklist Collection.

Why use this over other sources?

Due to my DNS sink-holing experience (which I've been running on my very active household network since roughly 2013), I've been able to get a good feel for what lists cause issues and which don't. This experience lets me categorise the lists and, more importantly, provide easily accessible recommendations for you to implement into your network.

There's also the fact that there are very few sources of original blocklist content out there. A considerable percentage of lists I've seen is essentially the "I made this" meme, which leads to the following issues:

  • Cessation: Consolidated lists deprive the original list maintainer of visits.
    • If their visit count falls, it's reasonable to expect one would stop updating the list because their efforts are no longer appreciated
    • Lack of maintenance can lead to even less original blocklist content
  • Centralisation: You're letting one entity essentially dictate what can and can't be blocked
    • This puts a lot of workload on a single person to maintain changes to their consolidated list, potentially bringing into question how long a consolidated list will be maintained for, which could become an issue due to the "set and forget" nature of DNS sink-holing
    • The consolidated list maintainer may not always be up-to-date with the original list source
    • Additions and removals may not be passed upstream to the maintainer of that original list, benefitting more people overall

My goals are to:

  • Credit other maintainer's high-quality content by way of "direct hits"
  • Make their content as easy to access for others as possible
  • Not require payment or to nag for donations
  • Have a transparent changelog thanks to GitHub's commit history

These goals ensure every interested individual or group can have better control over their Internet experience.

On a related note, I have zero interest in maintaining content inside various blocklists, except for the handful of domains I put into my blocklist which have cropped up here and there over the years because it was quicker to add to that list than submit to anyone else.

Found a false positive, but don't know which list contains it?

Run pihole -q blockeddomain.com, and it will return the URL of the block list.

Know which list contains the false positive?

Click on the big blue "Toggle List Maintainer Sources" button to show the source of a particular blocklist. Via the source page, you should be able to find the contact details for the list maintainer.

Some lists are sourced from an "adblock" style list which are flat-out NOT designed to work with DNS sinkholes, and there WILL be mistakes with how these are parsed due to how domain names are extracted and exceptions handled. Before reaching out to one of the fine folks at EasyList, PLEASE confirm your issue still exists when using an Adblock plugin such as uBlock/ABP/etc. If the issue isn't present when using an Adblock plugin, raise an issue here first.

For every other list, get in touch with them to remove the false positive - if you're not able to find the maintainer's contact details, please feel free to reach out to me.

Lists which I host at v.firebog.net:

These lists are automatically updated, and are a domains-only (Pi-hole friendly) format of what the original list maintainer provides. I do not make any additions or subtractions to these lists (except for my personal blocklist).

My automated parser/mirror has the following methods in place to minimise risk of being auto IP banned:

  • It should not retrieve a remote file if it has already done so within 24 hours
  • It will get the HTTP status, ETag and Last-Modified headers of a remote file using cURL
  • cURL uses a custom user-agent which specifically identifies v.firebog.net
  • If the HTTP status is 403, the script should not attempt retrieval until after 5 weeks (2.964e+6 seconds)
  • It will compare server ETag header with the previously stored ETag, and only retrieve the file contents if necessary
  • In the event a server is not configured with the ETag header, the Last-Modified header will be used in place
  • If the Last-Modified header does not exist, it will retrieve a new copy, make a comparison with the existing version and update if necessary
  • A cron job will fire off my script every two days (Sun/Tue/Thur/Sat at midnight AEST) (Or twice a week for Prigent-Adult.txt due to size)

On the subject of "non-domain entries":

As mentioned before, I do not make additions or subtractions to content — therefore it isn't my place to correct the issues from upstream lists. Please do not create an issue for non-domain entries that come up when running pihole -g as they are already being filtered out.

Unable to find the list maintainer; they're unresponsive or have another issue?

Open a ticket here, and I'll be happy to see what I can do.

I attempt to reply to GitHub tickets and mentions as soon as I notice them, so by all means please @ ping me anywhere that I frequent that's convenient for you if @'ing on GitHub doesn't get a reply in a day or two. Also, if a ticket is closed, you are welcome to comment on it still if you have any question, comment or concern. 😃

wally3k.github.io's People

Contributors

ail1020 avatar alyetama avatar andrewdaws avatar anudeepnd avatar bigdargon avatar cbrookins avatar geoffreyfrogeye avatar lightswitch05 avatar mertcangokgoz avatar perflyst avatar te-k avatar tedder avatar timleland avatar wally3k avatar xhmikosr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

wally3k.github.io's Issues

osint.bambenekconsulting.com blocked in https://v.firebog.net/hosts/Airelle-hrsk.txt

Airelle-trc blocks aws.amazon.com

Amazon's AWS service is used for general computing. While it could be for tracking, it is also the core of many legitimate services and applications. Blocking this breaks things for me, and I suspect I'm not the only one. It looks like there are several other tickets against this list for false positives. If you do not believe these to be false positives, then I would recommend updating the website to have a cross in front of this list:

Lists bulleted with a cross block multiple useful sites (e.g: Pi-hole updates, Amazon, Netflix)

The description is especially relevant since the service being blocked is actually Amazon

airelle-trc blocks cs9.wac.phicdn.net (ocsp.digicert.com)

The list "airelle-trc" blocks cs9.wac.phicdn.net (ocsp.digicert.com seems to be an alias) which breaks sites like github. I was wondering why the Query Log showed ocsp.digicert.com as blocked query even though I couldn´t find the domain on any of the Block Lists I use. The reason seems to be cs9.wac.phicdn.net blocked by airelle-trc.

Don´t know if this is a real issue since the airelle lists are not recommended as ticked any more. Just to let you know.

Airelle-trc blocks "medium.com"

medium.com looks like a false-positive to me. Not really a website that distributes malware, is a host for ads, etc.

Airelle-trc looks like to host multiple false-positives #27 and #29.

AdguardDNS - googleadapis.l.google.com

Hello!
The domain googleadapis.l.google.com is in the AdguardDNS exceptions, because it blocks fonts.googleapis.com. The exceptions are at the end of the official list. Click here for more info. But in the AdguardDNS firebog list googleadapis is still there.

Airelle-trc blocks "www.howtogeek.com"

Hi, i hope this is the right place to report false positives.
I can't find a reason why howtogeek should be blocked.

Also i would like to ask if there is an optimal way to figure out which list is responsible for a block, because my lists in /etc/pihole are using different names compared to the https://wally3k.github.io/ page.

github.io in EasyPrivacy list

Hi,

I daily download several blocklists available at your https://wally3k.github.io/ page.

Using the https://v.firebog.net/hosts/Easyprivacy.txt list is problematic because it includes as such github.io which, at least when processed by DNSCrypt-proxy (blocklist) blocks access to all github.io sites (it's understood as *.github.io) which means that https://wally3k.github.io/ itself is blocked!

Other lists mention specific github.io pages, but not plain github.io ...

Thanks for your work, @WaLLy3K , much appreciated.

Steven Black hostlist

Hi,

to add blocklists effectively for pi-hole it would be nice to mention, that the Steven Black hostlist (the first of the default lists in a fresh pi-hole installation) contains 15 additional lists!
11 of them are listed here sperately, but are often added by users to their blocklists. Pi-hole sorts out double entries by gravity script, but it's easier and more effective to make it clear, that these 11 files are still included in a default pi-hole configuration.

regards, Frank

http://someonewhocares.org/hosts/zero/hosts
http://winhelp2002.mvps.org/hosts.txt
https://adaway.org/hosts.txt
https://www.malwaredomainlist.com/hostslist/hosts.txt
https://raw.githubusercontent.com/StevenBlack/hosts/master/data/add.2o7Net/hosts
https://raw.githubusercontent.com/StevenBlack/hosts/master/data/add.Risk/hosts
https://raw.githubusercontent.com/StevenBlack/hosts/master/data/add.Spam/hosts
https://raw.githubusercontent.com/StevenBlack/hosts/master/data/KADhosts/hosts
https://raw.githubusercontent.com/StevenBlack/hosts/master/data/SpotifyAds/hosts
https://raw.githubusercontent.com/StevenBlack/hosts/master/data/tyzbit/hosts
https://raw.githubusercontent.com/StevenBlack/hosts/master/data/UncheckyAds/hosts

Piwik is now Matomo-org

Hi

Piwik has changed its name to Matomo and their github has changed too.

You are referencing

  • https://raw.githubusercontent.com/piwik/referrer-spam-blacklist/master/spammers.txt
    and the new url is
  • https://raw.githubusercontent.com/matomo-org/referrer-spam-blacklist/master/spammers.txt

Maybe worth updating...

IDs for lists

Hi,

Thanks for this list.
Could you please give an ID to each <ul class="bdUrlList">-block? Something like <ul id="listSuspicious" class="bdUrlList"> would be perfect. This would allow me to automate the retrieval and processing of the lists.
Obviously I would be happy to share my script when I finish it.

Thanks,
Rob

P.s.:
Yes, I could have forked, changed and sent a pull request. But I think it's less work this way.

Airelle - localhost

In Airelle lists there is a localhost entry. That entry should not be there.

webtrack.dhlglobalmail.com blocked

The DHL package tracking page is blocked in list.19.v.firebog.net.domains. I've whitelisted it and I didn't notice anything objectionable on the page.

[Feature Request] Automatic updating of blocklist using txt?

Hi Wally3k,

Thanks for providing this blocklist on your website! I was wondering if it would be possible to create a txt version of the no-false-positives list (or just of the list without the ones that cross block multiple sites) so we could add all of these lists to pihole through adlists.list with automatic updating?

Thanks

Bad lists?

Adding ticked lists and get errors when downloading the following lists

All lists above (I'm sure there are a few others) error out, such as:

[i] Target: bitbucket.org (Mandiant_APT1_Report_Appendix_D.txt)
[✗] Status: Connection Refused
[✗] List download failed: no cached list available

updated links (HOSTS)

@WaLLy3K
here is the updated link, for the HOSTS repository
used in https://github.com/WaLLy3K/wally3k.github.io/blob/master/classification.ini.

https://raw.githubusercontent.com/eladkarako/hosts/master/_raw__hosts.txt

also available...
HOSTS (HOSTS format)
https://raw.githubusercontent.com/eladkarako/hosts/master/build/hosts.txt
https://raw.githubusercontent.com/eladkarako/hosts/master/build/hosts0.txt
https://raw.githubusercontent.com/eladkarako/hosts/master/build/hosts_with_localhost.txt
https://raw.githubusercontent.com/eladkarako/hosts/master/build/hosts0_with_localhost.txt

HOSTS (HOSTS in AdBlock format for the browser)
https://raw.githubusercontent.com/eladkarako/hosts/master/build/hosts_adblock.txt

AdBlock lists
https://raw.githubusercontent.com/eladkarako/hosts/master/build/hosts_adblock_anti_annoyances_block.txt
https://raw.githubusercontent.com/eladkarako/hosts/master/build/hosts_adblock_anti_annoyances_block_inline_script.txt
https://raw.githubusercontent.com/eladkarako/hosts/master/build/hosts_adblock_anti_annoyances_hide.txt
https://raw.githubusercontent.com/eladkarako/hosts/master/build/hosts_adblock_anti_annoyances_style_inject.txt

Question icanhazip.com blocked

Not sure if I'm posting this in the correct github repo (instead of list hosters/creators)[please do not get angry if im wrong to post it here].

These blocklists block the domain icanhazip.com:

https://v.firebog.net/hosts/Airelle-hrsk.txt [included in firebog.net]
https://tspprs.com/dl/malware
https://raw.githubusercontent.com/CHEF-KOCH/CKs-FilterList/master/HOSTS/CK's-Malware-HOSTS-FilterList.txt

Why do these lists block icanhazip.com?

Is it because some malware gets and determines the IP using this domain?
Is it not a false-positive since it can be used for good: to determine IP in a script. The domain itself is harmless.

Add Smart-TV Block and Adlists?

Pihole cannot update lists from v.firebog.net (Status: Forbidden)

Hello,

I'm using your great adlist collection for my pihole. I have no trouble fetching those hosts with the exception of your domain, v.firebog.net. If I'm updating my lists using the pihole web interface, it looks like this (excerpt):

::: Getting s3.amazonaws.com list... done
:::   Status: Not modified
:::   No changes detected, transport skipped!
::: Getting v.firebog.net list... done
:::   Status: Forbidden
:::   Download failed and no cached list available (list will not be considered)
  [TRUNCATED as I have multiple lists of this host, they all fail]
::: Getting v.firebog.net list... done
:::   Status: Forbidden
:::   Download failed and no cached list available (list will not be considered)
::: Getting www.dshield.org list... done
:::   Status: Success (OK)
:::   List updated, transport successful!

My adlists.list is the same as your nocross adlist collection

I have no problem to wget one of the urls, for which pihole returns Forbidden. This happened on the same machine.

Is this maybe some kind of misconfiguration of your hoster that it doesn't allow pihole? Or am I the only one with that problem?

facebook blocklist remove from list generator

Can you remove https://raw.githubusercontent.com/anudeepND/blacklist/master/facebook.txt from the list generator, similar to how you handled the 1 million porn list? I have a script that automatically pulls in the no-cross plaintext list and adds any missing to my adlist. I've gotten a lot of false positives since my cohabitants use facebook. It would also work to mark it as a cross.

I'm referring to this process: https://v.firebog.net/hosts/lists.php

Bad Lists

CHEF-KOCH lists

Why not add the original lists from https://github.com/CHEF-KOCH/NSABlocklist/tree/master/Trackers for the CHEF-KOCH Anti-Fingerprinting lists? It has standard short URL's

This is your list and the original URL's. I also add the NSABlocking list of his site

CHEF-KOCH Anti-Fingerprinting lists

https://github.com/CHEF-KOCH/NSABlocklist/tree/master/Trackers

https://gist.githubusercontent.com/CHEF-KOCH/080efada22b9659ef61241029122873b/raw/7f9bd984d3c46b3dba2de7606da579bc0ac6780c/Canvas%2520Font%2520Fingerprinting%2520pages%2520%255B2017%2520Edition%255D
https://github.com/CHEF-KOCH/NSABlocklist/raw/master/Trackers/Canvas%20font%20fingerprinting.txt

https://gist.githubusercontent.com/CHEF-KOCH/5a7b1593d1880f906b12a3c87cee4500/raw/3ba028508feb2ef67a3d7ab75f428fd284223e8b/WebRTC%2520tracking%2520list%2520%255B2017%2520Edition%255D.txt
https://github.com/CHEF-KOCH/NSABlocklist/blob/master/Trackers/webrtc%20tracking.txt

https://gist.githubusercontent.com/CHEF-KOCH/63fd2e506cb34a2378ad2620ab06d2e0/raw/fb9f16e3ac998d3f773ebdfee4aa3bfd10a5d763/Audio%2520fingerprint%2520pages%2520%255B2017%2520Edition.exe
https://github.com/CHEF-KOCH/NSABlocklist/raw/master/Trackers/audio%20fingerporint%20pages.txt

https://gist.githubusercontent.com/CHEF-KOCH/2dea75d43b2184f228ae94b168d275b1/raw/35d7a4447a198449bbb3280e1c3d7a57517350de/Canvas%2520fingerprinting%2520pages%2520%255B2017%2520Edition%255D.exe
https://github.com/CHEF-KOCH/NSABlocklist/raw/master/Trackers/canvas%20fingerprinting%20pages.txt

https://github.com/CHEF-KOCH/NSABlocklist/raw/master/Trackers/trackers.txt

NSABlocklist/HOSTS

https://github.com/CHEF-KOCH/NSABlocklist

https://raw.githubusercontent.com/CHEF-KOCH/NSABlocklist/master/HOSTS

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.