a4k-openproject / script.module.openscrapers Goto Github PK
View Code? Open in Web Editor NEWOpenScrapers Project
License: GNU General Public License v3.0
OpenScrapers Project
License: GNU General Public License v3.0
should anilist.py have this changed:
from resources.lib.modules import
to this:
from openscrapers.modules import
With being in the UK I have to use a lot of proxy sites to get torrents to scrape ,I used this https://limetorrents.unblockit.me/ with openscrapers and others that used to work , but have stopped working sometime ago, is this because of cloudfare v2 change
settings, for toggle torrents on or off still refer to script.module.civitasscrapers
happens when running scrape-test.py
.
minimal setup to reproduce:
import sys
import os
sys.path.append(os.path.join(os.path.curdir, 'lib'))
from lib import openscrapers
openscrapers.sources(None, True)
please remove as the following website are dead and don't exist anymore
<setting id="provider.openkatalog" type="bool" label="OPENKATALOG" default="false" />
<setting id="provider.paczamy" type="bool" label="PACZAMY" default="false" />
<setting id="provider.trt" type="bool" label="TRT" default="false" />
v 0.0.0.7 the updated cfscrape seems to have broken rlsbb, way less premium links with rlsbb not working, i put the cfscrape from v 0.0.0.5 into v 0.0.0.7 and then rlsbb links are scraped and work
hi i am having real trouble getting this to work i followed each step 4 times now and im still getting the same issue if you have a telegram group can you send me a invite please so maybe someone can help me get it to work please.
Awesome job ! thanx to all involved, thumbs up to the new additions in the credits ;)
I did a fresh install of the latest Exodus Redux. Disabled all other providers except Furk. Set up my login credentials and API key. Tried a search and got the notification that there was no stream found. My search on Furk.net itself returns plenty results. Did a test with the default providers and got back plenty of results. Looks like the Furk scraper is broken. Can you please look into this?
Multiple friends using addons with Open Scrapers had issues, disabling ALLUCXYZ fixed the issue.
c06e458
What is this debrid.tor_enabled()
suppose to check? It breaks torrent scraping on my add-on, and exodus redux as well.
Removing this check from torrent scrapers makes them work again.
Are we supposed to implement an extra setting or something?
On a side note, this commit adds some lines with tabs, unlike the rest of the file's spaces indentation.
is xwatchseries broken
Hi,
Just wanted to pass along that vidics is down. Could someone take a look at it?
Thanks.
Hi there.
I've written an Easynews scraper for Openscrapers, but there is a problem with the control.py file that should be used to access the settings of openscrapers.
The 'addon' variable (accessing xbmcaddon.Addon) needs to explicitly state the openscrapers id as it's 'id' arg...
addon = xbmcaddon.Addon(id='script.module.openscrapers')
at the moment it is like this...
addon = xbmcaddon.Addon()
This has implications with other variables in the code, such as "setting" which is assigned "addon.getSetting". If 'addon' is not set to openscrapers, then this 'setting' call will call the settings of whichever addon is accessing the scraper. So, for example, if Venom calls the new Easynews scraper, then the 'settings' calls in the Easynews scraper will check Venom's settings instead of Openscrapers settings.
I can fix this with a pull request, I just don't know whether it will affect the scrapers test code incorporated into Openscrapers.
Thanks for the update , was this update to fix the cloudflare , cfscrape error , for me I'm still not getting any rlsbb links for some reason.
Currently Easynews & Furk results are treated as 'direct' sources by addons, this is a problem if using the 'use debrid only' filter (and possibly other sorting/filtering options).
Solution would be to consider these links 'premium' along with torrent/debrid rather than the 'free' links.
as title mentions i would like to be able to add some scrapers to openscrapers they are from a addon i have and i have tested to make sure they work and i have chacked to see if there is any duplicates and i have removed the duplicates as well
2ddl giving this error: http://2ddl.vg/ returned an error. Could not collect tokens.
and rapidmoviez requires "from openscrapers.modules import dom_parser2" but dom_parser2.py is not in modules, i was able to add my own dom_parser2 but i thought i would let you know.
just curious why vidics and xwatchseries are not being used because im getting a lot of links with them for episodes ?
Sorry, I forgot to mention, that there is a copy of onlineseries in en that still calls for dom_parser2, the one in en_debrid_only is proper and i guess the one in en should be removed.
And just a thank you for creating and maintaining openscrapers ;)
PubfilmOnline gives this error
UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("html.parser"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.
The code that caused this warning is on line 56 of the file C:\Users\Grim\Documents\GitHub\script.module.openscrapers\lib\openscrapers\sources_openscrapers\en\pubfilmonline.py. To get rid of this warning, pass the additional argument 'features="html.parser"' to the BeautifulSoup constructor.
Hi.
Can you check the german providers list please?
I use exodus redux and it loads no links when i set openscraper to use german providers.
BTW when i use lambdascrapers all works.
I have notized that in lambdascrapers the list of german providers is completely different.
Hello, I have been developing some scrapers using Bsoup for a while. I am interested in developing scrapers capable of being integrated with openscrapers. Is there documentation with clear instructions on how to develop scrapers to your pattern?
Hey @nazegnl just merged your PR and tested scrape test, all CSV outputs are using ; instead of , so i i have to go into all files and change ; to , lol can you please look at it again?
The following scrapers all reference the missing module anilist:
Animeloads
Proxer
Foxx
Pureanime
This module will either need to be found or the providers removed
I'm thinking next update to add the hash to our torrent sources dict. Would make things a little easier for devs doing torrent cached/uncached checking, and or removal.
Please excuse my poor English.
Hello,
is there a special reason why in the 'addon.xml' the line
<extension point="xbmc.python.pluginsource" library="lib/default.py">
is not
<extension point="xbmc.python.script" library="lib/default.py">
like other script modules?
I changed this for me and also changed the 'default.py' a little bit. The advantage is that after changing the settings, the settings are always saved when you click on the "OK" button.
As an example my "default.py":
myvideolink may be broken, is anyone else getting errors from it ? thnx
Seems to be displaying episode title rather than the name of the show. For example, Gold Rush: Parker's Trail show displayed as Hell's Crack: https://imgur.com/a/KN5rqjj
use: "develop" Kodi on Firetvstick
The module "cfscrap.py" from branch "develop" does NOT return a result.
use: "develop" Kodi on windows
The module "cfscrap.py" from branch "develop" does return a result.
use: "master"
The same module "cfscrap.py" from branch "master" does return in Kodi a result on FireTvSick and on Windows
Sorry for the short text - my english is bad
Thank you
url = https://movietown.org
import openscrapers
from openscrapers.modules import cfscrape
scraper = cfscrape.create_scraper()
sHtmlContent = scraper.get(url).content
print sHtmlContent
Here is a strange one, for example if i open scraper settings and choose disable all torrent providers it flashes and toggles them all off then i click "ok" and it exits out then when i go back in the torrents are all toggled back on, but if i disable all and then hit cancel to close settings and then go back in my changes will have been saved.... its as if Ok and Cancel are acting in reverse ? im using kodi 18.1
The german (de) scraper are almost all broken and should be revised.
solarmovie.py missing from EN
Really many german scraper seems to be broken.
I used venom with only the foreign scrapers and set to german indexers within venom.
right now I have only found sources at iload, ddl(.me?) and streamto.
I know the searched series is at least at serienstream (s.to), freikino, hdfilme, kinox.to.
Could someone look into this?
Or anyone has a "easy" guide for making scrapers, than I could try it myself.
(Haven't done this before, nor used python much)
Hi!
I think it would be wonderful if you could took a look at the primewire scraper and updated it.
Right now it scrapes https://www.primewire.ac/ which is DOWN (522 error) and https://primewire.ink/ which seems to be a clone... If I'm not mistaken..?
Instead please update it to scrape https://www.primewire.li/ or https://www.primewire.ag/ which seems to be updated EVERY day.
Thanks in advance!
Skip scrapers that require login like GoStream, Furk, etc
ultrahdindir.py (i requested on Reddit a fix) is still not giving results back ,i replaced the old ultrahdindir.py with the new one from the developer branch, maybey you can also test this? Thx.
I don't understand the version 1.93 is available on openscrappers repo but not here ?
Thank for this addons :-)
As title says, please add headers on the requests of cfscrape to hide kodi headers.
In the case of client.request(url) headers are created inside the request function of client module so for example you dont need to set User-Agent in headers, but on normal requests and cfscrape requests you need to set User-Agent and possibly the baseurl as referer of the scraper to hide that way requests from kodi!
for example:
scraper = cfscrape.create_scraper() headers = {'User-Agent': client.agent(), 'Referer': self.base_url} html = scraper.get(url, headers=headers).text
v1.106
I dont use many free scrapers, i do use most debrid scrapers, of the scrapers i use i have had these issues, P.S. Thank you for all your efforts, im just trying to contribute:
[2020-03-17 05:37:04] [COLOR red][ OPENSCRAPERS DEBUG ][/COLOR]: Error: Loading module: "projectfreetv": cannot import name cfScraper[2020-03-17 05:37:06] [COLOR red][ OPENSCRAPERS DEBUG ][/COLOR]: Request-Error (500): http://www.sceneddl.me/?s=Riviera+S02E10
[2020-03-17 05:37:06] [COLOR red][ OPENSCRAPERS DEBUG ][/COLOR]: Request-Error: (unknown url type: Riviera) => Riviera
[2020-03-17 05:37:06] [COLOR red][ OPENSCRAPERS DEBUG ][/COLOR]: Request-Error: (unknown url type: Riviera) => Riviera
[2020-03-17 05:37:06] [COLOR red][ OPENSCRAPERS DEBUG ][/COLOR]: Request-Error: (unknown url type: Riviera) => Riviera
[2020-03-17 05:37:07] [COLOR red][ OPENSCRAPERS DEBUG ][/COLOR]: MYVIDEOLINK - Exception:
Traceback (most recent call last):
File "/Users/xxxxxxx/Library/Application Support/Kodi/addons/script.module.openscrapers/lib/openscrapers/sources_openscrapers/en_DebridOnly/myvideolink.py", line 107, in sources
posts = zip(client.parseDOM(r1, 'a', ret='href'), client.parseDOM(r1, 'a'), re.findall('((?:\d+.\d+|\d+,\d+|\d+)\s*(?:GB|GiB|MB|MiB))', r2[0]))
[2020-03-17 05:38:03] [COLOR red][ OPENSCRAPERS DEBUG ][/COLOR]: RAPIDMOVIEZ - Exception:
Traceback (most recent call last):
Cloudflare_reCaptcha_Provider: Cloudflare reCaptcha detected, unfortunately you haven't loaded an anti reCaptcha provider correctly via the 'recaptcha' parameter.
[2020-03-17 05:38:03] [COLOR red][ OPENSCRAPERS DEBUG ][/COLOR]: RAPIDMOVIEZ - Exception:
Traceback (most recent call last):
MissingSchema: Invalid URL 'None': No schema supplied. Perhaps you meant http://None?
[2020-03-17 05:39:10] [COLOR red][ OPENSCRAPERS DEBUG ][/COLOR]: RAPIDMOVIEZ - Exception:
Traceback (most recent call last):
Cloudflare_Loop_Protection: !!Loop Protection!! We have tried to solve 3 time(s) in a row.
[2020-03-17 05:39:10] [COLOR red][ OPENSCRAPERS DEBUG ][/COLOR]: RAPIDMOVIEZ - Exception:
Traceback (most recent call last):
Cloudflare_Loop_Protection: !!Loop Protection!! We have tried to solve 3 time(s) in a row.
[2020-03-17 05:39:10] [COLOR red][ OPENSCRAPERS DEBUG ][/COLOR]: RAPIDMOVIEZ - Exception:
Traceback (most recent call last):
Cloudflare_Loop_Protection: !!Loop Protection!! We have tried to solve 3 time(s) in a row.
[2020-03-17 05:39:10] [COLOR red][ OPENSCRAPERS DEBUG ][/COLOR]: RAPIDMOVIEZ - Exception:
Traceback (most recent call last):
Cloudflare_Loop_Protection: !!Loop Protection!! We have tried to solve 3 time(s) in a row.
[2020-03-17 05:39:14] [COLOR red][ OPENSCRAPERS DEBUG ][/COLOR]: RAPIDMOVIEZ - Exception:
Traceback (most recent call last):
Cloudflare_Loop_Protection: !!Loop Protection!! We have tried to solve 3 time(s) in a row.
Seems Rapidmoviez rarely passes CF...
@nazegnl
Series9 giving me this error on latest dev branch
Traceback (most recent call last):
File "C:\Users*\Documents\GitHub\script.module.openscrapers\lib\openscrapers\sources_openscrapers\en\series9.py", line 111, in sources
url = self.searchMovie(data['title'], data['year'])
File "C:\Users*\Documents\GitHub\script.module.openscrapers\lib\openscrapers\sources_openscrapers\en\series9.py", line 93, in searchMovie
url = [i[0] for i in results if cleantitle.get(i[1]) == cleantitle.get(title)][0]
IndexError: list index out of range
Please transfer the following function "get_titles_for_search()" to the "source_utils.py
def get_titles_for_search(title, localtitle, aliases):
try:
titles = []
if "country':" in str(aliases): aliases = aliases_to_array(aliases)
if localtitle != '': titles.append(localtitle)
if title != ''and title != localtitle: titles.append(title)
[titles.append(i) for i in aliases if i.lower() != title.lower() and i.lower() != localtitle.lower() and i != '']
titles = [str(i) for i in titles if all(ord(c) < 128 for c in i)]
return titles
except:
return []
This function simplifies the writing of scrapres.
It creates a list from the transferred values in which there are no more duplicates.
as an example you see some codelines from a scraper:
old:
def movie(self, imdb, title, localtitle, aliases, year):
try:
url = self.__search([localtitle] + source_utils.aliases_to_array(aliases))
if not url and title != localtitle: url = self.__search([title] + source_utils.aliases_to_array(aliases))
return url
except:
return
new with "get_titles_for_search()"
def movie(self, imdb, title, localtitle, aliases, year):
try:
return self.__search(source_utils.get_titles_for_search(title, localtitle, aliases))
except:
return
many thanks
https://github.com/nusch/nusch-repo/tree/master/script.module.openscrapers
version numbers are 9.x.x.x
seems kinda shady, I don't have a reddit account but maybe someone that does can post warning on addons4kodi (people should have auto updates off anyway)
Like RLSbb, SceneRLS,Zoogle (just to name a few)
If i rollback to the previous version, they are working fine.
If you need a log, guide me.
Maybey you can replicate this issue also?
Cloudflare v2 challenge has taken another from us. Will be removed in next update.
HD-Streams.org and probably the rest does not give 1080p results and i also think the lower resolutions come from different pages.
In general i think they are largely outdated and we would also benefit from modules for:
- streamkiste.tv
- kinoz.to
@host505 try latest dev or test-reformat branch. skytorrent is working for me just fine on both
Hello. Using openscrapers on Kodi 18.6 under windows 10.
When trying to scrape french yggtorrent website I get this error :
DEPRECATION: The OpenSSL being used by this python install (OpenSSL 1.0.2j 26 Sep 2016) does not meet the minimum supported version (>= OpenSSL 1.1.1) in order to support TLS 1.3 required by Cloudflare, You may encounter an unexpected reCaptcha or cloudflare 1020 blocks
And I can't bypass cloudfare protection.
Any idea ?
I am having an issues with digbt.py causing the dialog window to not show other links. I have tested it with a working one and that one only available in the folder and it will prevent the dialog box from coming up. if I comment out the self variables it will work correctly(with no links from digbt). I noticed it was a cf based and are always hard to fix at times. I just wanted to see if you can confirm as well.
Thanks
v196 seem to be having trouble with user agents py, may just be me, had to revert to last cfscrape etc as scrapers cant load user_agents.py
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.