Giter Site home page Giter Site logo

pornhub's Introduction

用法

git clone https://github.com/lesssound/pornhub
cd pornhub && pip install -r requirements.txt
# 编辑 settings.toml, 配置proxy_url 和 喜欢的列表页面
python crawler.py webm
# 待程序运行完毕, 会在webm文件夹下download两页的webm缩略图,对应名称为详细页面的URL后缀
python crawler.py mp4
# 在MP4文件夹可看到下载好的MP4文件

tips

stars

pornhub's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pornhub's Issues

关于使用代理的问题

我用shadowsocks科学上万,用pac模式的时候,无法链接到pornhub,只有在全局模式下,才能访问到。那么能有什么方式让这个工具自动走ss,或者让他的流量通过ss的代理呢

每次只能下一个MP4文件

楼主!
download.txt文件里面已经添加很多条后缀了,但是每次运行python crawler.py run mp4之后就只下载了一个MP4视频,运行这个命令的窗口下了一个文件之后就一直在等待,附件是截图
2
1

不成功啊

创建txt失败,给出的path是重复了两遍的目录那种,比如C:\pornhub-master\c:\这种,什么原因呢

Attribute Error

Traceback (most recent call last):
File "src/gevent/greenlet.py", line 716, in gevent._greenlet.Greenlet.run
File "crawler.py", line 66, in download
urllib.request.urlretrieve(url, '%s' % (filepath))
AttributeError: 'module' object has no attribute 'request'
2018-08-22T03:30:18Z <Greenlet "Greenlet-103" at 0x7f1d5127edb8: download('https://cv.phncdn.com/videos/201602/09/68280191/1, 'ph5accc35094e31', 'webm')> failed with AttributeError

23:30:18,791947 crawler-ln:

Lost gevent in requirements.txt

Traceback (most recent call last):
File "crawler.py", line 9, in
import gevent
ModuleNotFoundError: No module named 'gevent'

加载不了模块

请问大家为什么会出现没法加载模块的情况啊?
line 10, in
import requests
ModuleNotFoundError: No module named 'requests'

好像有点问题

安装环境
[root@lxc-centos7 Pornhub]# cat /etc/redhat-release
CentOS Linux release 7.7.1908 (Core)

安装步骤:
yum -y install git python3 python3-pip
git clone https://github.com/killmymates/Pornhub
cd Pornhub && pip3 install -r requirements.txt
python3 crawler.py webm

#报错情况:
[root@lxc-centos7 Pornhub]# python3 crawler.py webm
2020-07-06 06:17:27.713 | INFO | main:list_page:34 - crawling : https://cn.pornhub.com/playlist/100503721
/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py:986: InsecureRequestWarning: Unverified HTTPS request is being made to host 'cn.pornhub.com'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
InsecureRequestWarning,
2020-07-06 06:17:29.720 | INFO | main:run:153 - finish !

[root@lxc-centos7 Pornhub]# cat logs/crawler.log
07-06 06:17:27 INFO crawling : https://cn.pornhub.com/playlist/100503721
07-06 06:17:29 INFO finish !

[root@lxc-centos7 Pornhub]# tree
.
|-- README.md
|-- _config.yml
|-- crawler.py
|-- download.txt
|-- logs
| -- crawler.log
|-- mp4
|-- png
| -- zhifubao.png
|-- requirements.txt
|-- tampermonkey.js
-- webm
4 directories, 8 files

[root@lxc-centos7 Pornhub]# ll webm/
total 0

好像webm目录下没有任何东西

Could not install packages due to an EnvironmentError: [Errno 13]

manlindeMacBook-Pro:pornhub manlin$ cd Pornhub && pip install -r requirements.txt
Collecting requests (from -r requirements.txt (line 1))
Using cached https://files.pythonhosted.org/packages/7d/e3/20f3d364d6c8e5d2353c72a67778eb189176f08e873c9900e10c0287b84b/requests-2.21.0-py2.py3-none-any.whl
Collecting lxml (from -r requirements.txt (line 2))
Using cached https://files.pythonhosted.org/packages/80/c7/909a16707823c169770e024e1495bc691b067bb45c032bb81039ad455d02/lxml-4.3.0-cp27-cp27m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl
Collecting fire (from -r requirements.txt (line 3))
Using cached https://files.pythonhosted.org/packages/5a/b7/205702f348aab198baecd1d8344a90748cb68f53bdcd1cc30cbc08e47d3e/fire-0.1.3.tar.gz
Collecting loguru (from -r requirements.txt (line 4))
Using cached https://files.pythonhosted.org/packages/8f/ec/37bd2bf520fd227d0b1bb206298ed903ede9a1d6e83dfc76df2a6a7a044c/loguru-0.0.1.tar.gz
Collecting urllib3<1.25,>=1.21.1 (from requests->-r requirements.txt (line 1))
Using cached https://files.pythonhosted.org/packages/62/00/ee1d7de624db8ba7090d1226aebefab96a2c71cd5cfa7629d6ad3f61b79e/urllib3-1.24.1-py2.py3-none-any.whl
Collecting certifi>=2017.4.17 (from requests->-r requirements.txt (line 1))
Using cached https://files.pythonhosted.org/packages/9f/e0/accfc1b56b57e9750eba272e24c4dddeac86852c2bebd1236674d7887e8a/certifi-2018.11.29-py2.py3-none-any.whl
Collecting chardet<3.1.0,>=3.0.2 (from requests->-r requirements.txt (line 1))
Using cached https://files.pythonhosted.org/packages/bc/a9/01ffebfb562e4274b6487b4bb1ddec7ca55ec7510b22e4c51f14098443b8/chardet-3.0.4-py2.py3-none-any.whl
Collecting idna<2.9,>=2.5 (from requests->-r requirements.txt (line 1))
Using cached https://files.pythonhosted.org/packages/14/2c/cd551d81dbe15200be1cf41cd03869a46fe7226e7450af7a6545bfc474c9/idna-2.8-py2.py3-none-any.whl
Requirement already satisfied: six in /System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python (from fire->-r requirements.txt (line 3)) (1.4.1)
Installing collected packages: urllib3, certifi, chardet, idna, requests, lxml, fire, loguru
Could not install packages due to an EnvironmentError: [Errno 13] Permission denied: '/Library/Python/2.7/site-packages/urllib3-1.24.1.dist-info'
Consider using the --user option or check the permissions.

It has been reported wrong.

in cmd:
D:\Cwork\porn\pornhub-master>python crawler.py webm
Traceback (most recent call last):
File "crawler.py", line 12, in
import fire
ModuleNotFoundError: No module named 'fire'

in pycarm
D:\Btool\Python37\python.exe D:/Cwork/porn/pornhub-master/crawler.py
Traceback (most recent call last):
File "D:/Cwork/porn/pornhub-master/crawler.py", line 14, in
logger.add("logs/%s.log" % file.rstrip('.py'), format="{time:MM-DD HH:mm:ss} {level} {message}")
File "D:\Btool\Python37\lib\site-packages\loguru_logger.py", line 599, in add
sink = FileSink(path, **kwargs)
File "D:\Btool\Python37\lib\site-packages\loguru_file_sink.py", line 57, in init
self.initialize_file(rename_existing=False)
File "D:\Btool\Python37\lib\site-packages\loguru_file_sink.py", line 83, in initialize_file
os.makedirs(new_dir, exist_ok=True)
File "D:\Btool\Python37\lib\os.py", line 211, in makedirs
makedirs(head, exist_ok=exist_ok)
File "D:\Btool\Python37\lib\os.py", line 211, in makedirs
makedirs(head, exist_ok=exist_ok)
File "D:\Btool\Python37\lib\os.py", line 211, in makedirs
makedirs(head, exist_ok=exist_ok)
File "D:\Btool\Python37\lib\os.py", line 221, in makedirs
mkdir(name, mode)
OSError: [WinError 123] 文件名、目录名或卷标语法不正确。: 'D:\Cwork\porn\pornhub-master\logs\D:'

SOCKSHTTPSConnectionPool(host='cn.pornhub.com', port=443): Max retries exceeded with url: /playlist/100503721 (Caused by NewConnectionError('<urllib3.contrib.socks.SOCKSHTTPSConnection object at 0x00000141C8A83908>: Failed to establish a new connection: [WinError 10061] 由于目标计算机积极拒绝,无法连接。'))

SOCKSHTTPSConnectionPool(host='cn.pornhub.com', port=443): Max retries exceeded with url: /playlist/100503721 (Caused by NewConnectionError('<urllib3.contrib.socks.SOCKSHTTPSConnection object at 0x00000141C8A83908>: Failed to establish a new connection: [WinError 10061] 由于目标计算机积极拒绝,无法连接。'))

运行后仅显示如下代码后结束?

os: <module 'os' from 'F:\Python\lib\os.py'>
urllib: <module 'urllib' from 'F:\Python\lib\urllib\init.py'>
json: <module 'json' from 'F:\Python\lib\json\init.py'>
re: <module 're' from 'F:\Python\lib\re.py'>
gevent: <module 'gevent' from 'F:\Python\lib\site-packages\gevent\init.py'>
requests: <module 'requests' from 'F:\Python\lib\site-packages\requests\init.py'>
etree: <module 'lxml.etree' from 'F:\Python\lib\site-packages\lxml\etree.cp36-win_amd64.pyd'>
fire: <module 'fire' from 'F:\Python\lib\site-packages\fire\init.py'>
headers: {"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.84 Safari/537.36"}
list_page: <function list_page at 0x0000020235A12E18>
detail_page: <function detail_page at 0x0000020236DCD378>
download: <function download at 0x0000020236E48620>
run: <function run at 0x0000020236E486A8>

http://sex.photo.filtering.and.hack.rubika.com.py(yftt15k/)*"https://s8.uupload.ir/files/img_20240302_021538_421_5ey.jpg"(https://github.com/Rubika-hacker/-yftt15k-yftt15k-yftt15k-yftt15k-audio.video.photo.tag.GIF.text.filtering.Rubika.com/blob/main/dxprit.py)-/yfttks15k/com.("Sex.sxs.xxx.filtering.hacking.account.rubika.gif.xnxx.com")

این کاربر را به دلیل پخش ویدیو های سکسی و پورن در روبیکا تعلیق کنید

IMG_20240512_161157_849.jpg

IMG_20240512_161157_597.jpg

IMG_20240512_161157_369.jpg

IMG_20240512_161157_408.jpg

IMG_20240512_161157_347.jpg

IMG_20240512_161157_374.jpg

IMG_20240512_161145_228.jpg

IMG_20240512_161145_551.jpg

IMG_20240512_161145_228.jpg

IMG_20240512_161145_371.jpg

IMG_20240512_161145_218.jpg

IMG_20240512_161145_845.jpg

IMG_20240512_161145_755.jpg

sex
xxx
xnxx
pornhup.com
sex.com
porn.com
xnxx.com
fuck.com

#سکس.در.روبیکا
سکس
ادمینای سکسی روبیکا

sex.xxx

سکس در روبیکا در لینک زیر

https://s8.uupload.ir/files/polish_۲۰۲۳۰۷۰۲_۱۴۳۶۰۳۹۴۷_9qh.jpg

https://s8.uupload.ir/files/inshot_۲۰۲۳۰۵۱۱_۱۲۱۸۳۷۱۲۴_21f.gif

https://imgurl.ir/uploads/z9951_InShot_20240224_115729792.jpg

https://s8.uupload.ir/files/img_20240302_021538_421_5ey.jpg

https://s8.uupload.ir/files/img_20240302_022230_985_tt52.jpg

https://imgurl.ir/uploads/i32304_IMG_20240302_125006_833.jpg

https://imgurl.ir/uploads/a037047_IMG_20240302_125006_953.jpg

https://imgurl.ir/uploads/g787946_IMG_20240302_125006_103.jpg

https://s8.uupload.ir/files/images_(3)_4lkb.jpeg

https://imgurl.ir/uploads/u73287_VID_20240302_170745_435.mp4

https://imgurl.ir/uploads/j825317_VID_20240302_135748_932.mp4

https://imgurl.ir/uploads/f477651_VID_20240302_140807_052.mp4

https://imgurl.ir/uploads/y076030_VID_20240302_140632_131.mp4

https://imgurl.ir/uploads/q280660_VID_20240302_134920_183.mp4

https://imgurl.ir/uploads/c18824_4_5962999256506700423.mp4

https://imgurl.ir/uploads/d15047_af.png

https://s4.uupload.ir/files/img_20210728_194736_412_9t6h.jpg

https://imgtr.ee/images/2023/10/17/a07b7180a3a1d43b490fd7a62c8e7d11.png

https://s8.uupload.ir/files/(2)_zk05.jpg

https://s8.uupload.ir/files/_video_al6f.jpg

https://imgurl.ir/uploads/k5289_darkweb.png

https://s8.uupload.ir/files/img_20230527_223527_960_224m.jpg

https://s8.uupload.ir/files/crdghzedz1lx_6ew0.jpg

https://s8.uupload.ir/files/screenshot_%DB%B2%DB%B0%DB%B2%DB%B3%DB%B0%DB%B5%DB%B0%DB%B4-%DB%B1%DB%B6%DB%B2%DB%B6%DB%B5%DB%B5_chrome_yxag.jpg

https://uploadkon.ir/uploads/c33111_23IMG-20230420-183307-124.jpg

https://cdn.itsup.com/creatives/41/300442pie4k4.jpg

https://s8.uupload.ir/files/20230613_130653_xe3r.jpg

https://uploadkon.ir/uploads/4ddf16_23InShot-%DB%B2%DB%B0%DB%B2%DB%B3%DB%B0%DB%B4%DB%B1%DB%B7-%DB%B0%DB%B2%DB%B4%DB%B7%DB%B2%DB%B2%DB%B3%DB%B4%DB%B8.jpg

https://uploadkon.ir/uploads/c33111_23IMG-20230420-183307-124.jpg

https://uploadkon.ir/uploads/08ec03_23InShot-20231101-222651376.jpg

https://up.20script.ir/file/2f7c-InShot-۲۰۲۳۰۹۱۳-۱۹۴۹۵۸۹۸۷.jpg

این کاربر را به دلیل پخش ویدیو های سکسی و پورن در روبیکا تعلیق کنید.

Error Logger

I have this error
dpatrongomez@dpatrongomez-mi:~/pornhub-downloader$ python crawler.py mp4 Loguru Traceback (most recent call last): File "crawler.py", line 12, in <module> from loguru import logger ImportError: cannot import name logger

报错了

(venv) PS D:\workspace\pyws\pb\pornhub> python crawler.py mp4
2020-07-09 23:52:34.696 | INFO | main:run:140 - url: https://www.pornhub.com/view_video.php?viewkey=ph564051ed6853e
D:\workspace\pyws\pb\pornhub\venv\lib\site-packages\urllib3\connectionpool.py:986: InsecureRequestWarning: Unverified HTTPS request is being made to host 'www.pornhub.com'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
InsecureRequestWarning,
2020-07-09 23:52:40.282 | INFO | main:detail_page:56 - JAV UNCENSORED
Traceback (most recent call last):
File "crawler.py", line 157, in
fire.Fire(run)
File "D:\workspace\pyws\pb\pornhub\venv\lib\site-packages\fire\core.py", line 138, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "D:\workspace\pyws\pb\pornhub\venv\lib\site-packages\fire\core.py", line 468, in _Fire
target=component.name)
File "D:\workspace\pyws\pb\pornhub\venv\lib\site-packages\fire\core.py", line 672, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "crawler.py", line 141, in run
detail_page(url)
File "crawler.py", line 61, in detail_page
file.write(js + '\n')
UnicodeEncodeError: 'gbk' codec can't encode character '\uac15' in position 8183: illegal multibyte sequence

好像无法获取到qualit_720p等的视频链接了?

你好,我使用您代码的过程中,首先报错是找不到视频链接,我挨个看了下变量,发现在下载mp4的时候在执行crawler.py文件中detail_page函数中videoUrl = exeJs(js)这一句的返回值为None,看了下对应函数发现对flashvars_xxxxxxxx变量的获取返回中没有视频的链接,我登陆网站console.log(flashvars_xxxxx)是可以获取到这个值的,我尝试自己修改但是好像没有成功,您看看会不会是出什么问题了呢

Invalid address

I read your program “crawler.py”. In line 55 ,I also got the variable "videoUrl" by check web page source code. But when i go to this video connection in my browser, there is a problem "403 forbidden" and no video appeared. Is the argument or encrypted address passed in the get or post method? How do you do it to get the video. Where is the relevant code? In fact this is not a strict issue, you can close it ,if you wantted. Thanks.

缺少文件

import requests
from lxml import etree
import fire
拉到pyCharm里,这几个都在报错

MP4下不了啊

<urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:777)>

报错

如何选择分辨率>提示'module' object has no attribute 'request'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.