Giter Site home page Giter Site logo

karmenzind / fp-server Goto Github PK

View Code? Open in Web Editor NEW
160.0 5.0 36.0 786 KB

Free proxy server, continuously crawling and providing proxies, based on Tornado and Scrapy. 免费代理服务器,基于Tornado和Scrapy,在本地搭建属于自己的代理池

License: MIT License

Python 99.94% Dockerfile 0.06%
proxy python scrapy spider proxypool tornado

fp-server's Introduction

Hello. Is there anybody in there 👋

OS:ArchLinux IDE:Vim WM:i3wm Gist.GitHub:karmenzind

fp-server's People

Contributors

dependabot[bot] avatar karmenzind avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

fp-server's Issues

无用网站!!!

经测试,IP海已经停止服务(访问生么页面都是404/500)
希望修改一下站点列表

老大的程序很稳定

`1
爬虫名称: xicidaili
运行状态: 正在运行
最后运行时间: 2018-08-17 23:42:02
运行时长: 0年2天0小时48分58秒

2
爬虫名称: coolproxy
运行状态: 正在运行
最后运行时间: 2018-08-17 23:42:02
运行时长: 0年2天0小时48分58秒

3
爬虫名称: checker
运行状态: 正在运行
最后运行时间: 2018-08-17 23:38:00
运行时长: 0年2天0小时53分0秒

4
爬虫名称: data5u
运行状态: 停止
最后运行时间: 2018-08-17 23:32:00
运行时长: 0秒

5
爬虫名称: yundaili
运行状态: 正在运行
最后运行时间: 2018-08-17 23:42:02
运行时长: 0年2天0小时48分58秒

6
爬虫名称: ip66
运行状态: 正在运行
最后运行时间: 2018-08-17 14:50:27
运行时长: 0年2天9小时40分33秒

7
爬虫名称: 3464
运行状态: 正在运行
最后运行时间: 2018-08-17 23:42:02
运行时长: 0年2天0小时48分58秒

8
爬虫名称: coderbusy
运行状态: 正在运行
最后运行时间: 2018-08-17 23:42:02
运行时长: 0年2天0小时48分58秒

9
爬虫名称: kuaidaili
运行状态: 正在运行
最后运行时间: 2018-08-17 13:00:11
运行时长: 0年2天11小时30分49秒

10
爬虫名称: mix
运行状态: 停止
最后运行时间: 2018-08-17 23:32:00
运行时长: 0秒

代理总计: 265681
其中http: 130667
其中https: 135014
其中透明代理: 26667
其中匿名代理: 239014

[页面执行时间:7.3968110084534 秒]`

我希望大佬能够写一篇关于这个项目大致思路,或者关于tornado于scrapy进行交互的文章

在网上面找了一些关于scrapy嵌入到web的方式,但是都有些不太满意,而且主要看到了Django和scrapy,或者是通过scrapyd来进行操控。

这个项目源码我也看了一些,但是异步的部分是在是不怎么懂,特别是在tornado中get,post方法那些地方,没有找到大佬究竟做了什么事,那些属性或者方法是在什么地方写进去的。
image

确实很多不理解的地方,我学的还是太浅了

显示:服务器内部错误

一直抓取不到代理
/api/status/
{"code": 0, "msg": "success", "data": {"spiders": [{"status": "stopped", "name": "coderbusy", "last_start_time": "1533775630"}, {"status": "stopped", "name": "kuaidaili", "last_start_time": "1533775630"}, {"status": "stopped", "name": "mix", "last_start_time": "1533775630"}, {"status": "stopped", "name": "data5u", "last_start_time": "1533775630"}, {"status": "stopped", "name": "xicidaili", "last_start_time": "1533775079"}, {"status": "stopped", "name": "checker", "last_start_time": "1533776171"}, {"status": "stopped", "name": "coolproxy", "last_start_time": "1533775079"}, {"status": "stopped", "name": "3464", "last_start_time": "1533775630"}, {"status": "stopped", "name": "yundaili", "last_start_time": "1533775630"}, {"status": "stopped", "name": "ip66", "last_start_time": "1533775630"}], "proxies": {"total": 0, "detail": {"http": 0, "https": 0, "transparent": 0, "anonymous": 0}}}}
/api/spider/run_all/
{"code": 500, "msg": "\u670d\u52a1\u5668\u5185\u90e8\u9519\u8bef", "data": {}}
显示:服务器内部错误
环境:Debian 9 x64 (stretch),python3.6.5

utils/tools.py 是不是有问题

函数recuresive_update
old_value为字符串的时候, value为list会报错。value为tuple时,old_value字符串转成list失去了本来的意义了

就是获取不到ip

main.py应该是能运行
api也能访问,但是获取的 内柔就是count为0 是main.py无法存储数据到redis的原因吗

报错了,centos+py3.7.1

     File "/usr/local/lib/python3.7/site-packages/scrapy/middleware.py", line 34, in from_settings
        mwcls = load_object(clspath)
      File "/usr/local/lib/python3.7/site-packages/scrapy/utils/misc.py", line 44, in load_object
        mod = import_module(module)
      File "/usr/local/lib/python3.7/importlib/__init__.py", line 127, in import_module
        return _bootstrap._gcd_import(name[level:], package, level)
      File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
      File "<frozen importlib._bootstrap>", line 983, in _find_and_load
      File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
      File "<frozen importlib._bootstrap_external>", line 728, in exec_module
      File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
      File "/usr/local/lib/python3.7/site-packages/scrapy/extensions/telnet.py", line 12, in <module>
        from twisted.conch import manhole, telnet
      File "/usr/local/lib/python3.7/site-packages/twisted/conch/manhole.py", line 154
        def write(self, data, async=False):
                                  ^
    SyntaxError: invalid syntax

感谢一波

Your operating system and Python version?
centos7
版本为python34-->python36

成功用nginx + 代码 部署在服务器上面
其中遇到坑就是需要安装python36-devel 安装python-redis插件需要这个 。
感谢大佬

还有就是能不能在返回的字段加入返回的延时啊
还有有个代理叫31代理 网址是http://31f.cn/

[建议]能否精简一下api,感觉过于繁琐,许多字段用不上的.

/api/proxy/

{"code": 0, "msg": "success", "data": {"count": 1, "detail": [{"ip": "91.196.39.196", "scheme": "https", "port": "32585", "need_auth": "0", "url": "https://91.196.39.196:32585", "anonymity": "anonymous"}]}}

可否改为或加个简单版如/api/proxy/simple/

{"code": 0, "msg": "success", "data": {"count": 1, "detail": [{"scheme": "https", "ip": "91.196.39.196:32585"}]}}

或更加干脆的直接

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.