Giter Site home page Giter Site logo

lianjia_scrapy's Introduction

链家二手房爬虫

介绍

链家二手房爬虫,echarts可视化 demo

python爬虫

echarts可视化

Fastapi Web

安装教程

pip install -r requirements.txt

使用说明

获取数据

python main.py

Web

python app.py

lianjia_scrapy's People

Contributors

chenwr727 avatar

Stargazers

yijun xiang avatar  avatar  avatar

Watchers

 avatar

Forkers

yijxiang

lianjia_scrapy's Issues

IndexError: list index out of range`

  • `程序运行一段时间后出现这个报错,请问是什么原因。我修改了config.py中的city# sqlite config
    SQLITE_FILE_PATH = "./db/house_jh.db"

log config

LOGS_NUM = 0

scrapy

SLEEP_TIME = 0.6
CITY = "jh"`

2022-02-24 18:46:05,556 | INFO | main.py | getHouseByUrl | 81 | Successfully added house 2022-02-24 18:46:07,563 | INFO | main.py | getHouseByUrl | 72 | Successfully get house by url https://jh.lianjia.com/ershoufang/103118510884.html 2022-02-24 18:46:07,936 | INFO | main.py | getHouseByUrl | 81 | Successfully added house 2022-02-24 18:46:09,625 | INFO | main.py | getHouseByUrl | 72 | Successfully get house by url https://jh.lianjia.com/ershoufang/103118648540.html 2022-02-24 18:46:09,857 | INFO | main.py | getHouseByUrl | 81 | Successfully added house 2022-02-24 18:46:11,485 | INFO | main.py | getHouseByUrl | 72 | Successfully get house by url https://jh.lianjia.com/ershoufang/103116199523.html 2022-02-24 18:46:11,860 | INFO | main.py | getHouseByUrl | 81 | Successfully added house 2022-02-24 18:46:13,562 | INFO | main.py | getHouseByUrl | 72 | Successfully get house by url https://jh.lianjia.com/ershoufang/103119175931.html 2022-02-24 18:46:13,872 | INFO | main.py | getHouseByUrl | 81 | Successfully added house 2022-02-24 18:46:15,475 | INFO | main.py | getHouseByUrl | 72 | Successfully get house by url https://jh.lianjia.com/ershoufang/103119208519.html 2022-02-24 18:46:15,700 | INFO | main.py | getHouseByUrl | 81 | Successfully added house 2022-02-24 18:46:17,380 | INFO | main.py | getHouseByUrl | 72 | Successfully get house by url https://jh.lianjia.com/ershoufang/103119218540.html 2022-02-24 18:46:17,590 | INFO | main.py | getHouseByUrl | 81 | Successfully added house 2022-02-24 18:46:19,313 | INFO | main.py | getHouseByUrl | 72 | Successfully get house by url https://jh.lianjia.com/ershoufang/103119142135.html 2022-02-24 18:46:19,499 | INFO | main.py | getHouseByUrl | 81 | Successfully added house 2022-02-24 18:46:21,248 | INFO | main.py | getHouseByUrl | 72 | Successfully get house by url https://jh.lianjia.com/ershoufang/103119195516.html 2022-02-24 18:46:21,687 | INFO | main.py | getHouseByUrl | 81 | Successfully added house 2022-02-24 18:46:23,317 | INFO | main.py | getHouseByUrl | 72 | Successfully get house by url https://jh.lianjia.com/ershoufang/103119208227.html 2022-02-24 18:46:23,721 | INFO | main.py | getHouseByUrl | 81 | Successfully added house Traceback (most recent call last): File "e:/PythonProject/lianjia_scrapy-master/main.py", line 133, in <module> main() File "e:/PythonProject/lianjia_scrapy-master/main.py", line 129, in main house.getData(max_page=max_page) File "e:/PythonProject/lianjia_scrapy-master/main.py", line 25, in getData self.getUrlsByPage(page) File "e:/PythonProject/lianjia_scrapy-master/main.py", line 47, in getUrlsByPage self.getHouseByUrl(url) File "e:/PythonProject/lianjia_scrapy-master/main.py", line 71, in getHouseByUrl des_content = self.getContentByTags(contents[5]("div", {"class", "row"})) IndexError: list index out of range

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.