Giter Site home page Giter Site logo

findata's Introduction

金融爬虫项目

项目简介

该项目通过对东方财富和网易财经网站爬虫,获取个股日线数据并存入Mysql数据库。通过Python调用数据并进行基础金融数据分析,在Jupyter Notebook进行展示。代码通过下文项目说明的简单修改,可以直接在Macos,Linux操作系统的Server上直接使用。

项目使用说明

1.修改文件保存路径:文件结构需要与本仓库保持一致

get_data_sh_ORG.py line 40,56,70,75 改到自己文件所在路径下
get_data_sz_ORG.py line 42,60,74,80 改到自己data_sz所在文件路径下

2.修改个股数据开始和结束日期,如 &start=20060101&end=20210204&

get_data_sh_ORG.py line 45
get_data_sz_ORG.py line 48

3.终端输入 python3 get_data_sh_ORG.py 和 python3 get_data_sz_ORG.py

这两个文件只需要在初始时候执行一次即可,之后无需再次执行,将个股数据存入mysql数据库

4.终端输入 crontab -e
将crontab文件的内容拷贝进去保存,update.py,update_new_stock_sh.py,update_new_stock_sz.py自动定时执行
意思分别是每天21:30更新数据库信息,每天21:40更新股票池,每天22:30数据库备份
可以使用 crontab -l 查看
Macos:
sudo /usr/sbin/cron start
sudo /usr/sbin/cron restart
sudo /usr/sbin/cron stop

Ubuntu: sudo /etc/init.d/cron start
sudo /etc/init.d/cron stop
sudo /etc/init.d/cron restart

Centos: /sbin/service crond start //启动服务
/sbin/service crond stop //关闭服务
/sbin/service crond restart //重启服务
/sbin/service crond reload //重新载入配置

项目文件介绍

文件:
get_data_sh_ORG.py:获取沪A个股数据,并存入数据库(只需要执行一次)
get_data_sz_ORG.py:获取深A个股数据,并存入数据库(只需要执行一次)
log.txt:项目个人编写日志,记录及时想法
sha_list.csv:爬取沪A的个股股票代码,便于直接查询使用,数据库也有
sza_list.csv:爬取深A的个股股票代码,便于直接查询使用,数据库也有
update.py:每日数据库数据更新
update_new_stock_sh.py:每日沪A股票池更新
update_new_stock_sz.py:每日深A股票池更新
crontab:linux crontab命令


文件夹:
data_sh:网易财经下载的沪A个股数据,存入数据库后可删除,初始为空
data_sz:网易财经下载的深A个股数据,存入数据库后可删除,初始为空
save:存放一些测试代码
backup:数据库备份文件存放位置

项目更新以及想法

1.爬虫多线程

2.数据库存储缓存参数设置my.cnf

findata's People

Stargazers

 avatar  avatar mqm avatar  avatar 路名牛丶 avatar

Watchers

路名牛丶 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.