Giter Site home page Giter Site logo

liebkne / eventmonitor Goto Github PK

View Code? Open in Web Editor NEW

This project forked from liuhuanyong/eventmonitor

0.0 0.0 0.0 8.33 MB

Event monitor based on online news corpus including event storyline and analysis,基于给定事件关键词,采集事件资讯,对事件进行挖掘和分析。

Python 58.15% DIGITAL Command Language 41.85%

eventmonitor's Introduction

EventMonitor

Event monitor based on online news corpus built by Baidu search enginee using event keyword for event storyline and analysis,基于给定事件关键词,采集事件资讯,对事件进行挖掘和分析。

项目路线图

image

项目细分

1) 基于话题关键词的话题历时语料库采集

执行方式:进入EventMonitor目录下,进入cmd窗口,执行"scrapy crawl eventspider -a keyword=话题关键词",或者直接python crawl.py, 等待数秒后,既可以在news文件夹中存储相应的新闻文件,可以得到相应事件的话题集,话题历史文本
image image image

2)关于热点事件的情感分析

对于1)得到的历史语料,可以使用基于依存语义和情感词库的篇章级情感分析算法进行情感分析
这部分参考我的篇章级情感分析项目DocSentimentAnalysis:https://github.com/liuhuanyong/DocSentimentAnalysis

3)关于热点事件的搜索趋势

对于1)得到的历史语料,可以使用百度指数,新浪微博指数进行采集
这部分参考我的百度指数采集项目BaiduIndexSpyder:https://github.com/liuhuanyong/BaiduIndexSpyder
微博指数采集项目WeiboIndexSpyder:https://github.com/liuhuanyong/WeiboIndexSpyder

4)关于热点事件的话题分析

对于1)得到的历史语料,可以使用LDA,Kmeans模型进行话题分析
这部分参考我的话题分析项目Topicluster:https://github.com/liuhuanyong/TopicCluster

5)关于热点事件的代表性文本分析

对于1)得到的历史语料,可以使用跨篇章的textrank算法,对文本集的重要性进行计算和排序
这部分参考我的文本重要性分析项目ImportantEventExtractor:https://github.com/liuhuanyong/ImportantEventExtractor

6)关于热点事件新闻文本的图谱化展示

对于得到每个历史新闻事件文本,可以使用关键词,实体识别等关系抽取方法对文本进行可视化展示
这部分内容,参考我的文本内容可视化项目项目TextGrapher:https://github.com/liuhuanyong/TextGrapher

结束语

关于事件监测的方法有很多,也有很多问题需要去解决,以上提出的方法只是一个尝试,就算法本身还有许多需要改进的地方

contact

如有自然语言处理、知识图谱、事理图谱、社会计算、语言资源建设等问题或合作,请联系我:
邮箱:[email protected]
csdn:https://blog.csdn.net/lhy2014
我的自然语言处理项目: https://liuhuanyong.github.io/
刘焕勇,**科学院软件研究所

eventmonitor's People

Contributors

liuhuanyong avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.