Giter Site home page Giter Site logo

67in / ccfcompetition Goto Github PK

View Code? Open in Web Editor NEW

This project forked from digfound/ccfcompetition

0.0 1.0 0.0 6.46 MB

这是本人第一次数据算法比赛实录。主要整理记录2017年CCF大数据与计算智能竞赛使用的模型和实现代码,选择了基于主题的文本情感分析赛题。采用情感词典+自定义规则完成比赛,成绩:136/796。

Python 100.00%

ccfcompetition's Introduction

2017 CCF大数据与计算智能竞赛记录

1.赛题:基于主题的文本情感分析

2.赛题背景(引自大赛官网):

以网上电商购物评论为例,原始的主题模型主要针对篇幅较大的文档或者评论句子的集合,学习到的主题主要针对整个产品品牌;而现实情形是,用户评论大多针围绕产品的某些特征或内容主题展开(如口味、服务、环境、性价比、交通、快递、内存、电池续航能力、原料、保质期等等,这说明相比于对产品的整体评分, 用户往往更关心产品特征),而且评论文本往往较短。

3.任务描述(引自大赛官网):

本次大赛提供脱敏后的电商评论数据。参赛队伍需要通过数据挖掘的技术和机器学习的算法,根据语句中的主题特征和情感信息来分析用户对这些主题的偏好,并以<主题,情感词>序对作为输出。

4.评分规则(引自大赛官网):

本赛题采用F1-score进行评价。

在最终评测时,我们按照“主题词-情感词-情感值”为最小粒度逐条与标注数据进行比对,若三者均与答案相符,则判为情感匹配正确,否则为错误。 评分计算如下: a) 情感匹配正确数量:tp b) 情感匹配错误数量:fp c) 情感匹配漏判数量:fn1 d) 情感匹配多判数量:fn2

最终根据以上值计算选手的准确率(P)与召回率(R),按照含有度量参数β的Fβ公式进行计算: 准确率:P=tp/(tp+fp+fn2) 召回率:R=tp/( tp+fp+fn1) Fβ的数学定义如下:Fβ=(1+)∗P∗R/(∗P+R) β=1

5.致谢队友(附队友GitHub):

王新日:https://github.com/xinrisanshao 李雪松:https://github.com/xs-L

ccfcompetition's People

Contributors

digfound avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.