Giter Site home page Giter Site logo

mesosxzan / data_mining_competition Goto Github PK

View Code? Open in Web Editor NEW

This project forked from glqglq/data_mining_competition

0.0 1.0 0.0 445.43 MB

我的数据挖掘比赛打怪之路

Python 0.85% Makefile 0.02% C++ 0.82% Shell 0.02% Jupyter Notebook 98.28%

data_mining_competition's Introduction

data_mining_competition

我的数据挖掘比赛打怪之路。

0.介绍

0.1常见比赛

  • 天池
  • kaggle
  • KDD Cup
  • CCF
  • DataCastle(**版的Kaggle)
  • DataFountain
  • 全国高校云计算应用创新大赛
  • 泰迪杯数据挖掘挑战赛
  • 京东JData算法大赛
  • 腾讯社交广告高校算法大赛
  • 滴滴-Udacity“无人驾驶”大挑战

1.Challenger_AI——虚拟股票趋势预测

  • 排名:44/90
  • 问题描述:二分类问题,找错误股票数据
  • 模型:xgboost
  • 调参:在经验值的基础上用网格法、交叉验证调整参数。

2.Kaggle——泰坦尼克

3.IJCAI18搜索广告转化预测

  • 排名:67/5204
  • 问题描述:二分类问题,给定前7.5天(7天正常、0.5天购物节)数据,预测剩余0.5天的点击-购买转化率
  • 模型&融合:用所有特征采用Lightgbm和Xgboost训练。所有数据One-hot编码后转成libsvm格式,尝试FFM。网格法调参。正负样本采样,模型间加法融合、voting融合、stacking融合、bagging融合及变种堆融合。
  • 特征:构建User、Item、Context、Shop四大基本特征群以及等交叉特征群,分别统计计数特征(不同时间粒度)、点击率特征(贝叶斯平滑)、排序特征、Flag特征、规则特征、用户IDF特征等,计算各行为衰减函数带入特征中,同时加入GBDT特征。
  • 划分:构建训练集为7日上午0-10点数据、验证集为7日上午11-12点数据,其余数据用于生成统计特征。

data_mining_competition's People

Contributors

glqglq avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.