Giter Site home page Giter Site logo

kaggle-for-korean's Introduction

kaggle-for-korean in python

한국인을 위한 (파이썬) 캐글 튜토리얼

  • Kaggle을 이제 막 시작하셨다면? 아래의 Terminology와 numpy, pandas를 공부하시고 시작하시면 좋을 것 같습니다 :)

작성자 Profile

Contents

  1. Terminology: Kaggle의 kernel들과 discussion들을 이해하기 위한 용어 설명
  2. EDA: 데이터 시각화와 Feature Engineering (한글로 마땅한 단어를 모르겠네요)
  3. Model: 기계 학습과 딥러닝 / 인공지능 모델들에 대해 소개 (Machine Learning and Deep Learning)

Terminology

: kaggle에서 쓰이는 용어들

  • CV = Cross-validation score
    • 모델을 trainining dataset을 validation split을 한 후에, cross-validation을 해서 구한 점수를 나타냅니다.
    • 한번의 validation을 통해 얻은 score라면, overfitting일 가능성이 높지만, CV 방식은 좀 더 객관적인 점수를 나타냅니다.
    • 하지만 test dataset으로 계산하지 않은, 로컬한 점수입니다. CV가 LB보다 많이 높을 경우, overfitting으로 간주합니다.
    • Discussion에서 CV는 얼마인데, LB는 얼마가 나온다는 얘기를 많이 볼 수 있습니다.
  • LB = Leaderboard score
  • DAE = Denoising autoencoder
  • VAE = Variational autoencoder
  • OverSampling/UnderSampling
  • OOF = Out-Of-Fold
  • leak
  • Stacking
  • Stacking2
  • Target encoding

EDA

: Exploratory Data Analysis

  • Main Point
    • Data Analysis
    • Inlier/Outlier
    • Feature Engineering
      • Feature Selection
      • Dimension Reduction
      • Feature Generation
  • Libraries
    • Visualization
      • matplotlib
      • bokeh:
      • seaborn (as sns)
        • heatmap
          • variable/feature 간의 correlation matrix를 그릴 때 많이 사용됨.
        • pointplot, boxplot, lmplot(=scatterplot)
      • plotly
        • dots, lines, bars, pie, ...

Model

  • Machine Learning
    • SciKit-Learn
      • Regression
      • Classification
    • Light GBM
    • Catboost
  • Deep Learning
    • tensorFlow
    • PyTorch
    • Caffe
    • Theano
    • Cahiner
    • Keras
      • kaggle의 강자는 keras인 것 같습니다. 코드가 간결하기 때문이죠.

kaggle-for-korean's People

Contributors

seriousran avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.