Giter Site home page Giter Site logo

kaggle-riiid's Introduction

Kaggle "Riiid Answer Correctness Prediction" competition

Competition overview: https://www.kaggle.com/c/riiid-test-answer-prediction/overview/description

This repository contains the code used to generate and submit the solution, which ranked 56th with a 80.2% ROC AUC score. The best performing model is based on a blending of a Lightgbm, a Catboost, and a Keras MLP model.

The repository also contains an encoder-decoder transformer based model, inspired by the Saint+ paper which scored 79.6%, but was not integrated in the final solution due to the submission notebook running time constraint of 9 hours.

Running the code

  • Create a new Riiid folder, and a sub folder data
  • Download and save the competition data in the newly created data folder
  • Set the RIIID_PATH environment variable to the Riiid folder path
  • Run scripts/build_validation.py
  • Run scripts/train.py

The training is configured to be performed on a small subset of the data (30k users), on a 16GB machine. Training the model on the full dataset requires 256GB of RAM, and was performed on an AWS EC2 instance by running aws/train.py (running this script requires an AWS account, credentials, and the AWS Doppel package).

The Saint+ like Transformer model can be trained by running scripts/train_saint.py. On the full dataset, features where generated using an AWS 128GB EC2 instance and the model was trained on a Kaggle TPUv3 notebook for 2 hours.

kaggle-riiid's People

Contributors

fabien-vavrand avatar rfbr avatar

Stargazers

Usha Rengaraju avatar

Watchers

Abhimanyu Dikshit avatar Jacques Peeters avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.