Giter Site home page Giter Site logo

snooja / rleague-pyspark Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 36 KB

Analysing Rocket League TPS competition data using PySpark, Docker, and PyTest

License: MIT License

Dockerfile 50.03% Makefile 12.14% Python 37.83%
docker kaggle-dataset logging pytest

rleague-pyspark's Introduction

rleague-pyspark

Analysing Rocket League TPS competition data using PySpark, Docker, and PyTest.

Getting Started

At the moment I'm using dev containers in VSCode to develop. But the Makefile can be used to build images from the Dokcerifle and spin up containers as well.

Project

Status

In Development

Completed:

  • Multi-stage Dockerfile with seperate prod and dev targets created and run using Makefile and pipenv piping to pip install.
  • Common logger set up in src\shared\utils.py using config\logging.json.
  • Migrated from Make commands to .devcontainer for development including mounting api key directories.
  • Relative imports in src folder working from script directly, main.py, and pytest.
  • Jupyter notebooks now work using /opt/venv/bin/python as the interpreter

TODO

  • Finish extract and transform steps in notebook then migrate to etl.py
  • Load data ?somewhere? ready for analysis stage

Structure

  • .devcontainer for containerised development
  • config folder for general config and logger configs
  • data folder for raw and processed data
  • notebooks folder for jupyter notebooks used for initial EDA and development src
  • src folder contains jobs and shared folder, and main.py for majority of Python code
  • tests folder for pytests
  • Dockerfile contains multi-stage image builds using Pipenv just during build to pipe into requirements.txt for pip install
  • Pipfile and Pipfile.lock for package control
  • Makefile as alternative way to build and run dev and prod containers

Datasets

I'm using the Kaggle TPS rocket league data from here: https://www.kaggle.com/datasets/alvinleenh/tps-rocket-league-data-float16-parquet-format

rleague-pyspark's People

Contributors

snooja avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.