Hi all,
I created this space to collaborate on projects for the Bertelsmann Technology Scholarship - AI Track.
Please feel free to collaborate in this space as you see fit. To simplify matters I have created one project, with subdirectories. Each subdirectory contains an individual project or resources.
Please be aware that although only collaborators can contribute to this repository, it is a public repository, so can be viewed or cloned by anyone. If you believe we should change this please let me know.
Also, if you are not a contributor, please ping me and I will add you as a contributor - provided you are registered on the Bertelsmann Technology Scholarship - AI Track.
Each one of the code projects are structured according to the cookiecutter data science project template. This template organises code, data and documentation in a structured manner which allows for flexible exploration of data and structured deployment of production ready code.
All projects have README's so you can read about each individual project by navigating there. The cor_art_dis and autompg projects are based on UCI open data repository data and focuses on the exploratory part of the analysis, eliciting as much information from the data as possible. For a project such as this future work would be to refactor the code for production readiness or to perform additional analysis (README for each project specifies next steps typically). I chose the Jupyter Notebook tool for exploration. I used PyCharm for development, and the virtual environment files generated by PyCharm (and used by the project) are also uploaded to GitHub. The environment can hence be reproduced.
Please feel free to contribute or change each of these projects as you please. We can then also discuss possible improvements.
I have uploaded a third more comprehensive project battery_island which focuses on a Data Lake architecture for a hypothetical company, based on an academic paper published in Nature. This project has a Tensorflow component to it for fun, but the focus will be on the Data Lake architecture.
Feel free to jump in and enjoy!