Name: Abdullah AlGhamdi
Type: User
Company: Innovation Lab
Bio: Machine Learning Engineer-Data Scientist by self-reinvention.
Electrical, Communication, Digital Signal Processing Engineer by academic training.
Twitter: Abdullusive
Location: Riyadh, Saudi Arabia
Blog: abdullahalghamdi.sa
Abdullah AlGhamdi's Projects
Coding exercises for the Natural Language Processing concentration, part of Udacity's AIND program.
GA DSI capstone project
The PyTorch-based audio source separation toolkit for researchers
:atom: The hackable text editor
Building an ETL pipeline that extracts their data from S3, stages them in Redshift, and transforms data into a set of dimensional tables for their analytics team to continue finding insights in what songs their users are listening to.
Develop a data model designed for Online Analytical Processing (OLAP) to support queries analyzing US immigration data. In the data model, we complemented the US immigration data with US cities' demographics data (ETL'ed in a fact table) as well as dimension tables for arrival ports, countries, visa types, etc.
In this project, an ETL pipeline is built for a data lake. The data resides in S3, in a directory of JSON logs on user activity on the app, as well as a directory with JSON metadata on the songs in the app. Data will be loaded from S3, then processed into analytics tables using Spark, and load them back into S3. This Spark process is deployed on a cluster using AWS.
Create a noSQL database and ETL pipeline designed to optimize queries for understanding what songs users are listening to. Model the data in Apache Cassandra to allow for specific queries provided by the analytics team at Sparkify.
Create relational and NoSQL data models to fit the diverse needs of data consumers. Use ETL to build databases in PostgreSQL and Apache Cassandra.
Schedule, automate, and monitor data pipelines using Apache Airflow. Run data quality checks, track data lineage, and work with data pipelines in production.