Giter Site home page Giter Site logo

Hello, I am Stefen 👋

I am an data engineer 💻, photographer 📸, and designer 🎨!

Hello and welcome to my GitHub! With over eight years of experience as a data engineer, I have gained deep expertise in designing, building, and managing data pipelines and infrastructures. My work supports various analytical and business intelligence needs, with a solid mastery of a variety of technologies and tools, including but not limited to, SQL, Python, ETL frameworks, as well as big data technologies such as Hadoop and Spark.

My GitHub account hosts a range of projects and code examples that demonstrate my expertise in the field of data engineering. These projects not only showcase my skills in extracting, transforming, and loading data from various sources but also reflect my abilities in data modeling, visualization, and reporting.

I am committed to continuous learning and growth, and I am open to collaborating with other professionals in the field. I invite you to join me on GitHub to explore my work, exchange ideas, and share knowledge.

🤝 Connect with me:

Feel free to contact me if you have any questions or comments!

🔭 I am currently working on

  • Redesigning my old projects
  • Data engineering projects
  • Data analyses
  • DevOps projects

🌱 I am currently learning

  • 📱 Machine Learning
  • AI (Artificial Intelligence)
  • CloudSec (Cloud Security)
  • Kubernetes

💼 Technical Skills

Stefen's Projects

-google-analytics-360 icon -google-analytics-360

Welcome to the Google Analytics 360 Dataset Project! This repository is designed for anyone interested in working with realistic Google Analytics data. Whether you're a data scientist, a student, or a marketing analyst

adv_nlp_workshop_odsc_europe22 icon adv_nlp_workshop_odsc_europe22

Extensive tutorials for the Advanced NLP Workshop in Open Data Science Conference Europe 2020. We will leverage deep learning and deep transfer learning to solve popular tasks in NLP including Classification, Information Retrieval, Sentiment Analysis, Search Engines, Clustering, Paraphrase Mining, Summarization, Language Translation, Q&A systems

airflow_etl icon airflow_etl

The Pipeline for updating data between OLTP and OLAP environments

big-o-algorithm icon big-o-algorithm

we’ll explain Big O notation an real-world Python examples to illustrate how it can be applied to various time complexities.

devops-bash-script icon devops-bash-script

This repository contains a collection of bash scripts for common DevOps tasks, such as installing software, setting up environments, and managing resources.

docker-stack icon docker-stack

directory with different docker-compose file to quickly start an infrastructure

docsearch icon docsearch

Our project is a testament to this need, offering a comprehensive solution that combines modern technologies and architectures to create a powerful document search engine. This engine is not just a tool but a sophisticated ecosystem designed to handle complex data processing and retrieval tasks.

etl_onaws_deploy_with_terraform icon etl_onaws_deploy_with_terraform

The objective of this guide is to demonstrate how to automate the deployment of a data pipeline on AWS using Terraform. The pipeline will utilize AWS services such as Lambda, Glue, Crawler, Redshift, and

eventmusic icon eventmusic

EventMusic Producer is a Dockerized application designed to read data and output them to a Kafka topic, using Avro schemas for data serialization. It integrates seamlessly with Kafka and the Schema Registry to manage the flow of event data linked to music event information.

gmail-to-mongodb-script icon gmail-to-mongodb-script

This script facilitates the automation of fetching emails from a user's Gmail account and storing them into a MongoDB database. The emails fetched are filtered by specific labels such as Promotions, Social, Updates, and Forums. The script is intended to run continuously, checking for new emails every minute.

how-to-automatically-deploy-a-flask-application-on-an-ec2-instance-with-a-bash-script icon how-to-automatically-deploy-a-flask-application-on-an-ec2-instance-with-a-bash-script

The main motivation for this mini-project is to get familiar with using Bash Scripting and the AWS CLI to automate command line tasks. This particular repo contains a configuration script that automatically creates an EC2 instance, accesses it via SSH, installs dependencies and hosts a simple Flask application using the image taken from Docker Hub.

ia_data_pipeline icon ia_data_pipeline

The goal is to develop an intuitive platform where users can search for Airbnb apartments based on a target city, budget, and duration of stay, all powered by the intelligent language model, GPT-3.

iceberg-dbt-trino-hive-modern-open-source-data-stack icon iceberg-dbt-trino-hive-modern-open-source-data-stack

To provide a deeper understanding of how the modern, open-source data stack consisting of Iceberg, dbt, Trino, and Hive operates within a music streaming platform, let’s delve into the detailed workflow and benefits of each component.

ingest-data icon ingest-data

Big data application for multi-source data ingestion

kafka-pipeline icon kafka-pipeline

In the following post, we will learn how to build a data pipeline using a combination of open-source software (OSS), including Debezium, Apache Kafka, Kafka Connect.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.