Giter Site home page Giter Site logo

hidayat9945 / bq-create-data-transfer Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 1.0 95 KB

Create data transfer in GCP BigQuery for specified IAM. This will help as if a person leaves a job and creating data transfer using their credential.

Python 100.00%

bq-create-data-transfer's Introduction

BIGQUERY CREATE DATA TRANSFER

What Is This For?

This program is to create data transfer task in GCP BigQuery. Commonly, when we create new data transfer task in BigQuery, it will use our personal/work credential/email. Once we left the company and our work credential is deactivated, the task that was created with our credential will stop and return error.

So the goal is to create automatically using this script and using IAM service provided to create data transfer task. Therefore, once our credential is deactivated, it will keep running the task. The task that is created is detached to our work credential.

How it works

What Is In It?

There are 3 python files lie here.

  • logger.py is the file contains a logger object to provide logs generated by the program. If anything happens, you can debug the program right from the generated logs.
  • helpers.py is the file contains functions that can help us to create the task in BigQuery Data Transfer.
    • is_file_exist: check whether the config.json is already provided or not.
    • list_dts: returns the list of created tasks in BigQuery Data Transfer.
    • create_dts_s3: create the task in BigQuery Data Transfer.

What Is Required To Setup?

  • You need to set a .env file to provide every configuration needed by the program to run. Below is the variables needed to provide in the file.
    # GCP CREDENTIALS
    GOOGLE_APPLICATION_CREDENTIALS=iam-service-credential-file
    PROJECT_ID=your-gcp-project-id
    LOCATION=your-gcp-project-location
    
    # AWS CREDENTIALS
    ACCESS_KEY_ID=your-aws-access-key
    SECRET_ACCESS_KEY=your-aws-secret-access-key
    
  • Generate IAM service credential file from GCP that has access to create data transfer in BigQuery.
  • Create a config.json file that contains configurations as below.
    [
        {
            "display_name": "display name of the task",
            "destination_dataset": "destination dataset",
            "destination_table": "destination table",
            "s3_uri": "s3 path of where data in parquet format stored"
        },
        {
            "display_name": "display name of the task",
            "destination_dataset": "destination dataset",
            "destination_table": "destination table",
            "s3_uri": "s3 path of where data in parquet format stored"
        }
    ]
    You can add other configurations as above example.

โš ๏ธ DO NOT PUSH YOUR .env AND GCP CREDENTIAL FILES INTO REPOSITORY SINCE IT IS CREDENTIAL!!

How To Run The Program?

I suggest you to use python virtual environment to separate the environment between project. You can find how to create and use it in python documentation page.

After activating virtual environment, please follow below steps.

  1. You need to install all the dependencies required. All dependencies are provided in file requirements.txt. To install this, you can run the command below.
    pip install -r requirements.txt
    
  2. Run the script by writing below command in the shell.
    python main.py
    

bq-create-data-transfer's People

Contributors

hidayat9945 avatar ula-hidayat avatar

Watchers

 avatar

Forkers

uladotapp

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.