Giter Site home page Giter Site logo

de_project_yuqing's Introduction

DE_project_yuqing

DataEngineering_tutoring_project_home

This repo contains the Spark code and HQL for Data Engineering practices.

Must read:

  1. Hadoop Cluster https://www.databricks.com/glossary/hadoop-cluster#:~:text=Hadoop%20clusters%20are%20composed%20of,running%20on%20a%20separate%20machine.
  2. HIVE https://aws.amazon.com/big-data/what-is-hive/
  3. HDFS (Hadoop Distributed File System ) https://www.databricks.com/glossary/hadoop-distributed-file-system-hdfs#:~:text=HDFS%20(Hadoop%20Distributed%20File%20System,handle%20and%20store%20big%20data.

How to use key pair to log into edge node.

  1. Generate a key pair:
  • For Linux/Mac: Use the following command in the terminal:

     ssh-keygen -t rsa -b 4096 -C "[email protected]"
    

This command will create a public and private key pair in the default location (~/.ssh/id_rsa and ~/.ssh/id_rsa.pub).

  • For Windows: Use a tool like PuTTYgen to generate a key pair. Save the public and private keys (public_key.pem and private_key.ppk) to a safe location.
  1. In remote server, go to .ssh folder, create a authorized_key file and paste your public key there.
  2. Ensure the proper permissions are set on your private key file:
  • For Linux/Mac:

    chmod 600 ~/.ssh/id_rsa

  • For Windows: In PuTTYgen, when you save the private key (private_key.ppk), the correct permissions are automatically applied.

  1. Connect to the edge node using your private key:
  • For Linux/Mac: Use the following command in the terminal, replacing "user" with your username and "edge_node_ip" with the IP address or hostname of the edge node:

    ssh -i ~/.ssh/id_rsa user@edge_node_ip

  • For Windows: Open PuTTY and enter the edge node's IP address or hostname in the "Host Name (or IP address)" field. In the left pane, navigate to Connection > SSH > Auth, and click the "Browse" button to select your private_key.ppk file. Click "Open" to initiate the connection.

  1. If prompted, accept the edge node's host key by typing "yes" (Linux/Mac) or clicking "Yes" (Windows). You should now be connected to the edge node.

for GCP specific, follow the link below:

https://www.cyberciti.biz/faq/google-cloud-compute-engin-ssh-into-an-instance-from-linux-unix-appleosx/

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.