Giter Site home page Giter Site logo

brain-opera-gpt2-deployment's Introduction

Development on local

make sure you have model in this folder

# Install required libraries
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

# Start server
gunicorn -b :8000 server:app

Deploying model on GCP virtual machine

Steps to deploy

  1. Create a Google Cloud Platform account

  2. Install Cloud SDK

Follow the the steps for installing cloud SDK: https://cloud.google.com/sdk/install

  1. After installing Cloud SDK, initialize it
# Run the following command to initialize
# This will ask you to link your Google Cloud Platform account
gcloud init
  1. Create and configure GCP project
# This commands creates a project with the name `brain-opera-deployment`
gcloud projects create brain-opera-deployment

# Set your working project
gcloud config set project brain-opera-deployment

# If the above project name is taken, choose a differet project name
# Note: project names need to be unique across GCP

# Set compute zone
gcloud config set compute/zone asia-southeast1-b
  1. Set quota for GPU

First, go to the Compute Engine tab and initialize it. Wait for it to complete.

The default GPU quota for a GCP account with free credits is 0. A request for an increase in this quota is necessary to use GPUs.

Go to https://console.cloud.google.com/iam-admin/quotas. Make sure that brain-opera-deployment is chosen as the project in the header, as shown in the image below. Filter metric by GPUs (all regions). If the limit is 0, tick the checkbox and click on the Edit Quotas button at the top.

GCP quota

Fill in the necessary info and request for the limit to be raised to 1. An email will be sent to you for the quota request. The wait time is usually a few hours before the quota request is granted.

  1. Set firewall rules
gcloud compute --project=brain-opera-deployment firewall-rules create brain-opera-port8000 --direction=INGRESS --priority=1000 --network=default --action=ALLOW --rules=tcp:8000 --source-ranges=0.0.0.0/0 --target-tags=port8000
  1. Create VM
export IMAGE_FAMILY="tf-1-15-cu100"
export ZONE="asia-southeast1-b"
export INSTANCE_NAME="brain-opera-gpt2"
export INSTANCE_TYPE="n1-standard-2"

gcloud compute instances create $INSTANCE_NAME \
        --zone=$ZONE \
        --image-family=$IMAGE_FAMILY \
        --image-project=deeplearning-platform-release \
        --maintenance-policy=TERMINATE \
        --accelerator="type=nvidia-tesla-p4,count=1" \
        --machine-type=$INSTANCE_TYPE \
        --boot-disk-size=200GB \
        --metadata="install-nvidia-driver=True" \
        --tags=port8000
  1. Setup VM
# Copy model from local directory to VM
gcloud compute scp --recurse ./checkpoint brain-opera-gpt2:~/checkpoint
# SSH into machine
gcloud compute --project "brain-opera-deployment" ssh --zone "asia-southeast1-b" "brain-opera-gpt2"

# Clone repo
git clone https://github.com/jonheng/brain-opera-gpt2-deployment.git

# Move model into repo
mv checkpoint/ brain-opera-gpt2-deployment/checkpoint

# Go to cloned repo
cd brain-opera-gpt2-deployment/

# Install python3-venv, enter Y when prompted
sudo apt-get install python3-venv

# Install required libraries
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

# Start server
gunicorn -b :8000 server:app
  1. Final test

Check your application IP

gcloud compute instances list

Test that the connection works

  1. Clean up
# To delete the vm instance
gcloud compute instances delete brain-opera-gpt2

# To delete entire project
gcloud projects delete brain-opera-deployment

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.