Giter Site home page Giter Site logo

goodoid / paddle-operator Goto Github PK

View Code? Open in Web Editor NEW

This project forked from paddleflow/paddle-operator

0.0 0.0 0.0 544 KB

Elastic Deep Learning Training based on Kubernetes by Leveraging EDL and Volcano

License: Apache License 2.0

Go 92.73% Dockerfile 1.18% Makefile 6.09%

paddle-operator's Introduction

Paddle Operator

Overview

Paddle Operator makes it easy to run paddle distributed training job on kubernetes by providing PaddleJob custom resource etc.

Quick Start

Prerequisites

  • Kubernetes >= 1.8
  • kubectl

Installation

With kubernetes ready, you can install paddle operator with configuration in deploy folder (use deploy/v1 for kubernetes v1.16+ or deploy/v1beta1 for kubernetes 1.15-).

Create PaddleJob crd,

$ kubectl apply -f https://raw.githubusercontent.com/PaddleFlow/paddle-operator/main/deploy/v1/crd.yaml

A succeed creation leads to result as follows,

$ kubectl get crd
NAME                                    CREATED AT
paddlejobs.batch.paddlepaddle.org       2021-02-08T07:43:24Z

Then deploy controller,

$ kubectl apply -f https://raw.githubusercontent.com/PaddleFlow/paddle-operator/main/deploy/v1/operator.yaml

the ready state of controller would be as follow,

$ kubectl -n paddle-system get pods
NAME                                         READY   STATUS    RESTARTS   AGE
paddle-controller-manager-698dd7b855-n65jr   1/1     Running   0          1m

By default, paddle controller runs in namespace paddle-system and only controll jobs in that namespace. To run controller in a different namespace or controll jobs in other namespaces, you can edit charts/paddle-operator/values.yaml and install the helm chart. You can also edit kustomization files or edit deploy/v1/operator.yaml directly for that purpose.

Run demo paddlejob

Deploy your first paddlejob demo with

$ kubectl -n paddle-system apply -f https://raw.githubusercontent.com/PaddleFlow/paddle-operator/main/deploy/examples/wide_and_deep.yaml

Check pods status

$ kubectl -n paddle-system get pods

Check paddle job status

$ kubectl -n paddle-system get pdj

Uninstall

Simply

$ kubectl delete -f https://raw.githubusercontent.com/PaddleFlow/paddle-operator/main/deploy/v1/crd.yaml -f https://raw.githubusercontent.com/PaddleFlow/paddle-operator/main/deploy/v1/operator.yaml

Advanced usage

More configuration can be found in Makefile, clone this repo and enjoy it. If you have any questions or concerns about the usage, please do not hesitate to contact us.

More Information

Please refer to the 中文文档 for more information about paddle configuration.

paddle-operator's People

Contributors

kuizhiqing avatar ruminateer avatar tizhou86 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.