Giter Site home page Giter Site logo

atralupus / backend.ai Goto Github PK

View Code? Open in Web Editor NEW

This project forked from lablup/backend.ai

0.0 0.0 0.0 4.42 MB

Backend.AI is a streamlined, container-based computing cluster orchestrator that hosts diverse programming languages and popular computing/ML frameworks, with pluggable heterogeneous accelerator support including CUDA and ROCM.

Home Page: https://www.backend.ai

License: GNU Lesser General Public License v3.0

Shell 2.25% Python 96.82% Java 0.06% Go 0.06% CSS 0.17% Mako 0.01% Dockerfile 0.07% Vim Script 0.01% Starlark 0.47% Jinja 0.08% Gherkin 0.01%

backend.ai's Introduction

Backend.AI

PyPI release version Supported Python versions Gitter

Backend.AI is a streamlined, container-based computing cluster orchestrator that hosts diverse programming languages and popular computing/ML frameworks, with pluggable heterogeneous accelerator support including CUDA and ROCM. It allocates and isolates the underlying computing resources for multi-tenant computation sessions on-demand or in batches with customizable job schedulers. All its functions are exposed as REST/GraphQL/WebSocket APIs.

Contents in This Repository

This repository contains all open-source server-side components and the client SDK for Python as a reference implementation of API clients.

Directory Structure

  • src/ai/backend/: Source codes
    • manager/: Manager
    • manager/api: Manager API handlers
    • agent/: Agent
    • agent/docker/: Agent's Docker backend
    • agent/k8s/: Agent's Kubernetes backend
    • kernel/: Agent's kernel runner counterpart
    • runner/: Agent's in-kernel prebuilt binaries
    • helpers/: Agent's in-kernel helper package
    • common/: Shared utilities
    • client/: Client SDK
    • cli/: Unified CLI for all components
    • storage/: Storage proxy
    • storage/api: Storage proxy's manager-facing and client-facing APIs
    • web/: Web UI server
    • plugin/: Plugin subsystem
    • test/: Integration test suite
    • testutils/: Shared utilities used by unit tests
    • meta/: Legacy meta package
  • docs/: Unified documentation
  • tests/
    • manager/, agent/, ...: Per-component unit tests
  • configs/
    • manager/, agent/, ...: Per-component sample configurations
  • docker/: Dockerfiles for auxiliary containers
  • fixtures/
    • manager/, ...: Per-component fixtures for development setup and tests
  • plugins/: A directory to place plugins such as accelerators, monitors, etc.
  • scripts/: Scripts to assist development workflows
    • install-dev.sh: The single-node development setup script from the working copy
  • stubs/: Type annotation stub packages written by us
  • tools/: A directory to host Pants-related tooling
  • dist/: A directory to put build artifacts (.whl files) and Pants-exported virtualenvs
  • changes/: News fragments for towncrier
  • pants.toml: The Pants configuration
  • pyproject.toml: Tooling configuration (towncrier, pytest, mypy)
  • BUILD: The root build config file
  • **/BUILD: Per-directory build config files
  • BUILD_ROOT: An indicator to mark the build root directory for Pants
  • requirements.txt: The unified requirements file
  • *.lock, tools/*.lock: The dependency lock files
  • docker-compose.*.yml: Per-version recommended halfstack container configs
  • README.md: This file
  • MIGRATION.md: The migration guide for updating between major releases
  • VERSION: The unified version declaration

Server-side components are licensed under LGPLv3 to promote non-proprietary open innovation in the open-source community while other shared libraries and client SDKs are distributed under the MIT license.

There is no obligation to open your service/system codes if you just run the server-side components as-is (e.g., just run as daemons or import the components without modification in your codes). Please contact us (contact-at-lablup-com) for commercial consulting and more licensing details/options about individual use-cases.

Getting Started

Installation for Single-node Development

Run scripts/install-dev.sh after cloning this repository.

This script checks availability of all required dependencies such as Docker and bootstrap a development setup. Note that it requires sudo and a modern Python installed in the host system based on Linux (Debian/RHEL-likes) or macOS.

Installation for Multi-node Tests & Production

Please consult our documentation for community-supported materials. Contact the sales team ([email protected]) for professional paid support and deployment options.

Accessing Compute Sessions (aka Kernels)

Backend.AI provides websocket tunneling into individual computation sessions (containers), so that users can use their browsers and client CLI to access in-container applications directly in a secure way.

  • Jupyter: data scientists' favorite tool
    • Most container images have intrinsic Jupyter and JupyterLab support.
  • Web-based terminal
    • All container sessions have intrinsic ttyd support.
  • SSH
    • All container sessions have intrinsic SSH/SFTP/SCP support with auto-generated per-user SSH keypair. PyCharm and other IDEs can use on-demand sessions using SSH remote interpreters.
  • VSCode (coming soon)
    • Most container sessions have intrinsic web-based VSCode support.

Working with Storage

Backend.AI provides an abstraction layer on top of existing network-based storages (e.g., NFS/SMB), called vfolders (virtual folders). Each vfolder works like a cloud storage that can be mounted into any computation sessions and shared between users and user groups with differentiated privileges.

Major Components

Manager

It routes external API requests from front-end services to individual agents. It also monitors and scales the cluster of multiple agents (a few tens to hundreds).

Agent

It manages individual server instances and launches/destroys Docker containers where REPL daemons (kernels) run. Each agent on a new EC2 instance self-registers itself to the instance registry via heartbeats.

Storage Proxy

It provides a unified abstraction over multiple different network storage devices with vendor-specific enhancements such as real-time performance metrics and filesystem operation acceleration APIs.

Webserver

It hosts the SPA (single-page application) packaged from our web UI codebase for end-users and basic administration tasks.

Kernels

Jail

A programmable sandbox implemented using ptrace-based sytem call filtering written in Go.

Hook

A set of libc overrides for resource control and web-based interactive stdin (paired with agents).

Client SDK Libraries

We offer client SDKs in popular programming languages. These SDKs are freely available with MIT License to ease integration with both commercial and non-commercial software products and services.

Plugins

Legacy Components

These components still exist but are no longer actively maintained.

Media

The front-end support libraries to handle multi-media outputs (e.g., SVG plots, animated vector graphics)

  • The Python package (lablup) is installed inside kernel containers.
  • To interpret and display media generated by the Python package, you need to load the Javascript part in the front-end.
  • https://github.com/lablup/backend.ai-media

IDE and Editor Extensions

We now recommend using in-kernel applications such as Jupyter Lab, Visual Studio Code Server, or native SSH connection to kernels via our client SDK or desktop apps.

License

Refer to LICENSE file.

backend.ai's People

Contributors

100sun avatar achimnol avatar adrysn avatar agatha197 avatar consol-lee avatar ddanggle avatar fregataa avatar gy-ulbak96 avatar hekang42 avatar hephaex avatar iamupd avatar inureyes avatar kangjuseong avatar keunmo avatar kyujin-cho avatar lizable avatar qkoo0833 avatar rapsealk avatar reiden21 avatar rjwharry avatar seeun-320 avatar soheeeep avatar studioego avatar syeong2 avatar xyloon avatar yaminyam avatar yeslee-v avatar yongkyunlee avatar yujung7768903 avatar zeniuus avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.