Giter Site home page Giter Site logo

openflighthpc / concertim-cluster-builder Goto Github PK

View Code? Open in Web Editor NEW
0.0 5.0 0.0 333 KB

This repos is for managing the cluster builder for the concertim cluster portal project

License: Eclipse Public License 2.0

Python 99.09% Dockerfile 0.91%

concertim-cluster-builder's Introduction

Concertim Cluster Builder

The Concertim Cluster Builder is a Python daemon process providing an HTTP API to receive requests to build clusters of pre-defined types.

It is expected that such requests will be made by the Concertim Visualisation App and that the Concertim OpenStack Service will report the existence of those clusters to Concertim Visualisation App as and when they become available.

Quick start

  1. Clone the repository
    git clone https://github.com/openflighthpc/concertim-cluster-builder.git
  2. Build the docker image
    docker build --network=host --tag concertim-cluster-builder:latest .
  3. Start the docker container
    docker run -d --name concertim-cluster-builder \
        --stop-signal SIGINT \
    	--network=host \
    	--publish 42378:42378 \
    	concertim-cluster-builder

Use Concertim Visualisation App and the Concertim OpenStack Service as the Cluster Builder clients.

Building the docker image

Concertim Cluster Builder is intended to be deployed as a Docker container. There is a Dockerfile in this repo for building the image.

  1. Clone the repository
    git clone https://github.com/openflighthpc/concertim-cluster-builder.git
  2. Build the docker image
    docker build --network=host --tag concertim-cluster-builder:latest .

Configuration

Concertim Cluster Builder has three separate elements to its configuration: 1) configuring access to the cloud environment; and 2) configuring the enabled cluster type definitions. These are detailed below.

Cloud environment access

All required configuration to access the cloud environment is sent in the request to build a cluster. More details on the format can be found in the API documentation.

Cluster type definitions

Concertim cluster builder needs to be configured with the enabled cluster type definitions. The enabled definitions are to be created in the docker container's /app/instance/cluster-types-enabled/ directory.

The Docker image is built with some example cluster type definitions enabled by default.

If you wish to configure additional cluster type definitions, the docker container should be started with a host directory, say, /usr/share/concertim-cluster-builder/ mounted to /app/instance/. The example cluster type definitions can then be copied to /usr/share/concertim-cluster-builder/ and new definitions added. To do this follow the steps below:

Create the directory structure.

mkdir -p /usr/share/concertim-cluster-builder/{cluster-types-available,cluster-types-enabled,templates}

Copy across the example definitions and the template library.

for i in examples/cluster-types/* ; do
  cp -a $i /usr/share/concertim-cluster-builder/cluster-types-available/
done
for i in examples/templates/* ; do
  cp -a $i /usr/share/concertim-cluster-builder/templates/
done

Optionally, enable the example definitions

cd /usr/share/concertim-cluster-builder/cluster-types-enabled/
for i in ../cluster-types-available/* ; do
  ln -s ${i} .
done

Mount the directory /usr/share/concertim-cluster-builder/ to /app/instance, when starting the docker container.

docker run -d --name concertim-cluster-builder \
    --stop-signal SIGINT \
    --network=host \
    --publish <Host>:42378:42378 \
    --volume /usr/share/concertim-cluster-builder/:/app/instance \
    concertim-cluster-builder

Currently, there is no documentation on the format for the cluster type definition files beyond the well-documented examples. They should prove sufficient.

Usage

Once the docker image has been built and the cluster type definitions have been configured (see Cluster type definitions and Installation above), start the container with the following command.

docker run -d --name concertim-cluster-builder \
    --stop-signal SIGINT \
    --network=host \
    --publish <Host>:42378:42378 \
    --volume /usr/share/concertim-cluster-builder/:/app/instance \
    concertim-cluster-builder

HTTP API

The HTTP API is documented in the API documentation.

Development

See the development docs for details on development and getting started with development.

Contributing

Fork the project. Make your feature addition or bug fix. Send a pull request. Bonus points for topic branches.

Read CONTRIBUTING.md for more details.

Copyright and License

Eclipse Public License 2.0, see LICENSE.txt for details.

Copyright (C) 2024-present Alces Flight Ltd.

This program and the accompanying materials are made available under the terms of the Eclipse Public License 2.0 which is available at https://www.eclipse.org/legal/epl-2.0, or alternative license terms made available by Alces Flight Ltd - please direct inquiries about licensing to [email protected].

Concertim Cluster Builder is distributed in the hope that it will be useful, but WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OR CONDITIONS OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. See the Eclipse Public License 2.0 for more details.

concertim-cluster-builder's People

Contributors

benarmston avatar timalces avatar vipulnayyar avatar dependabot[bot] avatar logans56 avatar

Watchers

Mark J. Titorenko avatar  avatar Steve Norledge avatar  avatar James Muscat avatar

concertim-cluster-builder's Issues

Address limitations with sahara cluster type variety

There are currently some limitations with the sahara cluster type variety:

For the direct launching of sahara clusters:

  1. requires a network to be specified. We currently only have a guarantee that the user has access to the public1 network. Launching a sahara cluster on the public1 network is not a good idea. Ideally, we'd have some mechanism of ensuring that the user has non-external network available to them. There are several forms that could take: require the ops team (or similar) to create a single network and ensure it is available to every concertim user. Have the middleware configured to automatically make a certain non-external network available to each new concertim user. Have the middleware create a dedicated non-external network for each new concetim user.
  2. requires that the network be specified by ID. Using the name of the network would provide a better experience for both the user and the cluster type author especially where defaults are concerned. It is possible to get the ID for a network from its name, but the exact incantation is currently unknown.

Indirect launching of sahara clusters via HOT template:

  1. requires that the sahara plugin, plugin version, and image id are all provided by the user. It is possible to calculate these from the cluster template, but this is not currently done. This could be done by adding a new cluster type, say SaharaHeatWrapperHandler that is similar to SaharaHandler in how it determines values from the cluster template, but is also similar to HeatHandler in how it launches the cluster on openstack.
  2. requires that the image and cluster template are provided by id not name. This could (perhaps?) be fixed with a better HOT template, or by using the SaharaHeatWrapperHandler, alluded to above, to perform the translation.

Currently, support for sahara clusters is low priority so this issue mostly exists to document these as known issues and suggest potential solutions.

Specify name property for volumes in resource groups

Currently if there are multiple volumes in a resource group, they are given the same name (e.g. node-vol). We should update this to give them unique, descriptive names as we do with servers in a resource group. e.g. node01-vol, node02-vol, etc.

Include cluster name when creating order

As well as the openstack ID that is currently provided, we should include the cluster's name when creating an order so this can be more easily included in invoices.

Clusters vs Rack creation - Order ID mismatch

When using heat, this directly creates a stack (see http://10.151.0.184/project/stacks/). However, when using magnum and sahara, this creates a cluster (see http://10.151.0.184/project/clusters), which in turn contains a stack.

When cluster builder creates an order in the middleware service, the cluster id is being sent for magnum (and presumably sahara, though I've not been able to test this). And for heat, the stack id is being sent.

However the middleware service assumes the stack id has been used for all order records. This means that for magnum/sahara stacks, creating racks in visualiser fails, as the middleware can't find an order with their stack id.

Ideally we should be consistent and always pass the stack id instead of the cluster id. It may be possible to obtain the stack id when creating such clusters, e.g. using magnum_client.clusters.get(cluster_id).stack_id, but it seems that this is None when the cluster is first created - further investigation is needed.

Alternatively we could update the middleware service to check if a stack has a cluster id - if present, using that instead to check an order, but this will make the process less client agnostic.

More generally it may be worth considering if there is any fundamental difference between a cluster containing a rack and just a rack on its own that needs to be differentiated in visualiser/elsewhere in concertim.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.