Giter Site home page Giter Site logo

nimbo-sh / nimbo Goto Github PK

View Code? Open in Web Editor NEW
126.0 6.0 11.0 491 KB

Run compute jobs on AWS as if you were running them locally.

Home Page: https://nimbo.sh

License: GNU General Public License v3.0

Python 87.73% Shell 12.27%
aws machine-learning conda cli jupyter spot-instance

nimbo's Introduction

Nimbo: Machine Learning on AWS with a single command

Nimbo is a dead-simple command-line tool that allows you to run code on AWS as if you were running it locally. It abstracts away the complexity of AWS, allowing you to build, iterate, and deliver machine learning models faster than ever.

Example - nimbo run "python -u train.py --lr=3e-4

The fastest way to prototype on AWS

Nimbo drastically simplifies your AWS workflow by taking care of instance, environment, data, and IAM management - no changes to your codebase needed. Whether you're just getting started with AWS or are a seasoned veteran, Nimbo takes the pain out of doing Machine Learning in the cloud, allowing you to focus on what matters - building great models for your team and clients.

Powerful commands

Nimbo provides many useful commands to supercharge your productivity when working with AWS, such as easily launching notebooks, checking prices, logging onto an instance, or syncing data. Some examples include :

  • nimbo ls-spot-prices
  • nimbo ssh <instance-id>
  • nimbo push datasets
  • nimbo pull logs
  • nimbo rm-all-instances

Key Features

  • Your Infrastructure: Code runs on your EC2 instances and data is stored in your S3 buckets. This means that you can easily use the resulting models and data from anywhere within your AWS organization, and use your existing permissions and credentials.
  • User Experience: Nimbo gives you the command line tools to make working with AWS as easy as working with local resources. No more complicated SDKs and never-ending documentation.
  • Customizable: Want to use a custom AMI? Just change the image ID in the Nimbo config file. Want to use a specific conda package? Just add it to your environment file. Nimbo is built with customization in mind, so you can use any setup you want.
  • Seamless Spot Instances With Nimbo, using spot instances is as simples as changing a single value on the config file. Enjoy the 70-90% savings with AWS spot instances with no changes to your workflow.
  • Managed Images We provide managed AMIs with the latest drivers, with unified naming across all regions. We will also release AMIs that come preloaded with ImageNet and other large datasets, so that you can simply spin up an instance and start training.

You can find more information at nimbo.sh, or read the docs at docs.nimbo.sh.

Getting started

Please visit the Getting started page in the docs.

Examples

Sample projects can be found at our examples repo, nimbo-examples. Current examples include:

Product roadmap

  • GCP support: Use the same commands to run jobs on AWS or GCP.
  • Deployment: Deploy ML models to AWS/GCP with a single command. Automatically create an API endpoint for providing video/audio/text and getting results from your model back.
  • Add Docker support: Right now we assume you are using a conda environment, but many people use docker to run jobs. This feature would allow you to run a command such as nimbo run "docker-compose up", where the docker image would be fetched from DockerHub (or equivalent repository) through a docker_image parameter on the nimbo-config.yml file.
  • Add AMIs with preloaded large datasets: Downloading and storing large datasets like ImageNet is a time consuming process. We will make available AMIs that come with an extra EBS volume mounted on /datasets, so that you can use large datasets without worrying about storing them or waiting for them to be fetched from your S3 bucket. Get in touch if you have datasets you would like to see preloaded with the instances.

Developing

If you want to make changes to the codebase, you can clone this repo and

  1. pip install -e . to install nimbo locally. As you make code changes, your local nimbo installation will automatically update.
  2. pip install -r requirements/dev.txt for installing all dependencies for development.

Running Tests

Create two instance keys, one for eu-west-1 and one for us-east-2. The keys should begin with the zone name, e.g. eu-west-1-dave.pem. Do not forget to chmod 400 the created keys. Place these keys in src/nimbo/tests/aws/assets.

Create a nimbo-config.yml file in src/nimbo/tests/assets with only the aws_profile, security_group, and cloud_provider: AWS fields set.

Make sure that the security_group that you put in test nimbo-config.yml allows your IP for all regions, otherwise, the tests will fail.

Use pytest to run the tests

pytest -x

nimbo's People

Contributors

hellno avatar jozuas avatar seuqaj114 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

nimbo's Issues

create-bucket fails

Steps to reproduce

$ nimbo create-bucket BUCKET_NAME

Error

produces this stacktrace.
The final lines of the stacktrace:

  File "/nix/store/k8azdbzwsyklkm344271wacjjsi4mkp5-python3.8-botocore-1.20.52/lib/python3.8/site-packages/botocore/validate.py", line 293, in serialize_to_request
    raise ParamValidationError(report=report.generate_report())
botocore.exceptions.ParamValidationError: Parameter validation failed:
Unknown parameter in input: "DryRun", must be one of: ACL, Bucket, CreateBucketConfiguration, GrantFullControl, GrantRead, GrantReadACP, GrantWrite, GrantWriteACP, ObjectLockEnabledForBucket

Add Continuous Integration

There is no reason not to run tests that do not require spinning up of instances on each PR and master.

Replace AWSCLI usage with boto3 implementation

Right now awscli is only useful for s3 due to aws sync, but the usefulness of this command is limited.

By replacing awscli usage with our own implementation for interacting with s3 we will be able to:

  • Implement aws sync like ability, and extend it to fix #16
  • Remove a dependency
  • Not need to migrate to v2 #24
  • Will also resolve the issue with colorama version mismatches.

Docker support (as alternative to conda)

Is your feature request related to a problem? Please describe.
Conda env setup takes >45 mins every time I'm starting a new instance. This gets 10x more annoying when relying on spot instances.

Describe the solution you'd like
Instead of creating the same conda environment on each new instance, why not have a docker image that can be ready to start much quicker

Additional context
Would love any pointers how to add this to nimbo. I'd be up for creating a PR depending on the capacity needed.
Is this on the feature roadmap for nimbo anyway? (Is there a feature roadmap? :) )

Thanks, would love to hear your thoughts on this

AWS Credential errors

Really like the concept of nimbo, it is quite lightweight for launching instances. However, I have not been able to give it a try due to AWS key errors. Would love to chat more about using this tool though.

Tag the source

Could you please tag the source? This allows distributions to get the complete source from GitHub if they want.

Thanks

Changing nimbo-config.yml name

Hello,

We have a docker container which runs as a microservice that uses nimbo for training a model on EC2. However, if two users in the same container were to run a job at the same time than someone may use the other person's nimbo-config.yml file.

Is there a way to change the nimbo-config.yml name so that each user can have their own nimbo configuration file.

Automatically create instance keys

We should have a command line option to create and download instance keys. Careful, the user might not have the right permissions to create them programatically.

Allow naming / describing instances

nimbo list-active outputs a nice list of active instances. However, instance ids are pretty cryptic, so it is hard to know what is what. It would be really nice to be able to name instances, or perhaps to attach some sort of "description" field (that could for example contain hyperparameters, datasets used, etc).

nimbo unable to read pem key 'load pubkey "mykey.pem": invalid format'

Reproduced by

  1. Generate a key mykey.pem in the AWS console and download it
  2. Copy mykey.pem to project dir
  3. Put instance_key: mykey.pem in nimbo-config.yml
  4. Run nimbo test-access
  5. Get a bunch of load pubkey "mykey.pem": invalid format errors

Inspecting the key shows it looks fine with correct header/footer and can be used to manually SSH onto EC2 instances. Is there something Nimbo does differently when it comes to keys?

Nimbo instance alerts

Enable users to setup email alerts for when they forget to terminate instances. This could be done by deploying a lambda function that executes at some time of the day and sends an email to the user.

Nimbo fails to connect via SSH / Port 22 while terminal session succeeds

Describe the bug
I'm using nimbo run "python script.py" and the script fails at "Something went wrong while connecting to the instance."
In the AWS EC2 cloud console I can see the instance up and running, plus I can connect using a separate terminal session using ssh -i {SAME_SSH_KEY_AS_IN_NIMBO_CONFIG}.pem ubuntu@{IP}.

Expected behavior
Nimbo should succeed and fail depending on the actual state of ssh / port 22 connection.

Configuration

cloud_provider: AWS

# Data paths
local_datasets_path: data  # relative to project root
local_results_path: results    # relative to project root
s3_datasets_path: s3://xyz/data
s3_results_path: s3://xyz/results

# Device, environment and regions
aws_profile: default
region_name: eu-west-1
instance_type: g4dn.2xlarge
spot: yes

image: ubuntu18-latest-drivers
disk_size: 80  # In GB
conda_env: conda-env.yml  # denotes project root

# Job options
run_in_background: no
persist: no  # whether instance persists when the job finishes or on error

# Permissions and credentials
security_group: default
instance_key: key.pem  # can be an absolute path
role: NimboFullS3AccessRole

ssh_timeout: 450

To Reproduce
Try running a python script using nimbo. Connect to IP address via separate terminal session.

nimbo ssh <instance_id> connects to the instance as well.

Any hints how to fix this or how to disable the initial up check?

Allow passing config parameters as environment variables

It would be useful to be able to pass certain parameters as environment variables. For example, suppose I want to commit a config file, but each user of the repo has a different key. With the current setup this wouldn't work, because everyone has different entries for this field:

instance_key: /home/andre/andre-key.pem

A similar issue would arise when running a CI job, for example.

One possible solution is to be able to pass this parameter (and perhaps others) as an environment variable. So something like this:

instance_key: $NIMBO_KEY

Cheers and keep up the great work.

Improve printing

Right now the way information output is:

  • Inconsistent
  • Unrefined

In addition, Nimbo depends on click and awscli which in turn depend on colorama. Right now we are using rich, investigate if it is worth reducing our dependencies.

Group CLI help options

At the moment CLI options like auxiliary commands like list-gpu-prices are mixed main commands like ssh. Group CLI help options by category.

For example:

Utilities:
  list-gpu-prices
  list-spot-gpu-prices
  allow-current-ip

s3:
  ls
  pull
  push

Improve testing

Right now testing is minimal, and the quality of the tests is not fantastic.

  1. Introduce --dry-run tests for every CLI option.
  2. Write more end-to-end tests.

Would be desirable to aim for 100% test coverage.

Current IP addition improvements

At the moment, when adding current IP to a security group /16 CIDR block is used, this should be a config option. Should probably default to /32.

Nimbo should also attempt to add the current IP automatically to the current security group without the user having to run allow-current-ip group_name to reduce the number of actions the user has to perform.

issue creating spot instances

Hi! Thanks for this nice tool.

I'm having an issue with bringing up spot instances.

In nimbo-config.yml if spot is set to no, it works as expected, but setting to yes gives me this error:

Exception: {'Code': 'bad-parameters', 'Message': 'Your Spot request failed due to bad parameters. 
Spot request cannot be fulfilled due to invalid availability zone.', 
'UpdateTime': datetime.datetime(2021, 5, 12, 19, 8, 50, tzinfo=tzutc())}

I have tried on both ubuntu 20.04 and macos and meet the same error. On my linux box I tried both installing nimbo with pip and from source and met this availability zone error, on macos I've only tried installing nimbo with pip.

Is this a known issue? How can it be resolved?

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.