Giter Site home page Giter Site logo

emr-studio-samples's Introduction

EMR Studio Samples

This repository contains a script and AWS CloudFormation template samples for Amazon EMR Studio preview. You can create EMR Studios in AWS Organization Member accounts by using these samples. For more information about using EMR Studio, see Use EMR Studio in the Amazon EMR Management Guide.

You can submit feedback and requests for changes by opening an issue in this repo or by making proposed changes and submitting a pull request.

Creating an EMR Studio using demo script

⚠️ WARNING
Charges accrue for the AWS resources that the demo script provisions, such as the Amazon VPC, subnets, and the AWS Service Catalog portfolio in the AWS CloudFormation stack.
  1. Make sure you have your AWS credentials configured. For more information, see Configuring the AWS CLI. The IAM principal should contain at least Minimum Studio Admin permissions and AdditionalPermissionForDemoScript.json in this repo
  2. Make sure your AWS CLI version is equal or later than awscli-1.18.184 or awscli-2.1.4
  3. The cluster creations inside EMR Studio use the default EMR resources, such as EMR_EC2_DefaultRole, EMR_DefaultRole and S3 bucket for logging EMR steps (e.g. s3://aws-logs123456789012-us-east-1/elasticmapreduce/). So make sure they are present. If your account has never created any EMR cluster before, these default resources will be missing. The easiest way to bootstrap them is to create an EMR cluster using console.
  4. Clone this repository, or download create.sh using one of the following commands:
    • Clone: git clone https://github.com/aws-samples/emr-studio-samples.git
    • Download: curl https://raw.githubusercontent.com/aws-samples/emr-studio-samples/main/create_demo_studio_with_dependencies.sh --output create_demo_studio_with_dependencies.sh
  5. In the terminal, navigate to the directory where you saved create_demo_studio_with_dependencies.sh
  6. Run: bash create_demo_studio_with_dependencies.sh

Creating an EMR Studio using your own S3 bucket, VPC and cluster templates

⚠️ WARNING
Make sure your VPC and Subnets have the required tag: key = "for-use-with-amazon-emr-managed-policies", value = "true". Update S3 resource in service role policy to your S3 bucket

If you prefer to use existing S3 Bucket, VPC, Private Subnets(with NAT) and Service catalog products, use min_studio_dependencies.yml to create a minimum resource stack for your Studio. This stack contains only one service role, one user role, three example session policies and two securigy groups, which are needed to create an EMR Studio.

  1. If you did not clone the repository, download min_studio_dependencies.yml on your local machine using the following command: curl https://raw.githubusercontent.com/aws-samples/emr-studio-samples/main/min_studio_dependencies.yml -o min_studio_dependencies.yml.
  2. Create a new Cloudformation stack with min_studio_dependencies.yml via AWS Management console or AWS CLI. (Provide VPC Id for the stack parameter VPC)
  3. Remove the egress rule of EngineSecurityGroup (Unfortunately Cloudformation does not support creating 0-egress security group).
  4. Note down the Cloudformation stack outputs: EMRStudioServiceRoleArn, EMRStudioUserRoleArn, EngineSecurityGroup and WorkspaceSecurityGroup
  5. Run
aws emr create-studio --region $region \
--name $studio_name \
--auth-mode SSO \
--vpc-id $your_vpc \
--subnet-ids $your_subnet_1 $your_subnet_2 \
--service-role $service_role \
--user-role $user_role \
--workspace-security-group-id $workspace_sg \
--engine-security-group-id $engine_sg \
--default-s3-location s3://$your_s3_bucket

Security

See CONTRIBUTING for more information.

Copyright and License

All content in this repository, unless otherwise stated, is Copyright © Amazon Web Services, Inc. or its affiliates. All rights reserved.

The sample code within this repository is made available under the MIT-0 License. See the LICENSE file.

emr-studio-samples's People

Contributors

amazon-auto avatar emrnotebooks avatar rliuamzn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

emr-studio-samples's Issues

Full Studio Dependencies CloudFormation Template - Wrong S3 Permissions

The full_studio_dependencies.yml creates a S3 bucket named with the phrase:

emr-studio-setup-emrstudiostoragebucket-*

The EMRStudioServiceRole grants permissions to a bucket named with the phrase:

emr-studio-dependencies-emrstudiostoragebucket-*

You cannot create a new Studio after running this CloudFormation template. You get an error indicating the service role does not have permissions to the S3 bucket.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.