Giter Site home page Giter Site logo

aws-samples / amazon-sagemaker-personalized-generative-ai Goto Github PK

View Code? Open in Web Editor NEW
16.0 4.0 4.0 4.72 MB

This project simplifies personalized Gen-AI SaaS apps. We fine-tune pre-trained models for users, use single GPUs, and ensure real-time responsiveness. A base txt2img Stable Diffusion model from SageMaker JumpStart is used. Challenges include traffic spikes, low-latency, and cost-efficiency. We aim for efficient, user-centric AI solutions.

Home Page: https://aws.amazon.com/blogs/machine-learning/architect-personalized-generative-ai-saas-applications-on-amazon-sagemaker/

License: MIT No Attribution

Python 52.54% Jupyter Notebook 47.46%
generative-ai aws cdk poetry python stable-diffusion text2image

amazon-sagemaker-personalized-generative-ai's Introduction

Architect personalized generative AI SaaS applications on Amazon SageMaker

This project enables the fine-tuning and serving of hyper-personalized Generative AI models at scale on AWS. We address the needs of SaaS providers and B2C startups looking to scale quickly. We propose an architecture that leverages Amazon SageMaker to streamline AI model fine-tuning and deployment, enabling faster development, improved service quality, and cost-effectiveness; and Multi-model Endpoints (MMEs) for real-time hosting, which provide a scalable, low-latency, and cost-effective way to deploy thousands of deep learning models behind a single endpoint. For more details, please refer to this blog post.

Setup Requirements

  • Node 18+
  • Install CDK with npm npm install -g aws-cdk
  • Install Poetry: https://python-poetry.org/docs/#installation

Poetry install Linux, macOS, Windows (WSL)

curl -sSL https://install.python-poetry.org | python3 -

Install dependencies with poetry

poetry install

Setup python env in shell

poetry shell

At this point you can now synthesize the CloudFormation template for this code.

$ cdk synth

To add additional dependencies, for example other CDK libraries, just use poetry add yourpackage

Useful Commands

  • cdk ls list all stacks in the app
  • cdk synth emits the synthesized CloudFormation template
  • cdk deploy deploy this stack to your default AWS account/region
  • cdk diff compare deployed stack with current state
  • cdk docs open CDK documentation

Architecture

The architecture described involves a system for Generative AI use cases, with a focus on personalized text-to-image generation as an example, using Stable Diffusion v2-1. The key components of this architecture are as follows:

  • SageMaker Training and Hosting APIs: These APIs provide fully managed training jobs and model deployment capabilities. They enable fast-moving teams to concentrate more on product features and differentiation. SageMaker Training jobs, which follow a "launch-and-forget" paradigm, are suitable for transient concurrent model fine-tuning jobs during user onboarding.

  • GPU-Enabled Hosting: SageMaker supports GPU-enabled hosting options for deploying deep learning models at scale. This includes the integration of NVIDIA Triton Inference Server into the SageMaker ecosystem. SageMaker also offers GPU support for Multi-model Endpoints (MMEs), which allow the deployment of thousands of deep learning models behind a single endpoint, ensuring scalability, low-latency, and cost-effectiveness.

  • Infrastructure Level: At the infrastructure level, the architecture relies on best-in-class compute options, such as the G5 instance type, equipped with NVIDIA A10G Tensor Core GPUs (unique to AWS). This instance type offers a favorable price-performance ratio for both model training and hosting, delivering efficient compute power per dollar spent.

The architecture is particularly well-suited for text-to-image generation use cases. It divides the solution workflow into two major phases:

  • Phase A (User Onboarding): In this phase, users can request the creation of one or more custom, fine-tuned models. They can check for the availability status of their models at all times, to know when training has finished.

  • Phase B (On-Demand Inference): After fine-tuning, the model is ready for on-demand real-time image generation by end-users.

How to Call the API Gateway with Postman

To interact with your API Gateway deployed using AWS CDK, follow these steps:

  1. Open Postman and import the collection from the folder documentation.

  2. Set the request method (e.g., GET, POST) and enter the API Gateway URL endpoint.

  3. If your API requires authentication, configure the necessary headers or tokens.

  4. Add any required request parameters or data.

  5. Click "Send" to make the request and receive the response.

    Note: Ensure that your AWS resources and API Gateway are correctly configured to handle the request.

amazon-sagemaker-personalized-generative-ai's People

Contributors

joseanavarrom avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.