Giter Site home page Giter Site logo

Comments (4)

wong-a avatar wong-a commented on May 28, 2024

Thanks for the feedback @humanzz

Are you using the data science SDK today?

The aws-stepfunctions-tasks module supports constructs to create Task states for SageMaker CreateTransformJob and CreateTrainingJob. Besides other SageMaker APIs (aws/aws-cdk#6572), which features of this SDK would you like to see in CDK as constructs?

from aws-step-functions-data-science-sdk-python.

shivlaks avatar shivlaks commented on May 28, 2024

The CDK provides L2 constructs for several more of the SageMaker APIs including:

  • CreateTrainingJob
  • CreateTransformJob
  • CreateEndpoint
  • CreateEndpointConfig
  • CreateModel
  • UpdateEndpoint.

L2 constructs are intended to model and simplify assembly of a State Machine definition so that it can be deployed through CloudFormation via cdk deploy in the CDK CLI. You can also author your state machine in the language of your choice (TypeScript, JavaScript, C#, Python, Java, Go). It also allows for creation of resources outside of your state machine through the aws-sagemaker module, which allows for creation of all these resources as well via CloudFormation

The Data Science SDK offers capabilities in Python by talking to Step Functions APIs instead of deploying to CloudFormation. This includes several utilities that are geared towards simplify usage in Jupyter Notebooks that allow you to visualize the workflow, poll its execution status, etc that are unlocked by being able to call Step Functions APIs.

Question: Is there something actionable in this repository that we could be providing to improve the data science sdk experience? @humanzz

If there are features that are missing in the CDK towards the creation of a state machine definition, they should probably be created as issues and tracked in the aws-cdk repository.

If there are other areas to explore, I suggest converting this issue into a discussion.

from aws-step-functions-data-science-sdk-python.

humanzz avatar humanzz commented on May 28, 2024

Thanks for the response @shivlaks.

I believe when I opened this issue, SageMaker/StepFunctions support in CDK was very limited. My team saw value in modeling our ML pipelines in CDK, and this SDK seemed to do something very similar.

As you say, this library probably serves a different - though with similarities - to the use case I had in mind.

Since this is in Python, works with notebooks, it can fit easily into earlier experimentation phases, iterating to reach a final pipeline.

In our case, we chose CDK to represent those final pipelines, since they'll be used many times. But to be honest, the cycle between experimenting and reaching a final pipeline is a bit more cumbersome than if using the SDK here (requires infrastructure changes with CDK and Python change for SageMaker).

So to answer your question: nothing actionable required but appreciate the added context.

from aws-step-functions-data-science-sdk-python.

shivlaks avatar shivlaks commented on May 28, 2024

thank you for sharing your journey so far @humanzz 🙏

the cycle between experimenting and reaching a final pipeline is a bit more cumbersome than if using the SDK here

👀 It does still sound like we can smooth out the developer experience and go further in either CDK or data science SDK. If you have any thoughts or ideas for this repo or the CDK, I encourage you to open the feature request and that will help start the conversation around how we can bridge gaps. We want developers to have the option and the flexibility by providing solutions that simplifies their use caes.

Aside:
I believe the aws-stepfunctions and aws-stepfunctions-tasks modules were also in their early stages as the modules were in experimental stability when this issue was created (I was on the CDK team at that time). These modules are stable now although the SageMaker APIs offered in stepfunctions-tasks don't quite leverage SageMaker L2s as input type as they are largely modeled within the sfn tasks module.


The Step Functions team will be looking to improve the developer experience in both the data science SDK and we will also monitor issues for the CDK modules to weight in / open PRs to contribute where we can.

I'm resolving this issue for now as there is nothing actionable required at this time. Feel free to re-open if you have any unresolved questions.

from aws-step-functions-data-science-sdk-python.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.