This folder contains an Airflow plugin for training ML models on a schedule using the Bedrock platform.
- Generate an API token on Bedrock UI
- Copy
bedrock_plugin.py
toairflow/plugins/
- For self-hosted airflow, the destination is in your home directory
- For Google Cloud Composer, the
airflow
directory points to a GCS bucket
- Create a connection:
- Open the Airflow web server
- Go to Admin > Connections
- Click on "Create" at the top
- Fill in the connection details like below, leaving all other fields empty
Conn Id: bedrock
Conn Type: HTTP
Host: api.bdrk.ai
Schema: https
Extra: {"X-Bedrock-Access-Token": "<your generated here>"}
We recommend creating a new DAG for each training pipeline so that each model can be trained on its own schedule. To get started, create your first training pipeline on Bedrock UI and take note of its public id (the last portion of the pipeline's URL).
- Copy
examples/bedrock_dag.py
toairflow/dags/
- Create variables for
RunPipelineOperator
to match your new pipeline
{
"pipeline_public_id": "churn-prediction-123456",
"environment_public_id": "<obtained from dropdown list in run pipeline page>"
}
- Run
bedrock_dag
from Airflow UI or from the command line
$> airflow test bedrock_dag create 2019-05-21
- [Optional] Login to https://bedrock.basis-ai.com to verify that your training pipeline run has completed