Data Service for ADAS and ADS Development

Overview

This is an example of a data service typically used in advanced driver assistance systems (ADAS), and automated driving systems (ADS) development. The data service is composed from following modular AWS services: Amazon Elastic Kubernetes Service (EKS), Amazon Managed Streaming for Apache Kafka (MSK), Amazon Redshift, Amazon FSx, Amazon Elastic File System (EFS), AWS Batch, AWS Step Functions, AWS Secrets Manager, Amazon Glue, Amazon Fargate, Amazon Elastic Container Registry (ECR), Amazon Virtual Private Cloud (VPC), Amazon EC2, and Amazon S3.

The typical use case addressed by this data service is to serve sensor data from a specified drive scene of interest, either in batch mode as a single rosbag file, or in streaming mode as an ordered series of rosbag files, whereby each rosbag file contains the drive scene data for a single time step, which by default is 1 second. Each rosbag file is dynamically composed from the drive scene data stored in Amazon S3, using the meta-data stored in Amazon Redshift.

Key concepts

The data service runs in Kubernetes Pods in an Amazon EKS cluster configured to use Horizontal Pod Autoscaler and EKS Cluster Autoscaler.

An Amazon Managed Service For Apache Kafka (MSK) cluster provides the communication channel between the data service, and the client. The data service implements a request-response paradigm over Apache Kafka topics. However, the response data is not sent back over the Kafka topics. Instead, the data service stages the response data in Amazon S3, Amazon FSx, or Amazon EFS, as specified in the data client request, and the location of the response data is sent back to the data client over the Kafka topics. The data client directly reads the response data from its staged location.

Data client request

Concretely, imagine the data client wants to request drive scene data in rosbag file format from A2D2 autonomous driving dataset for vehicle id a2d2, drive scene id 20190401145936, starting at timestamp 1554121593909500 (microseconds) , and stopping at timestamp 1554122334971448 (microseconds). The data client wants the response to include data only from the front-left camera in sensor_msgs/Image ROS data type, and the front-left lidar in sensor_msgs/PointCloud2 ROS data type. The data client wants the response data to be streamed back chunked in series of rosbag files, each file spanning 1000000 microseconds of the drive scene. Finally, the data client wants the response rosbag files to be stored on a shared Amazon FSx file system.

The data client can encode such a data request using the JSON object shown below, and send it to the Kafka bootstrap servers b-1.msk-cluster-1:9092,b-2.msk-cluster-1:9092 on the Apache Kafka topic named a2d2:

{
	"servers": "b-1.msk-cluster-1:9092,b-2.msk-cluster-1:9092",
	"requests": [{
		"kafka_topic": "a2d2", 
		"vehicle_id": "a2d2",
		"scene_id": "20190401145936",
		"sensor_id": ["lidar/front_left", "camera/front_left"],
		"start_ts": 1554121593909500, 
		"stop_ts": 1554122334971448,
		"ros_topic": {"lidar/front_left": "/a2d2/lidar/front_left", 
				"camera/front_left": "/a2d2/camera/front_left"},
		"data_type": {"lidar/front_left": "sensor_msgs/PointCloud2",
				"camera/front_left": "sensor_msgs/Image"},
		"step": 1000000,
		"accept": "fsx/multipart/rosbag",
		"preview": false
	}]
}

For a detailed description of each request field shown in the example above, see data request fields below.

Tutorial step by step guide

Overview

In this tutorial, we use A2D2 autonomous driving dataset. The high-level outline of this tutorial is as follows:

Prerequisites
Configure the data service
Prepare the A2D2 data
Run the data service
Run the data service client

Prerequisites

This tutorial assumes you have an AWS Account, and you have Administrator job function access to the AWS Management Console.

To get started:

Select your AWS Region. The AWS Regions supported by this project include, us-east-1, us-east-2, us-west-2, eu-west-1, eu-central-1, ap-southeast-1, ap-southeast-2, ap-northeast-1, ap-northeast-2, and ap-south-1. The A2D2 dataset used in this tutorial is stored in eu-central-1.
Subscribe to Ubuntu Pro 18.04 LTS and Ubuntu Pro 20.04 LTS.
If you do not already have an Amazon EC2 key pair, create a new Amazon EC2 key pair. You will need the key pair name to specify the KeyName parameter when creating the AWS CloudFormation stack below.
You will need an Amazon S3 bucket. If you don't have one, create a new Amazon S3 bucket in the selected AWS region. You will use the S3 bucket name to specify the S3Bucket parameter in the stack. The bucket is used to store the A2D2 data.
Use the public internet address of your laptop as the base value for the CIDR to specify DesktopRemoteAccessCIDR parameter in the CloudFormation stack you will create below.
For all passwords used in this tutorial, we recommend using strong passwords using the best-practices recommended for AWS root account user password.

Configure the data service

Create AWS CloudFormation Stack

The AWS CloudFormation template cfn/mozart.yml in this repository creates AWS Identity and Access Management (IAM) resources, so when you create the CloudFormation Stack using the console, in the Review step, you must check I acknowledge that AWS CloudFormation might create IAM resources.

Create a new AWS CloudFormation stack using the cfn/mozart.yml template. The stack input parameters you must specify are described below:

Parameter Name	Parameter Description
KeyPairName	This is a required parameter whereby you select the Amazon EC2 key pair name used for SSH access to the desktop. You must have access to the selected key pair's private key to connect to your desktop.
RedshiftMasterUserPassword	This is a required parameter whereby you specify the Redshift database master user password.
DesktopRemoteAccessCIDR	This is a required parameter whereby you specify the public IP CIDR range from where you need remote access to your graphics desktop, e.g. 1.2.3.4/32, or 7.8.0.0/16.
DesktopInstanceType	This is a required parameter whereby you select an Amazon EC2 instance type for the ROS desktop. The default value, `g3s.xlarge`, may not be available for your selected region, in which case, we recommend you try `g4dn.xlarge`, or one of the other instance types.
S3Bucket	This is a required parameter whereby you specify the name of the Amazon S3 bucket to store your data. The S3 bucket must already exist.

For all other stack input parameters, default values are recommended during first walkthrough. See complete list of all the template input parameters below.

Key AWS resources

The key resources in the CloudFormation stack are listed below:

A ROS desktop EC2 instance (default type g3s.xlarge)
An Amazon EKS cluster with 2 managed node group nodes (default type r5n.8xlarge)
An Amazon MSK cluster with 3 broker nodes (default type kafka.m5.2xlarge)
An Amazon Redshift cluster with 3 nodes (default type ra3.4xlarge)
An Amazon Fsx for Lustre file-system (default size 7,200 GiB)
An Amazon EFS file-system

Connect to the graphics desktop using SSH

Once the stack status in CloudFormation console is CREATE_COMPLETE, find the desktop instance launched in your stack in the Amazon EC2 console, and connect to the instance using SSH as user ubuntu, using your SSH key pair.
When you connect to the desktop using SSH, and you see the message "Cloud init in progress. Machine will REBOOT after cloud init is complete!!", disconnect and try later after about 20 minutes. The desktop installs the NICE DCV server on first-time startup, and reboots after the install is complete.
If you see the message NICE DCV server is enabled!, run the command sudo passwd ubuntu to set a new password for user ubuntu. Now you are ready to connect to the desktop using the NICE DCV client

Connect to the graphics desktop using NICE DCV Client

Download and install the NICE DCV client on your laptop.
Use the NICE DCV Client to login to the desktop as user ubuntu
When you first login to the desktop using the NICE DCV client, you will be asked if you would like to upgrade the OS version. Do not upgrade the OS version .

Now you are ready to proceed with the following steps. For all the commands in this tutorial, we assume the working directory to be ~/amazon-eks-autonomous-driving-data-service on the graphics desktop.

Configure EKS cluster access

In this step, you need AWS credentials for programmatic access for the IAM user, or role, you used to create the AWS CloudFormation stack above. You must not use the AWS credentials for a different IAM user, or role. The AWS credentials are used one-time to enable EKS cluster access from the ROS desktop, and are automatically removed at the end of this step.

If you used an IAM role to create the CloudFormation stack above, you must manually configure the credentials associated with the IAM role in the ~/.aws/credentials file with the following fields:

	aws_access_key_id=
	aws_secret_access_key=
	aws_session_token=

If you used an IAM user to create the stack, you do not have to manually configure the credentials in ~/.aws/credentials file.

In the working directory, run the command:

	./scripts/configure-eks-auth.sh

At the successful execution of this command, you must see AWS Credentials Removed.

Setup EKS cluster environment

To setup the eks cluster environment, in the working directory, run the command:

	./scripts/setup-dev.sh

This step also builds and pushes the data service container image into Amazon ECR.

Prepare the A2D2 data

Before we can run the A2D2 data service, we need to extract the raw A2D2 data into your S3 bucket, extract the metadata from the raw data, and upload the metadata into the Redshift cluster. We execute these three steps using an AWS Step Functions state machine. To create and execute the AWS Step Functions state machine, execute the following command in the working directory:

	./scripts/a2d2-etl-steps.sh

Note the executionArn of the state machine execution in the output of the previous command. To check the status the status of the execution, use following command, replacing executionArn below with your value:

aws stepfunctions describe-execution --execution-arn executionArn

You can see the status of the Step Function State Machine execution in the Step Functions console, as well.

The state machine execution time depends on many variable factors and may take anywhere from 12 - 24 hours, or possibly longer. You can see the status and the CloudWatch logs for each AWS Batch job spawned by the Step Function State Machine execution in the AWS Batch console. If a Batch job fails, it is automatically retried.

Run the data service

For best performance, preload A2D2 data from S3 to FSx. For a quick preview of the data service, you may proceed with the step below.

The data service is deployed using an Helm Chart, and runs as a kubernetes deployment in EKS. To start the data service, execute the following command in the working directory:

	helm install --debug a2d2-data-service ./a2d2/charts/a2d2-data-service/

To verify that the a2d2-data-service deployment is running, execute the command:

	kubectl get pods -n a2d2

Raw data input data source, and response data staging

The data service can be configured to input raw data from S3, FSx (see Preload A2D2 data from S3 to FSx ), or EFS (see Preload A2D2 data from S3 to EFS ). Similarly, the data client can specify the accept field in the request to request that the response data be staged on S3, FSx, or EFS. The raw data input source is fixed when the data service is deployed. However, the response data staging option is specified in the data client request, and is therefore dynamic.

Below is the Helm chart configuration in values.yaml for various raw input data source options, with recommended Kubernetes resource requests for pod memory and cpu:

Raw input data source	`values.yaml` Configuration
`fsx` (default)	`a2d2.requests.memory: "72Gi"` `a2d2.requests.cpu: "8000m"` `configMap.data_store.input: "fsx"`
`efs`	`a2d2.requests.memory: "32Gi"` `a2d2.requests.cpu: "1000m"` `configMap.data_store.input: "efs"`
`s3`	`a2d2.requests.memory: "8Gi"` `a2d2.requests.cpu: "1000m"` `configMap.data_store.input: "s3"`

For matching response data staging options in data client request, see requests.accept field in data request fields. The recommended response data staging option for fsx raw data source is"accept": "fsx/multipart/rosbag", for efs raw data source is"accept": "efs/multipart/rosbag", and for s3 raw data source is"accept": "s3/multipart/rosbag"

Run the data service client

To visualize the response data requested by the A2D2 data client, we will use rviz tool on the graphics desktop. Open a terminal on the desktop, and run rviz. In the rviz tool, use File>Open Config to select /home/ubuntu/amazon-eks-autonomous-driving-data-service/a2d2/config/a2d2.rviz as the rviz config. You should see rviz tool now configured with two areas, one for visualizing image data, and the other for visualizing point cloud data.

To run the data client, execute the following command in the working directory:

	python ./a2d2/src/data_client.py \
		--config ./a2d2/config/c-config-ex1.json 1>/tmp/a.out 2>&1 &

After a brief delay, you should be able to preview the response data in the rviz tool To preview data from a different drive scene, execute:

	python ./a2d2/src/data_client.py \
		--config ./a2d2/config/c-config-ex2.json 1>/tmp/a.out 2>&1 &

You can set "preview": false in the data client config file, and run the above command to view the complete response, but before you do that, we recommend that for best performance, preload A2D2 data from S3 to FSx.

It is important that you do not run multiple data client instances on the ROS desktop concurrently. This is because the response data is played back on ROS topics, and there is only one ROS server running on the ROS desktop. Wait for a data client instance to exit, before you start another instance. Aborting the data client manually does not stop the data service pod from producing the response.

Hard reset of the data service

This step is for reference purposes. If at any time you need to do a hard reset of the data service, you can do so by executing:

	helm delete a2d2-data-service

This will delete all data service EKS pods immediately. All in-flight service responses will be aborted. Because the connection between the data client and data service is asynchronous, the data clients may wait indefinitely, and you may need to cleanup the data client processes manually on the ROS desktop using operating system tools. Note, each data client instance spawns multiple Python processes.

Deleting the AWS CloudFormation stack

When you no longer need the data service , you may delete the AWS CloudFormation stack from the AWS CloudFormation console. Deleting the stack will terminate the desktop instance, and delete the EFS and FSx for Lustre file-systems created in the stack. The Amazon S3 bucket is not deleted.

Reference

Data client request fields

Below, we explain the semantics of the various fields in the data client request JSON object.

Request field name	Request field description
`servers`	The `servers` identify the AWS MSK Kafka cluster endpoint.
`delay`	The `delay` specifies the delay in seconds that the data client delays sending the request. Default value is `0`.
`use_time`	(Optional) The `use_time` specifies whether to use the `received` time, or `header` time when playing back the received messages. Default value is `received`.
`requests`	The JSON document sent by the client to the data service must include an array of one or more data `requests` for drive scene data.
`requests.kafka_topic`	The `kafka_topic` specifies the Kafka topic on which the data request is sent from the client to the data service. The data service is listening on the topic.
`requests.vehicle_id`	The `vehicle_id` is used to identify the relevant drive scene dataset.
`requests.scene_id`	The `scene_id` identifies the drive scene of interest, which in this example is `20190401145936`, which in this example is a string representing the date and time of the drive scene, but in general could be any unique value.
`requests.start_ts`	The `start_ts` (microseconds) specifies the start timestamp for the drive scene data request.
`requests.stop_ts`	The `stop_ts` (microseconds) specifies the stop timestamp for the drive scene data request.
`requests.ros_topic`	The `ros_topic` is a dictionary from `sensor ids` in the vehicle to `ros` topics.
`requests.data_type`	The `data_type` is a dictionary from `sensor ids` to `ros` data types.
`requests.step`	The `step` is the discreet time interval (microseconds) used to discretize the timespan between `start_ts` and `stop_ts`. If `requests.accept` value contains `multipart`, the data service responds with a `rosbag` file for each discreet `step`: See possible values below.
`requests.accept`	The `accept` specifies the response data staging format acceptable to the client: See possible values below.
`requests.image`	(Optional) The value `undistorted` undistorts the camera image. Undistoring an image slows down the image frame rate. Default value is `original` distorted image.
`requests.lidar_view`	(Optional) The value `vehicle` transforms lidar points to `vehicle` frame of reference view. Default value is `camera`.
`requests.preview`	If the `preview` field is set to `true`, the data service returns requested data over a single time `step` starting from `start_ts` , and ignores the `stop_ts`.

Possible `requests.accept` field values

Below we describe the possible values for requests.accept field:

`requests.accept` value	Description
`fsx/multipart/rosbag`	Stage response data on shared Amazon FSx file-system as discretized `rosbag` chunks
`efs/multipart/rosbag`	Stage response data on shared Amazon EFS file-system as discretized `rosbag` chunks
`s3/multipart/rosbag`	Stage response data on Amazon S3 as discretized `rosbag` chunks
`fsx/singlepart/rosbag`	Stage response data on shared Amazon FSx file-system as single `rosbag`
`efs/singlepart/rosbag`	Stage response data on on shared Amazon EFS file-system as single `rosbag`
`s3/singlepart/rosbag`	Stage response data on Amazon S3 as single `rosbag`
`manifest`	Return a manifest of meta-data containing S3 paths to the raw data: `manifest` is returned directly over the Kafka response topic, and is not staged

AWS CloudFormation template input parameters

This repository provides an AWS CloudFormation template that is used to create the required stack.

Below, we describe the AWS CloudFormation template input parameters. Desktop below refers to the NICE DCV enabled high-performance graphics desktop that acts as the data service client in this tutorial.

Parameter Name	Parameter Description
DesktopInstanceType	This is a required parameter whereby you select an Amazon EC2 instance type for the desktop running in AWS cloud. Default value is `g3s.xlarge`.
DesktopEbsVolumeSize	This is a required parameter whereby you specify the size of the root EBS volume (default size is 200 GB) on the desktop. Typically, the default size is sufficient.
DesktopEbsVolumeType	This is a required parameter whereby you select the EBS volume type (default is gp3).
DesktopHasPublicIpAddress	This is a required parameter whereby you select whether a Public Ip Address be associated with the Desktop. Default value is `true`.
DesktopRemoteAccessCIDR	This parameter specifies the public IP CIDR range from where you need remote access to your client desktop, e.g. 1.2.3.4/32, or 7.8.0.0/16.
EKSEncryptSecrets	This is a required parameter whereby you select if encryption of EKS secrets is `Enabled`. Default value is `Enabled`.
EKSEncryptSecretsKmsKeyArn	This is an optional advanced parameter whereby you specify the AWS KMS key ARN that is used to encrypt EKS secrets. Leave blank to create a new KMS key.
EKSNodeGroupInstanceType	This is a required parameter whereby you select EKS Node group EC2 instance type. Default value is `r5n.8xlarge`.
EKSNodeVolumeSizeGiB	This is a required parameter whereby you specify EKS Node group instance EBS volume size. Default value is 200 GiB.
EKSNodeGroupMinSize	This is a required parameter whereby you specify EKS Node group minimum size. Default value is 1 node.
EKSNodeGroupMaxSize	This is a required parameter whereby you specify EKS Node group maximum size. Default value is 8 nodes.
EKSNodeGroupDesiredSize	This is a required parameter whereby you specify EKS Node group initial desired size. Default value is 2 nodes.
FargateComputeType	This is a required parameter whereby you specify Fargate compute environment type. Allowed values are `FARGATE_SPOT` and `FARGATE`. Default value is `FARGATE_SPOT`.
FargateComputeMax	This is a required parameter whereby you specify maximum size of Fargate compute environment in vCpus. Default value is `1024`.
FSxStorageCapacityGiB	This is a required parameter whereby you specify the FSx Storage capacity, which must be in multiples of `2400 GiB`. Default value is `7200 GiB`.
FSxS3ImportPrefix	This is an optional advanced parameter whereby you specify FSx S3 bucket path prefix for importing data from S3 bucket. Leave blank to import the complete bucket.
KeyPairName	This is a required parameter whereby you select the Amazon EC2 key pair name used for SSH access to the desktop. You must have access to the selected key pair's private key to connect to your desktop.
KubectlVersion	This is a required parameter whereby you specify EKS `kubectl` version. Default value is `1.21.2/2021-07-05`.
KubernetesVersion	This is a required parameter whereby you specify EKS cluster version. Default value is `1.21`.
MSKBrokerNodeType	This is a required parameter whereby you specify the type of node to be provisioned for AWS MSK Broker.
MSKNumberOfNodes	This is a required parameter whereby you specify the number of MSK Broker nodes, which must be >= 2.
PrivateSubnet1CIDR	This is a required parameter whereby you specify the Private Subnet1 CIDR in Vpc CIDR. Default value is `172.30.64.0/18`.
PrivateSubnet2CIDR	This is a required parameter whereby you specify the Private Subnet2 CIDR in Vpc CIDR. Default value is `172.30.128.0/18`.
PrivateSubnet3CIDR	This is a required parameter whereby you specify the Private Subnet3 CIDR in Vpc CIDR. Default value is `172.30.192.0/18`.
PublicSubnet1CIDR	This is a required parameter whereby you specify the Public Subnet1 CIDR in Vpc CIDR. Default value is `172.30.0.0/24`.
PublicSubnet2CIDR	This is a required parameter whereby you specify the Public Subnet2 CIDR in Vpc CIDR. Default value is `172.30.1.0/24`.
PublicSubnet3CIDR	This is a required parameter whereby you specify the Public Subnet3 CIDR in Vpc CIDR. Default value is `172.30.2.0/24`.
RedshiftDatabaseName	This is a required parameter whereby you specify the name of the Redshift database. Default value is `mozart`.
RedshiftMasterUsername	This is a required parameter whereby you specify the name Redshift Master user name. Default value is `admin`.
RedshiftMasterUserPassword	This is a required parameter whereby you specify the name Redshift Master user password.
RedshiftNodeType	This is a required parameter whereby you specify the type of node to be provisioned for Redshift cluster. Default value is `ra3.4xlarge`.
RedshiftNumberOfNodes	This is a required parameter whereby you specify the number of compute nodes in the Redshift cluster, which must be >= 2.
RedshiftPortNumber	This is a required parameter whereby you specify the port number on which the Redshift cluster accepts incoming connections. Default value is `5439`.
RosVersion	This is a required parameter whereby you specify the version of ROS. The supported versions are `melodic` on Ubuntu Bionic, and `noetic` on Ubuntu Focal. Default value is `noetic`.
S3Bucket	This is a required parameter whereby you specify the name of the Amazon S3 bucket to store your data.
UbuntuAMI	This is an optional advanced parameter whereby you specify Ubuntu AMI (18.04 or 20.04).
VpcCIDR	This is a required parameter whereby you specify the Amazon VPC CIDR for the VPC created in the stack. Default value is 172.30.0.0/16. If you change this value, all the subnet parameters above may need to be set, as well.

Preload A2D2 data from S3 to FSx

This step can be executed anytime after "Configure the data service" step has been executed

Amazon FSx for Lustre automatically lazy loads data from the configured S3 bucket. Therefore, this step is strictly a performance optimization step . However, for maximal performance, it is highly recommended. Execute following command to start preloading data from your S3 bucket to the FSx file-system:

kubectl apply -n a2d2 -f a2d2/fsx/stage-data-a2d2.yaml

To check if the step is complete, execute:

kubectl get pods stage-fsx-a2d2 -n a2d2

If the pod is still Running, the step has not yet completed. This step takes approximately 5 hours to complete.

Preload A2D2 data from S3 to EFS

This step can be executed anytime after "Configure the data service" step has been executed

This step is required only if you plan to configure the data service to use EFS as the raw data input source, otherwise, it may be safely skipped. Execute following command to start preloading data from your S3 bucket to the EFS file-system:

kubectl apply -n a2d2 -f a2d2/efs/stage-data-a2d2.yaml

To check if the step is complete, execute:

kubectl get pods stage-efs-a2d2 -n a2d2

If the pod is still Running, the step has not yet completed. This step takes approximately 6.5 hours to complete.

classicvalues / amazon-eks-autonomous-driving-data-service Goto Github PK

amazon-eks-autonomous-driving-data-service's Introduction

Data Service for ADAS and ADS Development

Overview

Key concepts

Data client request

Tutorial step by step guide

Overview

Prerequisites

Configure the data service

Create AWS CloudFormation Stack

Key AWS resources

Connect to the graphics desktop using SSH

Connect to the graphics desktop using NICE DCV Client

Configure EKS cluster access

Setup EKS cluster environment

Prepare the A2D2 data

Run the data service

Raw data input data source, and response data staging

Run the data service client

Hard reset of the data service

Deleting the AWS CloudFormation stack

Reference

Data client request fields

Possible requests.accept field values

AWS CloudFormation template input parameters

Preload A2D2 data from S3 to FSx

Preload A2D2 data from S3 to EFS

amazon-eks-autonomous-driving-data-service's People

Contributors

Stargazers

Watchers

Recommend Projects

Recommend Topics

Recommend Org

Possible `requests.accept` field values