{"id":1868,"date":"2022-03-01T17:50:05","date_gmt":"2022-03-01T17:50:05","guid":{"rendered":"https:\/\/salarydistribution.com\/machine-learning\/2022\/03\/01\/anomaly-detection-with-amazon-sagemaker-edge-manager-using-aws-iot-greengrass-v2\/"},"modified":"2022-03-01T17:50:05","modified_gmt":"2022-03-01T17:50:05","slug":"anomaly-detection-with-amazon-sagemaker-edge-manager-using-aws-iot-greengrass-v2","status":"publish","type":"post","link":"https:\/\/salarydistribution.com\/machine-learning\/2022\/03\/01\/anomaly-detection-with-amazon-sagemaker-edge-manager-using-aws-iot-greengrass-v2\/","title":{"rendered":"Anomaly detection with Amazon SageMaker Edge Manager using AWS IoT Greengrass V2"},"content":{"rendered":"<div id=\"\">\n<p>Deploying and managing machine learning (ML) models at the edge requires a different set of tools and skillsets as compared to the cloud. This is primarily due to the hardware, software, and networking restrictions at the edge sites. This makes deploying and managing these models more complex. An increasing number of applications, such as industrial automation, autonomous vehicles, and automated checkouts, require ML models that run on devices at the edge so predictions can be made in real time when new data is available.<\/p>\n<p>Another common challenge you may face when dealing with computing applications at the edge is how to efficiently manage the fleet of devices at scale. This includes installing applications, deploying application updates, deploying new configurations, monitoring device performance, troubleshooting devices, authenticating and authorizing devices, and securing the data transmission. These are foundational features for any edge application, but creating the infrastructure needed to achieve a secure and scalable solution requires a lot of effort and time.<\/p>\n<p>On a smaller scale, you can adopt solutions such as manually logging in to each device to run scripts, use automated solutions such as <a href=\"https:\/\/www.ansible.com\/\" target=\"_blank\" rel=\"noopener noreferrer\">Ansible<\/a>, or build custom applications that rely on services such as <a href=\"https:\/\/aws.amazon.com\/iot-core\/\" target=\"_blank\" rel=\"noopener noreferrer\">AWS IoT Core<\/a>. Although it can provide the necessary scalability and reliability, building such custom solutions comes at the cost of additional maintenance and requires specialized skills.<\/p>\n<p><a href=\"https:\/\/aws.amazon.com\/sagemaker\/\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon SageMaker<\/a>, together with <a href=\"https:\/\/aws.amazon.com\/greengrass\/\" target=\"_blank\" rel=\"noopener noreferrer\">AWS IoT Greengrass<\/a>, can help you overcome these challenges.<\/p>\n<p>SageMaker provides <a href=\"https:\/\/aws.amazon.com\/sagemaker\/neo\/\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon SageMaker Neo<\/a>, which is the easiest way to optimize ML models for edge devices, enabling you to train ML models one time in the cloud and run them on any device. As devices proliferate, you may have thousands of deployed models running across your fleets. <a href=\"https:\/\/aws.amazon.com\/sagemaker\/edge-manager\/\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon SageMaker Edge Manager<\/a> allows you to optimize, secure, monitor, and maintain ML models on fleets of smart cameras, robots, personal computers, and mobile devices.<\/p>\n<p>This post shows how to train and deploy an anomaly detection ML model to a simulated fleet of wind turbines at the edge using features of SageMaker and <a href=\"https:\/\/docs.aws.amazon.com\/greengrass\/v2\/developerguide\/greengrass-v2-whats-new.html\" target=\"_blank\" rel=\"noopener noreferrer\">AWS IoT Greengrass V2<\/a>. It takes inspiration from <a href=\"https:\/\/aws.amazon.com\/blogs\/machine-learning\/monitor-and-manage-anomaly-detection-models-on-a-fleet-of-wind-turbines-with-amazon-sagemaker-edge-manager\/\" target=\"_blank\" rel=\"noopener noreferrer\">Monitor and Manage Anomaly Detection Models on a fleet of Wind Turbines with Amazon SageMaker Edge Manager<\/a> by introducing AWS IoT Greengrass for deploying and managing inference application and the model on the edge devices.<\/p>\n<p>In the previous post, the author used custom code relying on AWS IoT services, such as <a href=\"https:\/\/aws.amazon.com\/iot-core\/\" target=\"_blank\" rel=\"noopener noreferrer\">AWS IoT Core<\/a> and <a href=\"https:\/\/aws.amazon.com\/iot-device-management\/\" target=\"_blank\" rel=\"noopener noreferrer\">AWS IoT Device Management<\/a>, to provide the remote management capabilities to the fleet of devices. Although that is a valid approach, developers need to spend a lot of time and effort to implement and maintain such solutions, which they could spend on solving the business problem of providing efficient, performant, and accurate anomaly detection logic for the wind turbines.<\/p>\n<p>The previous post also used a real 3D printed mini wind turbine and <a href=\"https:\/\/www.nvidia.com\/de-de\/autonomous-machines\/embedded-systems\/jetson-nano\/\" target=\"_blank\" rel=\"noopener noreferrer\">Jetson Nano<\/a> to act as the edge device running the application. Here, we use virtual wind turbines that run as Python threads within a SageMaker notebook. Also, instead of Jetson Nano, we use <a href=\"http:\/\/aws.amazon.com\/ec2\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon Elastic Compute Cloud<\/a> (Amazon EC2) instances to act as edge devices, running AWS IoT Greengrass software and the application. We also run a simulator to generate measurements for the wind turbines, which are sent to the edge devices using <a href=\"https:\/\/docs.aws.amazon.com\/iot\/latest\/developerguide\/mqtt.html\" target=\"_blank\" rel=\"noopener noreferrer\">MQTT<\/a>. We also use the simulator for visualizations and stopping or starting the turbines.<\/p>\n<p>The previous post goes more in detail about the ML aspects of the solution, such as how to build and train the model, which we don\u2019t cover here. We focus primarily on the integration of Edge Manager and AWS IoT Greengrass V2.<\/p>\n<p>Before we go any further, let\u2019s review what AWS IoT Greengrass is and the benefits of using it with Edge Manager.<\/p>\n<h2>What is AWS IoT Greengrass V2?<\/h2>\n<p>AWS IoT Greengrass is an Internet of Things (IoT) open-source edge runtime and cloud service that helps build, deploy, and manage device software. You can use AWS IoT Greengrass for your IoT applications on millions of devices in homes, factories, vehicles, and businesses. AWS IoT Greengrass V2 offers an open-source edge runtime, improved modularity, new local development tools, and improved fleet deployment features. It provides a component framework that manages dependencies, and allows you to reduce the size of deployments because you can choose to only deploy the components required for the application.<\/p>\n<p>Let\u2019s go through some of the concepts of AWS IoT Greengrass to understand how it works:<\/p>\n<ul>\n<li><strong>AWS IoT Greengrass core device<\/strong> \u2013 A device that runs the AWS IoT Greengrass Core software. The device is registered into the AWS IoT Core registry as an AWS IoT thing.<\/li>\n<li><strong>AWS IoT Greengrass component<\/strong> \u2013 A software module that is deployed to and runs on a core device. All software that is developed and deployed with AWS IoT Greengrass is modeled as a component.<\/li>\n<li><strong>Deployment<\/strong> \u2013 The process to send components and apply the desired component configuration to a destination target device, which can be a single core device or a group of core devices.<\/li>\n<li><strong>AWS IoT Greengrass core software<\/strong> \u2013 The set of all AWS IoT Greengrass software that you install on a core device.<\/li>\n<\/ul>\n<p>To enable remote application management on a device (or thousands of them), we first install the core software. This software runs as a background process and listens to deployments configurations sent from the cloud.<\/p>\n<p>To run specific applications on the devices, we model the application as one or more components. For example, we can have a component providing a database feature, another component providing a local UX, or we can use public components provided by AWS, such as LogManager to push the components logs to <a href=\"http:\/\/aws.amazon.com\/cloudwatch\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon CloudWatch<\/a>.<\/p>\n<p>We then create a deployment containing the necessary components and their specific configuration and send it to the target devices, either on a device-by-device basis or as a fleet.<\/p>\n<p>To learn more, refer to <a href=\"https:\/\/docs.aws.amazon.com\/greengrass\/v2\/developerguide\/what-is-iot-greengrass.html\" target=\"_blank\" rel=\"noopener noreferrer\">What is AWS IoT Greengrass?<\/a><\/p>\n<h2>Why use AWS IoT Greengrass with Edge Manager?<\/h2>\n<p>The post <a href=\"https:\/\/aws.amazon.com\/blogs\/machine-learning\/monitor-and-manage-anomaly-detection-models-on-a-fleet-of-wind-turbines-with-amazon-sagemaker-edge-manager\/\" target=\"_blank\" rel=\"noopener noreferrer\">Monitor and Manage Anomaly Detection Models on a fleet of Wind Turbines with Amazon SageMaker Edge Manager<\/a> already explains why we use Edge Manager to provide the ML model runtime for the application. But let\u2019s understand why we should use AWS IoT Greengrass to deploy applications to edge devices:<\/p>\n<ul>\n<li>With AWS IoT Greengrass, you can automate the tasks needed to deploy the Edge Manager software onto the devices and manage the ML models. AWS IoT Greengrass provides a SageMaker Edge Agent as an AWS IoT Greengrass component, which provides model management and data capture APIs on the edge. Without AWS IoT Greengrass, setting up devices and fleets to use Edge Manager requires you to manually copy the Edge Manager agent from an <a href=\"https:\/\/aws.amazon.com\/s3\/\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon Simple Storage Service<\/a> (Amazon S3) release bucket. The agent is used to make predictions with models loaded onto edge devices.<\/li>\n<li>With AWS IoT Greengrass and Edge Manager integration, you use AWS IoT Greengrass components. Components are pre-built software modules that can connect edge devices to AWS services or third-party services via AWS IoT Greengrass.<\/li>\n<li>The solution takes a modular approach in which the inference application, model, and any other business logic can be packaged as a component where the dependencies can also be specified. You can manage the lifecycle, updates, and reinstalls of each of the components independently rather than treat everything as a monolith.<\/li>\n<li>To make it easier to maintain <a href=\"http:\/\/aws.amazon.com\/iam\" target=\"_blank\" rel=\"noopener noreferrer\">AWS Identity and Access Management<\/a> (IAM) roles, Edge Manager allows you to reuse the existing AWS IoT Core role alias. If it doesn\u2019t exist, Edge Manager generates a role alias as part of the Edge Manager packaging job. You no longer need to associate a role alias generated from the Edge Manager packaging job with an AWS IoT Core role. This simplifies the deployment process for existing AWS IoT Greengrass customers.<\/li>\n<li>You can manage the models and other components with less code and configurations because AWS IoT Greengrass takes care of provisioning, updating, and stopping the components.<\/li>\n<\/ul>\n<h2>Solution overview<\/h2>\n<p>The following diagram is the architecture implemented for the solution:<\/p>\n<p><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2022\/02\/18\/ML-6786-image001-new.png\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-33330 size-full\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2022\/02\/18\/ML-6786-image001-new.png\" alt=\"\" width=\"927\" height=\"871\"><\/a><\/p>\n<p>We can broadly divide the architecture into the following phases:<\/p>\n<ul>\n<li><strong>Model training<\/strong>\n<ul>\n<li>Prepare the data and train an anomaly detection model using <a href=\"https:\/\/aws.amazon.com\/sagemaker\/pipelines\/\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon SageMaker Pipelines<\/a>. SageMaker Pipelines helps orchestrate your training pipeline with your own custom code. It also outputs the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Mean_absolute_error\" target=\"_blank\" rel=\"noopener noreferrer\">Mean Absolute Error<\/a> (MAE) and other threshold values used to calculate anomalies.<\/li>\n<\/ul>\n<\/li>\n<li><strong>Compile and package the model<\/strong>\n<ul>\n<li>Compile the model using Neo, so that it can be optimized for the target hardware (in this case, an EC2 instance).<\/li>\n<li>Use the SageMaker Edge packaging job API to package the model as an AWS IoT Greengrass component. The Edge Manager API has a native integration with AWS IoT Greengrass APIs.<\/li>\n<\/ul>\n<\/li>\n<li><strong>Build and package the inference application<\/strong>\n<ul>\n<li>Build and package the inference application as an AWS IoT Greengrass component. This application uses the computed threshold, the model, and some custom code to accept the data coming from turbines, perform anomaly detection, and return results.<\/li>\n<\/ul>\n<\/li>\n<li><strong>Set up AWS IoT Greengrass on edge devices<\/strong>\n          <\/li>\n<li><strong>Deploy to edge devices<\/strong>\n<ul>\n<li>Deploy the following on each edge device:\n<ul>\n<li>An ML model packaged as an AWS IoT Greengrass component.<\/li>\n<li>An inference application packaged an AWS IoT Greengrass component. This also sets up the connection to <a href=\"https:\/\/docs.aws.amazon.com\/iot\/latest\/developerguide\/mqtt.html\" target=\"_blank\" rel=\"noopener noreferrer\">AWS IoT Core MQTT<\/a>.<\/li>\n<li>The AWS-provided Edge Manager Greengrass component.<\/li>\n<li>The AWS-provided AWS IoT Greengrass CLI component (only needed for development and debugging purposes).<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<li><strong>Run the end-to-end solution<\/strong>\n<ul>\n<li>Run the simulator, which generates measurements for the wind turbines, which are sent to the edge devices using MQTT.<\/li>\n<li>Because the notebook and the EC2 instances running AWS IoT Greengrass are on different networks, we use AWS IoT Core to relay MQTT messages between them. In a real scenario, the wind turbine would send the data to the anomaly detection device using a local communication, for example, an <a href=\"https:\/\/docs.aws.amazon.com\/greengrass\/v2\/developerguide\/connect-client-devices.html\" target=\"_blank\" rel=\"noopener noreferrer\">AWS IoT Greengrass MQTT broker component<\/a>.<\/li>\n<li>The inference app and model running in the anomaly detection device predicts if the received data is anomalous or not, and sends the result to the monitoring application via MQTT through AWS IoT Core.<\/li>\n<li>The application displays the data and anomaly signal on the simulator dashboard.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>To know more on how to deploy this solution architecture, please refer to the <a href=\"https:\/\/github.com\/aws-samples\/amazon-sagemaker-edge-manager-workshop\" target=\"_blank\" rel=\"noopener noreferrer\">GitHub Repository<\/a> related to this post.<\/p>\n<p>In the following sections, we go deeper into the details of how to implement this solution.<\/p>\n<h2>Dataset<\/h2>\n<p>The solution uses raw turbine data collected from real wind turbines. The dataset is provided as part of the solution. It has the following features:<\/p>\n<ul>\n<li><strong>nanoId<\/strong> \u2013 ID of the edge device that collected the data<\/li>\n<li><strong>turbineId<\/strong> \u2013 ID of the turbine that produced this data<\/li>\n<li><strong>arduino_timestamp<\/strong> \u2013 Timestamp of the Arduino that was operating this turbine<\/li>\n<li><strong>nanoFreemem<\/strong>: Amount of free memory in bytes<\/li>\n<li><strong>eventTime<\/strong> \u2013 Timestamp of the row<\/li>\n<li><strong>rps<\/strong> \u2013 Rotation of the rotor in rotations per second<\/li>\n<li><strong>voltage<\/strong> \u2013 Voltage produced by the generator in milivolts<\/li>\n<li><strong>qw, qx, qy, qz<\/strong> \u2013 Quaternion angular acceleration<\/li>\n<li><strong>gx, gy, gz<\/strong> \u2013 Gravity acceleration<\/li>\n<li><strong>ax, ay, az<\/strong> \u2013 Linear acceleration<\/li>\n<li><strong>gearboxtemp<\/strong> \u2013 Internal temperature<\/li>\n<li><strong>ambtemp<\/strong> \u2013 External temperature<\/li>\n<li><strong>humidity<\/strong> \u2013 Air humidity<\/li>\n<li><strong>pressure<\/strong> \u2013 Air pressure<\/li>\n<li><strong>gas<\/strong> \u2013 Air quality<\/li>\n<li><strong>wind_speed_rps<\/strong> \u2013 Wind speed in rotations per second<\/li>\n<\/ul>\n<p>For more information, refer to <a href=\"https:\/\/aws.amazon.com\/blogs\/machine-learning\/monitor-and-manage-anomaly-detection-models-on-a-fleet-of-wind-turbines-with-amazon-sagemaker-edge-manager\/\" target=\"_blank\" rel=\"noopener noreferrer\">Monitor and Manage Anomaly Detection Models on a fleet of Wind Turbines with Amazon SageMaker Edge Manager<\/a>.<\/p>\n<h2>Data preparation and training<\/h2>\n<p>The data preparation and training are performed using SageMaker Pipelines. Pipelines is the first purpose-built, easy-to-use continuous integration and continuous delivery (CI\/CD) service for ML. With Pipelines, you can create, automate, and manage end-to-end ML workflows at scale. Because it\u2019s purpose-built for ML, Pipelines helps automate different steps of the ML workflow, including data loading, data transformation, training and tuning, and deployment. For more information, refer to <a href=\"https:\/\/docs.aws.amazon.com\/sagemaker\/latest\/dg\/pipelines.html\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon SageMaker Model Building Pipelines<\/a>.<\/p>\n<h2>Model compilation<\/h2>\n<p>We use Neo for model compilation. It automatically optimizes ML models for inference on cloud instances and edge devices to run faster with no loss in accuracy. ML models are optimized for a target hardware platform, which can be a SageMaker hosting instance or an edge device based on processor type and capabilities, for example if there is a GPU or not. The compiler uses ML to apply the performance optimizations that extract the best available performance for your model on the cloud instance or edge device. For more information, see <a href=\"https:\/\/docs.aws.amazon.com\/sagemaker\/latest\/dg\/neo.html\" target=\"_blank\" rel=\"noopener noreferrer\">Compile and Deploy Models with Neo<\/a>.<\/p>\n<h2>Model packaging<\/h2>\n<p>To use a compiled model with Edge Manager, you first need to package it. In this step, SageMaker creates an archive consisting of the compiled model and the <a href=\"https:\/\/neo-ai-dlr.readthedocs.io\/en\/latest\/\" target=\"_blank\" rel=\"noopener noreferrer\">Neo DLR runtime<\/a> required to run it. It also signs the model for integrity verification. When you deploy the model via AWS IoT Greengrass, the <a href=\"https:\/\/boto3.amazonaws.com\/v1\/documentation\/api\/latest\/reference\/services\/sagemaker.html#SageMaker.Client.create_edge_packaging_job\" target=\"_blank\" rel=\"noopener noreferrer\">create_edge_packaging_job<\/a> API automatically creates an AWS IoT Greengrass component containing the model package, which is ready to be deployed to the devices.<\/p>\n<p>The following code snippet shows how to invoke this API:<\/p>\n<div class=\"hide-language\">\n<pre><code class=\"lang-bash\">model_version = '1.0.0' # use this for semantic versioning the model. Must increment for every new model\n\nmodel_name = 'WindTurbineAnomalyDetection'\nedge_packaging_job_name='wind-turbine-anomaly-%d' % int(time.time()*1000)\ncomponent_name = 'aws.samples.windturbine.model'\ncomponent_version = model_version  \n\nresp = sm_client.create_edge_packaging_job(\n    EdgePackagingJobName=edge_packaging_job_name,\n    CompilationJobName=compilation_job_name,\n    ModelName=model_name,\n    ModelVersion=model_version,\n    RoleArn=role,\n    OutputConfig={\n        'S3OutputLocation': 's3:\/\/%s\/%s\/model\/' % (bucket_name, prefix),\n        \"PresetDeploymentType\": \"GreengrassV2Component\",\n        \"PresetDeploymentConfig\": json.dumps(\n            {\"ComponentName\": component_name, \"ComponentVersion\": component_version}\n        ),\n    }\n)<\/code><\/pre>\n<\/p><\/div>\n<p>To allow the API to create an AWS IoT Greengrass component, you must provide the following additional parameters under <code>OutputConfig<\/code>:<\/p>\n<ul>\n<li>The <code>PresetDeploymentType<\/code> as <code>GreengrassV2Component<\/code><\/li>\n<li><code>PresetDeploymentConfig<\/code> to provide the <code>ComponentName<\/code> and <code>ComponentVersion<\/code> that AWS IoT Greengrass uses to publish the component<\/li>\n<li>The <code>ComponentVersion<\/code> and <code>ModelVersion<\/code> must be in <code>major.minor.patch<\/code> format<\/li>\n<\/ul>\n<p>The model is then published as an AWS IoT Greengrass component.<\/p>\n<h3>Create the inference application as an AWS IoT Greengrass component<\/h3>\n<p>Now we create an inference application component that we can deploy to the device. This application component loads the ML model, receives data from wind turbines, performs anomaly detections, and sends the result back to the simulator. This application can be a native application that receives the data locally on the edge devices from the turbines or any other client application over a gRPC interface.<\/p>\n<p>To create a custom AWS IoT Greengrass <a href=\"https:\/\/docs.aws.amazon.com\/greengrass\/v2\/developerguide\/getting-started.html#upload-first-component\" target=\"_blank\" rel=\"noopener noreferrer\">component<\/a>, perform the following steps:<\/p>\n<ol>\n<li>Provide the code for the application as single files or as an archive. The code needs to be uploaded to an S3 bucket in the same Region where we registered the AWS IoT Greengrass devices.<\/li>\n<li>Create a recipe file, which specifies the component\u2019s configuration parameters, component dependencies, lifecycle, and platform compatibility.<\/li>\n<\/ol>\n<p>The component lifecycle defines the commands that install, run, and shut down the component. For more information, see <a href=\"https:\/\/docs.aws.amazon.com\/greengrass\/v2\/developerguide\/component-recipe-reference.html\" target=\"_blank\" rel=\"noopener noreferrer\">AWS IoT Greengrass component recipe reference<\/a>. We can define the recipe either in JSON or YAML format. Because the inference application requires the model and Edge Manager agent to be available on the device, we need to specify dependencies to the ML model packaged as an AWS IoT Greengrass component and the Edge Manager Greengrass component.<\/p>\n<ol start=\"3\">\n<li>When the recipe file is ready, create the inference component by invoking the <a href=\"https:\/\/boto3.amazonaws.com\/v1\/documentation\/api\/latest\/reference\/services\/greengrassv2.html#GreengrassV2.Client.create_component_version\" target=\"_blank\" rel=\"noopener noreferrer\">create_component_version<\/a> API. See the following code:\n<div class=\"hide-language\">\n<pre><code class=\"lang-bash\">ggv2_client = boto3.client('greengrassv2')\nwith open('recipes\/aws.samples.windturbine.detector-recipe.json') as f:\n    recipe = f.read()\nrecipe = recipe.replace('_BUCKET_', bucket_name)\nggv2_client.create_component_version(inlineRecipe=recipe\n)<\/code><\/pre>\n<\/p><\/div>\n<\/li>\n<\/ol>\n<h2>Inference application<\/h2>\n<p>The inference application connects to AWS IoT Core to receive messages from the simulated wind turbine and send the prediction results to the simulator dashboard.<\/p>\n<p>It publishes to the following topics:<\/p>\n<ul>\n<li><code>wind-turbine\/{turbine_id}\/dashboard\/update<\/code> \u2013 Updates the simulator dashboard<\/li>\n<li><code>wind-turbine\/{turbine_id}\/label\/update<\/code> \u2013 Updates the model loaded status on simulator<\/li>\n<li><code>wind-turbine\/{turbine_id}\/anomalies<\/code> \u2013 Publishes anomaly results to the simulator dashboard<\/li>\n<\/ul>\n<p>It subscribes to the following topic:<\/p>\n<ul>\n<li><code>wind-turbine\/{turbine_id}\/raw-data<\/code> \u2013 Receives raw data from the turbine<\/li>\n<\/ul>\n<h3>Set up AWS IoT Core devices<\/h3>\n<p>Next, we need to set up the devices that run the anomaly detection application by installing the AWS IoT Greengrass core software. For this post, we use five EC2 instances that act as the anomaly detection devices. We use AWS CloudFormation to launch the instances. To install the AWS IoT Greengrass core software, we provide a script in the instance <code>UserData<\/code> as shown in the following code:<\/p>\n<div class=\"hide-language\">\n<pre><code class=\"lang-bash\">      UserData:\n        Fn::Base64: !Sub \"#!\/bin\/bash\n          \n          wget -O- https:\/\/apt.corretto.aws\/corretto.key | apt-key add - \n          add-apt-repository 'deb https:\/\/apt.corretto.aws stable main'\n           \n          apt-get update; apt-get install -y java-11-amazon-corretto-jdk\n         \n          apt install unzip -y\n          apt install python3-pip -y\n          apt-get install python3.8-venv -y\n\n          ec2_region=$(curl http:\/\/169.254.169.254\/latest\/meta-data\/placement\/region)\n\n          curl -s https:\/\/d2s8p88vqu9w66.cloudfront.net\/releases\/greengrass-nucleus-latest.zip &gt; greengrass-nucleus-latest.zip  &amp;&amp; unzip greengrass-nucleus-latest.zip -d GreengrassCore\n          java -Droot=\"\/greengrass\/v2\" -Dlog.store=FILE -jar .\/GreengrassCore\/lib\/Greengrass.jar --aws-region $ec2_region  --thing-name edge-device-0 --thing-group-name ${ThingGroupName}  --tes-role-name SageMaker-WindturbinesStackTESRole --tes-role-alias-name SageMaker-WindturbinesStackTESRoleAlias  --component-default-user ggc_user:ggc_group --provision true --setup-system-service true --deploy-dev-tools true\n\n                  \"<\/code><\/pre>\n<\/p><\/div>\n<p>Each EC2 instance is associated to a single virtual wind turbine. In a real scenario, multiple wind turbines could also communicate to a single device in order to reduce the solution costs.<\/p>\n<p>To learn more about how to set up AWS IoT Greengrass software on a core device, refer to <a href=\"https:\/\/docs.aws.amazon.com\/greengrass\/v2\/developerguide\/install-greengrass-core-v2.html\" target=\"_blank\" rel=\"noopener noreferrer\">Install the AWS IoT Greengrass Core software<\/a>. The complete CloudFormation template is available in the <a href=\"https:\/\/github.com\/aws-samples\/amazon-sagemaker-edge-manager-workshop\/blob\/main\/setup\/greengrass_edge_devices_ec2.yml\" target=\"_blank\" rel=\"noopener noreferrer\">GitHub repository<\/a>.<\/p>\n<h2>Create an AWS IoT Greengrass deployment<\/h2>\n<p>When the devices are up and running, we can deploy the application. We create a deployment with a configuration containing the following components:<\/p>\n<ul>\n<li>ML model<\/li>\n<li>Inference application<\/li>\n<li>Edge Manager<\/li>\n<li>AWS IoT Greengrass CLI (only needed for debugging purposes)<\/li>\n<\/ul>\n<p>For each component, we must specify the component version. We can also provide additional configuration data, if necessary. We create the deployment by invoking the <a href=\"https:\/\/boto3.amazonaws.com\/v1\/documentation\/api\/latest\/reference\/services\/greengrassv2.html#GreengrassV2.Client.create_deployment\" target=\"_blank\" rel=\"noopener noreferrer\">create_deployment<\/a> API. See the following code:<\/p>\n<div class=\"hide-language\">\n<pre><code class=\"lang-bash\">ggv2_deployment = ggv2_client.create_deployment(\n    targetArn=wind_turbine_thing_group_arn,\n    deploymentName=\"Deployment for \" + project_id,\n    components={\n        \"aws.greengrass.Cli\": {\n            \"componentVersion\": \"2.5.3\"\n            },\n        \"aws.greengrass.SageMakerEdgeManager\": {\n            \"componentVersion\": \"1.1.0\",\n            \"configurationUpdate\": {\n                \"merge\": json.dumps({\"DeviceFleetName\":wind_turbine_device_fleet_name,\"BucketName\":bucket_name})\n            },\n            \"runWith\": {}\n        },\n        \"aws.samples.windturbine.detector\": {\n            \"componentVersion\": component_version\n        },\n        \"aws.samples.windturbine.model\": {\n            \"componentVersion\": component_version\n        }\n        })<\/code><\/pre>\n<\/p><\/div>\n<p>The <code>targetArn<\/code> argument defines where to run the deployment. The thing group ARN is specified to deploy this configuration to all devices belonging to the thing group. The thing group is created already as part of the setup of the solution architecture.<\/p>\n<p>The <code>aws.greengrass.SageMakerEdgeManager<\/code> component is an AWS-provided component by AWS IoT Greengrass. At the time of writing, the latest version is 1.1.0. You need to configure this component with the SageMaker edge device fleet name and S3 bucket location. You can find these parameters on the Edge Manager console, where the fleet was created during the setup of the solution architecture.<\/p>\n<p><code>aws.samples.windturbine.detector<\/code> is the inference application component created earlier.<\/p>\n<p><code>aws.samples.windturbine.model<\/code> is the anomaly detection ML model component created earlier.<\/p>\n<h2>Run the simulator<\/h2>\n<p>Now that everything is in place, we can start the simulator. The simulator is run from a Python notebook and performs two tasks:<\/p>\n<ol>\n<li>Simulate the physical wind turbine and display a dashboard for each wind turbine.<\/li>\n<li>Exchange data with the devices via AWS IoT MQTT using the following topics:\n<ol type=\"a\">\n<li><code>wind-turbine\/{turbine_id}\/raw-data<\/code> \u2013 Publishes the raw turbine data.<\/li>\n<li><code>wind-turbine\/{turbine_id}\/label\/update<\/code> \u2013 Receives model loaded or not loaded status from the inference application.<\/li>\n<li><code>wind-turbine\/{turbine_id}\/anomalies<\/code> \u2013 Receives anomalies published by inference application.<\/li>\n<li><code>wind-turbine\/{turbine_id}\/dashboard\/update<\/code> \u2013 Receives recent buffered data by the turbines.<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<p>We can use the simulator UI to start and stop the virtual wind turbine and inject noise in the <code>Volt<\/code>, <code>Rot<\/code>, and <code>Vib<\/code> measurements to simulate anomalies that are detected by the application running on the device. In the following screenshot, the simulator shows a virtual representation of five wind turbines that are currently running. We can choose <strong>Stop <\/strong>to stop any of the turbines, or choose <strong>Volt<\/strong>, <strong>Rot<\/strong>, or <strong>Vib<\/strong> to inject noise in the turbines. For example, if we choose <strong>Volt <\/strong>for turbine with ID 0, the <strong>Voltage <\/strong>status changes from a green check mark to a red x, denoting the voltage readings of the turbine are anomalous.<\/p>\n<p><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2022\/02\/17\/ML-6786-image002.jpg\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-33152\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2022\/02\/17\/ML-6786-image002.jpg\" alt=\"\" width=\"2456\" height=\"1434\"><\/a><\/p>\n<h2>Conclusion<\/h2>\n<p>Securely and reliably maintaining the lifecycle of an ML model deployed across a fleet of devices isn\u2019t an easy task. However, with Edge Manager and AWS IoT Greengrass, we can reduce the implementation effort and operational cost of such a solution. This solution increases the agility in experimenting and optimizing the ML model with full automation of the ML pipelines, from data acquisition, data preparation, model training, model validation, and deployment to the devices.<\/p>\n<p>In addition to the benefits described, Edge Manager offers further benefits, like having access to a device fleet dashboard on the Edge Manager console, which can display near-real-time health of the devices by capturing heartbeat requests. You can use this inference data with <a href=\"https:\/\/docs.aws.amazon.com\/sagemaker\/latest\/dg\/model-monitor.html\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon SageMaker Model Monitor<\/a> to check for data and model quality drift issues.<\/p>\n<p>To build a solution for your own needs, get the code and artifacts from the <a href=\"https:\/\/github.com\/aws-samples\/amazon-sagemaker-edge-manager-workshop\" target=\"_blank\" rel=\"noopener noreferrer\">GitHub repo<\/a>. The repository shows two different ways of deploying the models:<\/p>\n<ul>\n<li>Using IoT jobs<\/li>\n<li>Using AWS IoT Greengrass (covered in this post)<\/li>\n<\/ul>\n<p>Although this post focuses on deployment using AWS IoT Greengrass, interested readers look at the solution using IoT jobs as well to better understand the differences.<\/p>\n<hr>\n<h3>About the Authors<\/h3>\n<p><strong><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/06\/07\/Vikesh-Pandey.jpg\"><img decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-25068 alignleft\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/06\/07\/Vikesh-Pandey.jpg\" alt=\"\" width=\"101\" height=\"134\"><\/a>Vikesh Pandey<\/strong> is a Machine Learning Specialist Specialist Solutions Architect at AWS, helping customers in the Nordics and wider EMEA region design and build ML solutions. Outside of work, Vikesh enjoys trying out different cuisines and playing outdoor sports.<\/p>\n<p><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2022\/02\/17\/Massimiliano-Angelino.png\"><img decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-33157 alignleft\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2022\/02\/17\/Massimiliano-Angelino.png\" alt=\"\" width=\"100\" height=\"133\"><\/a><strong>Massimiliano Angelino<\/strong> is Lead Architect for the EMEA Prototyping team. During the last 3 and half years he has been an IoT Specialist Solution Architect with a particular focus on edge computing, and he contributed to the launch of AWS IoT Greengrass v2 service and its integration with Amazon SageMaker Edge Manager. Based in Stockholm, he enjoys skating on frozen lakes.<\/p>\n<p>       <!-- '\"` -->\n      <\/div>\n","protected":false},"excerpt":{"rendered":"<p>https:\/\/aws.amazon.com\/blogs\/machine-learning\/anomaly-detection-with-amazon-sagemaker-edge-manager-using-aws-iot-greengrass-v2\/<\/p>\n","protected":false},"author":0,"featured_media":1869,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[3],"tags":[],"_links":{"self":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/1868"}],"collection":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/comments?post=1868"}],"version-history":[{"count":0,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/1868\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media\/1869"}],"wp:attachment":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media?parent=1868"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/categories?post=1868"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/tags?post=1868"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}