{"id":728,"date":"2021-01-04T23:44:02","date_gmt":"2021-01-04T23:44:02","guid":{"rendered":"https:\/\/machine-learning.webcloning.com\/2021\/01\/04\/implementing-a-custom-labeling-gui-with-built-in-processing-logic-with-amazon-sagemaker-ground-truth\/"},"modified":"2021-01-04T23:44:02","modified_gmt":"2021-01-04T23:44:02","slug":"implementing-a-custom-labeling-gui-with-built-in-processing-logic-with-amazon-sagemaker-ground-truth","status":"publish","type":"post","link":"https:\/\/salarydistribution.com\/machine-learning\/2021\/01\/04\/implementing-a-custom-labeling-gui-with-built-in-processing-logic-with-amazon-sagemaker-ground-truth\/","title":{"rendered":"Implementing a custom labeling GUI with built-in processing logic with Amazon SageMaker Ground Truth"},"content":{"rendered":"<div id=\"\">\n<p><a href=\"https:\/\/aws.amazon.com\/sagemaker\/groundtruth\/\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon SageMaker Ground Truth<\/a> is a fully managed <a href=\"https:\/\/aws.amazon.com\/sagemaker\/groundtruth\/what-is-data-labeling\/\" target=\"_blank\" rel=\"noopener noreferrer\">data labeling<\/a> service that makes it easy to build highly accurate training datasets for machine learning. It offers easy access to <a href=\"https:\/\/www.mturk.com\/\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon Mechanical Turk<\/a> and private human labelers, and provides them with built-in workflows and interfaces for common labeling tasks.<\/p>\n<p>A labeling team may wish to use the powerful customization features in Ground Truth to modify:<\/p>\n<ul>\n<li>The look and feel of the workers\u2019 graphical user interface (GUI)<\/li>\n<li>The backend <a href=\"https:\/\/aws.amazon.com\/lambda\/\" target=\"_blank\" rel=\"noopener noreferrer\">AWS Lambda<\/a> functions that perform the preprocessing and postprocessing logic.<\/li>\n<\/ul>\n<p>Depending on the nature of your labeling job and your use case, your customization requirements may vary.<\/p>\n<p>In this post, via a custom workflow, I show you how to implement a text classification labeling job consisting of a custom GUI, built-in preprocessing and postprocessing logic, and encrypted output. I also provide you with an overview of the prerequisites, the code, and estimated costs of implementing the solution.<\/p>\n<h2>Understanding task types and processing logic<\/h2>\n<p>In this section, I\u2019ll discuss the use cases surrounding built-in vs custom task types and processing logic.<\/p>\n<h3>Built-in task types that implement built-in GUIs and built-in processing logic<\/h3>\n<p>Ground Truth provides several <a href=\"https:\/\/docs.aws.amazon.com\/sagemaker\/latest\/dg\/sms-task-types.html\" target=\"_blank\" rel=\"noopener noreferrer\">built-in task types<\/a>\u00a0that cover many image, text, video, video frame, and 3D point cloud labeling use cases.<\/p>\n<p>If you want to implement one of these built-in task types, along with a default labeling GUI, creating a labeling job requires no customization steps.<\/p>\n<h3>Custom task types that implement custom GUIs and custom processing logic<\/h3>\n<p>If the built-in task types don\u2019t satisfy your labeling job requirements, the options for customizing the GUI as well as the preprocessing and postprocessing logic are nearly endless by way of the <a href=\"https:\/\/docs.aws.amazon.com\/sagemaker\/latest\/dg\/sms-custom-templates.html\" target=\"_blank\" rel=\"noopener noreferrer\">custom labeling workflow<\/a> feature.<\/p>\n<p>With this feature, instead of choosing a built-in task type, you define the preprocessing and postprocessing logic via your own Lambda functions. You also have <a href=\"https:\/\/docs.aws.amazon.com\/sagemaker\/latest\/dg\/sms-custom-templates-step2.html\" target=\"_blank\" rel=\"noopener noreferrer\">full control over the labeling GUI<\/a> using HTML elements and the <a href=\"https:\/\/github.com\/aws-samples\/amazon-sagemaker-ground-truth-task-uis\" target=\"_blank\" rel=\"noopener noreferrer\">Liquid-based template system<\/a>. This enables you to do some really cool customization, including Angular framework integration. For more information, see <a href=\"https:\/\/aws.amazon.com\/blogs\/machine-learning\/building-a-custom-angular-application-for-labeling-jobs-with-amazon-sagemaker-ground-truth\/\" target=\"_blank\" rel=\"noopener noreferrer\">Building a custom Angular application for labeling jobs with Amazon SageMaker Ground Truth<\/a>.<\/p>\n<p>For more details on custom workflows, see <a href=\"https:\/\/docs.aws.amazon.com\/sagemaker\/latest\/dg\/sms-custom-templates.html\" target=\"_blank\" rel=\"noopener noreferrer\">Creating Custom Labeling Workflows<\/a> and <a href=\"https:\/\/aws.amazon.com\/blogs\/machine-learning\/creating-custom-labeling-jobs-with-aws-lambda-and-amazon-sagemaker-ground-truth\/\" target=\"_blank\" rel=\"noopener noreferrer\">Creating custom labeling jobs with AWS Lambda and Amazon SageMaker Ground Truth<\/a>.<\/p>\n<h3>Built-in task types that implement custom GUIs and built-in processing logic<\/h3>\n<p>So far, I\u2019ve discussed the built-in (100% out-of-the-box) option and the custom workflow (100% custom GUI and logic) option for running a job.<\/p>\n<p>What if you wanted to implement a custom GUI, but implement the built-in preprocessing and postprocessing logic that the built-in task types provide? This way, we can adjust the GUI just the way we want, while still relying on the latest AWS-based preprocessing and postprocessing logic (not to mention not having to maintain another codebase).<\/p>\n<p>You can, and I\u2019ll show you how, step-by-step.<\/p>\n<h2>Prerequisites<\/h2>\n<p>To complete this solution, you need to set up the following prerequisites:<\/p>\n<h3>Setting up an AWS account<\/h3>\n<p>In this post, you work directly with IAM, SageMaker, AWS KMS, and Amazon S3, so if you haven\u2019t already, <a href=\"https:\/\/aws.amazon.com\/free\/\">create an AWS <\/a><a href=\"https:\/\/aws.amazon.com\/free\/\" target=\"_blank\" rel=\"noopener noreferrer\">account<\/a>. Following along with this post incurs AWS usage charges, so be sure to shut down and delete resources when you\u2019re finished.<\/p>\n<h3>Setting up the AWS CLI<\/h3>\n<p>Because we use some parameters not available (as of this writing) on the <a href=\"http:\/\/aws.amazon.com\/console\">AWS Management Console<\/a>, you need access to the AWS CLI. For more information, see <a href=\"https:\/\/docs.aws.amazon.com\/cli\/latest\/userguide\/cli-chap-install.html\" target=\"_blank\" rel=\"noopener noreferrer\">Installing, updating, and uninstalling the AWS CLI<\/a>.<\/p>\n<p>All Ground Truth, Amazon S3, and Lambda configurations for this post must be set up within the same Region. This post assumes you\u2019re operating all services out of the <code>us-west-2 region<\/code>. If you\u2019re operating within another Region, be sure to modify your setup accordingly for a same-Region setup.<\/p>\n<h3>Setting up IAM permissions<\/h3>\n<p>If you created labeling jobs in the past with Ground Truth, you may already have the permissions needed to implement this solution. Those permissions include the following policies:<\/p>\n<ul>\n<li>\n<strong>SageMakerFullAccess <\/strong>\u2013 To have access to the SageMaker GUI and S3 buckets to perform the steps outlined in this post, you need the <a href=\"https:\/\/console.aws.amazon.com\/iam\/home?#\/policies\/arn:aws:iam::aws:policy\/AmazonSageMakerFullAccess%24jsonEditor\" target=\"_blank\" rel=\"noopener noreferrer\">SageMakerFullAccess<\/a> policy applied to the user, group, or role assumed for this post.<\/li>\n<li>\n<strong>AmazonSageMakerGroundTruthExecution <\/strong>\u2013 The Ground Truth labeling jobs you create in this post need to run with an execution role that has the <a href=\"https:\/\/console.aws.amazon.com\/iam\/home?#\/policies\/arn:aws:iam::aws:policy\/AmazonSageMakerGroundTruthExecution%24jsonEditor\" target=\"_blank\" rel=\"noopener noreferrer\">AmazonSageMakerGroundTruthExecution<\/a> policy attached.<\/li>\n<\/ul>\n<p>If you have the permissions required to create these roles yourself, the SageMaker GUI walks you through a wizard to set them up. If you don\u2019t have access to create these roles, ask your administrator to create them for you to use during job creation and management.<\/p>\n<h3>Setting up an S3 bucket<\/h3>\n<p>You need an S3 bucket in the us-west-2 Region to host the SageMaker manifest and categories files for the labeling job. By default, the <a href=\"https:\/\/console.aws.amazon.com\/iam\/home?#\/policies\/arn:aws:iam::aws:policy\/AmazonSageMakerFullAccess%24jsonEditor\" target=\"_blank\" rel=\"noopener noreferrer\">SageMakerFullAccess<\/a> and <a href=\"https:\/\/console.aws.amazon.com\/iam\/home?#\/policies\/arn:aws:iam::aws:policy\/AmazonSageMakerGroundTruthExecution%24jsonEditor\" target=\"_blank\" rel=\"noopener noreferrer\">AmazonSageMakerGroundTruthExecution<\/a> policies only grant access to S3 buckets containing <code>sagemaker<\/code> or <code>groundtruth<\/code> in their name (for example, buckets named <code>my-awesome-bucket-sagemaker<\/code> or<code> marketing-groundtruth-datasets<\/code>).<\/p>\n<p>Be sure to name your buckets accordingly, or modify the policy accordingly to provide the appropriate access.<\/p>\n<p>For more information on creating a bucket, see <a href=\"https:\/\/docs.aws.amazon.com\/sagemaker\/latest\/dg\/gs-config-permissions.html\" target=\"_blank\" rel=\"noopener noreferrer\">Step 1: Create an Amazon S3 Bucket<\/a>. There is no need for public access to this bucket, so don\u2019t grant it.<\/p>\n<p>As mentioned earlier, all the Ground Truth, Amazon S3, and Lambda configurations for this solution must be in the same Region. For this post, we use <code>us-west-2<\/code>.<\/p>\n<h3>Setting up the Ground Truth work team<\/h3>\n<p>When you create a labeling job, you need to assign it to a predefined work team that works on it. If you haven\u2019t created a work team already (or want to create a specific one just for this post), see <a href=\"https:\/\/docs.aws.amazon.com\/sagemaker\/latest\/dg\/sms-workforce-management.html\" target=\"_blank\" rel=\"noopener noreferrer\">Create and Manage Workforces<\/a>.<\/p>\n<h3>Setting up AWS KMS<\/h3>\n<p>With security as job zero, make sure to encrypt the output manifest file created by the job\u2019s output. To do this, at job creation time, you need to reference a KMS key ID to encrypt the output of the custom Ground Truth job in your S3 bucket.<\/p>\n<p>By default, each account has an AWS managed key (<code>aws\/s3<\/code>) created automatically. For this post, you can use the key ID of the AWS managed key, or you create and use your own customer managed key ID.<\/p>\n<p>For more information about creating and using keys with AWS KMS, see <a href=\"https:\/\/docs.aws.amazon.com\/kms\/latest\/developerguide\/getting-started.html\" target=\"_blank\" rel=\"noopener noreferrer\">Getting started<\/a>.<\/p>\n<h2>Estimated costs<\/h2>\n<p>Running this solution incurs costs for the following:<\/p>\n<ul>\n<li>\n<strong>Ground Truth labeling<\/strong> \u2013 Labeling costs for each job are $0.56 when using your own private workforce (other workforce types, including Mechanical Turk, may have additional costs). For more information, see <a href=\"https:\/\/aws.amazon.com\/sagemaker\/groundtruth\/pricing\/\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon SageMaker Ground Truth pricing<\/a>.<\/li>\n<li>\n<strong>Amazon S3 storage, retrieval, and data transfer<\/strong> \u2013 These costs are less than $0.05 (this assumes you delete all files when you\u2019re finished, and operate the solution for a day or less). For more information, see <a href=\"https:\/\/aws.amazon.com\/s3\/pricing\/\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon S3 pricing<\/a>.<\/li>\n<li>\n<strong>Key usage<\/strong> \u2013 The cost of an AWS managed KMS key is less than $0.02 for a day\u2019s worth of usage. Storage and usage costs for a customer managed key may be higher. For more information, see <a href=\"https:\/\/aws.amazon.com\/kms\/pricing\/\" target=\"_blank\" rel=\"noopener noreferrer\">AWS Key Management Service pricing<\/a>.<\/li>\n<\/ul>\n<h2>Setting up the manifest, category, and GUI files<\/h2>\n<p>Now that you have met the prerequisites, you can create the manifest, categories, GUI files.<\/p>\n<h3>Creating the files<\/h3>\n<p>We first create the <code>dataset.manifest<\/code> file, which we use as the input dataset for the labeling job.<\/p>\n<p>Each object in <code>dataset.manifest<\/code> contains a line of text describing a person, animal, or plant. One or more of these lines of text is presented as tasks to your workers; they\u2019re responsible for correctly identifying which of the three classifications the line of text best fits.<\/p>\n<p>For this post, <code>dataset.manifest<\/code> only has seven lines (workers can label up to seven objects), but this input dataset file could have up to 100,000 entries.<\/p>\n<p>Create a file locally named <code>dataset.manifest<\/code> that contains the following text:<\/p>\n<div class=\"hide-language\">\n<pre><code class=\"lang-json\">{\"source\":\"His nose could detect over 1 trillion odors!\"}\r\n{\"source\":\"Why do fish live in salt water? Because pepper makes them sneeze!\"}\r\n{\"source\":\"What did the buffalo say to his son when he went away on a trip? Bison!\"}\r\n{\"source\":\"Why do plants go to therapy? To get to the roots of their problems!\"}\r\n{\"source\":\"What do you call a nervous tree? A sweaty palm!\"}\r\n{\"source\":\"Some kids in my family really like birthday cakes and stars!\"}\r\n{\"source\":\"A small portion of the human population carries a fabella bone.\"}<\/code><\/pre>\n<\/div>\n<p>Next, we create the <code>categories.json<\/code> file. This file is used by Ground Truth to define the categories used to label the data objects.<\/p>\n<p>Create a file locally named categories.json that contains the following code:<\/p>\n<div class=\"hide-language\">\n<pre><code class=\"lang-json\">{\r\n    \"document-version\": \"2018-11-28\",\r\n    \"labels\": [{\r\n            \"label\": \"person\"\r\n        },\r\n        {\r\n            \"label\": \"animal\"\r\n        },\r\n        {\r\n            \"label\": \"plant\"\r\n        }\r\n    ]\r\n}<\/code><\/pre>\n<\/div>\n<p>Finally, we create the <code>worker_gui.html<\/code> file. This file, when rendered, provides the GUI for the workers\u2019 labeling tasks. The options are endless, but for this post, we create a custom GUI that adds the following custom features:<\/p>\n<ul>\n<li>An additional <strong>Submit<\/strong> button that is styled larger than the default.<\/li>\n<li>Shortcut keys for submitting and resetting the form.<\/li>\n<li>JavaScript logic to programatically modify a CSS style (break-all) on the task text output.<\/li>\n<\/ul>\n<p>Make this custom GUI by creating a file locally named <code>worker_gui.html<\/code> containing the following code:<\/p>\n<div class=\"hide-language\">\n<pre><code class=\"lang-html\">&lt;script src=\"https:\/\/assets.crowd.aws\/crowd-html-elements.js\"&gt;&lt;\/script&gt;\r\n\r\n&lt;crowd-form&gt;\r\n  &lt;crowd-classifier\r\n    name=\"crowd-classifier\"\r\n    categories=\"{{ task.input.labels | to_json | escape }}\"\r\n    header=\"Please classify\"\r\n  &gt;\r\n\r\n    &lt;classification-target&gt;\r\n      &lt;strong&gt;{{ task.input.taskObject }}&lt;\/strong&gt;\r\n    &lt;\/classification-target&gt;\r\n    &lt;full-instructions header=\"Full Instructions\"&gt;\r\n      &lt;div&gt;\r\n        &lt;p&gt;Based on the general subject or topic of each sentence presented, please classify it as only one of the following: person, animal, or plant. &lt;\/p&gt;\r\n      &lt;\/div&gt;\r\n    &lt;\/full-instructions&gt;\r\n\r\n    &lt;short-instructions&gt;\r\n      Complete tasks\r\n    &lt;\/short-instructions&gt;\r\n  &lt;\/crowd-classifier&gt;\r\n&lt;\/crowd-form&gt;\r\n\r\n&lt;script&gt;\r\n\r\n  document.addEventListener('all-crowd-elements-ready', () =&gt; {\r\n    \/\/ Creating new button to inject in label pane\r\n    const button = document.createElement('button');\r\n    button.textContent = 'Submit';\r\n    button.classList.add('awsui-button', 'awsui-button-variant-primary', 'awsui-hover-child-icons');\r\n\r\n    \/\/ Editing styling to make it larger\r\n    button.style.height = '60px';\r\n    button.style.width = '100px';\r\n    button.style.margin = '15px';\r\n\r\n    \/\/ Adding onclick for submission\r\n    const crowdForm = document.querySelector('crowd-form');\r\n    button.onclick = () =&gt; crowdForm.submit();\r\n\r\n    \/\/ Injecting\r\n    const crowdClassifier = document.querySelector('crowd-classifier').shadowRoot;\r\n    const labelPane = crowdClassifier.querySelector('.category-picker-wrapper');\r\n    labelPane.appendChild(button);\r\n\r\n    \/\/ Adding a Enter hotkey\r\n    document.addEventListener('keydown', e =&gt; {\r\n      if (e.key === 'Enter') {\r\n        crowdForm.submit();\r\n      }\r\n      if (e.key === 'r') {\r\n        crowdForm.reset();\r\n      }\r\n\r\n    })\r\n\r\n    \/\/ Implement break-all style in the layout to handle long text tasks\r\n    const annotationTarget = crowdClassifier.querySelector('.annotation-area.target');\r\n    annotationTarget.style.wordBreak = 'break-all';\r\n  });\r\n&lt;\/script&gt;<\/code><\/pre>\n<\/div>\n<h3>Previewing the GUI in your web browser<\/h3>\n<p>While working on the <code>worker.gui.html<\/code> file, you may find it useful to preview what you\u2019re building.<\/p>\n<p>At any time you can open the <code>worker_gui.html<\/code> file from your local file system on your browser for a limited preview of the GUI you\u2019re creating. Some dynamic data, such as that provided by the Lamdba preprocessing functions, may not be visible until you run the job from the job status preview page or worker portal.<\/p>\n<p>To preview with real data, you can create a custom job with Lambda functions. For instructions, see <a href=\"https:\/\/aws.amazon.com\/blogs\/machine-learning\/creating-custom-labeling-jobs-with-aws-lambda-and-amazon-sagemaker-ground-truth\/\" target=\"_blank\" rel=\"noopener noreferrer\">Creating custom labeling jobs with AWS Lambda and Amazon SageMaker Ground Truth<\/a>. You can preview live from the Ground Truth console\u2019s <strong>Create labeling job<\/strong> flow.<\/p>\n<p>For more information about the <a href=\"https:\/\/github.com\/aws-samples\/amazon-sagemaker-ground-truth-task-uis\" target=\"_blank\" rel=\"noopener noreferrer\">Liquid-based template system<\/a>, see <a href=\"https:\/\/docs.aws.amazon.com\/sagemaker\/latest\/dg\/sms-custom-templates-step2.html\" target=\"_blank\" rel=\"noopener noreferrer\">Step 2: Creating your custom labeling task template<\/a>.<\/p>\n<h3>Uploading the files to Amazon S3<\/h3>\n<p>You can now upload all three files to the root directory of your S3 bucket. When uploading these files to Amazon S3, accept all defaults. For more information, see <a href=\"https:\/\/docs.aws.amazon.com\/AmazonS3\/latest\/user-guide\/upload-objects.html\" target=\"_blank\" rel=\"noopener noreferrer\">How Do I Upload Files and Folders to an S3 Bucket?<\/a><\/p>\n<h2>Creating the custom labeling job<\/h2>\n<p>After you upload the files to Amazon S3, you can create your labeling job. For some use cases, the SageMaker console provides the needed interface for creating both built-in and custom workflows. In our use case, we use the AWS CLI because it provides additional options not yet available (as of this writing) on the SageMaker console.<\/p>\n<p>The following scripting instructions assume you\u2019re on MacOS or Linux. If you\u2019re on Windows, you may need to modify the extension and contents of the script for it to work, depending on your environment.<\/p>\n<p>Create a file called <code>createCustom.sh<\/code> (provide your bucket name, execution role ARN, KMS key ID, and work team ARN):<\/p>\n<div class=\"hide-language\">\n<pre><code class=\"lang-bash\">aws sagemaker create-labeling-job \r\n--labeling-job-name $1 \r\n--label-attribute-name \"aws-blog-demo\" \r\n--label-category-config-s3-uri \"s3:\/\/<span><strong><em>YOUR_BUCKET_NAME<\/em><\/strong><\/span>\/categories.json\" \r\n--role-arn \"<span><strong><em>YOUR_SAGEMAKER_GROUNDTRUTH_EXECUTION_ROLE_ARN<\/em><\/strong><\/span>\" \r\n--input-config '{\r\n  \"DataSource\": {\r\n    \"S3DataSource\": {\r\n      \"ManifestS3Uri\": \"s3:\/\/<span><strong><em>YOUR_BUCKET_NAME<\/em><\/strong><\/span>\/dataset.manifest\"\r\n    }\r\n  }\r\n}' \r\n--output-config '{\r\n        \"KmsKeyId\": \"<span><strong><em>YOUR_KMS_KEY_ID<\/em><\/strong><\/span>\",\r\n        \"S3OutputPath\": \"s3:\/\/<span><strong><em>YOUR_BUCKET_NAME<\/em><\/strong><\/span>\/output\"\r\n}' \r\n--human-task-config '{\r\n        \"AnnotationConsolidationConfig\": {\r\n            \"AnnotationConsolidationLambdaArn\": \"arn:aws:lambda:us-west-2:081040173940:function:ACS-TextMultiClass\"\r\n        },\r\n        \"TaskAvailabilityLifetimeInSeconds\": 21600,\r\n        \"TaskTimeLimitInSeconds\": 3600,\r\n        \"NumberOfHumanWorkersPerDataObject\": 1,\r\n        \"PreHumanTaskLambdaArn\":  \"arn:aws:lambda:us-west-2:081040173940:function:PRE-TextMultiClass\",\r\n        \"WorkteamArn\": \"<strong><em><span>YOUR_WORKTEAM_ARN<\/span><\/em><\/strong>\",\r\n        \"TaskDescription\": \"Select all labels that apply\",\r\n        \"MaxConcurrentTaskCount\": 1000,\r\n        \"TaskTitle\": \"Text classification task\",\r\n        \"UiConfig\": {\r\n            \"UiTemplateS3Uri\": \"s3:\/\/<span><strong><em>YOUR_BUCKET_NAME<\/em><\/strong><\/span>\/worker_gui.html\"\r\n        }\r\n    }'<\/code><\/pre>\n<\/div>\n<p>Make sure to use your work team ARN, not your workforce ARN. For your KMS key, use the key ID or the AWS managed or customer managed key you want to encrypt the output with. For instructions on retrieving your key, see <a href=\"https:\/\/docs.aws.amazon.com\/kms\/latest\/developerguide\/find-cmk-id-arn.html\" target=\"_blank\" rel=\"noopener noreferrer\">Finding the key ID and ARN<\/a>. For more information about types of KMS keys, see <a href=\"https:\/\/docs.aws.amazon.com\/kms\/latest\/developerguide\/concepts.html#master_keys\" target=\"_blank\" rel=\"noopener noreferrer\">Customer master keys (CMKS)<\/a>.<\/p>\n<p>Make the file executable via the command <code>chmod 700\u00a0createCustom.sh<\/code>.<\/p>\n<p>Almost done! But before we run the script, let\u2019s step through what this script is doing in more detail. The script runs the <code>aws sagemaker create-lableing-job<\/code> CLI command with the following parameters:<\/p>\n<ul>\n<li>\n<strong>\u2013labeling-job-name<\/strong> \u2013 We set this value to $1, which translates to the argument we pass on the command line when we run it.<\/li>\n<li>\n<strong>\u2013label-attribute-name <\/strong>\u2013 The attribute name to use for the label in the output manifest file.<\/li>\n<li>\n<strong>\u2013label-category-config-s3-url<\/strong> \u2013 The path to the <code>categories.json<\/code> file we previously uploaded to Amazon S3.<\/li>\n<li>\n<strong>\u2013role-arn<\/strong> \u2013 The ARN of the IAM role SageMaker runs the job under. If you aren\u2019t sure what this value is, your administrator should be able to provide it to you.<\/li>\n<li>\n<strong>\u2013input-config<\/strong> \u2013 Points to the location of the input dataset manifest file.<\/li>\n<li>\n<strong>\u2013output-config<\/strong> \u2013 Points to a KMS key ID and the job\u2019s output path.<\/li>\n<li>\n<strong>\u2013human-task-config <\/strong>\u2013 Provides the following parameters: <\/p>\n<ul>\n<li>\n<strong>PreHumanTaskLambdaArn<\/strong> \u2013 The built-in AWS-provided Lambda function that performs the same preprocessing logic as that found in the built-in text classification job type. It handles reading the dataset manifest file in Amazon S3, parsing it, and providing the GUI with the appropriate task data.<\/li>\n<li>\n<strong>AnnotationConsolidationLambdaArn<\/strong> \u2013 The built-in AWS-provided Lambda function that performs the same postprocessing logic as that found in the built-in text classification job type. It handles postprocessing of the data after each labeler submits an answer. As a reminder, all Ground Truth, Amazon S3, and Lambda configurations for this post must be set up within the same Region (for this post, <code>us-west-2<\/code>). For non <code>us-west-2<\/code> Lambda ARN options, see <a href=\"https:\/\/docs.aws.amazon.com\/cli\/latest\/reference\/sagemaker\/create-labeling-job.html\" target=\"_blank\" rel=\"noopener noreferrer\">create-labeling-job<\/a>.<\/li>\n<li>\n<strong>TaskAvailabilityLifetimeInSeconds<\/strong> \u2013 The length of time that a task remains available for labeling by human workers.<\/li>\n<li>\n<strong>TaskTimeLimitInSeconds<\/strong> \u2013 The amount of time that a worker has to complete a task.<\/li>\n<li>\n<strong>NumberOfHumanWorkersPerDataObject<\/strong> \u2013 The number of human workers that label an object.<\/li>\n<li>\n<strong>WorkteamArn<\/strong> \u2013 The ARN of the work team assigned to complete the tasks. Make sure to use your work team ARN and not your workforce ARN in the script.<\/li>\n<li>\n<strong>TaskDescription<\/strong> \u2013 A description of the task for your human workers.<\/li>\n<li>\n<strong>MaxConcurrentTaskCount<\/strong> \u2013 Defines the maximum number of data objects that can be labeled by human workers at the same time.<\/li>\n<li>\n<strong>TaskTitle<\/strong> \u2013 A title for the task for your human workers.<\/li>\n<li>\n<strong>UiTemplateS3Uri<\/strong> \u2013 The S3 bucket location of the GUI template that we uploaded earlier. This is the HTML template used to render the worker GUI for labeling job tasks.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>For more information about the options available when creating a labeling job from the AWS CLI, see <a href=\"https:\/\/docs.aws.amazon.com\/cli\/latest\/reference\/sagemaker\/create-labeling-job.html\" target=\"_blank\" rel=\"noopener noreferrer\">create-labeling-job<\/a>.<\/p>\n<h2>Running the job<\/h2>\n<p>Now that you\u2019ve created the script with all the proper parameters, its time to run it! To run the script, enter <code>.\/createCustom.sh<\/code> <span><strong><em>JOBNAME<\/em><\/strong><\/span> from the command line, providing a unique name for the job.<\/p>\n<p>In my example, I named the job <code>gec-custom-template-300<\/code>, and my command line looked like the following:<\/p>\n<div class=\"hide-language\">\n<pre><code class=\"lang-json\">gcohen $: .\/createCustom.sh gec-custom-template-300\r\n\r\n{\r\n\"LabelingJobArn\": \"arn:aws:sagemaker:us-west-2:xxyyzz:labeling-job\/gec-custom-template-300\"\r\n}<\/code><\/pre>\n<\/div>\n<h2>Checking the job status and previewing the GUI<\/h2>\n<p>Now that we\u2019ve submitted the job, we can easily check its status on the console.<\/p>\n<ol>\n<li>On the SageMaker console, under <strong>Ground Truth<\/strong>, choose <strong>Labeling jobs<\/strong>.<\/li>\n<\/ol>\n<p>You should see the job we just submitted.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-18983\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/11\/25\/Implementing-a-custom-1.jpg\" alt=\"\" width=\"800\" height=\"121\"><\/p>\n<ol start=\"2\">\n<li>Choose the job to get more details.<\/li>\n<\/ol>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-18984\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/11\/25\/Implementing-a-custom-2.jpg\" alt=\"\" width=\"800\" height=\"912\"><\/p>\n<ol start=\"3\">\n<li>Choose <strong>View labeling tool<\/strong> to preview what our labeling workers see when they take the job.<\/li>\n<\/ol>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-18985\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/11\/25\/Implementing-a-custom-3.jpg\" alt=\"\" width=\"800\" height=\"162\"><\/p>\n<p>In addition, by using AWS KMS encryption, you can specify authorized users who can decrypt the output manifest file. Who exactly is authorized to decrypt this file varies depending on whether the key is customer managed or AWS managed. For specifics on access permissions for a given key, review the key\u2019s <a href=\"https:\/\/docs.aws.amazon.com\/kms\/latest\/developerguide\/concepts.html#key_permissions\" target=\"_blank\" rel=\"noopener noreferrer\">key policy<\/a>.<\/p>\n<h2>Conclusion<\/h2>\n<p>In this post, I demonstrated how to implement a custom labeling GUI with built-in preprocessing and postprocessing logic by way of a custom workflow. I also demonstrated how to encrypt the output with AWS KMS. The prerequisites, code, and estimated costs of running it all were also provided.<\/p>\n<p>The code was provided to get you running quickly, but don\u2019t stop there! Try experimenting by adding additional functionality to your workers\u2019 labeling GUIs, either with your own custom libraries or third-party logic. If you get stuck, don\u2019t hesitate to reach out directly, or post an issue on our <a href=\"https:\/\/github.com\/aws-samples\/amazon-sagemaker-ground-truth-task-uis\/issues\" target=\"_blank\" rel=\"noopener noreferrer\">GitHub repo issues<\/a> page.<\/p>\n<hr>\n<h3>About the Author<\/h3>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-10862 alignleft\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/01\/29\/geremy-cohen-100.jpg\" alt=\"\" width=\"100\" height=\"132\"><strong>Geremy Cohen<\/strong>\u00a0is a Solutions Architect with AWS where he helps customers build cutting-edge, cloud-based solutions. In his spare time, he enjoys short walks on the beach, exploring the bay area with his family, fixing things around the house, breaking things around the house, and BBQing.<\/p>\n<\/p><\/div>\n","protected":false},"excerpt":{"rendered":"<p>https:\/\/aws.amazon.com\/blogs\/machine-learning\/implementing-a-custom-labeling-gui-with-built-in-processing-logic-with-amazon-sagemaker-ground-truth\/<\/p>\n","protected":false},"author":0,"featured_media":729,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[3],"tags":[],"_links":{"self":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/728"}],"collection":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/comments?post=728"}],"version-history":[{"count":0,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/728\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media\/729"}],"wp:attachment":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media?parent=728"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/categories?post=728"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/tags?post=728"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}