{"id":867,"date":"2021-09-17T06:56:47","date_gmt":"2021-09-17T06:56:47","guid":{"rendered":"https:\/\/salarydistribution.com\/machine-learning\/2021\/09\/17\/introducing-pii-identification-and-redaction-in-streaming-transcriptions-using-amazon-transcribe\/"},"modified":"2021-09-17T06:56:47","modified_gmt":"2021-09-17T06:56:47","slug":"introducing-pii-identification-and-redaction-in-streaming-transcriptions-using-amazon-transcribe","status":"publish","type":"post","link":"https:\/\/salarydistribution.com\/machine-learning\/2021\/09\/17\/introducing-pii-identification-and-redaction-in-streaming-transcriptions-using-amazon-transcribe\/","title":{"rendered":"Introducing PII identification and redaction in streaming transcriptions using Amazon Transcribe"},"content":{"rendered":"<div id=\"\">\n<p><a href=\"https:\/\/aws.amazon.com\/transcribe\/\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon Transcribe<\/a> is an automatic speech recognition (ASR) service that makes it easy for developers to add <a href=\"https:\/\/aws.amazon.com\/transcribe\/\" target=\"_blank\" rel=\"noopener noreferrer\">speech to text<\/a> capabilities to their applications. Since launching in 2017, Amazon Transcribe has added numerous features to enhance its capabilities around converting speech to text. Some of these features include automatic language detection, custom language models, vocabulary filtering, speaker identification, streaming transcriptions, and more.<\/p>\n<p>One popular use case of Amazon Transcribe is transcribing customer support calls or any customer interaction that involves voice. You can use these transcripts to record customer conversations and extract insights such as sentiment, call drivers, or agent performance. Therefore, call transcripts are a valuable dataset that is crucial to effectively addressing customer needs and improving operational performance. However, it\u2019s critical to ensure that the right safeguards are put in place to protect customer identity and privacy when using this data.<\/p>\n<p>To enable privacy protection, Amazon Transcribe launched <a href=\"https:\/\/docs.aws.amazon.com\/transcribe\/latest\/dg\/pii-redaction.html\" target=\"_blank\" rel=\"noopener noreferrer\">automatic redaction of personally identifiable information (PII)<\/a> in transcription jobs. Companies can use this feature to redact sensitive personal information such as credit card or social security numbers in your call and voice recordings. However, we heard from customers that they also want to mask such information from real-time transcription results on agent desktops. With PII redaction, supervisors can view a dashboard that can highlight trends in ongoing conversations, while helping to make sure that the identity of each customer is protected.<\/p>\n<p>Today, we\u2019re excited to announce a new feature of Amazon Transcribe that can help achieve this: <a href=\"https:\/\/docs.aws.amazon.com\/transcribe\/latest\/dg\/pii-redaction-stream.html\" target=\"_blank\" rel=\"noopener noreferrer\">PII identification and redaction in streaming transcriptions<\/a>. With this feature, you can redact sensitive data in your streaming transcriptions and display the output as per your requirements. Let\u2019s look at how this service works.<\/p>\n<h2>Feature overview<\/h2>\n<p>This feature extends the existing <code>StartStreamingTranscription<\/code> operation of Amazon Transcribe. You just add few more parameters to customize the stream (see the following code):<\/p>\n<div class=\"hide-language\">\n<pre><code class=\"lang-json\">POST \/stream-transcription HTTP\/2\nx-amzn-transcribe-language-code: en-US\nx-amzn-transcribe-sample-rate: MediaSampleRateHertz\nx-amzn-transcribe-media-encoding: MediaEncoding\nx-amzn-transcribe-vocabulary-name: VocabularyName\nx-amzn-transcribe-session-id: SessionId\nx-amzn-transcribe-vocabulary-filter-name: VocabularyFilterName\nx-amzn-transcribe-vocabulary-filter-method: VocabularyFilterMethod\nx-amzn-transcribe-language-model-name: LanguageModelName\nx-amzn-transcribe-enable-channel-identification: EnableChannelIdentification\nx-amzn-transcribe-number-of-channels: NumberOfChannels\nx-amzn-transcribe-show-speaker-label: ShowSpeakerLabel\nx-amzn-transcribe-enable-partial-results-stabilization: EnablePartialResultsStabilization\nx-amzn-transcribe-partial-results-stability: PartialResultsStability\nx-amzn-transcribe-content-identification-type: ContentIdentificationType (or x-amzn-transcribe-content-redaction-type: ContentRedactionType)\nx-amzn-transcribe-pii-entity-types: PiiEntityTypes\nContent-type: application\/json<\/code><\/pre>\n<\/p><\/div>\n<p>You can select the behavior you want in the streaming transcription. You can choose from two options when starting a streaming session: identify PII or redact PII. The purpose of adding these is to help you highlight or mask the sensitive information identified.<\/p>\n<p>In addition, you can now specify PII types by setting a value for the <code>x-amzn-transcribe-pii-entity-types<\/code> parameter. This parameter supports identifying the following PII types: <code>BANK_ACCOUNT_NUMBER<\/code>, <code>BANK_ROUTING<\/code>, <code>CREDIT_DEBIT_NUMBER<\/code>, <code>CREDIT_DEBIT_CVV<\/code>, <code>CREDIT_DEBIT_EXPIRY<\/code>, <code>PIN<\/code>, <code>EMAIL<\/code>, <code>ADDRESS<\/code>, <code>NAME<\/code>, <code>PHONE<\/code>, <code>SSN<\/code>, and <code>ALL<\/code>. This is an optional parameter with a default value of <code>ALL<\/code>. Also, it supports selecting multiple types in the form of a comma-separated list like <code>NAME<\/code>, <code>ADDRESS<\/code>.<\/p>\n<p>We wanted to give you the flexibility to select the PII types you want to redact or identify. For example, you might want to protect your customers\u2019 Social Security number and credit card details, but might need other PII fields like name, email, and phone to create or update customer profiles in CRM systems for marketing and analytics purposes.<\/p>\n<p>When parsing responses generated by the service, you see a JSON response similar to the following. The most important field to note is the <code>Entities<\/code> field.<\/p>\n<div class=\"hide-language\">\n<pre><code class=\"lang-json\">{\n   \"TranscriptResultStream\":{\n      \"TranscriptEvent\":{\n         \"Transcript\":{\n            \"Results\":[\n               {\n                  \"Alternatives\":[\n                     {\n                        \"Transcript\":\"My name is [NAME].\",\n                        \"Items\":[\n                        {\n                           \"Confidence\":1,\n                           \"Content\":\"My\",\n                           \"EndTime\":0.67,\n                           \"StartTime\":0.6,\n                           \"Type\":\"pronunciation\",\n                           \"VocabularyFilterMatch\":false\n                        },\n                        {\n                           \"Confidence\":1,\n                           \"Content\":\"name\",\n                           \"EndTime\":0.95,\n                           \"StartTime\":0.68,\n                           \"Type\":\"pronunciation\",\n                           \"VocabularyFilterMatch\":false\n                        },\n                        {\n                           \"Confidence\":1,\n                           \"Content\":\"is\",\n                           \"EndTime\":1.14,\n                           \"StartTime\":0.96,\n                           \"Type\":\"pronunciation\",\n                           \"VocabularyFilterMatch\":false\n                        },\n                        {\n                           \"Confidence\":0.96,\n                           \"Content\":\"[NAME]\",\n                           \"EndTime\":1.71,\n                           \"StartTime\":1.15,\n                           \"Type\":\"pronunciation\",\n                           \"VocabularyFilterMatch\":false\n                        },\n                        {\n                           \"Content\":\".\",\n                           \"EndTime\":1.71,\n                           \"StartTime\":1.71,\n                           \"Type\":\"punctuation\",\n                           \"VocabularyFilterMatch\":false\n                        }\n                     ],\n                        \"Entities\":[\n                           {\n                              \"Content\":\"[NAME]\",\n                              \"Category\":\"PII\",\n                              \"Type\":\"NAME\",\n                              \"StartTime\":1.15,\n                              \"EndTime\":1.71,\n                              \"Confidence\":0.9989\n                           }\n                        ]\n                     }\n                  ],\n                  \"EndTime\":1.71,\n                  \"IsPartial\":false,\n                  \"ResultId\":\"751d4068-6a90-4cf1-b301-78d9f9778a8f\",\n                  \"StartTime\":0.6\n               }\n            ]\n         }\n      }\n   }\n}<\/code><\/pre>\n<\/p><\/div>\n<p>In the preceding example response, the behavior desired was redaction, therefore when PII data was detected (in this case, a name), it was replaced with the tag <code>[NAME]<\/code>. This is also highlighted by the <code>Entities<\/code> array in the response, which provides category of identification and confidence value (between 0\u20131, where a value of 1 indicates highest confidence) about the PII identification.<\/p>\n<p>You can also request to just identify PII data by setting <code>x-amzn-transcribe-content-identification-type<\/code> to <code>PII<\/code> in the <code>StartStreamingTranscription<\/code> action. It returns a response similar to the following:<\/p>\n<div class=\"hide-language\">\n<pre><code class=\"lang-json\">{\n   \"TranscriptResultStream\":{\n      \"TranscriptEvent\":{\n         \"Transcript\":{\n            \"Results\":[\n               {\n                  \"Alternatives\":[\n                     {\n                        \"Transcript\":\"My name is John.\",\n                        \"Items\":[\n                        {\n                           \"Confidence\":1,\n                           \"Content\":\"My\",\n                           \"EndTime\":0.67,\n                           \"StartTime\":0.6,\n                           \"Type\":\"pronunciation\",\n                           \"VocabularyFilterMatch\":false\n                        },\n                        {\n                           \"Confidence\":1,\n                           \"Content\":\"name\",\n                           \"EndTime\":0.95,\n                           \"StartTime\":0.68,\n                           \"Type\":\"pronunciation\",\n                           \"VocabularyFilterMatch\":false\n                        },\n                        {\n                           \"Confidence\":1,\n                           \"Content\":\"is\",\n                           \"EndTime\":1.14,\n                           \"StartTime\":0.96,\n                           \"Type\":\"pronunciation\",\n                           \"VocabularyFilterMatch\":false\n                        },\n                        {\n                           \"Confidence\":0.96,\n                           \"Content\":\"John\",\n                           \"EndTime\":1.71,\n                           \"StartTime\":1.15,\n                           \"Type\":\"pronunciation\",\n                           \"VocabularyFilterMatch\":false\n                        },\n                        {\n                           \"Content\":\".\",\n                           \"EndTime\":1.71,\n                           \"StartTime\":1.71,\n                           \"Type\":\"punctuation\",\n                           \"VocabularyFilterMatch\":false\n                        }\n                     ],\n                        \"Entities\":[\n                           {\n                              \"Content\":\"John\",\n                              \"Category\":\"PII\",\n                              \"Type\":\"NAME\",\n                              \"StartTime\":1.15,\n                              \"EndTime\":1.71,\n                              \"Confidence\":0.9989\n                           }\n                        ]\n                     }\n                  ],\n                  \"EndTime\":1.71,\n                  \"IsPartial\":false,\n                  \"ResultId\":\"751d4068-6a90-4cf1-b301-78d9f9778a8f\",\n                  \"StartTime\":0.6\n               }\n            ]\n         }\n      }\n   }\n}<\/code><\/pre>\n<\/p><\/div>\n<p>This feature is available to customers programmatically using HTTP\/2 streaming and WebSocket streaming. For more information about this feature, see the <a href=\"https:\/\/docs.aws.amazon.com\/transcribe\/latest\/dg\/pii-redaction.html\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon Transcribe documentation<\/a>.<\/p>\n<h2>How to use the feature<\/h2>\n<p>Let\u2019s explore how we can use this feature. You can try it out three different ways:<\/p>\n<p>However, in this blog we will be only discussing the AWS Management Console and HTTP\/2 streaming options.<\/p>\n<h3>Using the <a href=\"http:\/\/aws.amazon.com\/console\" target=\"_blank\" rel=\"noopener noreferrer\">AWS Management Console<\/a><\/h3>\n<p>To test out this feature from the AWS Management Console, we must navigate to the Amazon Transcribe Page. You can do this by typing \u201ctranscribe\u201d in the search bar on the console.<\/p>\n<p><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/30\/ML-5087-image014.png\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-27512\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/30\/ML-5087-image014.png\" alt=\"\" width=\"1621\" height=\"324\"><\/a><\/p>\n<p>Once there, hover over to \u201cReal Time Transcription\u201d.<\/p>\n<p><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/30\/ML-5087-image015.png\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-27513\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/30\/ML-5087-image015.png\" alt=\"\" width=\"1777\" height=\"1301\"><\/a><\/p>\n<p>Now let\u2019s test this out, but first let\u2019s set the content removal settings. For demonstration purposes, we\u2019ll be just identifying all types of PII data in the stream.<\/p>\n<p><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/30\/ML-5087-image016.png\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-27514\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/30\/ML-5087-image016.png\" alt=\"\" width=\"944\" height=\"1204\"><\/a><\/p>\n<p>Now, click \u201cStart streaming\u201d. For the input voice, I said the following lines, which contains PII data, into my computer\u2019s mic (Please note we will be using the same lines later in this blog when we test this feature programmatically):<\/p>\n<p><em>Hello. My name is John Smith. I live at 999 ABC Street X Y Z, Virginia. My phone number is 999-888-7777. My email address is <\/em>employee@amazon.com<em>. My social security number is 123-45-6789. And finally, my credit card number is 6543 6543 6543 6543 and the expiry is 07\/2021. The CVV code is 210. <\/em><\/p>\n<p>We see the following output where the identified PII data is highlighted by an underline. You can also download the full transcript to further analyze the results.<\/p>\n<p><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/30\/ML-5087-image017.png\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-27515\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/30\/ML-5087-image017.png\" alt=\"\" width=\"1381\" height=\"749\"><\/a><\/p>\n<h3>Using HTTP\/2 streaming<\/h3>\n<p>In this section, we focus specifically on how to enable this feature using HTTP\/2 streaming, and we use the AWS SDK for Java v2. For an example, see the following <a href=\"https:\/\/github.com\/awsdocs\/aws-doc-sdk-examples\/tree\/master\/javav2\/example_code\/transcribe\/src\/main\/java\/com\/amazonaws\/transcribestreaming\" target=\"_blank\" rel=\"noopener noreferrer\">GitHub repo<\/a>. In this project, the example application <code>TranscribeStreamingDemoApp.java<\/code> uses the <code>StartStreamingTranscription<\/code> action of Amazon Transcribe (implemented in Java).<\/p>\n<p>To test out this feature, we need to make sure that the AWS SDK for Java is up to date with the latest version so that we can use this release. For more information, see <a href=\"https:\/\/docs.aws.amazon.com\/sdk-for-java\/latest\/developer-guide\/setup.html#setup-install\" target=\"_blank\" rel=\"noopener noreferrer\">Set up the AWS SDK for Java<\/a>.<\/p>\n<p>Next, we need to make sure that we have the right permissions associated. The following JSON snippet is a permissions policy that illustrates the minimum permissions required to use this feature:<\/p>\n<div class=\"hide-language\">\n<pre><code class=\"lang-json\">{\n    \"Version\": \"2012-10-17\",\n    \"Statement\": [\n        {\n            \"Sid\": \"transcribestreaming\",\n            \"Effect\": \"Allow\",\n            \"Action\": \"transcribe:StartStreamTranscription\",\n            \"Resource\": \"*\"\n        }\n    ]\n}<\/code><\/pre>\n<\/p><\/div>\n<p>Now let\u2019s explore the code we use to test this feature out (<a href=\"https:\/\/github.com\/awsdocs\/aws-doc-sdk-examples\/blob\/master\/javav2\/example_code\/transcribe\/src\/main\/java\/com\/amazonaws\/transcribestreaming\/TranscribeStreamingDemoApp.java\" target=\"_blank\" rel=\"noopener noreferrer\">TranscribeStreamingDemoApp.java<\/a>). All the classes under this <a href=\"https:\/\/github.com\/awsdocs\/aws-doc-sdk-examples\/tree\/master\/javav2\/example_code\/transcribe\/src\/main\/java\/com\/amazonaws\/transcribestreaming\" target=\"_blank\" rel=\"noopener noreferrer\">directory<\/a> work in tandem to provide the birectional streaming functionality that we test for generating the live transcriptions. If we explore this file, we can see that (at line 79) we instantiate the request to the <code>StartStreamTranscriptions<\/code> API.<br \/><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/10\/ML-5087-image001.png\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-27030 size-full\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/10\/ML-5087-image001.png\" alt=\"\" width=\"732\" height=\"228\"><\/a><\/p>\n<p>As we discussed earlier, PII identification and redaction in streaming adds a few more parameters to set streaming behavior. For this example, we try out content redaction for all PII entity types. Therefore, the modified code would look like the following screenshot.<a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/09\/10\/Screen-Shot-2021-09-10-at-3.59.54-PM.png\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-27952 size-full\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/09\/10\/Screen-Shot-2021-09-10-at-3.59.54-PM.png\" alt=\"\" width=\"1172\" height=\"246\"><\/a><\/p>\n<p>The original response handler code displays partial results of the streaming transcriptions.<br \/><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/10\/ML-5087-image005.png\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-27032\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/10\/ML-5087-image005.png\" alt=\"\" width=\"791\" height=\"511\"><\/a><\/p>\n<p>However, in the use case of masking sensitive data, it\u2019s best to display and use output that is the final output of the transcription. Therefore, we modify the code to only display the final output using the <code>isPartial<\/code> field in the streaming responses. The following screenshot shows the implementation.<br \/><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/09\/10\/Screen-Shot-2021-09-10-at-4.27.10-PM.png\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-27955 size-full\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/09\/10\/Screen-Shot-2021-09-10-at-4.27.10-PM.png\" alt=\"\" width=\"1200\" height=\"231\"><\/a><\/p>\n<p>Now let\u2019s test this out. You can build and run this code using your IDE or the command line. For the input voice, we will use the same lines used earlier.<\/p>\n<p>The following screenshot shows our output:<br \/><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/10\/ML-5087-image009.png\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-27034\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/10\/ML-5087-image009.png\" alt=\"\" width=\"1114\" height=\"66\"><\/a><\/p>\n<p>Each line was streamed to the console as soon as Amazon Transcribe inferred it was the final result.<\/p>\n<p>Now, let\u2019s take the use case where we don\u2019t want to redact the name, email address, and phone number. We only want to redact the Social Security number, credit card number, its expiration date, and the CVV code. To do so, we modify the request by listing the PII types we want to redact in the <code>PII_Entity_types<\/code> parameter.<br \/><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/09\/10\/Screen-Shot-2021-09-10-at-4.05.04-PM.png\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-27953 size-full\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/09\/10\/Screen-Shot-2021-09-10-at-4.05.04-PM.png\" alt=\"\" width=\"1225\" height=\"246\"><\/a><\/p>\n<p>The following screenshot shows our output:<\/p>\n<p><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/31\/ML-5087-image013_resized3.png\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-27576\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/31\/ML-5087-image013_resized3.png\" alt=\"\" width=\"9876\" height=\"600\"><\/a><\/p>\n<h2>Conclusion<\/h2>\n<p>As demonstrated in this post, Amazon Transcribe can be used to help identify and redact PII in streaming transcriptions. This feature can streamline and simplify customer data management across industries such as financial services, government, retail, and much more.<\/p>\n<p>As of today, PII identification and redaction for streaming transcription is supported in the following <a href=\"https:\/\/aws.amazon.com\/about-aws\/global-infrastructure\/regions_az\/\">AWS Regions<\/a>: US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Seoul), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), EU (Frankfurt), EU (Ireland), and EU (London). For pricing information, see <a href=\"https:\/\/aws.amazon.com\/transcribe\/pricing\/\">Amazon Transcribe Pricing<\/a>.<\/p>\n<p>For additional resources, see the following:<\/p>\n<hr>\n<h3>About the Author<\/h3>\n<p><strong><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/11\/Vishesh-Jha-resized.jpg\"><img decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-27070 alignleft\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/11\/Vishesh-Jha-resized.jpg\" alt=\"\" width=\"100\" height=\"133\"><\/a>Vishesh Jha<\/strong> is a Solutions Architect at AWS working with Public Sector Partners. He specializes in AI\/ML and has helped customers &amp; partners get started with NLP and CV using AWS services such as Amazon Lex, Transcribe, Amazon Translate, Amazon Comprehend, Amazon Kendra, Amazon Rekognition, and Amazon SageMaker. He is an avid soccer fan, and in his free time enjoys watching and playing the sport. He also loves cooking, gaming, and traveling with his family.<\/p>\n<p>       <!-- '\"` -->\n      <\/div>\n","protected":false},"excerpt":{"rendered":"<p>https:\/\/aws.amazon.com\/blogs\/machine-learning\/introducing-pii-identification-and-redaction-in-streaming-transcriptions-using-amazon-transcribe\/<\/p>\n","protected":false},"author":0,"featured_media":868,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[3],"tags":[],"_links":{"self":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/867"}],"collection":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/comments?post=867"}],"version-history":[{"count":0,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/867\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media\/868"}],"wp:attachment":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media?parent=867"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/categories?post=867"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/tags?post=867"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}