{"id":997,"date":"2021-10-06T08:38:58","date_gmt":"2021-10-06T08:38:58","guid":{"rendered":"https:\/\/salarydistribution.com\/machine-learning\/2021\/10\/06\/build-a-system-for-catching-adverse-events-in-real-time-using-amazon-sagemaker-and-amazon-quicksight\/"},"modified":"2021-10-06T08:38:58","modified_gmt":"2021-10-06T08:38:58","slug":"build-a-system-for-catching-adverse-events-in-real-time-using-amazon-sagemaker-and-amazon-quicksight","status":"publish","type":"post","link":"https:\/\/salarydistribution.com\/machine-learning\/2021\/10\/06\/build-a-system-for-catching-adverse-events-in-real-time-using-amazon-sagemaker-and-amazon-quicksight\/","title":{"rendered":"Build a system for catching adverse events in real-time using Amazon SageMaker and Amazon QuickSight"},"content":{"rendered":"<div id=\"\">\n<p>Social media platforms provide a channel of communication for consumers to talk about various products, including the medications they take. For pharmaceutical companies, monitoring and effectively tracking product performance provides customer feedback about the product, which is vital to maintaining and improving patient safety. However, when an unexpected medical occurrence resulting from a pharmaceutical product administration occurs, it\u2019s classified as an adverse event (AE). This includes medication errors, adverse drug reactions, allergic reactions, and overdoses. AEs can happen anywhere: in hospitals, long-term care settings, and outpatient settings.<\/p>\n<p>The objective of this post is to provide an example that showcases how to use <a href=\"https:\/\/aws.amazon.com\/sagemaker\/\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon SageMaker<\/a> and pre-trained transformer models to detect AEs mentioned on social media. The model is fine-tuned on domain-specific data to perform a text classification task. We also use <a href=\"https:\/\/aws.amazon.com\/quicksight\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon QuickSight<\/a> to create a monitoring dashboard. Importantly, this post requires a Twitter developer account to obtain tweets. For the purposes of this demonstration, we only use publicly available tweets. While privacy and data governance is not explicitly discussed in this post, users should consider these processes and helpful resources can be found in <a href=\"https:\/\/aws.amazon.com\/marketplace\/solutions\/data-analytics\/data-governance\" target=\"_blank\" rel=\"noopener noreferrer\">AWS Marketplace<\/a> and through featured <a href=\"https:\/\/aws.amazon.com\/big-data\/featured-partner-solutions-data-governance-compliance\/\" target=\"_blank\" rel=\"noopener noreferrer\">AWS Partner Solutions<\/a> for data governance. Following this demonstration, we deleted all the data used.<\/p>\n<p>This post is meant to support overarching pharmacovigilance activities for the life sciences and pharmaceutical customers globally, though the reference architecture can be implemented for any customer. The model is trained on identifying adverse events and can be applicable to biotech, healthcare, and life sciences domains.<\/p>\n<h2>Solution overview<\/h2>\n<p>The following architecture diagram illustrates the workflow of the solution.<br \/><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/11\/ML-3952-image001.jpg\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-27088\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/11\/ML-3952-image001.jpg\" alt=\"\" width=\"2260\" height=\"748\"><\/a><br \/>The workflow includes the following steps:<\/p>\n<ol>\n<li>Train a classification model using SageMaker training and deploy the trained model to an endpoint using SageMaker real-time inference.<\/li>\n<li>Create an <a href=\"https:\/\/aws.amazon.com\/cloud9\/\" target=\"_blank\" rel=\"noopener noreferrer\">AWS Cloud9<\/a> stream listener.<\/li>\n<li>Store live tweets in <a href=\"https:\/\/aws.amazon.com\/dynamodb\/\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon DynamoDB<\/a>.<\/li>\n<li>Use DynamoDB Streams to trigger <a href=\"http:\/\/aws.amazon.com\/lambda\" target=\"_blank\" rel=\"noopener noreferrer\">AWS Lambda<\/a> to invoke the SageMaker endpoint for AE classification and call <a href=\"https:\/\/aws.amazon.com\/comprehend\/medical\/\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon Comprehend Medical<\/a> to detect symptoms and provide <a href=\"https:\/\/docs.aws.amazon.com\/comprehend\/latest\/dg\/ontology-linking-icd10.html\" target=\"_blank\" rel=\"noopener noreferrer\">ICD-10<\/a> descriptions.<\/li>\n<li>Save tweets with their AE classification and symptoms in <a href=\"http:\/\/aws.amazon.com\/s3\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon Simple Storage Service<\/a> (Amazon S3) and use the <a href=\"https:\/\/aws.amazon.com\/glue\" target=\"_blank\" rel=\"noopener noreferrer\">AWS Glue<\/a> Data Catalog to create table views.<\/li>\n<li>Analyze data from Amazon S3 using <a href=\"http:\/\/aws.amazon.com\/athena\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon Athena<\/a>.<\/li>\n<li>Create a QuickSight dashboard to monitor tweets and their AE status.<\/li>\n<\/ol>\n<h2>Set up the environment and implement the solution<\/h2>\n<p>We have created a template for the adverse event detection app using the <a href=\"https:\/\/aws.amazon.com\/cdk\/\" target=\"_blank\" rel=\"noopener noreferrer\">AWS Cloud Development Kit<\/a> (AWS CDK), an open-source software development framework to define your cloud application resources. Complete the following steps to run the solution end to end:<\/p>\n<ol>\n<li>Clone the <a href=\"https:\/\/github.com\/aws-samples\/aws-cdk-adverse-event-detection-app\" target=\"_blank\" rel=\"noopener noreferrer\">GitHub repo<\/a> either to your local machine that is configured with your AWS account and has the <a href=\"http:\/\/aws.amazon.com\/cli\" target=\"_blank\" rel=\"noopener noreferrer\">AWS Command Line Interface<\/a> (AWS CLI) installed, or in an AWS Cloud9 environment within your AWS account:\n<div class=\"hide-language\">\n<pre><code class=\"lang-bash\">$ git clone https:\/\/github.com\/aws-samples\/aws-cdk-adverse-event-detection-app<\/code><\/pre>\n<\/p><\/div>\n<\/li>\n<\/ol>\n<p>After you clone the code repo, you can start the deployment process.<\/p>\n<ol start=\"2\">\n<li>Navigate to the project directory and create a virtual environment within this project that is stored under the <code>.venv<\/code> directory. To manually create a virtual environment on macOS or Linux, use the following code:\n          <\/li>\n<li>After the virtual environment is created, activate the virtual environment:\n<div class=\"hide-language\">\n<pre><code class=\"lang-bash\">$ source .venv\/bin\/activate<\/code><\/pre>\n<\/p><\/div>\n<\/li>\n<li>After the virtual environment is activated, install the required dependencies:\n<div class=\"hide-language\">\n<pre><code class=\"lang-bash\">$ pip install -r requirements.txt<\/code><\/pre>\n<\/p><\/div>\n<\/li>\n<li>At this point, you can now synthesize the <a href=\"http:\/\/aws.amazon.com\/cloudformation\" target=\"_blank\" rel=\"noopener noreferrer\">AWS CloudFormation<\/a> template:\n<div class=\"hide-language\">\n<pre><code class=\"lang-bash\">$ cdk synth\n$ cdk deploy --all<\/code><\/pre>\n<\/p><\/div>\n<\/li>\n<\/ol>\n<p><code>cdk synth<\/code> generates the CloudFormation template in JSON format as well as other necessary asset files for spinning up the resources. These files are stored in the <code>cdk.out<\/code> directory. The <code>cdk deploy<\/code> command then deploys the stack into your AWS account. You deploy two stacks: one is an S3 bucket stack and the other is the core adverse event app stack. The core app stack needs to be deployed after the Amazon S3 stack is successfully deployed. If you encounter any issues during the deployment process, refer to <a href=\"https:\/\/docs.aws.amazon.com\/cdk\/latest\/guide\/troubleshooting.html\" target=\"_blank\" rel=\"noopener noreferrer\">Troubleshooting common AWS CDK issues<\/a>.<\/p>\n<p>After the AWS CDK is successfully deployed, you need to train and deploy a model. On the <strong>Notebooks <\/strong>page of the SageMaker console, you should find a notebook instance named <code>AdverseEventDetectionModeling<\/code>. When you run the entire notebook (<a href=\"https:\/\/github.com\/aws-samples\/aws-cdk-adverse-event-detection-app\/blob\/main\/sagemaker\/AE_model_train_deploy.ipynb\" target=\"_blank\" rel=\"noopener noreferrer\">AE_model_train_deploy.ipynb<\/a>), a SageMaker training job is launched and the model is deployed to a SageMaker endpoint. The model training data in this tutorial is based on the <a href=\"https:\/\/huggingface.co\/datasets\/ade_corpus_v2\" target=\"_blank\" rel=\"noopener noreferrer\">Adverse Drug Reaction<\/a> dataset from <a href=\"https:\/\/huggingface.co\/\" target=\"_blank\" rel=\"noopener noreferrer\">Hugging Face<\/a> but can be replaced with any other dataset.<\/p>\n<h2>Train and deploy a transformer model for adverse event classification<\/h2>\n<p>We fine-tune transformer models within the <a href=\"https:\/\/huggingface.co\/transformers\/index.html\" target=\"_blank\" rel=\"noopener noreferrer\">Hugging Face library<\/a> for adverse event (AE) classifications. The training job is built using the SageMaker PyTorch estimator. For model deployment, we use PyTorch Model Server. In this section, we walk through the major steps for model training and deployment.<\/p>\n<h3>Data preparation<\/h3>\n<p>We use the Adverse Drug Reaction Data (<a href=\"https:\/\/huggingface.co\/datasets\/ade_corpus_v2\" target=\"_blank\" rel=\"noopener noreferrer\">ade_corpus_v2<\/a>) within the Hugging Face dataset as the training and validation data. The required data structure for our model training and inference has two columns:<\/p>\n<ul>\n<li>One column for text content as model input data.<\/li>\n<li>Another column for the label class. We have two possible classes for a text: the <code>Not_AE<\/code> class and the <code>Adverse_Event<\/code> class.<\/li>\n<\/ul>\n<p>We download the raw dataset and split it into training (80%) and validation (20%) datasets, rename the input and target columns to <code>text<\/code> and <code>label<\/code> respectively, and upload them to Amazon S3:<\/p>\n<div class=\"hide-language\">\n<pre><code class=\"lang-bash\">inputs_data = sagemaker_session.upload_data(path=data_dir, bucket=bucket, key_prefix=s3_prefix)<\/code><\/pre>\n<\/p><\/div>\n<p>Our model also accepts multi-class classification, so you can bring your own dataset for model training.<\/p>\n<h3>Model training<\/h3>\n<p>We use the SageMaker built-in PyTorch estimator to fine-tune transformer models. The entry point script <code>.\/src\/hf_train_deploy.py<\/code> has the <code>train()<\/code> function for model training.<\/p>\n<p>We have added a <code>requirements.txt<\/code> file within the script source folder <code>.\/src<\/code> for a list of required packages. When you launch SageMaker training jobs, the SageMaker PyTorch container automatically looks for a <code>requirements.txt<\/code> file in the script source folder, and uses <code>pip install<\/code> to install the packages listed in that file.<\/p>\n<p>In addition to batch size, sequence length, learning rate, you can also specify the <code>model_name<\/code> to choose any transformer models supported within the pre-trained model list of Hugging Face <code>AutoModelForSequenceClassification<\/code>. The column names for text and label are also needed to specified through the <code>text_column<\/code> and <code>label_column<\/code> parameters.<\/p>\n<p>The following code is an example of setting hyperparameters for the model training:<\/p>\n<div class=\"hide-language\">\n<pre><code class=\"lang-python\">hyperparameters={'epochs': 4,\n'train_batch_size': 64,\n'max_seq_length': 128,\n'learning_rate': 5e-5,\n'model_name':'distilbert-base-uncased',\n'text_column':'text',\n'label_column': 'label'\n}<\/code><\/pre>\n<\/p><\/div>\n<p>Then we launch the training job:<\/p>\n<div class=\"hide-language\">\n<pre><code class=\"lang-python\">from sagemaker.pytorch import PyTorch\n\ntrain_instance_type = 'ml.p3.2xlarge'\n\nbert_estimator = PyTorch(entry_point='hf_train_deploy.py',\n                    source_dir = 'src',\n                    role=role,\n                    framework_version='1.4.0',\n                    py_version='py3',\n                    instance_count=1,\n                    instance_type= train_instance_type,\n                    hyperparameters = hyperparameters\n                   )\n                  \nbert_estimator.fit({'training': inputs_data})<\/code><\/pre>\n<\/p><\/div>\n<h3>Model deployment<\/h3>\n<p>We can directly deploy the PyTorch trained model using SageMaker real-time inference to an endpoint as long as the following prerequisite functions are provided within the entry point script <code>hf_train_deploy.py<\/code>:<\/p>\n<ul>\n<li><code>model_fn(model_dir)<\/code> for loading a model object<\/li>\n<li><code>input_fn(request_body, request_content_type)<\/code> for loading an input text and tokenizing the text<\/li>\n<li><code>predict_fn(input_data, model)<\/code> for model prediction returning the probability value for each class<\/li>\n<\/ul>\n<p>We deploy the model to a SageMaker endpoint for real-time inference with the following code:<\/p>\n<div class=\"hide-language\">\n<pre><code class=\"lang-python\">from sagemaker.pytorch.model import PyTorchModel\nfrom sagemaker.deserializers import JSONDeserializer\nfrom sagemaker.serializers import JSONSerializer\n\nmodel_data = bert_estimator.model_data\n\npytorch_model = PyTorchModel(model_data=model_data,\n                             role=role,\n                             framework_version='1.4.0',\n                             source_dir='.\/src',\n                             py_version='py3',\n                             entry_point='hf_train_deploy.py')\n                             \npredictor = pytorch_model.deploy(initial_instance_count=1, \n                                 instance_type='ml.m5.large', \n                                 endpoint_name='HF-BERT-AE-model',\n                                 serializer=JSONSerializer(),\n                                 deserializer=JSONDeserializer())<\/code><\/pre>\n<\/p><\/div>\n<h3>Model serving<\/h3>\n<p>After the SageMaker endpoint is created, we can invoke the endpoint for real-time model inference through services like Lambda:<\/p>\n<div class=\"hide-language\">\n<pre><code class=\"lang-python\">import boto3                              \nendpoint_name = 'HF-BERT-AE-model'\nruntime= boto3.client('runtime.sagemaker')\n\nquery = 'YOUR TEXT HERE'\n\nresponse = runtime.invoke_endpoint(EndpointName=endpoint_name,\n                                   ContentType='application\/json',\n                                   Body=json.dumps(query))\nprobabilities = eval(response['Body'].read())<\/code><\/pre>\n<\/p><\/div>\n<h2>Set up the Twitter API stream listener for real-time data streaming<\/h2>\n<p>During the initial <code>cdk deploy<\/code> process, AWS CDK should have spun up an AWS Cloud9 environment in your account and cloned the code repo into the environment. We use AWS Cloud9 to host the Twitter API stream listener for live data streaming.<\/p>\n<p>The Twitter API stream listener is composed of the following:<\/p>\n<ul>\n<li><strong>stream_config.py<\/strong> \u2013 Parameters that authenticate the Twitter API and a pre-determined list of drug names to search for<\/li>\n<li><strong>stream.py<\/strong> \u2013 Primarily used to keep the stream active, but also addresses other functionalities, such as processing users shared attributes and assessing if the drug mentions match the ones provided in <code>stream_config.py<\/code><\/li>\n<\/ul>\n<p>The next step is to set up the Twitter API stream listener. After you obtain your consumer keys and authentication tokens from the Twitter developer portal, go into AWS Cloud9 under stream_config.py and provide the following information:<\/p>\n<ol>\n<li>Enter in the Twitter API credentials.<\/li>\n<li>Add drug names and rules of interest to obtain associated tweets; we have provided an example of drug names in the code.<\/li>\n<li>Enter <code>aws_access_key_id<\/code> and <code>aws_secret_access_key<\/code>, respectively.<br \/><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/11\/ML-3952-image004.png\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-27089\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/11\/ML-3952-image004.png\" alt=\"\" width=\"864\" height=\"337\"><\/a><\/li>\n<li>Back in the AWS Cloud9 terminal, run the following commands to install the necessary packages and download <code>en_core_web_sm<\/code>:\n<div class=\"hide-language\">\n<pre><code class=\"lang-python\">$ pip install -r requirements.txt\n$ python -m spacy download en_core_web_sm<\/code><\/pre>\n<\/p><\/div>\n<\/li>\n<li>To activate the API stream listener, run the following command (make sure you\u2019re in the <code>ae-blog-cdk<\/code> folder):\n<div class=\"hide-language\">\n<pre><code class=\"lang-python\">$ python cloud9\/stream.py<\/code><\/pre>\n<\/p><\/div>\n<\/li>\n<\/ol>\n<h2>Run inference and crawl model prediction results<\/h2>\n<p>When the stream listener is active, incoming tweet data is stored in the DynamoDB table <code>ae_tweets_ddb<\/code>.<br \/><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/11\/ML-3952-image005.png\"><img decoding=\"async\" class=\"alignnone size-full wp-image-27090\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/11\/ML-3952-image005.png\" alt=\"\" height=\"812\"><\/a><br \/>The Lambda function is triggered by <a href=\"https:\/\/docs.aws.amazon.com\/amazondynamodb\/latest\/developerguide\/Streams.html\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon DynamoDB Streams<\/a> and invokes the model endpoint deployed from the SageMaker step. The function provides inference through the SageMaker deployed endpoint <code>HF-BERT-AE-model<\/code> to classify the incoming tweets as adverse events or not.<\/p>\n<p>For all the tweets that are classified as adverse events, the Amazon Comprehend Medical API is used to obtain <a href=\"https:\/\/docs.aws.amazon.com\/comprehend\/latest\/dg\/extracted-med-info-V2.html\" target=\"_blank\" rel=\"noopener noreferrer\">entities<\/a> that detect signs, symptoms, and diagnosis of medical conditions, along with the list of ICD-10 codes and descriptions. For simplicity, we extract entities based on maximum score. The ICD-10 code and description allow us to bin the symptoms in a more normalized concept (for more information, see <a href=\"https:\/\/docs.aws.amazon.com\/comprehend\/latest\/dg\/ontology-linking-icd10.html\" target=\"_blank\" rel=\"noopener noreferrer\">ICD-10-CM-linking<\/a>). See the following code:<\/p>\n<div class=\"hide-language\">\n<pre><code class=\"lang-python\"># Retrieve AE type data IFF marked as AE by model\nae_type = \"\"\nicd_codes = []\nif pred_label == 'Adverse_Event':\n    aetype_dict = {}\n    # Extract entities using Amazon Comprehend Medical\n    result_symptom = cm_client.detect_entities_v2(Text=text)\n    entities_symptom = result_symptom['Entities']\n    # Look for entities that detects signs, symptoms, and diagnosis of medical conditions \n    # Filter based on confidence score\n    for entity in entities_symptom:\n            if (entity['Category']=='MEDICAL_CONDITION') &amp; (entity['Score']&gt;=0.60):\n                aetype_dict[entity['Text']] = entity['Score']\n                # Extract entity with maximum score\n                ae_type = max(aetype_dict, key=aetype_dict.get)\n\n    _dict = {}\n    icdc_list = []\n\n    # Amazon Comprehend Medical lists the matching ICD-10-CM codes\n    result_icd = cm_client.infer_icd10_cm(Text=text)\n    entities_icd = result_icd['Entities']\n    for entity in entities_icd:\n        for codes in entity['ICD10CMConcepts']:\n            # Filter based on confidence score\n            if codes['Score'] &gt;= 0.70:\n                _dict[codes['Description']] = codes['Score']\n                # Extract entity with maximum score\n                icd_ = max(_dict, key=_dict.get)\n                icdc_list.append(icd_)\n    icd_codes = list(set(icdc_list))<\/code><\/pre>\n<\/p><\/div>\n<p>The Lambda function processes tweets and outputs predictions, associated entities, and ICD-10 codes to the S3 bucket folder <code>lambda_predictions<\/code>.<br \/><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/11\/ML-3952-image007.png\"><img decoding=\"async\" class=\"alignnone size-full wp-image-27091\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/11\/ML-3952-image007.png\" alt=\"\" height=\"476\"><\/a><br \/>The <a href=\"https:\/\/docs.aws.amazon.com\/glue\/latest\/dg\/add-crawler.html\" target=\"_blank\" rel=\"noopener noreferrer\">AWS Glue crawler<\/a> <code>s3_tweets_crawler<\/code> is created to crawl predictions in Amazon S3 and populate the Data Catalog, where the database <code>s3_tweets_db<\/code> and table <code>lambda_predictions<\/code> are created.<br \/><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/11\/ML-3952-image009.jpg\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-27092\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/11\/ML-3952-image009.jpg\" alt=\"\" width=\"1644\" height=\"758\"><\/a><\/p>\n<h2>View tabular processed data using Amazon Athena<\/h2>\n<p>To provide stakeholders a holistic view of the tweets, you can use Athena to query the results from Amazon S3 (linked by the AWS Glue Data Catalog) and expand to create custom dashboards using QuickSight.<\/p>\n<ol>\n<li>If this is your first time using Athena, set up a location to save queries to an Amazon S3 location.<\/li>\n<li>Otherwise, configure to <code>s3_tweets_db<\/code> as shown in the model training.<\/li>\n<li>Choose the <code>lambda_predictions<\/code> table and choose <strong>Preview table <\/strong>to generate a portion of your processed tweets.<br \/><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/11\/ML-3952-image011.jpg\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-27093\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/11\/ML-3952-image011.jpg\" alt=\"\" width=\"1152\" height=\"378\"><\/a><\/li>\n<\/ol>\n<p>The following screenshot is a custom SQL command to preview tweets associated towards a particular concept.<br \/><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/11\/ML-3952-image013.png\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-27094\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/11\/ML-3952-image013.png\" alt=\"\" width=\"889\" height=\"93\"><\/a><\/p>\n<h2>Create a dashboard with Amazon QuickSight<\/h2>\n<p>Building the QuickSight dashboard allows you to fully complete an end-to-end pipeline that publishes the analyses and inferences from our models. At a high level in QuickSight, you import the data using Athena and locate your Athena database and table that are linked to your S3 bucket. Make sure the user\u2019s account has <a href=\"http:\/\/aws.amazon.com\/iam\" target=\"_blank\" rel=\"noopener noreferrer\">AWS Identity and Access Management<\/a> (IAM) permissions to access Athena and Amazon S3 when using QuickSight.<\/p>\n<ol>\n<li>On the QuickSight console, choose <strong>Datasets<\/strong> in the navigation pane.<\/li>\n<li>Choose <strong>New dataset<\/strong>.<\/li>\n<li>Choose <strong>Athena<\/strong> as your data source.<br \/><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/11\/ML-3952-image015.jpg\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-27095\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/11\/ML-3952-image015.jpg\" alt=\"\" width=\"1152\" height=\"616\"><\/a><\/li>\n<li>For <strong>Data source name<\/strong>, enter a name.<\/li>\n<li>When prompted, choose the database and table that contain the tweets that were processed through the Lambda function.<\/li>\n<li>Choose <strong>Use custom SQL<\/strong>.<br \/><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/11\/ML-3952-image017.jpg\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-27096\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/11\/ML-3952-image017.jpg\" width=\"798\" height=\"648\"><\/a><\/li>\n<li>Change the <strong>New custom SQL<\/strong> name to <code>TweetsData<\/code> (or your choice of name).<\/li>\n<li>Enter the following SQL query:\n<div class=\"hide-language\">\n<pre><code class=\"lang-sql\">SELECT * FROM \"s3_tweets_db\".\"lambda_predictions\"<\/code><\/pre>\n<\/p><\/div>\n<\/li>\n<li>Choose <strong>Edit\/Preview<\/strong> data.<\/li>\n<li>Select <strong>Import SPICE for quicker analysis<\/strong>.<\/li>\n<\/ol>\n<p>We recommend importing the data using the Super-fast, Parallel, In-memory Calculation Engine (SPICE). Upon import, you can edit and visualize the data, as well as edit the data column type or rename columns towards your visuals. Furthermore, the SPICE dataset can be <a href=\"https:\/\/docs.aws.amazon.com\/quicksight\/latest\/user\/refreshing-imported-data.html#schedule-data-refresh\" target=\"_blank\" rel=\"noopener noreferrer\">refreshed on a schedule<\/a>, and ensure enough <a href=\"https:\/\/docs.aws.amazon.com\/quicksight\/latest\/user\/managing-spice-capacity.html\" target=\"_blank\" rel=\"noopener noreferrer\">SPICE capacity<\/a> is in place to incur charges from data refresh.<\/p>\n<ol start=\"11\">\n<li>Choose <strong>Visualize<\/strong>.<br \/><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/11\/ML-3952-image019.jpg\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-27097\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/11\/ML-3952-image019.jpg\" alt=\"\" width=\"1238\" height=\"708\"><\/a><\/li>\n<\/ol>\n<p>After the data is imported, you can begin to develop the analysis in the form of visuals along with <a href=\"https:\/\/docs.aws.amazon.com\/quicksight\/latest\/user\/quicksight-actions.html\" target=\"_blank\" rel=\"noopener noreferrer\">custom actions<\/a> for filtering and navigation to make panels more interactive. Lastly, you can <a href=\"https:\/\/docs.aws.amazon.com\/quicksight\/latest\/user\/creating-a-dashboard.html\" target=\"_blank\" rel=\"noopener noreferrer\">publish the developed dashboard<\/a> to be shared. The following screenshot shows example custom visualizations.<br \/><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/11\/ML-3952-image021.jpg\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-27098\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/11\/ML-3952-image021.jpg\" alt=\"\" width=\"1152\" height=\"354\"><\/a><\/p>\n<h2>Clean up<\/h2>\n<p>Back in your AWS CDK stack, you can run the <code>cdk destroy --all<\/code> command to clean up all the resources used during this tutorial. If for any reason the command doesn\u2019t run successfully, you can go to the AWS CloudFormation console and manually delete the stack. Also, if you created a dashboard using the data from this post, manually delete the data source and the associated dashboard within QuickSight.<\/p>\n<h2>Conclusion<\/h2>\n<p>With the expanding development of new pharmaceutical drugs comes increases in the number of associated adverse events\u2014events that must be responsibly and efficiently monitored and reported. This post has detailed an end-to-end solution that uses SageMaker to build and deploy a classification model, Amazon Comprehend Medical to infer tweets, and Quicksight to detect possible adverse events from pharmaceutical products. This solution helps replace laborious manual reviewing with an automated machine learning process. To learn more about Amazon SageMaker, please <a href=\"https:\/\/aws.amazon.com\/sagemaker\/\" target=\"_blank\" rel=\"noopener noreferrer\">visit the webpage<\/a>.<\/p>\n<hr>\n<h3>About the Authors<\/h3>\n<p><strong><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/05\/11\/Prithiviraj-Jothikumar.jpg\"><img decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-24383 alignleft\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/05\/11\/Prithiviraj-Jothikumar.jpg\" alt=\"\" width=\"100\" height=\"135\"><\/a>Prithiviraj Jothikumar<\/strong>,\u00a0PhD, is a Data Scientist with AWS Professional Services, where he helps customers build solutions using machine learning. He enjoys watching movies and sports and spending time to meditate.<\/p>\n<p><strong><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/01\/21\/jason-zhu-100.jpg\"><img decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-10798 alignleft\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/01\/21\/jason-zhu-100.jpg\" alt=\"\" width=\"100\" height=\"134\"><\/a>Jason Zhu <\/strong>is a Sr. Data Scientist with AWS Professional Services where he leads building enterprise-level machine learning applications for customers. In his spare time, he enjoys being outdoors and growing his capabilities as a cook.<\/p>\n<p><strong><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/11\/Rosa-Sun.jpg\"><img decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-27102 alignleft\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/08\/11\/Rosa-Sun.jpg\" alt=\"\" width=\"100\" height=\"133\"><\/a>Rosa Sun<\/strong> is a Professional Services Consultant at Amazon Web Services. Outside of work, she enjoys walks in the rain, painting portraits, and hugging her dog.<\/p>\n<p><strong><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/08\/26\/Sai-Sharanya-Nalla.jpg\"><img decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-15153 alignleft\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/08\/26\/Sai-Sharanya-Nalla.jpg\" alt=\"\" width=\"100\" height=\"136\"><\/a>Sai Sharanya Nalla\u00a0<\/strong>is a Data Scientist at AWS Professional Services. She works with customers to develop and implement AI and ML solutions on AWS. In her spare time, she enjoys listening to podcasts and audiobooks, long walks, and engaging in outreach activities.<\/p>\n<p><strong><a href=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/06\/10\/Shuai-Cao.jpg\"><img decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-25266 alignleft\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2021\/06\/10\/Shuai-Cao.jpg\" alt=\"\" width=\"100\" height=\"136\"><\/a>Shuai Cao\u00a0<\/strong>is a Data Scientist in the Professional Services team at Amazon Web Services. His expertise is building machine learning applications at scale for healthcare and life sciences customers. Outside of work, he loves traveling around the world and playing dozens of different instruments.<\/p>\n<p>       <!-- '\"` -->\n      <\/div>\n","protected":false},"excerpt":{"rendered":"<p>https:\/\/aws.amazon.com\/blogs\/machine-learning\/build-a-system-for-catching-adverse-events-in-real-time-using-amazon-sagemaker-and-amazon-quicksight\/<\/p>\n","protected":false},"author":0,"featured_media":998,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[3],"tags":[],"_links":{"self":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/997"}],"collection":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/comments?post=997"}],"version-history":[{"count":0,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/997\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media\/998"}],"wp:attachment":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media?parent=997"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/categories?post=997"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/tags?post=997"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}