{"id":404,"date":"2020-10-15T22:15:43","date_gmt":"2020-10-15T22:15:43","guid":{"rendered":"https:\/\/machine-learning.webcloning.com\/2020\/10\/15\/building-a-medical-image-search-platform-on-aws\/"},"modified":"2020-10-15T22:15:43","modified_gmt":"2020-10-15T22:15:43","slug":"building-a-medical-image-search-platform-on-aws","status":"publish","type":"post","link":"https:\/\/salarydistribution.com\/machine-learning\/2020\/10\/15\/building-a-medical-image-search-platform-on-aws\/","title":{"rendered":"Building a medical image search platform on AWS"},"content":{"rendered":"<div id=\"\">\n<p>Improving radiologist efficiency and preventing burnout is a primary goal for healthcare providers. A nationwide study published in <em>Mayo Clinic Proceedings<\/em> in 2015 showed radiologist burnout percentage at a concerning 61% [1]. In additon, the report concludes that \u201cburnout and satisfaction with work-life balance in US physicians worsened from 2011 to 2014. More than half of US physicians are now experiencing professional burnout.\u201d[2] As technologists, we\u2019re looking for ways to put new and innovative solutions in the hands of physicians to make them more efficient, reduce burnout, and improve care quality.<\/p>\n<p>To reduce burnout and improve value-based care through data-driven decision-making, Artificial Intelligence (AI) can be used to unlock the information trapped in the vast amount of unstructured data (e.g. images, texts, and voice) and create clinically actionable knowledge base. AWS AI services can derive insights and relationships from free-form medical reports, automate the knowledge sharing process, and eventually improve personalized care experience.<\/p>\n<p>In this post, we use Convolutional Neural Networks (CNN) as a feature extractor to convert medical images into a one-dimensional feature vector with a size of 1024. We call this process <em>medical image embedding<\/em>. Then we index the image feature vector using the <a href=\"https:\/\/docs.aws.amazon.com\/elasticsearch-service\/latest\/developerguide\/knn.html\" target=\"_blank\" rel=\"noopener noreferrer\">K-nearest neighbors (KNN) algorithm<\/a> in <a href=\"https:\/\/aws.amazon.com\/elasticsearch-service\/\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon Elasticsearch Service<\/a> (Amazon ES) to build a similarity-based image retrieval system. Additionally, we use the AWS managed natural language processing (NLP) service <a href=\"https:\/\/aws.amazon.com\/comprehend\/medical\/\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon Comprehend Medical<\/a> to perform <a href=\"https:\/\/en.wikipedia.org\/wiki\/Named-entity_recognition\" target=\"_blank\" rel=\"noopener noreferrer\">named entity recognition (NER)<\/a> against free text clinical reports. The detected named entities are also linked to medical ontology, ICD-10-CM, to enable simple aggregation and distribution analysis. The presented solution also includes a front-end React web application and backend GraphQL API managed by <a href=\"https:\/\/aws.amazon.com\/amplify\/\" target=\"_blank\" rel=\"noopener noreferrer\">AWS Amplify<\/a> and <a href=\"https:\/\/aws.amazon.com\/appsync\/\" target=\"_blank\" rel=\"noopener noreferrer\">AWS AppSync<\/a>, and authentication is handled by <a href=\"https:\/\/aws.amazon.com\/cognito\/\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon Cognito<\/a>.<\/p>\n<p>After deploying this working solution, the end-users (healthcare providers) can search through a repository of unstructured free text and medical images, conduct analytical operations, and use it in medical training and clinical decision support. This eliminates the need to manually analyze all the images and reports and get to the most relevant ones. Using a system like this improves the provider\u2019s efficiency. The following graphic shows an example end result of the deployed application.<\/p>\n<p>\u00a0<\/p>\n<h2>Dataset and architecture<\/h2>\n<p>We use the <a href=\"https:\/\/physionet.org\/content\/mimic-cxr\/2.0.0\/\" target=\"_blank\" rel=\"noopener noreferrer\">MIMIC CXR<\/a> dataset to demonstrate how this working solution can benefit healthcare providers, in particular, radiologists. MIMIC CXR is a publicly available database of chest X-ray images in DICOM format and the associated radiology reports as free text files[3]. The methods for data collection and the data structures in this dataset have been well documented and are very detailed [3]. Also, this is a restricted-access resource. To access the files, you must be a <a href=\"https:\/\/physionet.org\/register\/\" target=\"_blank\" rel=\"noopener noreferrer\">registered user<\/a> and <a href=\"https:\/\/physionet.org\/sign-dua\/mimic-cxr\/2.0.0\/\" target=\"_blank\" rel=\"noopener noreferrer\">sign the data use agreement<\/a>. The following sections provide more details on the components of the architecture.<\/p>\n<p>The following diagram illustrates the solution architecture.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-16011 size-full\" title=\"Solution architecture\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/09\/17\/2-Solution-Architecture.jpg\" alt=\"\" width=\"900\" height=\"415\"><\/p>\n<p>The architecture is comprised of the offline data transformation and online query components. The offline data transformation step, the unstructured data, including free texts and image files, is converted into structured data.<\/p>\n<p>Electronic Heath Record (EHR) radiology reports as free text are processed using Amazon Comprehend Medical, an NLP service that uses machine learning to extract relevant medical information from unstructured text, such as medical conditions including clinical signs, diagnosis, and symptoms. The named entities are identified and mapped to structured vocabularies, such as ICD-10 Clinical Modifications (CMs) ontology. The unstructured text plus structured named entities are stored in Amazon ES to enable free text search and term aggregations.<\/p>\n<p>The medical images from Picture Archiving and Communication System (PACS) are converted into vector representations using a pretrained deep learning model deployed in an <a href=\"https:\/\/aws.amazon.com\/ecs\/\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon Elastic Container Service<\/a> (Amazon ECS) <a href=\"https:\/\/aws.amazon.com\/fargate\/\" target=\"_blank\" rel=\"noopener noreferrer\">AWS Fargate<\/a> cluster. Similar <a href=\"https:\/\/aws.amazon.com\/blogs\/machine-learning\/visual-search-on-aws-part-1-engine-implementation-with-amazon-sagemaker\/\" target=\"_blank\" rel=\"noopener noreferrer\">visual search on AWS <\/a>has been published previously for online retail product image search. It used an <a href=\"https:\/\/aws.amazon.com\/sagemaker\/\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon SageMaker<\/a> <a href=\"https:\/\/aws.amazon.com\/blogs\/machine-learning\/amazon-sagemaker-supports-knn-classification-and-regression\/\" target=\"_blank\" rel=\"noopener noreferrer\">built-in KNN algorithm<\/a> for similarity search, which supports different index types and distance metrics.<\/p>\n<p>We took advantage of the <a href=\"https:\/\/docs.aws.amazon.com\/elasticsearch-service\/latest\/developerguide\/knn.html\">KNN for Amazon ES<\/a> to find the <em>k<\/em> <em>closest<\/em> <em>images<\/em> from a feature space as demonstrated on the <a href=\"https:\/\/github.com\/aws-samples\/aws-visual-content-recommender\">GitHub repo<\/a>. KNN search is supported in Amazon ES version 7.4+. The container running on the ECS Fargate cluster reads medical images in DICOM format, carries out image embedding using a pretrained model, and saves a PNG thumbnail in an <a href=\"http:\/\/aws.amazon.com\/s3\">Amazon Simple Storage Service<\/a> (Amazon S3) bucket, which serves as the storage for <a href=\"https:\/\/docs.amplify.aws\/start\/q\/integration\/react\">AWS Amplify React<\/a> web application. It also parses out the DICOM image metadata and saves them in <a href=\"https:\/\/aws.amazon.com\/dynamodb\/\">Amazon DynamoDB<\/a>. The image vectors are saved in an Elasticsearch cluster and are used for the KNN visual search, which is implemented in an <a href=\"https:\/\/aws.amazon.com\/lambda\/\">AWS Lambda<\/a> function.<\/p>\n<p>The unstructured data from EHR and PACS needs to be transferred to Amazon S3 to trigger the serverless data processing pipeline through the Lambda functions. You can achieve this data transfer by using <a href=\"https:\/\/aws.amazon.com\/storagegateway\/\">AWS Storage Gateway<\/a> or <a href=\"https:\/\/aws.amazon.com\/datasync\/?whats-new-cards.sort-by=item.additionalFields.postDateTime&amp;whats-new-cards.sort-order=desc\">AWS DataSync<\/a>, which is out of the scope of this post. The online query API, including the GraphQL schemas and resolvers, was developed in AWS AppSync. The front-end web application was developed using the Amplify React framework, which can be deployed using the Amplify CLI. The detailed <a href=\"http:\/\/aws.amazon.com\/cloudformation\">AWS CloudFormation<\/a> templates and sample code are available in the <a href=\"https:\/\/github.com\/aws-samples\/medical-image-search\">Github repo<\/a>.<\/p>\n<h2>Solution overview<\/h2>\n<p>To deploy the solution, you complete the following steps:<\/p>\n<ol>\n<li>Deploy the Amplify React web application for online search.<\/li>\n<li>Deploy the image-embedding container to AWS Fargate.<\/li>\n<li>Deploy the data-processing pipeline and AWS AppSync API.<\/li>\n<\/ol>\n<h2>Deploying the Amplify React web application<\/h2>\n<p>The first step creates the Amplify React web application, as shown in the following diagram.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-16012 size-full\" title=\"Amplify React web application\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/09\/17\/3-Amplify-React.jpg\" alt=\"\" width=\"356\" height=\"111\"><\/p>\n<ol>\n<li>\n<a href=\"https:\/\/docs.aws.amazon.com\/cli\/latest\/userguide\/cli-chap-install.html\" target=\"_blank\" rel=\"noopener noreferrer\">Install<\/a> and <a href=\"https:\/\/docs.aws.amazon.com\/cli\/latest\/userguide\/cli-chap-configure.html\" target=\"_blank\" rel=\"noopener noreferrer\">configure<\/a> the <a href=\"https:\/\/aws.amazon.com\/cli\/\" target=\"_blank\" rel=\"noopener noreferrer\">AWS Command Line Interface<\/a> (AWS CLI).<\/li>\n<li>Install the <a href=\"https:\/\/docs.amplify.aws\/cli\" target=\"_blank\" rel=\"noopener noreferrer\">AWS Amplify CLI<\/a>.<\/li>\n<li>Clone the <a href=\"https:\/\/github.com\/aws-samples\/medical-image-search\" target=\"_blank\" rel=\"noopener noreferrer\">code base<\/a> with stepwise instructions.<\/li>\n<li>Go to your code base folder and initialize the Amplify app using the command <code>amplify init<\/code>. You must answer a series of questions, like the name of the Amplify app.<\/li>\n<\/ol>\n<p>After this step, you have the following changes in your local and cloud environments:<\/p>\n<ul>\n<li>A new folder named <code>amplify<\/code> is created in your local environment<\/li>\n<li>A file named <code>aws-exports.js<\/code> is created in local the <code>src<\/code> folder<\/li>\n<li>A new Amplify app is created on the AWS Cloud with the name provided during deployment (for example, <code>medical-image-search<\/code>)<\/li>\n<li>A CloudFormation stack is created on the AWS Cloud with the prefix <code>amplify-<em><span>&lt;AppName&gt;<\/span><\/em><\/code>\n<\/li>\n<\/ul>\n<p>You create <a href=\"https:\/\/docs.amplify.aws\/lib\/auth\/getting-started\/q\/platform\/js\" target=\"_blank\" rel=\"noopener noreferrer\">authentication<\/a> and <a href=\"https:\/\/docs.amplify.aws\/lib\/storage\/getting-started\/q\/platform\/js\" target=\"_blank\" rel=\"noopener noreferrer\">storage<\/a> services for your Amplify app afterwards using the following commands:<\/p>\n<div class=\"hide-language\">\n<pre class=\"unlimited-height-code\"><code class=\"lang-code\">amplify add auth\r\namplify add storage\r\namplify push\r\n<\/code><\/pre>\n<\/div>\n<p>When the CloudFormation nested stacks for authentication and storage are successfully deployed, you can see the new Amazon Cognito user pool as the authentication backend and S3 bucket as the storage backend are created. Save the Amazon Cognito user pool ID and S3 bucket name from the <strong>Outputs<\/strong> tab of the corresponding CloudFormation nested stack (you use these later).<\/p>\n<p>The following screenshot shows the location of the user pool ID on the <strong>Outputs<\/strong> tab.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-16013 size-full\" title=\"Location of the user pool ID on the Outputs tab\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/09\/17\/4-Screenshot-2.jpg\" alt=\"\" width=\"900\" height=\"446\"><\/p>\n<p>The following screenshot shows the location of the bucket name on the <strong>Outputs<\/strong> tab.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-16014 size-full\" title=\"Location of the bucket name on the Outputs tab\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/09\/17\/5-Screenshot-3.jpg\" alt=\"\" width=\"900\" height=\"454\"><\/p>\n<h2>Deploying the image-embedding container to AWS Fargate<\/h2>\n<p>We use the <a href=\"https:\/\/github.com\/aws\/sagemaker-inference-toolkit\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon SageMaker Inference Toolkit<\/a> to serve the PyTorch inference model, which converts a medical image in DICOM format into a feature vector with the size of 1024. To create a container with all the dependencies, you can either use <a href=\"https:\/\/github.com\/aws\/deep-learning-containers\/blob\/master\/available_images.md\" target=\"_blank\" rel=\"noopener noreferrer\">pre-built deep learning container images<\/a> or derive a Dockerfile from the <a href=\"https:\/\/github.com\/aws\/sagemaker-pytorch-inference-toolkit\/blob\/master\/docker\/1.4.0\/py3\/Dockerfile.cpu\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon Sagemaker Pytorch inference CPU container<\/a>, like the one from the <a href=\"https:\/\/github.com\/aws-samples\/medical-image-search\/blob\/master\/container\/Dockerfile\" target=\"_blank\" rel=\"noopener noreferrer\">GitHub repo<\/a>, in the container folder. You can <a href=\"https:\/\/github.com\/aws-samples\/medical-image-search\/blob\/master\/container\/build_and_push.sh\" target=\"_blank\" rel=\"noopener noreferrer\">build the Docker container and push it to Amazon ECR manually<\/a> or by running the shell script <a href=\"https:\/\/github.com\/aws-samples\/medical-image-search\/blob\/master\/container\/build_and_push.sh\" target=\"_blank\" rel=\"noopener noreferrer\">build_and_push.sh<\/a>. You use the repository image URI for the Docker container later to deploy the AWS Fargate cluster.<\/p>\n<p>The following screenshot shows the <code>sagemaker-pytorch-inference<\/code> repository on the Amazon ECR console.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-16015 size-full\" title=\"sagemaker-pytorch-inference repository on the Amazon ECR console\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/09\/17\/6-Screenshot-2.jpg\" alt=\"\" width=\"900\" height=\"311\"><\/p>\n<p>We use <a href=\"https:\/\/github.com\/awslabs\/multi-model-server\" target=\"_blank\" rel=\"noopener noreferrer\">Multi Model Server<\/a> (MMS) to <a href=\"https:\/\/aws.amazon.com\/blogs\/machine-learning\/deploying-pytorch-inference-with-mxnet-model-server\/\">serve the inference endpoint<\/a>. You need to <a href=\"https:\/\/github.com\/awslabs\/multi-model-server#installing-multi-model-server-with-pip\" target=\"_blank\" rel=\"noopener noreferrer\">install MMS with pip<\/a> locally, use the <a href=\"https:\/\/github.com\/awslabs\/multi-model-server\/tree\/master\/model-archiver\" target=\"_blank\" rel=\"noopener noreferrer\">Model archiver<\/a> CLI to package model artifacts into a single model archive <code>.mar<\/code> file, and upload it to an S3 bucket to be served by a containerized inference endpoint. The model inference handler is defined in <a href=\"https:\/\/github.com\/aws-samples\/medical-image-search\/blob\/master\/MMS\/dicom_featurization_service.py\" target=\"_blank\" rel=\"noopener noreferrer\">dicom_featurization_service.py<\/a> in the <code>MMS<\/code> folder. If you have a domain-specific pretrained Pytorch model, place the <code>model.pth<\/code> file in the <code>MMS<\/code> folder; otherwise, the handler uses a pretrained DenseNET121[4] for image processing. See the following code:<\/p>\n<div class=\"hide-language\">\n<pre class=\"unlimited-height-code\"><code class=\"lang-python\">model_file_path = os.path.join(model_dir, \"model.pth\")\r\nif os.path.isfile(model_file_path):\r\n    model = torch.load(model_file_path) \r\nelse:\r\n    model = models.densenet121(pretrained=True)\r\n    model = model._modules.get('features')\r\n    model.add_module(\"end_relu\", nn.ReLU())\r\n    model.add_module(\"end_globpool\", nn.AdaptiveAvgPool2d((1, 1)))\r\n    model.add_module(\"end_flatten\", nn.Flatten())\r\nmodel = model.to(self.device)\r\nmodel.eval()\r\n<\/code><\/pre>\n<\/div>\n<p>The intermediate results of this CNN-based model is to represent images as feature vectors. In other words, the convolutional layers before the final classification layer is flattened to convert feature layers to a vector representation. Run the following command in the <code>MMS<\/code> folder to package up the model archive file:<\/p>\n<div class=\"hide-language\">\n<pre class=\"unlimited-height-code\"><code class=\"lang-python\">model-archiver -f --model-name dicom_featurization_service --model-path .\/ --handler dicom_featurization_service:handle --export-path .\/<\/code><\/pre>\n<\/div>\n<p>The preceding code generates a package file named <code>dicom_featurization_service.mar<\/code>. Create a new S3 bucket and upload the package file to that bucket with public read Access Control List (ACL). See the following code:<\/p>\n<div class=\"hide-language\">\n<pre class=\"unlimited-height-code\"><code class=\"lang-python\">aws s3 cp .\/dicom_featurization_service.mar s3:\/\/<em><span>&lt;S3bucketname&gt;<\/span><\/em>\/ --acl public-read --profile <em><span>&lt;profilename&gt;<\/span><\/em><\/code><\/pre>\n<\/div>\n<p>You\u2019re now ready to deploy the image-embedding inference model to the AWS Fargate cluster using the CloudFormation template <a href=\"https:\/\/github.com\/aws-samples\/medical-image-search\/blob\/master\/CloudFormationTemplates\/ecsfargate.yaml\" target=\"_blank\" rel=\"noopener noreferrer\">ecsfargate.yaml<\/a> in the <code>CloudFormationTemplates<\/code> folder. You can deploy using the AWS CLI: go to the <code>CloudFormationTemplates<\/code> folder and copy the following command:<\/p>\n<div class=\"hide-language\">\n<pre class=\"unlimited-height-code\"><code class=\"lang-python\">aws cloudformation deploy --capabilities CAPABILITY_IAM --template-file .\/ecsfargate.yaml --stack-name <em><span>&lt;stackname&gt;<\/span><\/em> --parameter-overrides ImageUrl=<em><span>&lt;imageURI&gt;<\/span><\/em> InferenceModelS3Location=https:\/\/<em><span>&lt;S3bucketname&gt;<\/span><\/em>.s3.amazonaws.com\/dicom_featurization_service.mar --profile <em><span>&lt;profilename&gt;<\/span><\/em><\/code><\/pre>\n<\/div>\n<p>You need to replace the following placeholders:<\/p>\n<ul>\n<li>\n<strong>stackname<\/strong> \u2013 A unique name to refer to this CloudFormation stack<\/li>\n<li>\n<strong>imageURI<\/strong> \u2013 The image URI for the MMS Docker container uploaded in Amazon ECR<\/li>\n<li>\n<strong>S3bucketname<\/strong> \u2013 The MMS package in the S3 bucket, such as <code>https:\/\/<\/code><span><em>&lt;S3bucketname&gt;<\/em><\/span><code>.s3.amazonaws.com\/dicom_featurization_service.mar<\/code>\n<\/li>\n<li>\n<strong>profilename<\/strong> \u2013 Your AWS CLI profile name (default if not named)<\/li>\n<\/ul>\n<p>Alternatively, you can choose <strong>Launch stack<\/strong> for the following Regions:<\/p>\n<p><a href=\"https:\/\/console.aws.amazon.com\/cloudformation\/home?region=us-east-1#\/stacks\/create\/template?stackName=ImageSearchFargateInferenceEndpoint&amp;templateURL=https:\/\/medical-image-search-us-east-1.s3.amazonaws.com\/ecsfargate.yaml\" target=\"_blank\" rel=\"noopener noreferrer\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-16018\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/09\/17\/9-LaunchStack.jpg\" alt=\"\" width=\"141\" height=\"31\"><\/a><\/p>\n<p><a href=\"https:\/\/console.aws.amazon.com\/cloudformation\/home?region=us-west-2#\/stacks\/create\/template?stackName=ImageSearchFargateInferenceEndpoint&amp;templateURL=https:\/\/medical-image-search-us-west-2.s3-us-west-2.amazonaws.com\/ecsfargate.yaml\" target=\"_blank\" rel=\"noopener noreferrer\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-16018\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/09\/17\/9-LaunchStack.jpg\" alt=\"\" width=\"141\" height=\"31\"><\/a><\/p>\n<p>After the CloudFormation stack creation is complete, go to the stack <strong>Outputs<\/strong> tab on the AWS CloudFormation console and copy the <code>InferenceAPIUrl<\/code> for later deployment. See the following screenshot.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-16016 size-full\" title=\"Outputs tab on the AWS CloudFormation console\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/09\/17\/7-Screenshot-1.jpg\" alt=\"\" width=\"900\" height=\"304\"><\/p>\n<p>You can delete this stack after the offline image embedding jobs are finished to save costs, because it\u2019s not used for online queries.<\/p>\n<h2>Deploying the data-processing pipeline and AWS AppSync API<\/h2>\n<p>You deploy the image and free text data-processing pipeline and <a href=\"https:\/\/aws.amazon.com\/appsync\/\" target=\"_blank\" rel=\"noopener noreferrer\">AWS AppSync<\/a> API backend through another CloudFormation template named <a href=\"https:\/\/github.com\/aws-samples\/medical-image-search\/blob\/master\/CloudFormationTemplates\/AppSyncBackend.yaml\" target=\"_blank\" rel=\"noopener noreferrer\">AppSyncBackend.yaml<\/a> in the <code>CloudFormationTemplates<\/code> folder, which creates the AWS resources for this solution. See the following solution architecture.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-16017 size-full\" title=\"Solution architecture\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/09\/17\/8-Architecture.jpg\" alt=\"\" width=\"900\" height=\"555\"><\/p>\n<p>To deploy this stack using the AWS CLI, go to the <code>CloudFormationTemplates<\/code> folder and copy the following command:<\/p>\n<div class=\"hide-language\">\n<pre class=\"unlimited-height-code\"><code class=\"lang-python\">aws cloudformation deploy --capabilities CAPABILITY_NAMED_IAM --template-file .\/AppSyncBackend.yaml --stack-name <em><span>&lt;stackname&gt;<\/span><\/em> --parameter-overrides <em><span>AuthorizationUserPool<\/span><\/em>=&lt;CFN_output_auth&gt; <em><span>PNGBucketName<\/span><\/em>=&lt;CFN_output_storage&gt; InferenceEndpointURL=<em><span>&lt;inferenceAPIUrl&gt;<\/span><\/em> --profile <em><span>&lt;profilename&gt;<\/span><\/em><\/code><\/pre>\n<\/div>\n<p>Replace the following placeholders:<\/p>\n<ul>\n<li>\n<strong>stackname<\/strong> \u2013 A unique name to refer to this CloudFormation stack<\/li>\n<li>\n<strong>AuthorizationUserPool<\/strong> \u2013 Amazon Cognito user pool<\/li>\n<li>\n<strong>PNGBucketName<\/strong> \u2013 Amazon S3 bucket name<\/li>\n<li>\n<strong>InferenceEndpointURL<\/strong> \u2013 The inference API endpoint<\/li>\n<li>\n<strong>Profilename<\/strong> \u2013 The AWS CLI profile name (use default if not named)<\/li>\n<\/ul>\n<p>Alternatively, you can choose <strong>Launch stack <\/strong>for the following Regions:<\/p>\n<p><a href=\"https:\/\/console.aws.amazon.com\/cloudformation\/home?region=us-east-1#\/stacks\/create\/template?stackName=ImageSearchAppSyncBackend&amp;templateURL=https:\/\/medical-image-search-us-east-1.s3.amazonaws.com\/AppSyncBackend.yaml\" target=\"_blank\" rel=\"noopener noreferrer\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-16018\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/09\/17\/9-LaunchStack.jpg\" alt=\"\" width=\"141\" height=\"31\"><\/a><\/p>\n<p><a href=\"https:\/\/console.aws.amazon.com\/cloudformation\/home?region=us-west-2#\/stacks\/create\/template?stackName=ImageSearchAppSyncBackend&amp;templateURL=https:\/\/medical-image-search-us-west-2.s3.amazonaws.com\/AppSyncBackend.yaml\" target=\"_blank\" rel=\"noopener noreferrer\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-16018\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/09\/17\/9-LaunchStack.jpg\" alt=\"\" width=\"141\" height=\"31\"><\/a><\/p>\n<p>You can download the <a href=\"https:\/\/medical-image-search-us-east-1.s3.amazonaws.com\/lambda.zip\" target=\"_blank\" rel=\"noopener noreferrer\">Lambda function<\/a> for medical image processing, <a href=\"https:\/\/github.com\/aws-samples\/medical-image-search\/blob\/master\/CloudFormationTemplates\/AppSyncBackend.yaml\" target=\"_blank\" rel=\"noopener noreferrer\">CMprocessLambdaFunction.py<\/a>, and its <a href=\"https:\/\/medical-image-search-us-east-1.s3.amazonaws.com\/python.zip\" target=\"_blank\" rel=\"noopener noreferrer\">dependency layer<\/a> separately if you deploy this stack in AWS Regions other than <code>us-east-1<\/code> and <code>us-west-2<\/code>. Because their file size exceeds the CloudFormation template limit, you need to upload them to your own S3 bucket (either create a new S3 bucket or use the existing one, like the aforementioned S3 bucket for hosting the MMS model package file) and override the <a href=\"https:\/\/github.com\/aws-samples\/medical-image-search\/blob\/master\/CloudFormationTemplates\/AppSyncBackend.yaml#L43\" target=\"_blank\" rel=\"noopener noreferrer\">LambdaBucket<\/a> mapping parameter using your own bucket name.<\/p>\n<p>Save the AWS AppySync API URL and AWS Region from the settings on the AWS AppSync console.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-16879 size-full\" title=\"Saving the AWS AppySync API URL and AWS Region from the settings on the AWS AppSync console\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/10\/08\/Settings.jpg\" alt=\"\" width=\"900\" height=\"725\"><\/p>\n<p>Edit the <code>src\/aws-exports.js<\/code> file in your local environment and replace the placeholders with those values:<\/p>\n<div class=\"hide-language\">\n<pre class=\"unlimited-height-code\"><code class=\"lang-python\">const awsmobile = {\r\n  \"aws_appsync_graphqlEndpoint\": \"&lt;AppSync API URL&gt;\", \r\n  \"aws_appsync_region\": \"&lt;AWS AppSync Region&gt;\",\r\n  \"aws_appsync_authenticationType\": \"AMAZON_COGNITO_USER_POOLS\"\r\n};\r\n<\/code><\/pre>\n<\/div>\n<p>After this stack is successfully deployed, you\u2019re ready to use this solution. If you have in-house EHR and PACS databases, you can set up the <a href=\"https:\/\/aws.amazon.com\/storagegateway\/\" target=\"_blank\" rel=\"noopener noreferrer\">AWS Storage Gateway<\/a> to transfer data to the S3 bucket to trigger the transformation jobs.<\/p>\n<p>Alternatively, you can use the public dataset MIMIC CXR: download the <a href=\"https:\/\/physionet.org\/content\/mimic-cxr\/2.0.0\/\" target=\"_blank\" rel=\"noopener noreferrer\">MIMIC CXR dataset<\/a> from PhysioNet (to access the files, you must be a <a href=\"https:\/\/physionet.org\/settings\/credentialing\/\" target=\"_blank\" rel=\"noopener noreferrer\">credentialed user<\/a> and <a href=\"https:\/\/physionet.org\/sign-dua\/mimic-cxr\/2.0.0\/\" target=\"_blank\" rel=\"noopener noreferrer\">sign the data use agreement<\/a> for the project) and upload the DICOM files to the S3 bucket <code>mimic-cxr-dicom-<\/code> and the free text radiology report to the S3 bucket <code>mimic-cxr-report-<\/code>. If everything works as expected, you should see the new records created in the DynamoDB table <code>medical-image-metadata<\/code> and the Amazon ES domain <code>medical-image-search<\/code>.<\/p>\n<p>You can test the Amplify React web application locally by running the following command:<\/p>\n<p>Or you can publish the React web app by deploying it in Amazon S3 with AWS CloudFront distribution, by first entering the following code:<\/p>\n<p>Then, enter the following code:<\/p>\n<p>You can see the hosting endpoint for the Amplify React web application after deployment.<\/p>\n<h2>Conclusion<\/h2>\n<p>We have demonstrated how to deploy, index and search medical images on AWS, which segregates the offline data ingestion and online search query functions. You can use AWS AI services to transform unstructured data, for example the medical images and radiology reports, into structured ones.<\/p>\n<p>By default, the solution uses a general-purpose model trained on <a href=\"http:\/\/www.image-net.org\/\" target=\"_blank\" rel=\"noopener noreferrer\">ImageNET<\/a> to extract features from images. However, this default model may not be accurate enough to extract medical image features because there are fundamental differences in appearance, size, and features between medical images in its raw form. Such differences make it hard to train commonly adopted triplet-based learning networks [5], where semantically relevant images or objects can be easily defined or ranked.<\/p>\n<p>To improve search relevancy, we performed an experiment by using the same MIMIC CXR dataset and the derived diagnosis labels to train a <a href=\"https:\/\/medical-image-search-us-east-1.s3.amazonaws.com\/model.pth\" target=\"_blank\" rel=\"noopener noreferrer\">weakly supervised disease classification network<\/a> similar to Wang et. Al [6]. We found this domain-specific pretrained model yielded qualitatively better visual search results. So it\u2019s recommended to <a href=\"https:\/\/docs.aws.amazon.com\/sagemaker\/latest\/dg\/your-algorithms.html\" target=\"_blank\" rel=\"noopener noreferrer\">bring your own model (BYOM)<\/a> to this search platform for real-world implementation.<\/p>\n<p>The methods presented here enable you to perform indexing, searching and aggregation against unstructured images in addition to free text. It sets the stage for future work that can combine these features for <a href=\"https:\/\/en.wikipedia.org\/wiki\/Multimodal_search\" target=\"_blank\" rel=\"noopener noreferrer\">multimodal<\/a> medical image search engine. Information retrieval from unstructured corpuses of clinical notes and images is a time-consuming and tedious task. Our solution allows radiologists to become more efficient and help them reduce potential burnout.<\/p>\n<p>To find the latest development to this solution, check out <a href=\"https:\/\/github.com\/aws-samples\/medical-image-search\" target=\"_blank\" rel=\"noopener noreferrer\">medical image search on GitHub<\/a>.<\/p>\n<h2>Reference:<\/h2>\n<ol>\n<li><a href=\"https:\/\/www.radiologybusiness.com\/topics\/leadership\/radiologist-burnout-are-we-done-yet\" target=\"_blank\" rel=\"noopener noreferrer\">https:\/\/www.radiologybusiness.com\/topics\/leadership\/radiologist-burnout-are-we-done-yet<\/a><\/li>\n<li><a href=\"https:\/\/www.mayoclinicproceedings.org\/article\/S0025-6196(15)00716-8\/abstract#secsectitle0010\" target=\"_blank\" rel=\"noopener noreferrer\">https:\/\/www.mayoclinicproceedings.org\/article\/S0025-6196(15)00716-8\/abstract#secsectitle0010<\/a><\/li>\n<li>Johnson, Alistair EW, et al. \u201cMIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports.\u201d Scientific Data 6, 2019.<\/li>\n<li>Huang, Gao, et al. \u201cDensely connected convolutional networks.\u201d Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017.<\/li>\n<li>Wang, Jiang, et al. \u201cLearning fine-grained image similarity with deep ranking.\u201d Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014.<\/li>\n<li>Wang, Xiaosong, et al. \u201cChestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases.\u201d Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017.<\/li>\n<\/ol>\n<hr>\n<h3>About the Authors<\/h3>\n<p><strong>\u00a0<img decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-16082 alignleft\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/09\/17\/GangFu_Headshot-1.jpg\" alt=\"\" width=\"100\" height=\"135\">Gang Fu<\/strong> is a Healthcare Solution Architect at AWS. He holds a PhD in Pharmaceutical Science from the University of Mississippi and has over ten years of technology and biomedical research experience. He is passionate about technology and the impact it can make on healthcare.<\/p>\n<p>\u00a0<\/p>\n<p>\u00a0<\/p>\n<p>\u00a0<\/p>\n<p><strong><img decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-16080 alignleft\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/09\/17\/ujjwal-ratan-100-1.jpg\" alt=\"\" width=\"100\" height=\"134\"><\/strong><strong>Ujjwal Ratan<\/strong> is a Principal Machine Learning Specialist Solution Architect in the Global Healthcare and Lifesciences team at Amazon Web Services. He works on the application of machine learning and deep learning to real world industry problems like medical imaging, unstructured clinical text, genomics, precision medicine, clinical trials and quality of care improvement. He has expertise in scaling machine learning\/deep learning algorithms on the AWS cloud for accelerated training and inference. In his free time, he enjoys listening to (and playing) music and taking unplanned road trips with his family.<\/p>\n<p>\u00a0<\/p>\n<p>\u00a0<\/p>\n<p><strong><img decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-16916 alignleft\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/10\/08\/Erhan_Bas.jpg\" alt=\"\" width=\"100\" height=\"120\">Erhan Bas <\/strong>is a Senior Applied Scientist in the AWS Rekognition team, currently developing deep learning algorithms for computer vision applications. His expertise is in machine learning and large scale image analysis techniques, especially in biomedical, life sciences and industrial inspection technologies. He enjoys playing video games, drinking coffee, and traveling with his family.<\/p>\n<p>\u00a0<\/p>\n<\/p><\/div>\n","protected":false},"excerpt":{"rendered":"<p>https:\/\/aws.amazon.com\/blogs\/machine-learning\/building-a-medical-image-search-platform-on-aws\/<\/p>\n","protected":false},"author":0,"featured_media":405,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[3],"tags":[],"_links":{"self":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/404"}],"collection":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/comments?post=404"}],"version-history":[{"count":0,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/404\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media\/405"}],"wp:attachment":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media?parent=404"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/categories?post=404"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/tags?post=404"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}