{"id":450,"date":"2020-10-27T00:32:34","date_gmt":"2020-10-27T00:32:34","guid":{"rendered":"https:\/\/machine-learning.webcloning.com\/2020\/10\/27\/building-an-nlu-powered-search-application-with-amazon-sagemaker-and-the-amazon-es-knn-feature\/"},"modified":"2020-10-27T00:32:34","modified_gmt":"2020-10-27T00:32:34","slug":"building-an-nlu-powered-search-application-with-amazon-sagemaker-and-the-amazon-es-knn-feature","status":"publish","type":"post","link":"https:\/\/salarydistribution.com\/machine-learning\/2020\/10\/27\/building-an-nlu-powered-search-application-with-amazon-sagemaker-and-the-amazon-es-knn-feature\/","title":{"rendered":"Building an NLU-powered search application with Amazon SageMaker and the Amazon ES KNN feature"},"content":{"rendered":"<div id=\"\">\n<p>The rise of semantic search engines has made ecommerce and retail businesses search easier for its consumers. Search engines powered by <a href=\"https:\/\/en.wikipedia.org\/wiki\/Natural-language_understanding\" target=\"_blank\" rel=\"noopener noreferrer\">natural language understanding<\/a> (NLU) allow you to speak or type into a device using your preferred conversational language rather than finding the right keywords for fetching the best results. You can query using words or sentences in your native language, leaving it to the search engine to deliver the best results.<\/p>\n<p><a href=\"https:\/\/aws.amazon.com\/sagemaker\/\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon SageMaker<\/a> is a fully managed service that provides every developer and data scientist with the ability to build, train, and deploy machine learning (ML) models quickly. <a href=\"https:\/\/aws.amazon.com\/elasticsearch-service\/\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon Elasticsearch Service<\/a> (Amazon ES) is a fully managed service that makes it easy for you to deploy, secure, and run Elasticsearch cost-effectively at scale. Amazon ES offers KNN search, which can enhance search in use cases such as product recommendations, fraud detection, and image, video, and some specific semantic scenarios like document and query similarity. Alternatively, you can also choose <a href=\"https:\/\/aws.amazon.com\/kendra\/\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon Kendra<\/a>, a highly accurate and easy to use enterprise search service that\u2019s powered by machine learning, with no machine learning experience required. In this post, we explain how you can implement an NLU-based product search for certain types of applications using Amazon SageMaker\u00a0and the Amazon ES k-nearest neighbor (KNN) feature.<\/p>\n<p>In the post <a href=\"https:\/\/aws.amazon.com\/blogs\/machine-learning\/building-a-visual-search-application-with-amazon-sagemaker-and-amazon-es\/\" target=\"_blank\" rel=\"noopener noreferrer\">Building a visual search application with Amazon SageMaker and Amazon ES<\/a><u>,<\/u> we shared how to build a visual search application using Amazon SageMaker and the Amazon ES KNN\u2019s <a href=\"https:\/\/en.wikipedia.org\/wiki\/Euclidean_distance\" target=\"_blank\" rel=\"noopener noreferrer\">Euclidean distance<\/a> metric. Amazon ES now supports open-source Elasticsearch version 7.7 and includes the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Cosine_similarity\" target=\"_blank\" rel=\"noopener noreferrer\">cosine similarity<\/a> metric for KNN indexes. Cosine similarity measures the cosine of the angle between two vectors in the same direction, where a smaller cosine angle denotes higher similarity between the vectors. With cosine similarity, you can measure the orientation between two vectors, which makes it the ideal choice for some specific semantic search applications. The highly distributed architecture of Amazon ES enables you to implement an enterprise-grade search engine with enhanced KNN ranking, with high recall and performance.<\/p>\n<p>In this post, you build a very simple search application that demonstrates the potential of using KNN with Amazon ES compared to the traditional Amazon ES ranking method, including a web application for testing the KNN-based search queries in your browser. The application also compares the search results with Elasticsearch match queries to demonstrate the difference between KNN search and full-text search.<\/p>\n<h2>Overview of solution<\/h2>\n<p>Regular Elasticsearch text-matching search is useful when you want to do text-based search, but KNN-based search is a more natural way to search for something. For example, when you search for a wedding dress using KNN-based search application, it gives you similar results if you type \u201cwedding dress\u201d or \u201cmarriage dress.\u201d Implementing this KNN-based search application consists of two phases:<\/p>\n<ul>\n<li>\n<strong>KNN reference index<\/strong> \u2013 In this phase, you pass a set of corpus documents through a deep learning model to extract their features, or <em>embeddings<\/em>. Text embeddings are a numerical representation of the corpus. You save those features into a KNN index on Amazon ES. The concept underpinning KNN is that similar data points exist in close proximity in the vector space. As an example, \u201csummer dress\u201d and \u201csummer flowery dress\u201d are both similar, so these text embeddings are collocated, as opposed to \u201csummer dress\u201d vs. \u201cwedding dress.\u201d<\/li>\n<li>\n<strong>KNN index query<\/strong> \u2013 This is the inference phase of the application. In this phase, you submit a text search query through the deep learning model to extract the features. Then, you use those embeddings to query the reference KNN index. The KNN index returns similar text embeddings from the KNN vector space. For example, if you pass a feature vector of \u201cmarriage dress\u201d text, it returns \u201cwedding dress\u201d embeddings as a similar item.<\/li>\n<\/ul>\n<p>Next, let\u2019s take a closer look at each phase in detail, with the associated AWS architecture.<\/p>\n<h3>KNN reference index creation<\/h3>\n<p>For this use case, you use dress images and their visual descriptions from the <a href=\"https:\/\/github.com\/zalandoresearch\/feidegger\" target=\"_blank\" rel=\"noopener noreferrer\">Feidegger<\/a> dataset. This dataset is a multi-modal corpus that focuses specifically on the domain of fashion items and their visual descriptions in German. The dataset was created as part of ongoing research at <a href=\"https:\/\/research.zalando.com\/welcome\/mission\/research-projects\/feidegger-dataset\/\" target=\"_blank\" rel=\"noopener noreferrer\">Zalando<\/a> into text-image multi-modality in the area of fashion.<\/p>\n<p>In this step, you translate each dress description from German to English using <a href=\"https:\/\/aws.amazon.com\/translate\/\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon Translate<\/a>. From each English description, you extract the feature vector, which is an n-dimensional vector of numerical features that represent the dress. You use a <a href=\"https:\/\/github.com\/UKPLab\/sentence-transformers\" target=\"_blank\" rel=\"noopener noreferrer\">pre-trained BERT<\/a> model hosted in Amazon SageMaker to extract 768 feature vectors of each visual description of the dress, and store them as a <a href=\"https:\/\/docs.aws.amazon.com\/elasticsearch-service\/latest\/developerguide\/knn.html\" target=\"_blank\" rel=\"noopener noreferrer\">KNN index in an Amazon ES domain<\/a>.<\/p>\n<p>The following screenshot illustrates the workflow for creating the KNN index.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-16850\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/10\/07\/1-1-3.jpg\" alt=\"\" width=\"900\" height=\"463\"><\/p>\n<p>The process includes the following steps:<\/p>\n<ol>\n<li>Users interact with a Jupyter notebook on an Amazon SageMaker notebook instance. An Amazon SageMaker notebook instance is an ML compute instance running the Jupyter Notebook app. Amazon SageMaker manages creating the instance and related resources.<\/li>\n<li>Each item description, originally open-sourced in German, is translated to English using Amazon Translate.<\/li>\n<li>A pre-trained BERT model is downloaded, and the model artifact is serialized and stored in <a href=\"http:\/\/aws.amazon.com\/s3\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon Simple Storage Service<\/a> (Amazon S3). The model is used to serve from a PyTorch model server on an Amazon SageMaker real-time endpoint.<\/li>\n<li>Translated descriptions are pushed through the SageMaker endpoint to extract fixed-length features (embeddings).<\/li>\n<li>The notebook code writes the text embeddings to the KNN index along with product Amazon S3 URI in an Amazon ES domain.<\/li>\n<\/ol>\n<h3>KNN search from a query text<\/h3>\n<p>In this step, you present a search query text string from the application, which passes through the Amazon SageMaker hosted model to extract 768 features. You use these features to query the KNN index in Amazon ES. <a href=\"https:\/\/docs.aws.amazon.com\/elasticsearch-service\/latest\/developerguide\/knn.html\" target=\"_blank\" rel=\"noopener noreferrer\">KNN for Amazon ES<\/a> lets you search for points in a vector space and find the nearest neighbors for those points by cosine similarity (the default is Euclidean distance). When it finds the nearest neighbors vectors (for example, <em>k<\/em> = 3 nearest neighbors) for a given query text, it returns the associated Amazon S3 images to the application. The following diagram illustrates the KNN search full-stack application architecture.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-16851\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/10\/07\/2-1-2.jpg\" alt=\"\" width=\"900\" height=\"589\"><\/p>\n<p>The process includes the following steps:<\/p>\n<ol>\n<li>The end-user accesses the web application from their browser or mobile device.<\/li>\n<li>A user-provided search query string is sent to <a href=\"https:\/\/aws.amazon.com\/api-gateway\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon API Gateway<\/a> and <a href=\"http:\/\/aws.amazon.com\/lambda\" target=\"_blank\" rel=\"noopener noreferrer\">AWS Lambda<\/a>.<\/li>\n<li>The Lambda function invokes the Amazon SageMaker real-time endpoint, and the model returns a vector of the search query embeddings. Amazon SageMaker hosting provides a managed HTTPS endpoint for predictions and automatically scales to the performance needed for your application using Application Auto Scaling.<\/li>\n<li>The function passes the search query embedding vector as the search value for a KNN search in the index in the Amazon ES domain. A list of <em>k<\/em> similar items and their respective Amazon S3 URIs are returned.<\/li>\n<li>The function generates pre-signed Amazon S3 URLs to return back to the client web application, used to display similar items in the browser.<\/li>\n<\/ol>\n<h2>Prerequisites<\/h2>\n<p>For this walkthrough, you should have an <a href=\"https:\/\/signin.aws.amazon.com\/signin?redirect_uri=https%3A%2F%2Fportal.aws.amazon.com%2Fbilling%2Fsignup%2Fresume&amp;client_id=signup\" target=\"_blank\" rel=\"noopener noreferrer\">AWS account<\/a> with appropriate <a href=\"http:\/\/aws.amazon.com\/iam\" target=\"_blank\" rel=\"noopener noreferrer\">AWS Identity and Access Management<\/a> (IAM) permissions to launch the <a href=\"http:\/\/aws.amazon.com\/cloudformation\" target=\"_blank\" rel=\"noopener noreferrer\">AWS CloudFormation<\/a> template.<\/p>\n<h2>Deploying your solution<\/h2>\n<p>You use a CloudFormation stack to deploy the solution. The stack creates all the necessary resources, including the following:<\/p>\n<ul>\n<li>An Amazon SageMaker notebook instance to run Python code in a Jupyter notebook<\/li>\n<li>An IAM role associated with the notebook instance<\/li>\n<li>An Amazon ES domain to store and retrieve sentence embedding vectors into a KNN index<\/li>\n<li>Two S3 buckets: one for storing the source fashion images and another for hosting a static website<\/li>\n<\/ul>\n<p>From the Jupyter notebook, you also deploy the following:<\/p>\n<ul>\n<li>An Amazon SageMaker endpoint for getting fixed-length sentence embedding vectors in real time.<\/li>\n<li>An <a href=\"https:\/\/aws.amazon.com\/serverless\/sam\/\" target=\"_blank\" rel=\"noopener noreferrer\">AWS Serverless Application Model<\/a> (AWS SAM) template for a serverless backend using API Gateway and Lambda.<\/li>\n<li>A static front-end website hosted on an S3 bucket to demonstrate a real-world, end-to-end ML application. The front-end code uses ReactJS and the <a href=\"https:\/\/aws.amazon.com\/amplify\/\" target=\"_blank\" rel=\"noopener noreferrer\">AWS Amplify<\/a> JavaScript library.<\/li>\n<\/ul>\n<p>To get started, complete the following steps:<\/p>\n<ol>\n<li>Sign in to the <a href=\"https:\/\/aws.amazon.com\/console\/\" target=\"_blank\" rel=\"noopener noreferrer\">AWS Management Console<\/a> with your IAM user name and password.<\/li>\n<li>Choose <strong>Launch Stack<\/strong> and open it in a new tab:<\/li>\n<\/ol>\n<p><a href=\"https:\/\/console.aws.amazon.com\/cloudformation\/home?region=us-east-1#\/stacks\/quickcreate?templateUrl=https:\/\/aws-ml-blog.s3.amazonaws.com\/artifacts\/nlu-search\/initial_template.yaml&amp;stackName=nlu-search\" target=\"_blank\" rel=\"noopener noreferrer\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-16216\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/09\/23\/LaunchStack.jpg\" alt=\"\" width=\"107\" height=\"20\"><\/a><\/p>\n<ol start=\"3\">\n<li>On the <strong>Quick create stack<\/strong> page, select the check-box to acknowledge the creation of IAM resources.<\/li>\n<li>Choose <strong>Create stack.<\/strong>\n<\/li>\n<\/ol>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-16852\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/10\/07\/3-1-3.jpg\" alt=\"\" width=\"900\" height=\"872\"><\/p>\n<ol start=\"5\">\n<li>Wait for the stack to complete.<\/li>\n<\/ol>\n<p>You can examine various events from the stack creation process on the <strong>Events<\/strong> tab. When the stack creation is complete, you see the status CREATE_COMPLETE.<\/p>\n<p>You can look on the <strong>Resources<\/strong> tab to see all the resources the CloudFormation template created.<\/p>\n<ol start=\"6\">\n<li>On the <strong>Outputs<\/strong> tab, choose the <strong>SageMakerNotebookURL <\/strong>\n<\/li>\n<\/ol>\n<p>This hyperlink opens the Jupyter notebook on your Amazon SageMaker notebook instance that you use to complete the rest of the lab.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-16853\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/10\/07\/4-1-3.jpg\" alt=\"\" width=\"900\" height=\"283\"><\/p>\n<p>You should be on the Jupyter notebook landing page.<\/p>\n<ol start=\"7\">\n<li>Choose <strong>nlu-based-item-search.ipynb<\/strong>.<\/li>\n<\/ol>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-16854 size-large\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/10\/07\/5-1-2-810x1024.jpg\" alt=\"\" width=\"810\" height=\"1024\"><\/p>\n<h2>Building a KNN index on Amazon ES<\/h2>\n<p>For this step, you should be at the beginning of the notebook with the title <strong>NLU based Item Search<\/strong>. Follow the steps in the notebook and run each cell in order.<\/p>\n<p>You use a pre-trained BERT model (distilbert-base-nli-stsb-mean-tokens) from <a href=\"https:\/\/github.com\/UKPLab\/sentence-transformers\" target=\"_blank\" rel=\"noopener noreferrer\">sentence-transformers<\/a> and host it on an Amazon SageMaker PyTorch model server endpoint to generate fixed-length sentence embeddings. The embeddings are saved to the Amazon ES domain created in the CloudFormation stack. For more information, see the markdown cells in the notebook.<\/p>\n<p>Continue when you reach the cell <code>Deploying a full-stack NLU search application<\/code> in your notebook.<\/p>\n<p>The notebook contains several important cells; we walk you through a few of them.<\/p>\n<p>Download the multi-modal corpus dataset from <a href=\"https:\/\/github.com\/zalandoresearch\/feidegger\" target=\"_blank\" rel=\"noopener noreferrer\">Feidegger<\/a>, which contains fashion images and descriptions in German. See the following code:<\/p>\n<div class=\"hide-language\">\n<pre class=\"unlimited-height-code\"><code class=\"lang-python\">## Data Preparation\r\n\r\nimport os \r\nimport shutil\r\nimport json\r\nimport tqdm\r\nimport urllib.request\r\nfrom tqdm import notebook\r\nfrom multiprocessing import cpu_count\r\nfrom tqdm.contrib.concurrent import process_map\r\n\r\nimages_path = 'data\/feidegger\/fashion'\r\nfilename = 'metadata.json'\r\n\r\nmy_bucket = s3_resource.Bucket(bucket)\r\n\r\nif not os.path.isdir(images_path):\r\n    os.makedirs(images_path)\r\n\r\ndef download_metadata(url):\r\n    if not os.path.exists(filename):\r\n        urllib.request.urlretrieve(url, filename)\r\n        \r\n#download metadata.json to local notebook\r\ndownload_metadata('https:\/\/raw.githubusercontent.com\/zalandoresearch\/feidegger\/master\/data\/FEIDEGGER_release_1.1.json')\r\n\r\ndef generate_image_list(filename):\r\n    metadata = open(filename,'r')\r\n    data = json.load(metadata)\r\n    url_lst = []\r\n    for i in range(len(data)):\r\n        url_lst.append(data[i]['url'])\r\n    return url_lst\r\n\r\n\r\ndef download_image(url):\r\n    urllib.request.urlretrieve(url, images_path + '\/' + url.split(\"\/\")[-1])\r\n                    \r\n#generate image list            \r\nurl_lst = generate_image_list(filename)     \r\n\r\nworkers = 2 * cpu_count()\r\n\r\n#downloading images to local disk\r\nprocess_map(download_image, url_lst, max_workers=workers)\r\n<\/code><\/pre>\n<\/div>\n<p>Upload the dataset to Amazon S3:<\/p>\n<div class=\"hide-language\">\n<pre class=\"unlimited-height-code\"><code class=\"lang-python\"># Uploading dataset to S3\r\n\r\nfiles_to_upload = []\r\ndirName = 'data'\r\nfor path, subdirs, files in os.walk('.\/' + dirName):\r\n    path = path.replace(\"\\\",\"\/\")\r\n    directory_name = path.replace('.\/',\"\")\r\n    for file in files:\r\n        files_to_upload.append({\r\n            \"filename\": os.path.join(path, file),\r\n            \"key\": directory_name+'\/'+file\r\n        })\r\n        \r\n\r\ndef upload_to_s3(file):\r\n        my_bucket.upload_file(file['filename'], file['key'])\r\n        \r\n#uploading images to s3\r\nprocess_map(upload_to_s3, files_to_upload, max_workers=workers)\r\n<\/code><\/pre>\n<\/div>\n<p>This dataset has product descriptions in German, so you use Amazon Translate for the English translation for each German sentence:<\/p>\n<div class=\"hide-language\">\n<pre class=\"unlimited-height-code\"><code class=\"lang-python\">with open(filename) as json_file:\r\n    data = json.load(json_file)\r\n\r\n#Define translator function\r\ndef translate_txt(data):\r\n    results = {}\r\n    results['filename'] = f's3:\/\/{bucket}\/data\/feidegger\/fashion\/' + data['url'].split(\"\/\")[-1]\r\n    results['descriptions'] = []\r\n    translate = boto3.client(service_name='translate', use_ssl=True)\r\n    for i in data['descriptions']:\r\n        result = translate.translate_text(Text=str(i), \r\n            SourceLanguageCode=\"de\", TargetLanguageCode=\"en\")\r\n        results['descriptions'].append(result['TranslatedText'])\r\n    return results\r\n<\/code><\/pre>\n<\/div>\n<p>Save the sentence transformers model to notebook instance:<\/p>\n<div class=\"hide-language\">\n<pre class=\"unlimited-height-code\"><code class=\"lang-python\">!pip install sentence-transformers\r\n\r\n#Save the model to disk which we will host at sagemaker\r\nfrom sentence_transformers import models, SentenceTransformer\r\nsaved_model_dir = 'transformer'\r\nif not os.path.isdir(saved_model_dir):\r\n    os.makedirs(saved_model_dir)\r\n\r\nmodel = SentenceTransformer('distilbert-base-nli-stsb-mean-tokens')\r\nmodel.save(saved_model_dir)\r\n<\/code><\/pre>\n<\/div>\n<p>Upload the model artifact (model.tar.gz) to Amazon S3 with the following code:<\/p>\n<div class=\"hide-language\">\n<pre class=\"unlimited-height-code\"><code class=\"lang-python\">#zip the model .gz format\r\nimport tarfile\r\nexport_dir = 'transformer'\r\nwith tarfile.open('model.tar.gz', mode='w:gz') as archive:\r\n    archive.add(export_dir, recursive=True)\r\n\r\n#Upload the model to S3\r\n\r\ninputs = sagemaker_session.upload_data(path='model.tar.gz', key_prefix='model')\r\ninputs\r\n<\/code><\/pre>\n<\/div>\n<p>Deploy the model into an Amazon SageMaker PyTorch model server using the Amazon SageMaker Python SDK. See the following code:<\/p>\n<div class=\"hide-language\">\n<pre class=\"unlimited-height-code\"><code class=\"lang-pytho\">from sagemaker.pytorch import PyTorch, PyTorchModel\r\nfrom sagemaker.predictor import RealTimePredictor\r\nfrom sagemaker import get_execution_role\r\n\r\nclass StringPredictor(RealTimePredictor):\r\n    def __init__(self, endpoint_name, sagemaker_session):\r\n        super(StringPredictor, self).__init__(endpoint_name, sagemaker_session, content_type='text\/plain')\r\n\r\npytorch_model = PyTorchModel(model_data = inputs, \r\n                             role=role, \r\n                             entry_point ='inference.py',\r\n                             source_dir = '.\/code', \r\n                             framework_version = '1.3.1',\r\n                             predictor_cls=StringPredictor)\r\n\r\npredictor = pytorch_model.deploy(instance_type='ml.m5.large', initial_instance_count=3)\r\n<\/code><\/pre>\n<\/div>\n<p>Define a cosine similarity Amazon ES KNN index mapping with the following code (to define cosine similarity KNN index mapping, you need Amazon ES 7.7 and above):<\/p>\n<div class=\"hide-language\">\n<pre class=\"unlimited-height-code\"><code class=\"lang-python\">#KNN index maping\r\nknn_index = {\r\n    \"settings\": {\r\n        \"index.knn\": True,\r\n        \"index.knn.space_type\": \"cosinesimil\",\r\n        \"analysis\": {\r\n          \"analyzer\": {\r\n            \"default\": {\r\n              \"type\": \"standard\",\r\n              \"stopwords\": \"_english_\"\r\n            }\r\n          }\r\n        }\r\n    },\r\n    \"mappings\": {\r\n        \"properties\": {\r\n           \"zalando_nlu_vector\": {\r\n                \"type\": \"knn_vector\",\r\n                \"dimension\": 768\r\n            } \r\n        }\r\n    }\r\n}\r\n<\/code><\/pre>\n<\/div>\n<p>Each product has five visual descriptions, so you combine all five descriptions and get one fixed-length sentence embedding. See the following code:<\/p>\n<div class=\"hide-language\">\n<pre class=\"unlimited-height-code\"><code class=\"lang-python\"># For each product, we are concatenating all the \r\n# product descriptions into a single sentence,\r\n# so that we will have one embedding for each product\r\n\r\ndef concat_desc(results):\r\n    obj = {\r\n        'filename': results['filename'],\r\n    }\r\n    obj['descriptions'] = ' '.join(results['descriptions'])\r\n    return obj\r\n\r\nconcat_results = map(concat_desc, results)\r\nconcat_results = list(concat_results)\r\nconcat_results[0]\r\n<\/code><\/pre>\n<\/div>\n<p>Import the sentence embeddings and associated Amazon S3 image URI into the Amazon ES KNN index with the following code. You also load the translated descriptions in full text, so that later you can compare the difference between KNN search and standard match text queries in Elasticsearch.<\/p>\n<div class=\"hide-language\">\n<pre class=\"unlimited-height-code\"><code class=\"lang-python\"># defining a function to import the feature vectors corresponds to each S3 URI into Elasticsearch KNN index\r\n# This process will take around ~10 min.\r\n\r\ndef es_import(concat_result):\r\n    vector = json.loads(predictor.predict(concat_result['descriptions']))\r\n    es.index(index='idx_zalando',\r\n             body={\"zalando_nlu_vector\": vector,\r\n                   \"image\": concat_result['filename'],\r\n                   \"description\": concat_result['descriptions']}\r\n            )\r\n        \r\nworkers = 8 * cpu_count()\r\n    \r\nprocess_map(es_import, concat_results, max_workers=workers)\r\n<\/code><\/pre>\n<\/div>\n<h2>Building a full-stack KNN search application<\/h2>\n<p>Now that you have a working Amazon SageMaker endpoint for extracting text features and a KNN index on Amazon ES, you\u2019re ready to build a real-world, full-stack ML-powered web app. You use an AWS SAM template to deploy a serverless REST API with API Gateway and Lambda. The REST API accepts new search strings, generates the embeddings, and returns similar relevant items to the client. Then you upload a front-end website that interacts with your new REST API to Amazon S3. The front-end code uses Amplify to integrate with your REST API.<\/p>\n<ol>\n<li>In the following cell, prepopulate a CloudFormation template that creates necessary resources such as Lambda and API Gateway for full-stack application:<\/li>\n<\/ol>\n<div class=\"hide-language\">\n<pre class=\"unlimited-height-code\"><code class=\"lang-python\">s3_resource.Object(bucket, 'backend\/template.yaml').upload_file('.\/backend\/template.yaml', ExtraArgs={'ACL':'public-read'})\r\n\r\n\r\nsam_template_url = f'https:\/\/{bucket}.s3.amazonaws.com\/backend\/template.yaml'\r\n\r\n# Generate the CloudFormation Quick Create Link\r\n\r\nprint(\"Click the URL below to create the backend API for NLU search:n\")\r\nprint((\r\n    'https:\/\/console.aws.amazon.com\/cloudformation\/home?region=us-east-1#\/stacks\/create\/review'\r\n    f'?templateURL={sam_template_url}'\r\n    '&amp;stackName=nlu-search-api'\r\n    f'&amp;param_BucketName={outputs[\"s3BucketTraining\"]}'\r\n    f'&amp;param_DomainName={outputs[\"esDomainName\"]}'\r\n    f'&amp;param_ElasticSearchURL={outputs[\"esHostName\"]}'\r\n    f'&amp;param_SagemakerEndpoint={predictor.endpoint}'\r\n))\r\n<\/code><\/pre>\n<\/div>\n<p>The following screenshot shows the output: a pre-generated CloudFormation template link.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-16855\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/10\/07\/6-1-1.jpg\" alt=\"\" width=\"900\" height=\"119\"><\/p>\n<ol start=\"2\">\n<li>Choose the link.<\/li>\n<\/ol>\n<p>You are sent to the <strong>Quick create stack<\/strong> page.<\/p>\n<ol start=\"3\">\n<li>Select the check-boxes to acknowledge the creation of IAM resources, IAM resources with custom names, and CAPABILITY_AUTO_EXPAND.<\/li>\n<li>Choose <strong>Create stack<\/strong>.<\/li>\n<\/ol>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-16856\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/10\/07\/7-.jpg\" alt=\"\" width=\"900\" height=\"1139\"><\/p>\n<p>When the stack creation is complete, you see the status CREATE_COMPLETE. You can look on the <strong>Resources<\/strong> tab to see all the resources the CloudFormation template created.<\/p>\n<ol start=\"5\">\n<li>After the stack is created, proceed through the cells.<\/li>\n<\/ol>\n<p>The following cell indicates that your full-stack application, including front-end and backend code, are successfully deployed:<\/p>\n<div class=\"hide-language\">\n<pre class=\"unlimited-height-code\"><code class=\"lang-python\">print('Click the URL below:n')\r\nprint(outputs['S3BucketSecureURL'] + '\/index.html')\r\n<\/code><\/pre>\n<\/div>\n<p>The following screenshot shows the URL output.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-16857\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/10\/07\/8-.jpg\" alt=\"\" width=\"900\" height=\"137\"><\/p>\n<ol start=\"6\">\n<li>Choose the link.<\/li>\n<\/ol>\n<p>You are sent to the application page, where you can provide your own search text to find products using both the KNN approach and regular full-text search approaches.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-16966\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/10\/09\/blog.gif\" alt=\"\" width=\"1912\" height=\"2100\"><\/p>\n<ol start=\"7\">\n<li>When you\u2019re done testing and experimenting with your KNN search application, run the last two cells at the bottom of the notebook:<\/li>\n<\/ol>\n<div class=\"hide-language\">\n<pre class=\"unlimited-height-code\"><code class=\"lang-python\"># Delete the endpoint\r\npredictor.delete_endpoint()\r\n\r\n# Empty S3 Contents\r\ntraining_bucket_resource = s3_resource.Bucket(bucket)\r\ntraining_bucket_resource.objects.all().delete()\r\n\r\nhosting_bucket_resource = s3_resource.Bucket(outputs['s3BucketHostingBucketName'])\r\nhosting_bucket_resource.objects.all().delete()\r\n<\/code><\/pre>\n<\/div>\n<p>These cells end your Amazon SageMaker endpoint and empty your S3 buckets to prepare you for cleaning up your resources.<\/p>\n<h2>Cleaning up<\/h2>\n<p>To delete the rest of your AWS resources, go to the AWS CloudFormation console and delete the nlu-search-api and nlu-search stacks.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-16859\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/10\/07\/10-.jpg\" alt=\"\" width=\"900\" height=\"101\"><\/p>\n<h2>Conclusion<\/h2>\n<p>In this post, we showed you how to create a KNN-based search application using Amazon SageMaker and Amazon ES KNN index features. You used a pre-trained BERT model from the sentence-transformers Python library. You can also fine-tune your BERT model using your own dataset. For more information, see <a href=\"https:\/\/aws.amazon.com\/blogs\/machine-learning\/fine-tuning-a-pytorch-bert-model-and-deploying-it-with-amazon-elastic-inference-on-amazon-sagemaker\/\" target=\"_blank\" rel=\"noopener noreferrer\">Fine-tuning a PyTorch BERT model and deploying it with Amazon Elastic Inference on Amazon SageMaker<\/a>.<\/p>\n<p>A GPU instance is recommended for most deep learning purposes. In many cases, training new models is faster on GPU instances than CPU instances. You can scale sub-linearly when you have multi-GPU instances or if you use distributed training across many instances with GPUs. However, we used CPU instances for this use case so you can complete the walkthrough under the <a href=\"https:\/\/aws.amazon.com\/free\/\" target=\"_blank\" rel=\"noopener noreferrer\">AWS Free Tier<\/a>.<\/p>\n<p>For more information about the code sample in the post, see the <a href=\"https:\/\/github.com\/aws-samples\/amazon-sagemaker-nlu-search\" target=\"_blank\" rel=\"noopener noreferrer\">GitHub repo<\/a>.<\/p>\n<hr>\n<h3>About the Authors<\/h3>\n<p><strong><img decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-16860 alignleft\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/10\/07\/Amit.jpg\" alt=\"\" width=\"100\" height=\"150\">Amit Mukherjee<\/strong> is a Sr. Partner Solutions Architect with a focus on data analytics and AI\/ML. He works with AWS partners and customers to provide them with architectural guidance for building highly secure and scalable data analytics platforms and adopting machine learning at a large scale.<\/p>\n<p>\u00a0<\/p>\n<p>\u00a0<\/p>\n<p>\u00a0<\/p>\n<p>\u00a0<\/p>\n<p><strong><img decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-16861 alignleft\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/10\/07\/Laith.jpg\" alt=\"\" width=\"100\" height=\"134\">Laith Al-Saadoon<\/strong> is a Principal Solutions Architect with a focus on data analytics at AWS. He spends his days obsessing over designing customer architectures to process enormous amounts of data at scale. In his free time, he follows the latest in machine learning and artificial intelligence.<\/p>\n<\/p><\/div>\n","protected":false},"excerpt":{"rendered":"<p>https:\/\/aws.amazon.com\/blogs\/machine-learning\/building-an-nlu-powered-search-application-with-amazon-sagemaker-and-the-amazon-es-knn-feature\/<\/p>\n","protected":false},"author":0,"featured_media":451,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[3],"tags":[],"_links":{"self":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/450"}],"collection":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/comments?post=450"}],"version-history":[{"count":0,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/450\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media\/451"}],"wp:attachment":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media?parent=450"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/categories?post=450"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/tags?post=450"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}