{"id":282,"date":"2020-09-24T22:25:15","date_gmt":"2020-09-24T22:25:15","guid":{"rendered":"https:\/\/machine-learning.webcloning.com\/2020\/09\/24\/streamline-modeling-with-amazon-sagemaker-studio-and-the-amazon-experiments-sdk\/"},"modified":"2020-09-24T22:25:15","modified_gmt":"2020-09-24T22:25:15","slug":"streamline-modeling-with-amazon-sagemaker-studio-and-the-amazon-experiments-sdk","status":"publish","type":"post","link":"https:\/\/salarydistribution.com\/machine-learning\/2020\/09\/24\/streamline-modeling-with-amazon-sagemaker-studio-and-the-amazon-experiments-sdk\/","title":{"rendered":"Streamline modeling with Amazon SageMaker Studio and the Amazon Experiments SDK"},"content":{"rendered":"<div id=\"\">\n<p>The modeling phase is a highly iterative process in machine learning (ML) projects, where data scientists experiment with various data preprocessing and feature engineering strategies, intertwined with different model architectures, which are then trained with disparate sets of hyperparameter values. This highly iterative process with many moving parts can, over time, manifest into a tremendous headache in terms of keeping track of the design decisions applied in each iteration and how the training and evaluation metrics of each iteration compare to the previous versions of the model.<\/p>\n<p>While your head may be spinning by now, fear not! <a href=\"https:\/\/aws.amazon.com\/sagemaker\/\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon SageMaker<\/a> has a solution!<\/p>\n<p>This post walks you through an end-to-end example of using <a href=\"https:\/\/docs.aws.amazon.com\/sagemaker\/latest\/dg\/gs-studio.html\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon SageMaker Studio<\/a> and the <a href=\"https:\/\/sagemaker-experiments.readthedocs.io\/en\/latest\/\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon SageMaker Experiments SDK<\/a> to organize, track, visualize, and compare our iterative experimentation with a Keras model. Although this use case is specific to Keras framework, you can extend the same approach to other deep learning frameworks and ML algorithms.<\/p>\n<p>Amazon SageMaker is a fully managed service, created with the goal of democratizing ML by empowering developers and data scientists to quickly and cost-effectively build, train, deploy, and monitor ML models.<\/p>\n<h2>What Is Amazon SageMaker Experiments?<\/h2>\n<p>Amazon SageMaker Experiments is a capability of Amazon SageMaker that lets you effortlessly organize, track, compare, and evaluate your ML experiments. Before we dive into the hands-on exercise, let\u2019s first take a step back and review the building blocks of an experiment and their referential relationships. The following diagram illustrates these building blocks.<\/p>\n<div id=\"attachment_16097\" class=\"wp-caption alignnone\">\n<p><img decoding=\"async\" loading=\"lazy\" aria-describedby=\"caption-attachment-16097\" class=\"wp-image-16097 size-full\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/09\/18\/1-Flowchart-1.jpg\" alt=\"\" width=\"900\" height=\"426\"><\/p>\n<p id=\"caption-attachment-16097\" class=\"wp-caption-text\">Figure 1. The building blocks of Amazon SageMaker Experiments<\/p>\n<\/div>\n<p>Amazon SageMaker Experiments is composed of the following components:<\/p>\n<ul>\n<li>\n<strong>Experiment \u2013 <\/strong>An ML problem that we want to solve. Each experiment consists of a collection of trials.<\/li>\n<li>\n<strong>Trial<\/strong> <strong>\u2013<\/strong> An iteration of a data science workflow related to an experiment. Each trial consists of several trial components.<\/li>\n<li>\n<strong>Trial component \u2013<\/strong> A stage in a given trial. For instance, as we see in our example, we create one trial component for the data preprocessing stage and one trial component for model training. In a similar fashion, we can also add a trial component for any data postprocessing.<\/li>\n<li>\n<strong>Tracker \u2013<\/strong> A mechanism that records various metadata about a particular trial component, including any parameters, inputs, outputs, artifacts, and metrics. A tracker can be linked to a particular training component to assign the collected metadata to it.<\/li>\n<\/ul>\n<p>Now that we\u2019ve set a rock-solid foundation on the key building blocks of the Amazon SageMaker Experiments SDK, let\u2019s dive into the fun hands-on component.<\/p>\n<h2>Prerequisites<\/h2>\n<p>You should have an <a href=\"https:\/\/signin.aws.amazon.com\/signin?redirect_uri=https%3A%2F%2Fportal.aws.amazon.com%2Fbilling%2Fsignup%2Fresume&amp;client_id=signup\" target=\"_blank\" rel=\"noopener noreferrer\">AWS account<\/a> and a sufficient level of access to create resources in the following AWS services:<\/p>\n<h2>Solution overview<\/h2>\n<p>As part of this post, we walk through the following high-level steps:<\/p>\n<ol>\n<li>Environment setup<\/li>\n<li>Data preprocessing and feature engineering<\/li>\n<li>Modeling with Amazon SageMaker Experiments<\/li>\n<li>Training and evaluation metric exploration<\/li>\n<li>Environment cleanup<\/li>\n<\/ol>\n<h2>Setting up the environment<\/h2>\n<p>We can set up our environment in a few simple steps:<\/p>\n<ol>\n<li>Clone the source code from the <a href=\"https:\/\/github.com\/aws-samples\/modeling-with-amazon-sagemaker-experiments\" target=\"_blank\" rel=\"noopener noreferrer\">GitHub repo<\/a>, which contains the complete demo, into your Amazon SageMaker Studio environment.<\/li>\n<li>Open the included Jupyter notebook and choose the Python 3 (TensorFlow 2 CPU Optimized)<\/li>\n<li>When the kernel is ready, install sagemaker-experiments package, which enables us to work with the Amazon SageMaker Experiments SDK, and s3fs package, to enable our pandas dataframes to easily integrate with objects in Amazon S3.<\/li>\n<li>Import all required packages and initialize the variables.<\/li>\n<\/ol>\n<p>The following screenshot shows the environment setup.<\/p>\n<div id=\"attachment_15841\" class=\"wp-caption alignnone\">\n<p><img decoding=\"async\" loading=\"lazy\" aria-describedby=\"caption-attachment-15841\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/09\/11\/2-Environment-setup.jpg\" alt=\"\" width=\"900\" height=\"379\"><\/p>\n<p id=\"caption-attachment-15841\" class=\"wp-caption-text\">Figure 2. Environment Setup<\/p>\n<\/div>\n<h2>Data preprocessing and feature engineering<\/h2>\n<p>Excellent! Now, let\u2019s dive into data preprocessing and feature engineering. In our use case, we use the <a href=\"https:\/\/archive.ics.uci.edu\/ml\/machine-learning-databases\/abalone\/\" target=\"_blank\" rel=\"noopener noreferrer\">abalone dataset<\/a> from the UCI Machine Learning Repository.<\/p>\n<p>Run the steps in the provided Jupyter notebook to complete all data preprocessing and feature engineering. After your data is preprocessed, it\u2019s time for us to seamlessly capture our preprocessing strategy! Let\u2019s create an experiment with the following code:<\/p>\n<div class=\"hide-language\">\n<pre class=\"unlimited-height-code\"><code class=\"lang-python\">sm = boto3.client('sagemaker') \r\nts = datetime.now().strftime('%Y-%m-%d-%H-%M-%S-%f')\r\n\r\nabalone_experiment = Experiment.create(\r\n    experiment_name = 'predict-abalone-age-' + ts,\r\n    description = 'Predicting the age of an abalone based on a set of features describing it',\r\n    sagemaker_boto_client=sm)\r\n<\/code><\/pre>\n<\/div>\n<p>Now, we can create a Tracker to describe the Pre-processing Trial Component, including the location of the artifacts:<\/p>\n<div class=\"hide-language\">\n<pre class=\"unlimited-height-code\"><code class=\"lang-python\">with Tracker.create(display_name='Pre-processing', sagemaker_boto_client=sm, artifact_bucket=sm_bucket, artifact_prefix=artifacts_path) as tracker:\r\n    tracker.log_parameters({\r\n        'train_test_split': 0.8\r\n    })\r\n    tracker.log_input(name='raw data', media_type='s3\/uri', value=source_url)\r\n    tracker.log_output(name='preprocessed data', media_type='s3\/uri', value=processed_data_path)\r\n    tracker.log_artifact(name='preprocessors', media_type='s3\/uri', file_path='preprocessors.pickle')\r\n    \r\nprocessing_component = tracker.trial_component\r\n<\/code><\/pre>\n<\/div>\n<p>Fantastic! We now have our experiment ready and we\u2019ve already done our due diligence to capture our data preprocessing strategy. Next, let\u2019s dive into the modeling phase.<\/p>\n<h2>Modeling with Amazon SageMaker Experiments<\/h2>\n<p>Our Keras model has two fully connected hidden layers with a variable number of neurons and variable activation functions. This flexibility enables us to pass these values as arguments to a training job and quickly parallelize our experimentation with several model architectures.<\/p>\n<p>We have <a href=\"https:\/\/keras.io\/api\/losses\/regression_losses\/#meansquaredlogarithmicerror-class\" target=\"_blank\" rel=\"noopener noreferrer\">mean squared logarithmic error<\/a> defined as the loss function, and the model is using the <a href=\"https:\/\/keras.io\/api\/optimizers\/adam\/\" target=\"_blank\" rel=\"noopener noreferrer\">Adam optimization algorithm<\/a>. Finally, the model tracks mean squared logarithmic error as our metric, which automatically propagates into our training trial component in our experiment, as we see shortly:<\/p>\n<div class=\"hide-language\">\n<pre class=\"unlimited-height-code\"><code class=\"lang-python\">def model(x_train, y_train, x_test, y_test, args):\r\n    \"\"\"Generate a simple model\"\"\"\r\n    model = Sequential([\r\n                Dense(args.l1_size, activation=args.l1_activation, kernel_initializer='normal'),\r\n                Dense(args.l2_size, activation=args.l2_activation, kernel_initializer='normal'),\r\n                Dense(1, activation='linear')\r\n    ])\r\n\r\n    model.compile(optimizer=Adam(learning_rate=args.learning_rate),\r\n                  loss='mean_squared_logarithmic_error',\r\n                  metrics=['mean_squared_logarithmic_error'])\r\n    model.fit(x_train, y_train, batch_size=args.batch_size, epochs=args.epochs, verbose=1)\r\n    model.evaluate(x_test,y_test,verbose=1)\r\n\r\n    return model\r\n<\/code><\/pre>\n<\/div>\n<p>Fantastic! Follow the steps in the provided notebook to define the hyperparameters for experimentation and instantiate the TensorFlow estimator. Finally, let\u2019s start our training jobs and supply the names of our experiment and trial via the <code>experiment_config<\/code> dictionary:<\/p>\n<div class=\"hide-language\">\n<pre class=\"unlimited-height-code\"><code class=\"lang-python\">abalone_estimator.fit(processed_data_path,\r\n                        job_name=job_name,\r\n                        wait=False,\r\n                        experiment_config={\r\n                                        'ExperimentName': abalone_experiment.experiment_name,\r\n                                        'TrialName': abalone_trial.trial_name,\r\n                                        'TrialComponentDisplayName': 'Training',\r\n                                        })\r\n<\/code><\/pre>\n<\/div>\n<h2>Exploring the training and evaluation metrics<\/h2>\n<p>Upon completion of the training jobs, we can quickly visualize how different variations of the model compare in terms of the metrics collected during model training. For instance, let\u2019s see how the loss has been decreasing by epoch for each variation of the model and observe the model architecture that is most effective in decreasing the loss:<\/p>\n<ol>\n<li>Choose the <code>Amazon SageMaker Experiments List<\/code> icon on the left sidebar.<\/li>\n<li>Choose your experiment to open it and press <strong>Shift<\/strong> to select all four trials.<\/li>\n<li>Choose any of the highlighted trials (right-click) and choose <code>Open in trial component list<\/code>.<\/li>\n<li>Press <strong>Shift<\/strong> to select the four trial components representing the training jobs and choose <code>Add chart<\/code>.<\/li>\n<li>Choose <code>New chart<\/code> and customize it to plot the collected metrics that you want to analyze. For our use case, choose the following:\n<ol>\n<li>For <strong>Data type<\/strong>\u00b8 choose <code>Time series<\/code>.<\/li>\n<li>For <strong>Chart type<\/strong>\u00b8 choose <code>Line<\/code>.<\/li>\n<li>For <strong>X-axis dimension<\/strong>, choose <code>epoch<\/code>.<\/li>\n<li>For <strong>Y-axis<\/strong>, choose <code>loss_TRAIN_last<\/code>.<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<div id=\"attachment_15842\" class=\"wp-caption alignnone\">\n<p><img decoding=\"async\" loading=\"lazy\" aria-describedby=\"caption-attachment-15842\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/09\/11\/3-Generating-plots.jpg\" alt=\"\" width=\"900\" height=\"395\"><\/p>\n<p id=\"caption-attachment-15842\" class=\"wp-caption-text\">Figure 3. Generating plots based on the collected model training metrics<\/p>\n<\/div>\n<p>Wow! How quick and effortless was that?! I encourage you to further explore plotting various other metrics on your own. For instance, you can choose the <code>Summary<\/code> data type to generate a scatter plot and explore if there is a relationship between the size of the first hidden layer in your neural network and the mean squared logarithmic error. See the following screenshot.<\/p>\n<div id=\"attachment_15843\" class=\"wp-caption alignnone\">\n<p><img decoding=\"async\" loading=\"lazy\" aria-describedby=\"caption-attachment-15843\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/09\/11\/4-Plot.jpg\" alt=\"\" width=\"900\" height=\"395\"><\/p>\n<p id=\"caption-attachment-15843\" class=\"wp-caption-text\">Figure 4. Plot of the relationship between the size of the first hidden layer in the neural network and Mean-Squared Logarithmic Error during model evaluation<\/p>\n<\/div>\n<p>Next, let\u2019s choose our best-performing trial (<code>abalone-trial-0<\/code>). As expected, we see two trial components. One represents our data <code>Pre-processing<\/code>, and the other reflects our model <code>Training<\/code>. When we open the <code>Training<\/code> trial component, we see that it contains all the hyperparameters, input data location, Amazon S3 location of this particular version of the model, and more.<\/p>\n<div id=\"attachment_15844\" class=\"wp-caption alignnone\">\n<p><img decoding=\"async\" loading=\"lazy\" aria-describedby=\"caption-attachment-15844\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/09\/11\/5-Trial-stages.jpg\" alt=\"\" width=\"900\" height=\"689\"><\/p>\n<p id=\"caption-attachment-15844\" class=\"wp-caption-text\">Figure 5. Metadata about model training, automatically collected by Amazon SageMaker Experiments<\/p>\n<\/div>\n<p>Similarly, when we open the\u00a0<code>Pre-processing<\/code> component, we see that it captures where the source data came from, where the processed data was stored in Amazon S3, and where we can easily find our trained encoder and scalers, which we\u2019ve packaged into the <code>preprocessors.pickle<\/code>\u00a0artifact.<\/p>\n<div id=\"attachment_15845\" class=\"wp-caption alignnone\">\n<p><img decoding=\"async\" loading=\"lazy\" aria-describedby=\"caption-attachment-15845\" class=\"wp-image-15845 size-full\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/09\/11\/6-Trial-stages2.jpg\" alt=\"\" width=\"900\" height=\"407\"><\/p>\n<p id=\"caption-attachment-15845\" class=\"wp-caption-text\">Figure 6. Metadata about data pre-processing and feature engineering, automatically collected by Amazon SageMaker Experiments<\/p>\n<\/div>\n<h2>Cleaning up<\/h2>\n<p>What a fun exploration this has been! Let\u2019s now clean up after ourselves by running the cleanup function provided at the end of the notebook to hierarchically delete all elements of the experiment that we created in this post:<\/p>\n<div class=\"hide-language\">\n<pre class=\"unlimited-height-code\"><code class=\"lang-python\">abalone_experiment.delete_all('--force')<\/code><\/pre>\n<\/div>\n<h2>Conclusion<\/h2>\n<p>You have now learned to seamlessly track the design decisions that you made during data preprocessing and model training, as well as rapidly compare and analyze the performance of various iterations of your model by using the tracked metrics of the trials in your experiment.<\/p>\n<p>I hope that you enjoyed diving into the intricacies of the Amazon SageMaker Experiments SDK and exploring how Amazon SageMaker Studio smoothly integrates with it, enabling you to lose yourself in experimentation with your ML model without losing track of the hard work you\u2019ve done! I highly encourage you to leverage the Amazon SageMaker Experiments Python SDK in your next ML engagement and I invite you to consider contributing to the further evolution of this open-sourced project.<\/p>\n<hr>\n<h3>About the Author<\/h3>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-15846 alignleft\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/09\/11\/IvanKopas.jpg\" alt=\"\" width=\"100\" height=\"134\">Ivan Kopas is a Machine Learning Engineer for AWS Professional Services, based out of the United States. Ivan is passionate about working closely with AWS customers from a variety of industries and helping them leverage AWS services to spearhead their toughest AI\/ML challenges. In his spare time, he enjoys spending time with his family, working out, hanging out with friends and diving deep into the fascinating realms of economics, psychology and philosophy.<\/p>\n<p>\u00a0<\/p>\n<p>\u00a0<\/p>\n<\/p><\/div>\n","protected":false},"excerpt":{"rendered":"<p>https:\/\/aws.amazon.com\/blogs\/machine-learning\/streamline-modeling-with-amazon-sagemaker-studio-and-amazon-experiments-sdk\/<\/p>\n","protected":false},"author":0,"featured_media":283,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[3],"tags":[],"_links":{"self":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/282"}],"collection":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/comments?post=282"}],"version-history":[{"count":0,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/282\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media\/283"}],"wp:attachment":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media?parent=282"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/categories?post=282"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/tags?post=282"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}