{"id":718,"date":"2020-12-22T01:39:03","date_gmt":"2020-12-22T01:39:03","guid":{"rendered":"https:\/\/machine-learning.webcloning.com\/2020\/12\/22\/controlling-and-auditing-data-exploration-activities-with-amazon-sagemaker-studio-and-aws-lake-formation\/"},"modified":"2020-12-22T01:39:03","modified_gmt":"2020-12-22T01:39:03","slug":"controlling-and-auditing-data-exploration-activities-with-amazon-sagemaker-studio-and-aws-lake-formation","status":"publish","type":"post","link":"https:\/\/salarydistribution.com\/machine-learning\/2020\/12\/22\/controlling-and-auditing-data-exploration-activities-with-amazon-sagemaker-studio-and-aws-lake-formation\/","title":{"rendered":"Controlling and auditing data exploration activities with Amazon SageMaker Studio and AWS Lake Formation"},"content":{"rendered":"<div id=\"\">\n<p>Highly-regulated industries, such as financial services, are often required to audit all access to their data. This includes auditing exploratory activities performed by data scientists, who usually query data from within machine learning (ML) notebooks.<\/p>\n<p>This post walks you through the steps to implement access control and auditing capabilities on a per-user basis, using <a href=\"https:\/\/docs.aws.amazon.com\/sagemaker\/latest\/dg\/studio.html\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon SageMaker Studio<\/a> notebooks and <a href=\"https:\/\/aws.amazon.com\/lake-formation\/\" target=\"_blank\" rel=\"noopener noreferrer\">AWS Lake Formation<\/a> access control policies. This is a how-to guide based on the <a href=\"https:\/\/d1.awsstatic.com\/whitepapers\/architecture\/wellarchitected-Machine-Learning-Lens.pdf\" target=\"_blank\" rel=\"noopener noreferrer\">Machine Learning Lens for the AWS Well-Architected Framework<\/a>, following the design principles described in the Security Pillar:<\/p>\n<ul>\n<li>Restrict access to ML systems<\/li>\n<li>Ensure data governance<\/li>\n<li>Enforce data lineage<\/li>\n<li>Enforce regulatory compliance<\/li>\n<\/ul>\n<p>Additional ML governance practices for experiments and models using <a href=\"https:\/\/aws.amazon.com\/sagemaker\/\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon SageMaker<\/a> are described in the whitepaper <a href=\"https:\/\/d1.awsstatic.com\/whitepapers\/machine-learning-in-financial-services-on-aws.pdf\" target=\"_blank\" rel=\"noopener noreferrer\">Machine Learning Best Practices in Financial Services<\/a>.<\/p>\n<h2>Overview of solution<\/h2>\n<p>This implementation uses <a href=\"http:\/\/aws.amazon.com\/athena\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon Athena<\/a> and the <a href=\"https:\/\/pypi.org\/project\/PyAthena\/\" target=\"_blank\" rel=\"noopener noreferrer\">PyAthena<\/a> client on a Studio notebook to query data on a data lake registered with Lake Formation.<\/p>\n<p>SageMaker Studio is the first fully integrated development environment (IDE) for ML. Studio provides a single, web-based visual interface where you can perform all the steps required to build, train, and deploy ML models. <a href=\"https:\/\/docs.aws.amazon.com\/sagemaker\/latest\/dg\/notebooks.html\" target=\"_blank\" rel=\"noopener noreferrer\">Studio notebooks<\/a> are collaborative notebooks that you can launch quickly, without setting up compute instances or file storage beforehand.<\/p>\n<p>Athena is an interactive query service that makes it easy to analyze data directly in <a href=\"https:\/\/aws.amazon.com\/s3\/\">Amazon Simple Storage Service<\/a> (Amazon S3) using standard SQL. Athena is serverless, so there is no infrastructure to set up or manage, and you pay only for the queries you run.<\/p>\n<p>Lake Formation is a fully managed service that makes it easier for you to build, secure, and manage data lakes. Lake Formation simplifies and automates many of the complex manual steps that are usually required to create data lakes, including securely making that data available for analytics and ML.<\/p>\n<p>For an existing data lake registered with Lake Formation, the following diagram illustrates the proposed implementation.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-19972\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/12\/16\/ML-1106-1.jpg\" alt=\"For an existing data lake registered with Lake Formation, the following diagram illustrates the proposed implementation.\" width=\"800\" height=\"400\"><\/p>\n<p>The workflow includes the following steps:<\/p>\n<ol>\n<li>Data scientists access the <a href=\"http:\/\/aws.amazon.com\/console\" target=\"_blank\" rel=\"noopener noreferrer\">AWS Management Console<\/a> using their <a href=\"http:\/\/aws.amazon.com\/iam\" target=\"_blank\" rel=\"noopener noreferrer\">AWS Identity and Access Management<\/a> (IAM) user accounts and open Studio using individual user profiles. Each user profile has an associated execution role, which the user assumes while working on a Studio notebook. The diagram depicts two data scientists that require different permissions over data in the data lake. For example, in a data lake containing personally identifiable information (PII), user Data Scientist 1 has full access to every table in the Data Catalog, whereas Data Scientist 2 has limited access to a subset of tables (or columns) containing non-PII data.<\/li>\n<li>The Studio notebook is associated with a Python kernel. The PyAthena client allows you to run exploratory ANSI SQL queries on the data lake through Athena, using the execution role assumed by the user while working with Studio.<\/li>\n<li>Athena sends a data access request to Lake Formation, with the user profile execution role as principal. Data permissions in Lake Formation offer database-, table-, and column-level access control, restricting access to metadata and the corresponding data stored in Amazon S3. Lake Formation generates short-term credentials to be used for data access, and informs Athena what columns the principal is allowed to access.<\/li>\n<li>Athena uses the short-term credential provided by Lake Formation to access the data lake storage in Amazon S3, and retrieves the data matching the SQL query. Before returning the query result, Athena filters out columns that aren\u2019t included in the data permissions informed by Lake Formation.<\/li>\n<li>Athena returns the SQL query result to the Studio notebook.<\/li>\n<li>Lake Formation records data access requests and other activity history for the registered data lake locations. <a href=\"https:\/\/aws.amazon.com\/cloudtrail\/\" target=\"_blank\" rel=\"noopener noreferrer\">AWS CloudTrail<\/a> also records these and other API calls made to AWS during the entire flow, including Athena query requests.<\/li>\n<\/ol>\n<h2>Walkthrough overview<\/h2>\n<p>In this walkthrough, I show you how to implement access control and audit using a Studio notebook and Lake Formation. You perform the following activities:<\/p>\n<ol>\n<li>Register a new database in Lake Formation.<\/li>\n<li>Create the required IAM policies, roles, group, and users.<\/li>\n<li>Grant data permissions with Lake Formation.<\/li>\n<li>Set up Studio.<\/li>\n<li>Test Lake Formation access control policies using a Studio notebook.<\/li>\n<li>Audit data access activity with Lake Formation and CloudTrail.<\/li>\n<\/ol>\n<p>If you prefer to skip the initial setup activities and jump directly to testing and auditing, you can deploy the following <a href=\"http:\/\/aws.amazon.com\/cloudformation\" target=\"_blank\" rel=\"noopener noreferrer\">AWS CloudFormation<\/a> template in a Region that supports <a href=\"https:\/\/aws.amazon.com\/sagemaker\/pricing\/#Amazon_SageMaker_Pricing_Calculator\" target=\"_blank\" rel=\"noopener noreferrer\">Studio<\/a> and <a href=\"https:\/\/docs.aws.amazon.com\/general\/latest\/gr\/lake-formation.html#lake-formation_region\" target=\"_blank\" rel=\"noopener noreferrer\">Lake Formation<\/a>:<\/p>\n<p><a href=\"https:\/\/console.aws.amazon.com\/cloudformation\/home#\/stacks\/create\/review?templateURL=https:\/\/aws-ml-blog.s3.amazonaws.com\/artifacts\/sagemaker-studio-audit-control\/SageMakerStudioAuditControlStack.yaml&amp;stackName=SageMakerStudioAuditControl\" target=\"_blank\" rel=\"noopener noreferrer\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-15948\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/09\/16\/2-LaunchStack.jpg\" alt=\"\" width=\"107\" height=\"20\"><\/a><\/p>\n<p>You can also deploy the template by <a href=\"https:\/\/aws-ml-blog.s3.amazonaws.com\/artifacts\/sagemaker-studio-audit-control\/SageMakerStudioAuditControlStack.yaml\" target=\"_blank\" rel=\"noopener noreferrer\">downloading the CloudFormation template<\/a>. When deploying the CloudFormation template, you provide the following parameters:<\/p>\n<ul>\n<li>User name and password for a data scientist with full access to the dataset. The default user name is <code>data-scientist-full<\/code>.<\/li>\n<li>User name and password for a data scientist with limited access to the dataset. The default user name is <code>data-scientist-limited<\/code>.<\/li>\n<li>Names for the database and table to be created for the dataset. The default names are <code>amazon_reviews_db<\/code> and <code>amazon_reviews_parquet<\/code>, respectively.<\/li>\n<li>VPC and subnets that are used by Studio to communicate with the <a href=\"https:\/\/aws.amazon.com\/efs\/\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon Elastic File System<\/a> (Amazon EFS) volume associated to Studio.<\/li>\n<\/ul>\n<p>If you decide to deploy the CloudFormation template, after the CloudFormation stack is complete, you can go directly to the section <strong>Testing Lake Formation access control policies<\/strong> in this post.<\/p>\n<h2>Prerequisites<\/h2>\n<p>For this walkthrough, you should have the following prerequisites:<\/p>\n<h2>Registering a new database in Lake Formation<\/h2>\n<p>For this post, I use the <a href=\"https:\/\/s3.amazonaws.com\/amazon-reviews-pds\/readme.html\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon Customer Reviews Dataset<\/a> to demonstrate how to provide granular access to the data lake for different data scientists. If you already have a dataset registered with Lake Formation that you want to use, you can skip this section and go to <strong>Creating required IAM roles and users for data scientists<\/strong>.<\/p>\n<p>To register the Amazon Customer Reviews Dataset in Lake Formation, complete the following steps:<\/p>\n<ol>\n<li>Sign in to the console with the IAM user configured as Lake Formation Admin.<\/li>\n<li>On the Lake Formation console, in the navigation pane, under <strong>Data catalog<\/strong>, choose <strong>Databases<\/strong>.<\/li>\n<li>Choose <strong>Create Database<\/strong>.<\/li>\n<li>In <strong>Database details<\/strong>, select <strong>Database<\/strong> to create the database in your own account.<\/li>\n<li>For <strong>Name<\/strong>, enter a name for the database, such as <code>amazon_reviews_db<\/code>.<\/li>\n<li>For <strong>Location<\/strong>, enter <code>s3:\/\/amazon-reviews-pds<\/code>.<\/li>\n<li>Under <strong>Default permissions for newly created tables<\/strong>, make sure to clear the option <strong>Use only IAM access control for new tables in this database<\/strong>.<\/li>\n<\/ol>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-19973\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/12\/16\/ML-1106-2.jpg\" alt=\"Under Default permissions for newly created tables, make sure to clear the option Use only IAM access control for new tables in this database.\" width=\"800\" height=\"607\"><\/p>\n<ol start=\"8\">\n<li>Choose <strong>Create database<\/strong>.<\/li>\n<\/ol>\n<p>The Amazon Customer Reviews Dataset is currently available in TSV and Parquet formats. The Parquet dataset is partitioned on Amazon S3 by <code>product_category<\/code>. To create a table in the data lake for the Parquet dataset, you can use an <a href=\"https:\/\/aws.amazon.com\/glue\" target=\"_blank\" rel=\"noopener noreferrer\">AWS Glue<\/a> crawler or manually create the table using Athena, as described in Amazon Customer Reviews Dataset <a href=\"https:\/\/s3.amazonaws.com\/amazon-reviews-pds\/readme.html\" target=\"_blank\" rel=\"noopener noreferrer\">README file<\/a>.<\/p>\n<ol start=\"9\">\n<li>On the Athena console, create the table.<\/li>\n<\/ol>\n<p>If you haven\u2019t specified a query result location before, follow the instructions in <a href=\"https:\/\/docs.aws.amazon.com\/athena\/latest\/ug\/querying.html#query-results-specify-location\" target=\"_blank\" rel=\"noopener noreferrer\">Specifying a Query Result Location<\/a>.<\/p>\n<ol start=\"10\">\n<li>Choose the data source <code>AwsDataCatalog<\/code>.<\/li>\n<li>Choose the database created in the previous step.<\/li>\n<li>In the Query Editor, enter the following query:\n<div class=\"hide-language\">\n<pre><code class=\"lang-python\">CREATE EXTERNAL TABLE amazon_reviews_parquet(\r\n  marketplace string, \r\n  customer_id string, \r\n  review_id string, \r\n  product_id string, \r\n  product_parent string, \r\n  product_title string, \r\n  star_rating int, \r\n  helpful_votes int, \r\n  total_votes int, \r\n  vine string, \r\n  verified_purchase string, \r\n  review_headline string, \r\n  review_body string, \r\n  review_date bigint, \r\n  year int)\r\nPARTITIONED BY (product_category string)\r\nROW FORMAT SERDE \r\n  'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' \r\nSTORED AS INPUTFORMAT \r\n  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' \r\nOUTPUTFORMAT \r\n  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'\r\nLOCATION\r\n  's3:\/\/amazon-reviews-pds\/parquet\/'<\/code><\/pre>\n<\/div>\n<\/li>\n<\/ol>\n<ol start=\"13\">\n<li>Choose <strong>Run query<\/strong>.<\/li>\n<\/ol>\n<p>You should receive a Query successful response when the table is created.<\/p>\n<ol start=\"14\">\n<li>Enter the following query to load the table partitions:\n<div class=\"hide-language\">\n<pre><code class=\"lang-python\">MSCK REPAIR TABLE amazon_reviews_parquet<\/code><\/pre>\n<\/div>\n<\/li>\n<\/ol>\n<ol start=\"15\">\n<li>Choose <strong>Run query<\/strong>.<\/li>\n<li>On the Lake Formation console, in the navigation pane, under <strong>Data catalog<\/strong>, choose <strong>Tables<\/strong>.<\/li>\n<li>For <strong>Table name<\/strong>, enter a table name.<\/li>\n<li>Verify that you can see the table details.<\/li>\n<\/ol>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-19974\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/12\/16\/ML-1106-3.jpg\" alt=\"18. Verify that you can see the table details. \" width=\"800\" height=\"343\"><\/p>\n<ol start=\"19\">\n<li>Scroll down to see the table schema and partitions.<\/li>\n<\/ol>\n<p>Finally, you register the database location with Lake Formation so the service can start enforcing data permissions on the database.<\/p>\n<ol start=\"20\">\n<li>On the Lake Formation console, in the navigation pane, under <strong>Register and ingest<\/strong>, choose <strong>Data lake locations<\/strong>.<\/li>\n<li>On the <strong>Data lake locations<\/strong> page, choose <strong>Register location<\/strong>.<\/li>\n<li>For <strong>Amazon S3 path<\/strong>, enter <code>s3:\/\/amazon-reviews-pds\/<\/code>.<\/li>\n<li>For <strong>IAM role<\/strong>, you can keep the default role.<\/li>\n<li>Choose <strong>Register location<\/strong>.<\/li>\n<\/ol>\n<h2>Creating required IAM roles and users for data scientists<\/h2>\n<p>To demonstrate how you can provide differentiated access to the dataset registered in the previous step, you first need to create IAM policies, roles, a group, and users. The following diagram illustrates the resources you configure in this section.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-19975\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/12\/16\/ML-1106-4.jpg\" alt=\"The following diagram illustrates the resources you configure in this section.\" width=\"800\" height=\"400\"><\/p>\n<p>In this section, you complete the following high-level steps:<\/p>\n<ol>\n<li>Create an IAM group named <code>DataScientists<\/code> containing two users: <code>data-scientist-full<\/code> and <code>data-scientist-limited<\/code>, to control their access to the console and to Studio.<\/li>\n<li>Create a managed policy named <code>DataScientistGroupPolicy<\/code> and assign it to the group.<\/li>\n<\/ol>\n<p>The policy allows users in the group to access Studio, but only using a SageMaker user profile that matches their IAM user name. It also denies the use of SageMaker notebook instances, allowing Studio notebooks only.<\/p>\n<ol start=\"3\">\n<li>For each IAM user, create individual IAM roles, which are used as user profile execution roles in Studio later.<\/li>\n<\/ol>\n<p>The naming convention for these roles consists of a common prefix followed by the corresponding IAM user name. This allows you to audit activities on Studio notebooks\u2014which are logged using Studio\u2019s execution roles\u2014and trace them back to the individual IAM users who performed the activities. For this post, I use the prefix <code>SageMakerStudioExecutionRole_<\/code>.<\/p>\n<ol start=\"4\">\n<li>Create a managed policy named <code>SageMakerUserProfileExecutionPolicy<\/code> and assign it to each of the IAM roles.<\/li>\n<\/ol>\n<p>The policy establishes coarse-grained access permissions to the data lake.<\/p>\n<p>Follow the remainder of this section to create the IAM resources described. The permissions configured in this section grant common, coarse-grained access to data lake resources for all the IAM roles. In a later section, you use Lake Formation to establish fine-grained access permissions to Data Catalog resources and Amazon S3 locations for individual roles.<\/p>\n<h3>Creating the required IAM group and users<\/h3>\n<p>To create your group and users, complete the following steps:<\/p>\n<ol>\n<li>Sign in to the console using an IAM user with permissions to create groups, users, roles, and policies.<\/li>\n<li>On the IAM console, <a href=\"https:\/\/docs.aws.amazon.com\/IAM\/latest\/UserGuide\/access_policies_create-console.html#access_policies_create-json-editor\" target=\"_blank\" rel=\"noopener noreferrer\">create policies on the JSON tab<\/a> to create a new IAM managed policy named <code>DataScientistGroupPolicy<\/code>.\n<ol type=\"a\">\n<li>Use the following JSON policy document to provide permissions, providing your AWS Region and AWS account ID:\n<div class=\"hide-language\">\n<pre><code class=\"lang-json\">{\r\n    \"Version\": \"2012-10-17\",\r\n    \"Statement\": [\r\n        {\r\n            \"Action\": [\r\n                \"sagemaker:DescribeDomain\",\r\n                \"sagemaker:ListDomains\",\r\n                \"sagemaker:ListUserProfiles\",\r\n                \"sagemaker:ListApps\"\r\n            ],\r\n            \"Resource\": \"*\",\r\n            \"Effect\": \"Allow\"\r\n        },\r\n        {\r\n            \"Action\": [\r\n                \"sagemaker:CreatePresignedDomainUrl\",\r\n                \"sagemaker:DescribeUserProfile\"\r\n            ],\r\n            \"Resource\": \"arn:aws:sagemaker:<em><span>&lt;AWSREGION&gt;<\/span><\/em>:<span><em>&lt;AWSACCOUNT&gt;<\/em><\/span>:user-profile\/*\/${aws:username}\",\r\n            \"Effect\": \"Allow\"\r\n        },\r\n        {\r\n            \"Action\": [\r\n                \"sagemaker:CreatePresignedDomainUrl\",\r\n                \"sagemaker:DescribeUserProfile\"\r\n            ],\r\n            \"Effect\": \"Deny\",\r\n            \"NotResource\": \"arn:aws:sagemaker:<em><span>&lt;AWSREGION&gt;<\/span><\/em>:<em><span>&lt;AWSACCOUNT&gt;<\/span><\/em>:user-profile\/*\/${aws:username}\"\r\n        },\r\n        {\r\n            \"Action\": \"sagemaker:*App\",\r\n            \"Resource\": \"arn:aws:sagemaker:<em><span>&lt;AWSREGION&gt;<\/span><\/em>:<em><span>&lt;AWSACCOUNT&gt;<\/span><\/em>:app\/*\/${aws:username}\/*\",\r\n            \"Effect\": \"Allow\"\r\n        },\r\n        {\r\n            \"Action\": \"sagemaker:*App\",\r\n            \"Effect\": \"Deny\",\r\n            \"NotResource\": \"arn:aws:sagemaker:<em><span>&lt;AWSREGION&gt;<\/span><\/em>:<em><span>&lt;AWSACCOUNT&gt;<\/span><\/em>:app\/*\/${aws:username}\/*\"\r\n        },\r\n        {\r\n            \"Action\": [\r\n                \"sagemaker:CreatePresignedNotebookInstanceUrl\",\r\n                \"sagemaker:*NotebookInstance\",\r\n                \"sagemaker:*NotebookInstanceLifecycleConfig\",\r\n                \"sagemaker:CreateUserProfile\",\r\n                \"sagemaker:DeleteDomain\",\r\n                \"sagemaker:DeleteUserProfile\"\r\n            ],\r\n            \"Resource\": \"*\",\r\n            \"Effect\": \"Deny\"\r\n        }\r\n    ]\r\n}<\/code><\/pre>\n<\/div>\n<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<p>This policy forces an IAM user to open Studio using a SageMaker user profile with the same name. It also denies the use of SageMaker notebook instances, allowing Studio notebooks only.<\/p>\n<ol start=\"3\">\n<li>\n<a href=\"https:\/\/docs.aws.amazon.com\/IAM\/latest\/UserGuide\/id_groups_create.html\" target=\"_blank\" rel=\"noopener noreferrer\">Create an IAM group<\/a>. <\/p>\n<ol type=\"a\">\n<li>For <strong>Group name<\/strong>, enter <code>DataScientists<\/code>.<\/li>\n<li>Search and attach the AWS managed policy named <code>DataScientist<\/code> and the IAM policy created in the previous step.<\/li>\n<\/ol>\n<\/li>\n<li>\n<a href=\"https:\/\/docs.aws.amazon.com\/IAM\/latest\/UserGuide\/id_users_create.html#id_users_create_console\" target=\"_blank\" rel=\"noopener noreferrer\">Create two IAM users<\/a> named <code>data-scientist-full<\/code> and <code>data-scientist-limited<\/code>.<\/li>\n<\/ol>\n<p>Alternatively, you can provide names of your choice, as long as they\u2019re a combination of lowercase letters, numbers, and hyphen (-). Later, you also give these names to their corresponding SageMaker user profiles, which at the time of writing <a href=\"https:\/\/docs.aws.amazon.com\/sagemaker\/latest\/APIReference\/API_CreateUserProfile.html#sagemaker-CreateUserProfile-request-UserProfileName\" target=\"_blank\" rel=\"noopener noreferrer\">only support those characters<\/a>.<\/p>\n<h3>Creating the required IAM roles<\/h3>\n<p>To create your roles, complete the following steps:<\/p>\n<ol>\n<li>On the IAM console, <a href=\"https:\/\/docs.aws.amazon.com\/IAM\/latest\/UserGuide\/access_policies_create-console.html#access_policies_create-json-editor\" target=\"_blank\" rel=\"noopener noreferrer\">create a new managed policy<\/a> named <code>SageMakerUserProfileExecutionPolicy<\/code>.\n<ol type=\"a\">\n<li>Use the following policy code:\n<div class=\"hide-language\">\n<pre><code class=\"lang-json\">{\r\n    \"Version\": \"2012-10-17\",\r\n    \"Statement\": [\r\n        {\r\n            \"Action\": [\r\n                \"lakeformation:GetDataAccess\",\r\n                \"glue:GetTable\",\r\n                \"glue:GetTables\",\r\n                \"glue:SearchTables\",\r\n                \"glue:GetDatabase\",\r\n                \"glue:GetDatabases\",\r\n                \"glue:GetPartitions\"\r\n            ],\r\n            \"Resource\": \"*\",\r\n            \"Effect\": \"Allow\"\r\n        },\r\n        {\r\n            \"Action\": \"sts:AssumeRole\",\r\n            \"Resource\": \"*\",\r\n            \"Effect\": \"Deny\"\r\n        }\r\n    ]\r\n}<\/code><\/pre>\n<\/div>\n<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<p>This policy provides common coarse-grained IAM permissions to the data lake, leaving Lake Formation permissions to control access to Data Catalog resources and Amazon S3 locations for individual users and roles. This is the recommended method for granting access to data in Lake Formation. For more information, see <a href=\"https:\/\/docs.aws.amazon.com\/lake-formation\/latest\/dg\/access-control-fine-grained.html\" target=\"_blank\" rel=\"noopener noreferrer\">Methods for Fine-Grained Access Control<\/a>.<\/p>\n<ol start=\"2\">\n<li>\n<a href=\"https:\/\/docs.aws.amazon.com\/glue\/latest\/dg\/create-an-iam-role-sagemaker-notebook.html\" target=\"_blank\" rel=\"noopener noreferrer\">Create an IAM role<\/a> for the first data scientist (<code>data-scientist-full<\/code>), which is used as the corresponding user profile\u2019s execution role. <\/p>\n<ol type=\"a\">\n<li>On the <strong>Attach permissions policy<\/strong> page, search and attach the AWS managed policy <code>AmazonSageMakerFullAccess<\/code>.<\/li>\n<li>For <strong>Role name<\/strong>, use the naming convention introduced at the beginning of this section to name the role <code>SageMakerStudioExecutionRole_data-scientist-full<\/code>.<\/li>\n<\/ol>\n<\/li>\n<li>To add the remaining policies, on the <strong>Roles<\/strong> page, choose the role name you just created.<\/li>\n<li>Under <strong>Permissions<\/strong>, choose <strong>Attach policies<\/strong>;<\/li>\n<li>Search and select the <code>SageMakerUserProfileExecutionPolicy<\/code> and <code>AmazonAthenaFullAccess<\/code> policies.<\/li>\n<li>Choose <strong>Attach policy<\/strong>.<\/li>\n<li>To restrict the Studio resources that can be created within Studio (such as image, kernel, or instance type) to only those belonging to the user profile associated to the first IAM role, <a href=\"https:\/\/docs.aws.amazon.com\/IAM\/latest\/UserGuide\/access_policies_manage-attach-detach.html#embed-inline-policy-console\" target=\"_blank\" rel=\"noopener noreferrer\">embed an inline policy to the IAM role<\/a>.\n<ol type=\"a\">\n<li>Use the following JSON policy document to scope down permissions for the user profile, providing the Region, account ID, and IAM user name associated to the first data scientist (<code>data-scientist-full<\/code>). You can name the inline policy <code>DataScientist1IAMRoleInlinePolicy<\/code>.\n<div class=\"hide-language\">\n<pre><code class=\"lang-json\">{\r\n    \"Version\": \"2012-10-17\",\r\n    \"Statement\": [\r\n        {\r\n            \"Action\": \"sagemaker:*App\",\r\n            \"Resource\": \"arn:aws:sagemaker:<em><span>&lt;AWSREGION&gt;<\/span><\/em>:<em><span>&lt;AWSACCOUNT&gt;<\/span><\/em>:app\/*\/<span><em>&lt;IAMUSERNAME&gt;<\/em><\/span>\/*\",\r\n            \"Effect\": \"Allow\"\r\n        },\r\n        {\r\n            \"Action\": \"sagemaker:*App\",\r\n            \"Effect\": \"Deny\",\r\n            \"NotResource\": \"arn:aws:sagemaker:<em><span>&lt;AWSREGION&gt;<\/span><\/em>:<em><span>&lt;AWSACCOUNT&gt;<\/span><\/em>:app\/*\/<em><span>&lt;IAMUSERNAME&gt;<\/span><\/em>\/*\"\r\n        }\r\n    ]\r\n}<\/code><\/pre>\n<\/div>\n<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<ol start=\"8\">\n<li>Repeat the previous steps to create an IAM role for the second data scientist (<code>data-scientist-limited<\/code>).\n<ol type=\"a\">\n<li>Name the role <code>SageMakerStudioExecutionRole_data-scientist-limited<\/code> and the second inline policy <code>DataScientist2IAMRoleInlinePolicy<\/code>.<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<h2>Granting data permissions with Lake Formation<\/h2>\n<p>Before data scientists are able to work on a Studio notebook, you grant the individual execution roles created in the previous section access to the Amazon Customer Reviews Dataset (or your own dataset). For this post, we implement different data permission policies for each data scientist to demonstrate how to grant granular access using Lake Formation.<\/p>\n<ol>\n<li>Sign in to the console with the IAM user configured as Lake Formation Admin.<\/li>\n<li>On the Lake Formation console, in the navigation pane, choose <strong>Tables<\/strong>.<\/li>\n<li>On the <strong>Tables<\/strong> page, select the table you created earlier, such as <code>amazon_reviews_parquet<\/code>.<\/li>\n<li>On the <strong>Actions<\/strong> menu, under <strong>Permissions<\/strong>, choose <strong>Grant<\/strong>.<\/li>\n<li>Provide the following information to grant full access to the Amazon Customer Reviews Dataset table for the first data scientist:<\/li>\n<li>Select <strong>My account<\/strong>.<\/li>\n<li>For <strong>IAM users and roles<\/strong>, choose the execution role associated to the first data scientist, such as <code>SageMakerStudioExecutionRole_data-scientist-full<\/code>.<\/li>\n<li>For <strong>Table permissions<\/strong> and <strong>Grantable permissions<\/strong>, select <strong>Select<\/strong>.<\/li>\n<li>Choose <strong>Grant<\/strong>.<\/li>\n<li>Repeat the first step to grant limited access to the dataset for the second data scientist, providing the following information:<\/li>\n<li>Select <strong>My account<\/strong>.<\/li>\n<li>For <strong>IAM users and roles<\/strong>, choose the execution role associated to the second data scientist, such as <code>SageMakerStudioExecutionRole_data-scientist-limited<\/code>.<\/li>\n<li>For <strong>Columns<\/strong>, choose <strong>Include columns<\/strong>.<\/li>\n<li>Choose a subset of columns, such as: <code>product_category<\/code>, <code>product_id<\/code>, <code>product_parent<\/code>, <code>product_title<\/code>, <code>star_rating<\/code>, <code>review_headline<\/code>, <code>review_body<\/code>, and <code>review_date<\/code>.<\/li>\n<li>For <strong>Table permissions<\/strong> and <strong>Grantable permissions<\/strong>, select <strong>Select<\/strong>.<\/li>\n<li>Choose <strong>Grant<\/strong>.<\/li>\n<li>To verify the data permissions you have granted, on the Lake Formation console, in the navigation pane, choose <strong>Tables<\/strong>.<\/li>\n<li>On the <strong>Tables<\/strong> page, select the table you created earlier, such as <code>amazon_reviews_parquet<\/code>.<\/li>\n<li>On the <strong>Actions<\/strong> menu, under <strong>Permissions<\/strong>, choose <strong>View permissions <\/strong>to open the<strong> Data permissions\u00a0<\/strong>menu.<\/li>\n<\/ol>\n<p>You see a list of permissions granted for the table, including the permissions you just granted and permissions for the Lake Formation Admin.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-19976\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/12\/16\/ML-1106-5.jpg\" alt=\"You see a list of permissions granted for the table, including the permissions you just granted and permissions for the Lake Formation Admin.\" width=\"800\" height=\"420\"><\/p>\n<p>If you see the principal <code>IAMAllowedPrincipals<\/code> listed on the <strong>Data permissions<\/strong> menu for the table, you must remove it. Select the principal and choose <strong>Revoke<\/strong>. On the <strong>Revoke permissions<\/strong> page, choose <strong>Revoke<\/strong>.<\/p>\n<h2>Setting up SageMaker Studio<\/h2>\n<p>You now onboard to Studio and create two user profiles, one for each data scientist.<\/p>\n<p>When you onboard to Studio using IAM authentication, Studio creates a domain for your account. A domain consists of a list of authorized users, configuration settings, and an Amazon EFS volume, which contains data for the users, including notebooks, resources, and artifacts.<\/p>\n<p>Each user receives a private home directory within Amazon EFS for notebooks, Git repositories, and data files. All traffic between the domain and the Amazon EFS volume is communicated through specified subnet IDs. By default, all other traffic goes over the internet through a SageMaker system <a href=\"https:\/\/aws.amazon.com\/vpc\/\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon Virtual Private Cloud<\/a> (Amazon VPC).<\/p>\n<p>Alternatively, instead of using the default SageMaker internet access, you could secure how Studio accesses resources by assigning a private VPC to the domain. This is beyond the scope of this post, but you can find additional details in <a href=\"https:\/\/aws.amazon.com\/blogs\/machine-learning\/securing-amazon-sagemaker-studio-connectivity-using-a-private-vpc\/\" target=\"_blank\" rel=\"noopener noreferrer\">Securing Amazon SageMaker Studio connectivity using a private VPC<\/a>.<\/p>\n<p>If you already have a Studio domain running, you can skip the onboarding process and follow the steps to create the SageMaker user profiles.<\/p>\n<h3>Onboarding to Studio<\/h3>\n<p>To onboard to Studio, complete the following steps:<\/p>\n<ol>\n<li>Sign in to the console using an IAM user with service administrator permissions for SageMaker.<\/li>\n<li>On the SageMaker console, in the navigation pane, choose <strong>Amazon SageMaker Studio<\/strong>.<\/li>\n<li>On the <strong>Studio <\/strong>menu, under <strong>Get started<\/strong>, choose <strong>Standard setup<\/strong>.<\/li>\n<li>For <strong>Authentication method<\/strong>, choose <strong>AWS Identity and Access Management (IAM)<\/strong>.<\/li>\n<li>Under <strong>Permission<\/strong>, for <strong>Execution role for all users<\/strong>, choose an option from the role selector.<\/li>\n<\/ol>\n<p>You\u2019re not using this execution role for the SageMaker user profiles that you create later. If you choose <strong>Create a new role<\/strong>, the <strong>Create an IAM role<\/strong> dialog opens.<\/p>\n<ol start=\"6\">\n<li>For <strong>S3 buckets you specify<\/strong>, choose <strong>None<\/strong>.<\/li>\n<li>Choose <strong>Create role<\/strong>.<\/li>\n<\/ol>\n<p>SageMaker creates a new IAM role named <code>AmazonSageMaker-ExecutionPolicy<\/code> role with the <code>AmazonSageMakerFullAccess<\/code> policy attached.<\/p>\n<ol start=\"8\">\n<li>Under <strong>Network and storage<\/strong>, for <strong>VPC<\/strong>, choose the private VPC that is used for communication with the Amazon EFS volume.<\/li>\n<li>For <strong>Subnet(s)<\/strong>, choose multiple subnets in the VPC from different Availability Zones.<\/li>\n<li>Choose <strong>Submit<\/strong>.<\/li>\n<li>On the <strong>Studio Control Panel<\/strong>, under <strong>Studio Summary<\/strong>, wait for the status to change to <code>Ready<\/code> and the <strong>Add user<\/strong> button to be enabled.<\/li>\n<\/ol>\n<h3>Creating the SageMaker user profiles<\/h3>\n<p>To create your SageMaker user profiles, complete the following steps:<\/p>\n<ol>\n<li>On the SageMaker console, in the navigation pane, choose <strong>Amazon SageMaker Studio<\/strong>.<\/li>\n<li>On the <strong>Studio Control Panel<\/strong>, choose <strong>Add user<\/strong>.<\/li>\n<li>For <strong>User name<\/strong>, enter data-scientist-full.<\/li>\n<li>For <strong>Execution role<\/strong>, choose <strong>Enter a custom IAM role ARN<\/strong>.<\/li>\n<li>Enter <code>arn:aws:iam::<span><em>&lt;AWSACCOUNT&gt;<\/em><\/span>:role\/SageMakerStudioExecutionRole_data-scientist-full<\/code>, providing your AWS account ID.<\/li>\n<li>After creating the first user profile, repeat the previous steps to create a second user profile.\n<ol type=\"a\">\n<li>For <strong>User name<\/strong>, enter <code>data-scientist-limited<\/code>.<\/li>\n<li>For <strong>Execution role<\/strong>, enter the associated IAM role ARN.<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-19977\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/12\/16\/ML-1106-6.jpg\" alt=\"For Execution role, enter the associated IAM role ARN.\" width=\"800\" height=\"342\"><\/p>\n<h2>Testing Lake Formation access control policies<\/h2>\n<p>You now test the implemented Lake Formation access control policies by opening Studio using both user profiles. For each user profile, you run the same Studio notebook containing Athena queries. You should see different query outputs for each user profile, matching the data permissions implemented earlier.<\/p>\n<ol>\n<li>Sign in to the console with IAM user <code>data-scientist-full<\/code>.<\/li>\n<li>On the SageMaker console, in the navigation pane, choose <strong>Amazon SageMaker Studio<\/strong>.<\/li>\n<li>On the <strong>Studio Control Panel<\/strong>, choose user name <code>data-scientist-full<\/code><strong>.<\/strong>\n<\/li>\n<li>Choose <strong>Open Studio<\/strong>.<\/li>\n<li>Wait for SageMaker Studio to load.<\/li>\n<\/ol>\n<p>Due to the IAM policies attached to the IAM user, you can only open Studio with a user profile matching the IAM user name.<\/p>\n<ol start=\"6\">\n<li>In Studio, on the top menu, under <strong>File<\/strong>, under <strong>New<\/strong>, choose <strong>Terminal<\/strong>.<\/li>\n<li>At the command prompt, run the following command to import a sample notebook to test Lake Formation data permissions:\n<div class=\"hide-language\">\n<pre><code class=\"lang-bash\">git clone https:\/\/github.com\/aws-samples\/amazon-sagemaker-studio-audit.git<\/code><\/pre>\n<\/div>\n<\/li>\n<\/ol>\n<ol start=\"8\">\n<li>In the left sidebar, choose the <strong>file browser\u00a0<\/strong>icon.<\/li>\n<li>Navigate to <code>amazon-sagemaker-studio-audit<\/code>.<\/li>\n<li>Open the <code>notebook<\/code> folder.<\/li>\n<li>Choose <strong>sagemaker-studio-audit-control.ipynb<\/strong> to open the notebook.<\/li>\n<li>In the <strong>Select Kernel<\/strong> dialog, choose <strong>Python 3 (Data Science)<\/strong>.<\/li>\n<li>Choose <strong>Select<\/strong>.<\/li>\n<li>Wait for the kernel to load.<\/li>\n<\/ol>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-19978\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/12\/16\/ML-1106-7.jpg\" alt=\"Wait for the kernel to load.\" width=\"800\" height=\"220\"><\/p>\n<ol start=\"15\">\n<li>Starting from the first code cell in the notebook, press Shift + Enter to run the code cell.<\/li>\n<li>Continue running all the code cells, waiting for the previous cell to finish before running the following cell.<\/li>\n<\/ol>\n<p>After running the last <code>SELECT<\/code> query, because the user has full SELECT permissions for the table, the query output includes all the columns in the <code>amazon_reviews_parquet<\/code> table.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-19979\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/12\/16\/ML-1106-8.jpg\" alt=\"After running the last SELECT query, because the user has full SELECT permissions for the table, the query output includes all the columns in the amazon_reviews_parquet table.\" width=\"800\" height=\"235\"><\/p>\n<ol start=\"17\">\n<li>On the top menu, under <strong>File<\/strong>, choose <strong>Shut Down<\/strong>.<\/li>\n<li>Choose <strong>Shutdown All <\/strong>to shut down all the Studio apps.<\/li>\n<li>Close the Studio browser tab.<\/li>\n<li>Repeat the previous steps in this section, this time signing in as the user <code>data-scientist-limited<\/code> and opening Studio with this user.<\/li>\n<li>Don\u2019t run the code cell in the section <strong>Create S3 bucket for query output files<\/strong>.<\/li>\n<\/ol>\n<p>For this user, after running the same <code>SELECT<\/code> query in the Studio notebook, the query output only includes a subset of columns for the <code>amazon_reviews_parquet<\/code> table.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-19980\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/12\/16\/ML-1106-9.jpg\" alt=\"For this user, after running the same SELECT query in the Studio notebook, the query output only includes a subset of columns for the amazon_reviews_parquet table.\" width=\"800\" height=\"189\"><\/p>\n<h2>Auditing data access activity with Lake Formation and CloudTrail<\/h2>\n<p>In this section, we explore the events associated to the queries performed in the previous section. The Lake Formation console includes a dashboard where it centralizes all CloudTrail logs specific to the service, such as <code>GetDataAccess<\/code>. These events can be correlated with other CloudTrail events, such as Athena query requests, to get a complete view of the queries users are running on the data lake.<\/p>\n<p>Alternatively, instead of filtering individual events in Lake Formation and CloudTrail, you could run SQL queries to correlate CloudTrail logs using Athena. Such integration is beyond the scope of this post, but you can find additional details in <a href=\"https:\/\/docs.aws.amazon.com\/athena\/latest\/ug\/cloudtrail-logs.html#create-cloudtrail-table-ct\" target=\"_blank\" rel=\"noopener noreferrer\">Using the CloudTrail Console to Create an Athena Table for CloudTrail Logs<\/a> and <a href=\"https:\/\/aws.amazon.com\/blogs\/big-data\/aws-cloudtrail-and-amazon-athena-dive-deep-to-analyze-security-compliance-and-operational-activity\/\" target=\"_blank\" rel=\"noopener noreferrer\">Analyze Security, Compliance, and Operational Activity Using AWS CloudTrail and Amazon Athena<\/a>.<\/p>\n<h3>Auditing data access activity with Lake Formation<\/h3>\n<p>To review activity in Lake Formation, complete the following steps:<\/p>\n<ol>\n<li>Sign out of the AWS account.<\/li>\n<li>Sign in to the console with the IAM user configured as Lake Formation Admin.<\/li>\n<li>On the Lake Formation console, in the navigation pane, choose <strong>Dashboard<\/strong>.<\/li>\n<\/ol>\n<p>Under <strong>Recent access activity<\/strong>, you can find the events associated to the data access for both users.<\/p>\n<ol start=\"4\">\n<li>Choose the most recent event with event name <code>GetDataAccess<\/code>.<\/li>\n<li>Choose <strong>View event<\/strong>.<\/li>\n<\/ol>\n<p>Among other attributes, each event includes the following:<\/p>\n<ul>\n<li>Event date and time<\/li>\n<li>Event source (Lake Formation)<\/li>\n<li>Athena query ID<\/li>\n<li>Table being queried<\/li>\n<li>IAM user embedded in the Lake Formation principal, based on the chosen role name convention<\/li>\n<\/ul>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-19981\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/12\/16\/ML-1106-10.jpg\" alt=\"\u2022 IAM user embedded in the Lake Formation principal, based on the chosen role name convention\" width=\"800\" height=\"325\"><\/p>\n<h3>Auditing data access activity with CloudTrail<\/h3>\n<p>To review activity in CloudTrail, complete the following steps:<\/p>\n<ol>\n<li>On the CloudTrail console, in the navigation pane, choose <strong>Event history<\/strong>.<\/li>\n<li>In the <strong>Event history <\/strong>menu, for <strong>Filter<\/strong>, choose <strong>Event name<\/strong>.<\/li>\n<li>Enter <code>StartQueryExecution<\/code>.<\/li>\n<li>Expand the most recent event, then choose <strong>View event<\/strong>.<\/li>\n<\/ol>\n<p>This event includes additional parameters that are useful to complete the audit analysis, such as the following:<\/p>\n<ul>\n<li>Event source (Athena).<\/li>\n<li>Athena query ID, matching the query ID from Lake Formation\u2019s <code>GetDataAccess<\/code>\u00a0event.<\/li>\n<li>Query string.<\/li>\n<li>Output location. The query output is stored in CSV format in this Amazon S3 location. Files for each query are named using the query ID.<\/li>\n<\/ul>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-19982\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/12\/16\/ML-1106-11.jpg\" alt=\"Output location. The query output is stored in CSV format in this Amazon S3 location. Files for each query are named using the query ID.\" width=\"800\" height=\"368\"><\/p>\n<h2>Cleaning up<\/h2>\n<p>To avoid incurring future charges, delete the resources created during this walkthrough.<\/p>\n<p>If you followed this walkthrough using the CloudFormation template, after shutting down the Studio apps for each user profile, deleting the stack deletes the remaining resources.<\/p>\n<p>If you encounter any errors, open the Studio Control Panel and verify that all the apps for every user profile are in <code>Deleted<\/code> state before deleting the stack.<\/p>\n<p>If you didn\u2019t use the CloudFormation template, you can manually delete the resources you created:<\/p>\n<ol>\n<li>On the <strong>Studio Control Panel<\/strong>, for each user profile, choose <strong>User Details<\/strong>.<\/li>\n<li>Choose <strong>Delete user<\/strong>.<\/li>\n<li>When all users are deleted, choose <strong>Delete Studio<\/strong>.<\/li>\n<li>On the Amazon EFS console, delete the volume that was automatically created for Studio.<\/li>\n<li>On the Lake Formation console, delete the table and the database created for the Amazon Customer Reviews Dataset.<\/li>\n<li>Remove the data lake location for the dataset.<\/li>\n<li>On the IAM console, delete the IAM users, group, and roles created for this walkthrough.<\/li>\n<li>Delete the policies you created for these principals.<\/li>\n<li>On the Amazon S3 console, empty and delete the bucket created for storing Athena query results (starting with <code>sagemaker-audit-control-query-results-<\/code>), and the bucket created by Studio to share notebooks (starting with <code>sagemaker-studio-<\/code>).<\/li>\n<\/ol>\n<h2>Conclusion<\/h2>\n<p>This post described how to the implement access control and auditing capabilities on a per-user basis in ML projects, using Studio notebooks, Athena, and Lake Formation to enforce access control policies when performing exploratory activities in a data lake.<\/p>\n<p>I thank you for following this walkthrough and I invite you to implement it using the associated <a href=\"https:\/\/console.aws.amazon.com\/cloudformation\/home#\/stacks\/create\/review?templateURL=https:\/\/aws-ml-blog.s3.amazonaws.com\/artifacts\/sagemaker-studio-audit-control\/SageMakerStudioAuditControlStack.yaml&amp;stackName=SageMakerStudioAuditControl\" target=\"_blank\" rel=\"noopener noreferrer\">CloudFormation template<\/a>. You\u2019re also welcome to visit the <a href=\"https:\/\/github.com\/aws-samples\/amazon-sagemaker-studio-audit\" target=\"_blank\" rel=\"noopener noreferrer\">GitHub repo<\/a> for the project.<\/p>\n<hr>\n<h3>About the Author<\/h3>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-19987 alignleft\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2020\/12\/16\/Rodrigo-Alarcon.jpg\" alt=\"\" width=\"100\" height=\"133\"><strong>Rodrigo Alarcon<\/strong> is a Sr. Solutions Architect with AWS based out of Santiago, Chile. Rodrigo has over 10 years of experience in IT security and network infrastructure. His interests include machine learning and cybersecurity.<\/p>\n<\/p><\/div>\n","protected":false},"excerpt":{"rendered":"<p>https:\/\/aws.amazon.com\/blogs\/machine-learning\/controlling-and-auditing-data-exploration-activities-with-amazon-sagemaker-studio-and-aws-lake-formation\/<\/p>\n","protected":false},"author":0,"featured_media":719,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[3],"tags":[],"_links":{"self":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/718"}],"collection":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/comments?post=718"}],"version-history":[{"count":0,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/718\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media\/719"}],"wp:attachment":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media?parent=718"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/categories?post=718"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/tags?post=718"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}