{"id":274,"date":"2020-09-24T19:52:03","date_gmt":"2020-09-24T19:52:03","guid":{"rendered":"https:\/\/machine-learning.webcloning.com\/2020\/09\/24\/how-to-hill-climb-the-test-set-for-machine-learning\/"},"modified":"2020-09-24T19:52:03","modified_gmt":"2020-09-24T19:52:03","slug":"how-to-hill-climb-the-test-set-for-machine-learning","status":"publish","type":"post","link":"https:\/\/salarydistribution.com\/machine-learning\/2020\/09\/24\/how-to-hill-climb-the-test-set-for-machine-learning\/","title":{"rendered":"How to Hill Climb the Test Set for Machine Learning"},"content":{"rendered":"<div id=\"\">\n<p><strong>Hill climbing the test set<\/strong> is an approach to achieving good or perfect predictions on a machine learning competition without touching the training set or even developing a predictive model.<\/p>\n<p>As an approach to machine learning competitions, it is rightfully frowned upon, and most competition platforms impose limitations to prevent it, which is important.<\/p>\n<p>Nevertheless, hill climbing the test set is something that a machine learning practitioner accidentally does as part of participating in a competition. By developing an explicit implementation to hill climb a test set, it helps to better understand how easy it can be to overfit a test dataset by overusing it to evaluate modeling pipelines.<\/p>\n<p>In this tutorial, you will discover how to hill climb the test set for machine learning.<\/p>\n<p>After completing this tutorial, you will know:<\/p>\n<ul>\n<li>Perfect predictions can be made by hill climbing the test set without even looking at the training dataset.<\/li>\n<li>How to hill climb the test set for classification and regression tasks.<\/li>\n<li>We implicitly hill climb the test set when we overuse the test set to evaluate our modeling pipelines.<\/li>\n<\/ul>\n<p><strong>Kick-start your project<\/strong> with my new book <a href=\"https:\/\/machinelearningmastery.com\/data-preparation-for-machine-learning\/\">Data Preparation for Machine Learning<\/a>, including <em>step-by-step tutorials<\/em> and the <em>Python source code<\/em> files for all examples.<\/p>\n<p>Let\u2019s get started.<\/p>\n<div id=\"attachment_10974\" class=\"wp-caption aligncenter\">\n<img decoding=\"async\" aria-describedby=\"caption-attachment-10974\" loading=\"lazy\" class=\"size-full wp-image-10974\" src=\"https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/09\/41335312700_7d413118d3_c.jpg\" alt=\"How to Hill Climb the Test Set for Machine Learning\" width=\"799\" height=\"533\"><\/p>\n<p id=\"caption-attachment-10974\" class=\"wp-caption-text\">How to Hill Climb the Test Set for Machine Learning<br \/>Photo by <a href=\"https:\/\/www.flickr.com\/photos\/stignygaard\/41335312700\/\">Stig Nygaard<\/a>, some rights reserved.<\/p>\n<\/div>\n<h2>Tutorial Overview<\/h2>\n<p>This tutorial is divided into five parts; they are:<\/p>\n<ol>\n<li>Hill Climb the Test Set<\/li>\n<li>Hill Climbing Algorithm<\/li>\n<li>How to Implement Hill Climbing<\/li>\n<li>Hill Climb Diabetes Classification Dataset<\/li>\n<li>Hill Climb Housing Regression Dataset<\/li>\n<\/ol>\n<h2>Hill Climb the Test Set<\/h2>\n<p>Machine learning competitions, like those on Kaggle, provide a complete training dataset as well as just the input for the test set.<\/p>\n<p>The objective for a given competition is to predict target values, such as labels or numerical values for the test set. Solutions are evaluated against the hidden test set target values and scored appropriately. The submission with the best score against the test set wins the competition.<\/p>\n<p>The challenge of a machine learning competition can be framed as an optimization problem. Traditionally, the competition participant acts as the optimization algorithm, exploring different modeling pipelines that result in different sets of predictions, scoring the predictions, then making changes to the pipeline that are expected to result in an improved score.<\/p>\n<p>This process can also be modeled directly with an optimization algorithm where candidate predictions are generated and evaluated without ever looking at the test set.<\/p>\n<p>Generally, this is referred to as <a href=\"https:\/\/en.wikipedia.org\/wiki\/Hill_climbing\">hill climbing<\/a> the test set, as one of the simplest optimization algorithms to implement to solve this problem is the hill climbing algorithm.<\/p>\n<p>Although <strong>hill climbing the test set is rightfully frowned upon<\/strong> in actual machine learning competitions, it can be an interesting exercise to implement the approach in order to learn about the limitations of the approach and the dangers of overfitting the test set. Additionally, the fact that the test set can be predicted perfectly without ever touching the training dataset often shocks a lot of beginner machine learning practitioners.<\/p>\n<p>Most importantly, we implicitly hill climb the test set when we repeatedly evaluate different modeling pipelines. The risk is that score is improved on the test set at the cost of increased generalization error, i.e. worse performance on the broader problem.<\/p>\n<p>People that run machine learning competitions are well aware of this problem and impose limitations on prediction evaluation to counter it, such as limiting evaluation to one or a few per day and reporting scores on a hidden subset of the test set rather than the entire test set. For more on this, see the papers listed in the further reading section.<\/p>\n<p>Next, let\u2019s look at how we can implement the hill climbing algorithm to optimize predictions for a test set.<\/p>\n<h3>Want to Get Started With Data Preparation?<\/h3>\n<p>Take my free 7-day email crash course now (with sample code).<\/p>\n<p>Click to sign-up and also get a free PDF Ebook version of the course.<\/p>\n<p><a href=\"https:\/\/machinelearningmastery.lpages.co\/leadbox\/1041bc0ec172a2%3A164f8be4f346dc\/4935938752774144\/\" target=\"_blank\" rel=\"noopener noreferrer\">Download Your FREE Mini-Course<\/a><\/p>\n<h2>Hill Climbing Algorithm<\/h2>\n<p>The <strong>hill climbing algorithm<\/strong> is a very simple optimization algorithm.<\/p>\n<p>It involves generating a candidate solution and evaluating it. This is the starting point that is then incrementally improved until either no further improvement can be achieved or we run out of time, resources, or interest.<\/p>\n<p>New candidate solutions are generated from the existing candidate solution. Typically, this involves making a single change to the candidate solution, evaluating it, and accepting the candidate solution as the new \u201c<em>current<\/em>\u201d solution if it is as good or better than the previous current solution. Otherwise, it is discarded.<\/p>\n<p>We might think that it is a good idea to accept only candidates that have a better score. This is a reasonable approach for many simple problems, although, on more complex problems, it is desirable to accept different candidates with the same score in order to aid the search process to scale flat areas (plateaus) in the feature space.<\/p>\n<p>When hill climbing the test set, a candidate solution is a list of predictions. For a binary classification task, this is a list of 0 and 1 values for the two classes. For a regression task, this is a list of numbers in the range of the target variable.<\/p>\n<p>A modification to a candidate solution for classification would be to select one prediction and flip it from 0 to 1 or 1 to 0. A modification to a candidate solution for regression would be to add Gaussian noise to one value in the list or replace a value in the list with a new value.<\/p>\n<p>Scoring of solutions involves calculating a scoring metric, such as classification accuracy on classification tasks or mean absolute error for a regression task.<\/p>\n<p>Now that we are familiar with the algorithm, let\u2019s implement it.<\/p>\n<h2>How to Implement Hill Climbing<\/h2>\n<p>We will develop our hill climbing algorithm on a synthetic classification task.<\/p>\n<p>First, let\u2019s create a binary classification task with many input variables and 5,000 rows of examples. We can then split the dataset into train and test sets.<\/p>\n<p>The complete example is listed below.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f6cf6f6b7d3b335160127\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# example of a synthetic dataset.<br \/>\nfrom sklearn.datasets import make_classification<br \/>\nfrom sklearn.model_selection import train_test_split<br \/>\n# define dataset<br \/>\nX, y = make_classification(n_samples=5000, n_features=20, n_informative=15, n_redundant=5, random_state=1)<br \/>\nprint(X.shape, y.shape)<br \/>\n# split dataset<br \/>\nX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1)<br \/>\nprint(X_train.shape, X_test.shape, y_train.shape, y_test.shape)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-p\"># example of a synthetic dataset.<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">datasets <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">make_classification<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">model_selection <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-v\">train_test<\/span><span class=\"crayon-sy\">_<\/span>split<\/p>\n<p><span class=\"crayon-p\"># define dataset<\/span><\/p>\n<p><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">make_classification<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">n_samples<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">5000<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">n_features<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">20<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">n_informative<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">15<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">n_redundant<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">5<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">random_state<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># split dataset<\/span><\/p>\n<p><span class=\"crayon-v\">X_train<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_train<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">train_test_split<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">test_size<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">0.33<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">random_state<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X_train<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X_test<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_train<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0005 seconds] --><\/p>\n<p>Running the example first reports the shape of the created dataset, showing 5,000 rows and 20 input variables.<\/p>\n<p>The dataset is then split into train and test sets with about 3,300 for training and about 1,600 for testing.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f6cf6f6b7d41828749015\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n(5000, 20) (5000,)<br \/>\n(3350, 20) (1650, 20) (3350,) (1650,)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p>(5000, 20) (5000,)<\/p>\n<p>(3350, 20) (1650, 20) (3350,) (1650,)<\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0000 seconds] --><\/p>\n<p>Now we can develop a hill climber.<\/p>\n<p>First, we can create a function that will load, or in this case, define the dataset. We can update this function later when we want to change the dataset.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f6cf6f6b7d42118109817\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# load or prepare the classification dataset<br \/>\ndef load_dataset():<br \/>\n\treturn make_classification(n_samples=5000, n_features=20, n_informative=15, n_redundant=5, random_state=1)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-p\"># load or prepare the classification dataset<\/span><\/p>\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">load_dataset<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">make_classification<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">n_samples<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">5000<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">n_features<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">20<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">n_informative<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">15<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">n_redundant<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">5<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">random_state<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0002 seconds] --><\/p>\n<p>Next, we need a function to evaluate candidate solutions\u2013that is, lists of predictions.<\/p>\n<p>We will use classification accuracy where scores range between 0 for the worst possible solution to 1 for a perfect set of predictions.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f6cf6f6b7d43120045580\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# evaluate a set of predictions<br \/>\ndef evaluate_predictions(y_test, yhat):<br \/>\n\treturn accuracy_score(y_test, yhat)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-p\"># evaluate a set of predictions<\/span><\/p>\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">evaluate_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">yhat<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">accuracy_score<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">yhat<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0001 seconds] --><\/p>\n<p>Next, we need a function to create an initial candidate solution.<\/p>\n<p>That is a list of predictions for 0 and 1 class labels, long enough to match the number of examples in the test set, in this case, 1650.<\/p>\n<p>We can use the <a href=\"https:\/\/docs.python.org\/3\/library\/random.html#random.randint\">randint() function<\/a> to generate random values of 0 and 1.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f6cf6f6b7d44219391696\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# create a random set of predictions<br \/>\ndef random_predictions(n_examples):<br \/>\n\treturn [randint(0, 1) for _ in range(n_examples)]<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-p\"># create a random set of predictions<\/span><\/p>\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">random_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">n_examples<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-e\">randint<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">for<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">_<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">in<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">range<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">n_examples<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0001 seconds] --><\/p>\n<p>Next, we need a function to create a modified version of a candidate solution.<\/p>\n<p>In this case, this involves selecting one value in the solution and flipping it from 0 to 1 or 1 to 0.<\/p>\n<p>Typically, we make a single change for each new candidate solution during hill climbing, but I have parameterized the function so you can explore making more than one change if you want.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f6cf6f6b7d45715284015\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# modify the current set of predictions<br \/>\ndef modify_predictions(current, n_changes=1):<br \/>\n\t# copy current solution<br \/>\n\tupdated = current.copy()<br \/>\n\tfor i in range(n_changes):<br \/>\n\t\t# select a point to change<br \/>\n\t\tix = randint(0, len(updated)-1)<br \/>\n\t\t# flip the class label<br \/>\n\t\tupdated[ix] = 1 &#8211; updated[ix]<br \/>\n\treturn updated<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-p\"># modify the current set of predictions<\/span><\/p>\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">modify_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">current<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">n_changes<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-p\"># copy current solution<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">updated<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">current<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">copy<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">for<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">i<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">in<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">range<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">n_changes<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># select a point to change<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-v\">ix<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">randint<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">len<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">updated<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># flip the class label<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-v\">updated<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">ix<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">updated<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">ix<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">updated<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0003 seconds] --><\/p>\n<p>So far, so good.<\/p>\n<p>Next, we can develop the function that performs the search.<\/p>\n<p>First, an initial solution is created and evaluated by calling the <em>random_predictions()<\/em> function followed by the <em>evaluate_predictions()<\/em> function.<\/p>\n<p>Then we loop for a fixed number of iterations and generate a new candidate by calling <em>modify_predictions()<\/em>, evaluate it, and if the score is as good as or better than the current solution, replace it.<\/p>\n<p>The loop ends when we finish the pre-set number of iterations (chosen arbitrarily) or when a perfect score is achieved, which we know in this case is an accuracy of 1.0 (100 percent).<\/p>\n<p>The function <em>hill_climb_testset()<\/em> below implements this, taking the test set as input and returning the best set of predictions found during the hill climbing.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f6cf6f6b7d46771792039\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# run a hill climb for a set of predictions<br \/>\ndef hill_climb_testset(X_test, y_test, max_iterations):<br \/>\n\tscores = list()<br \/>\n\t# generate the initial solution<br \/>\n\tsolution = random_predictions(X_test.shape[0])<br \/>\n\t# evaluate the initial solution<br \/>\n\tscore = evaluate_predictions(y_test, solution)<br \/>\n\tscores.append(score)<br \/>\n\t# hill climb to a solution<br \/>\n\tfor i in range(max_iterations):<br \/>\n\t\t# record scores<br \/>\n\t\tscores.append(score)<br \/>\n\t\t# stop once we achieve the best score<br \/>\n\t\tif score == 1.0:<br \/>\n\t\t\tbreak<br \/>\n\t\t# generate new candidate<br \/>\n\t\tcandidate = modify_predictions(solution)<br \/>\n\t\t# evaluate candidate<br \/>\n\t\tvalue = evaluate_predictions(y_test, candidate)<br \/>\n\t\t# check if it is as good or better<br \/>\n\t\tif value &gt;= score:<br \/>\n\t\t\tsolution, score = candidate, value<br \/>\n\t\t\tprint(&#8216;&gt;%d, score=%.3f&#8217; % (i, score))<br \/>\n\treturn solution, scores<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<div class=\"urvanov-syntax-highlighter-nums-content\">\n<p>1<\/p>\n<p>2<\/p>\n<p>3<\/p>\n<p>4<\/p>\n<p>5<\/p>\n<p>6<\/p>\n<p>7<\/p>\n<p>8<\/p>\n<p>9<\/p>\n<p>10<\/p>\n<p>11<\/p>\n<p>12<\/p>\n<p>13<\/p>\n<p>14<\/p>\n<p>15<\/p>\n<p>16<\/p>\n<p>17<\/p>\n<p>18<\/p>\n<p>19<\/p>\n<p>20<\/p>\n<p>21<\/p>\n<p>22<\/p>\n<p>23<\/p>\n<p>24<\/p>\n<\/div>\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-p\"># run a hill climb for a set of predictions<\/span><\/p>\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">hill_climb_testset<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">max_iterations<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">scores<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">list<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-p\"># generate the initial solution<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">solution<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">random_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X_test<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-p\"># evaluate the initial solution<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">evaluate_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">solution<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">scores<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">append<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-p\"># hill climb to a solution<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">for<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">i<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">in<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">range<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">max_iterations<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># record scores<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-v\">scores<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">append<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># stop once we achieve the best score<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-st\">if<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">==<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">1.0<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t\t<\/span><span class=\"crayon-st\">break<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># generate new candidate<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-v\">candidate<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">modify_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">solution<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># evaluate candidate<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-v\">value<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">evaluate_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">candidate<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># check if it is as good or better<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-st\">if<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">value<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&gt;=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t\t<\/span><span class=\"crayon-v\">solution<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">candidate<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">value<\/span><\/p>\n<p><span class=\"crayon-e\">\t\t\t<\/span><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8216;&gt;%d, score=%.3f&#8217;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">%<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">i<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">solution<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">scores<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0006 seconds] --><\/p>\n<p>That\u2019s all there is to it.<\/p>\n<p>The complete example of hill climbing the test set is listed below.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f6cf6f6b7d47475878821\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# example of hill climbing the test set for a classification task<br \/>\nfrom random import randint<br \/>\nfrom sklearn.datasets import make_classification<br \/>\nfrom sklearn.model_selection import train_test_split<br \/>\nfrom sklearn.metrics import accuracy_score<br \/>\nfrom matplotlib import pyplot<\/p>\n<p># load or prepare the classification dataset<br \/>\ndef load_dataset():<br \/>\n\treturn make_classification(n_samples=5000, n_features=20, n_informative=15, n_redundant=5, random_state=1)<\/p>\n<p># evaluate a set of predictions<br \/>\ndef evaluate_predictions(y_test, yhat):<br \/>\n\treturn accuracy_score(y_test, yhat)<\/p>\n<p># create a random set of predictions<br \/>\ndef random_predictions(n_examples):<br \/>\n\treturn [randint(0, 1) for _ in range(n_examples)]<\/p>\n<p># modify the current set of predictions<br \/>\ndef modify_predictions(current, n_changes=1):<br \/>\n\t# copy current solution<br \/>\n\tupdated = current.copy()<br \/>\n\tfor i in range(n_changes):<br \/>\n\t\t# select a point to change<br \/>\n\t\tix = randint(0, len(updated)-1)<br \/>\n\t\t# flip the class label<br \/>\n\t\tupdated[ix] = 1 &#8211; updated[ix]<br \/>\n\treturn updated<\/p>\n<p># run a hill climb for a set of predictions<br \/>\ndef hill_climb_testset(X_test, y_test, max_iterations):<br \/>\n\tscores = list()<br \/>\n\t# generate the initial solution<br \/>\n\tsolution = random_predictions(X_test.shape[0])<br \/>\n\t# evaluate the initial solution<br \/>\n\tscore = evaluate_predictions(y_test, solution)<br \/>\n\tscores.append(score)<br \/>\n\t# hill climb to a solution<br \/>\n\tfor i in range(max_iterations):<br \/>\n\t\t# record scores<br \/>\n\t\tscores.append(score)<br \/>\n\t\t# stop once we achieve the best score<br \/>\n\t\tif score == 1.0:<br \/>\n\t\t\tbreak<br \/>\n\t\t# generate new candidate<br \/>\n\t\tcandidate = modify_predictions(solution)<br \/>\n\t\t# evaluate candidate<br \/>\n\t\tvalue = evaluate_predictions(y_test, candidate)<br \/>\n\t\t# check if it is as good or better<br \/>\n\t\tif value &gt;= score:<br \/>\n\t\t\tsolution, score = candidate, value<br \/>\n\t\t\tprint(&#8216;&gt;%d, score=%.3f&#8217; % (i, score))<br \/>\n\treturn solution, scores<\/p>\n<p># load the dataset<br \/>\nX, y = load_dataset()<br \/>\nprint(X.shape, y.shape)<br \/>\n# split dataset into train and test sets<br \/>\nX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1)<br \/>\nprint(X_train.shape, X_test.shape, y_train.shape, y_test.shape)<br \/>\n# run hill climb<br \/>\nyhat, scores = hill_climb_testset(X_test, y_test, 20000)<br \/>\n# plot the scores vs iterations<br \/>\npyplot.plot(scores)<br \/>\npyplot.show()<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<div class=\"urvanov-syntax-highlighter-nums-content\">\n<p>1<\/p>\n<p>2<\/p>\n<p>3<\/p>\n<p>4<\/p>\n<p>5<\/p>\n<p>6<\/p>\n<p>7<\/p>\n<p>8<\/p>\n<p>9<\/p>\n<p>10<\/p>\n<p>11<\/p>\n<p>12<\/p>\n<p>13<\/p>\n<p>14<\/p>\n<p>15<\/p>\n<p>16<\/p>\n<p>17<\/p>\n<p>18<\/p>\n<p>19<\/p>\n<p>20<\/p>\n<p>21<\/p>\n<p>22<\/p>\n<p>23<\/p>\n<p>24<\/p>\n<p>25<\/p>\n<p>26<\/p>\n<p>27<\/p>\n<p>28<\/p>\n<p>29<\/p>\n<p>30<\/p>\n<p>31<\/p>\n<p>32<\/p>\n<p>33<\/p>\n<p>34<\/p>\n<p>35<\/p>\n<p>36<\/p>\n<p>37<\/p>\n<p>38<\/p>\n<p>39<\/p>\n<p>40<\/p>\n<p>41<\/p>\n<p>42<\/p>\n<p>43<\/p>\n<p>44<\/p>\n<p>45<\/p>\n<p>46<\/p>\n<p>47<\/p>\n<p>48<\/p>\n<p>49<\/p>\n<p>50<\/p>\n<p>51<\/p>\n<p>52<\/p>\n<p>53<\/p>\n<p>54<\/p>\n<p>55<\/p>\n<p>56<\/p>\n<p>57<\/p>\n<p>58<\/p>\n<p>59<\/p>\n<p>60<\/p>\n<p>61<\/p>\n<p>62<\/p>\n<p>63<\/p>\n<p>64<\/p>\n<p>65<\/p>\n<p>66<\/p>\n<\/div>\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-p\"># example of hill climbing the test set for a classification task<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">random <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">randint<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">datasets <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">make_classification<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">model_selection <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">train_test_split<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">metrics <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">accuracy_score<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">matplotlib <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-i\">pyplot<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># load or prepare the classification dataset<\/span><\/p>\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">load_dataset<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">make_classification<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">n_samples<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">5000<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">n_features<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">20<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">n_informative<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">15<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">n_redundant<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">5<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">random_state<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># evaluate a set of predictions<\/span><\/p>\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">evaluate_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">yhat<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">accuracy_score<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">yhat<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># create a random set of predictions<\/span><\/p>\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">random_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">n_examples<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-e\">randint<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">for<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">_<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">in<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">range<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">n_examples<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># modify the current set of predictions<\/span><\/p>\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">modify_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">current<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">n_changes<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-p\"># copy current solution<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">updated<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">current<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">copy<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">for<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">i<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">in<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">range<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">n_changes<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># select a point to change<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-v\">ix<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">randint<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">len<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">updated<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># flip the class label<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-v\">updated<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">ix<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">updated<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">ix<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">updated<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># run a hill climb for a set of predictions<\/span><\/p>\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">hill_climb_testset<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">max_iterations<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">scores<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">list<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-p\"># generate the initial solution<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">solution<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">random_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X_test<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-p\"># evaluate the initial solution<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">evaluate_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">solution<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">scores<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">append<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-p\"># hill climb to a solution<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">for<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">i<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">in<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">range<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">max_iterations<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># record scores<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-v\">scores<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">append<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># stop once we achieve the best score<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-st\">if<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">==<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">1.0<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t\t<\/span><span class=\"crayon-st\">break<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># generate new candidate<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-v\">candidate<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">modify_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">solution<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># evaluate candidate<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-v\">value<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">evaluate_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">candidate<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># check if it is as good or better<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-st\">if<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">value<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&gt;=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t\t<\/span><span class=\"crayon-v\">solution<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">candidate<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">value<\/span><\/p>\n<p><span class=\"crayon-e\">\t\t\t<\/span><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8216;&gt;%d, score=%.3f&#8217;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">%<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">i<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">solution<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">scores<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># load the dataset<\/span><\/p>\n<p><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">load_dataset<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># split dataset into train and test sets<\/span><\/p>\n<p><span class=\"crayon-v\">X_train<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_train<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">train_test_split<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">test_size<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">0.33<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">random_state<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X_train<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X_test<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_train<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># run hill climb<\/span><\/p>\n<p><span class=\"crayon-v\">yhat<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">scores<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">hill_climb_testset<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">20000<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># plot the scores vs iterations<\/span><\/p>\n<p><span class=\"crayon-v\">pyplot<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">plot<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">scores<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">pyplot<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">show<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0017 seconds] --><\/p>\n<p>Running the example will run the search for 20,000 iterations or stop if a perfect accuracy is achieved.<\/p>\n<p><strong>Note<\/strong>: Your <a href=\"https:\/\/machinelearningmastery.com\/different-results-each-time-in-machine-learning\/\">results may vary<\/a> given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the average outcome.<\/p>\n<p>In this case, we found a perfect set of predictions for the test set in about 12,900 iterations.<\/p>\n<p>Recall that this was achieved without touching the training dataset and without cheating by looking at the test set target values. Instead, we simply optimized a set of numbers.<\/p>\n<p>The lesson here is that repeated evaluation of a modeling pipeline against a test set will do the same thing, using you as the hill climbing optimization algorithm. The solution will be overfit to the test set.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f6cf6f6b7d48272769698\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\n&gt;8092, score=0.996<br \/>\n&gt;8886, score=0.997<br \/>\n&gt;9202, score=0.998<br \/>\n&gt;9322, score=0.998<br \/>\n&gt;9521, score=0.999<br \/>\n&gt;11046, score=0.999<br \/>\n&gt;12932, score=1.000<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p>&#8230;<\/p>\n<p>&gt;8092, score=0.996<\/p>\n<p>&gt;8886, score=0.997<\/p>\n<p>&gt;9202, score=0.998<\/p>\n<p>&gt;9322, score=0.998<\/p>\n<p>&gt;9521, score=0.999<\/p>\n<p>&gt;11046, score=0.999<\/p>\n<p>&gt;12932, score=1.000<\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0000 seconds] --><\/p>\n<p>A plot is also created of the progress of the optimization.<\/p>\n<p>This can be helpful to see how changes to the optimization algorithm, such as the choice of what to change and how it is changed during the hill climb, impact the convergence of the search.<\/p>\n<div id=\"attachment_10971\" class=\"wp-caption aligncenter\">\n<img decoding=\"async\" aria-describedby=\"caption-attachment-10971\" loading=\"lazy\" class=\"size-full wp-image-10971\" src=\"https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/05\/Line-Plot-of-Accuracy-vs-Hill-Climb-Optimization-Iteration-for-a-Classification-Task.png\" alt=\"Line Plot of Accuracy vs. Hill Climb Optimization Iteration for a Classification Task\" width=\"1280\" height=\"960\"><\/p>\n<p id=\"caption-attachment-10971\" class=\"wp-caption-text\">Line Plot of Accuracy vs. Hill Climb Optimization Iteration for a Classification Task<\/p>\n<\/div>\n<p>Now that we are familiar with hill climbing the test set, let\u2019s try the approach on a real dataset.<\/p>\n<h2>Hill Climb Diabetes Classification Dataset<\/h2>\n<p>We will use the diabetes dataset as the basis for exploring hill climbing the test set for a classification problem.<\/p>\n<p>Each record describes the medical details of a female, and the prediction is the onset of diabetes within the next five years.<\/p>\n<p>The dataset has eight input variables and 768 rows of data; the input variables are all numeric and the target has two class labels, e.g. it is a binary classification task.<\/p>\n<p>Below provides a sample of the first five rows of the dataset.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f6cf6f6b7d4c408685937\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n6,148,72,35,0,33.6,0.627,50,1<br \/>\n1,85,66,29,0,26.6,0.351,31,0<br \/>\n8,183,64,0,0,23.3,0.672,32,1<br \/>\n1,89,66,23,94,28.1,0.167,21,0<br \/>\n0,137,40,35,168,43.1,2.288,33,1<br \/>\n&#8230;<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p>6,148,72,35,0,33.6,0.627,50,1<\/p>\n<p>1,85,66,29,0,26.6,0.351,31,0<\/p>\n<p>8,183,64,0,0,23.3,0.672,32,1<\/p>\n<p>1,89,66,23,94,28.1,0.167,21,0<\/p>\n<p>0,137,40,35,168,43.1,2.288,33,1<\/p>\n<p>&#8230;<\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0000 seconds] --><\/p>\n<p>We can load the dataset directly using Pandas, as follows.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f6cf6f6b7d4d678072270\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# load or prepare the classification dataset<br \/>\ndef load_dataset():<br \/>\n\turl = &#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/pima-indians-diabetes.csv&#8217;<br \/>\n\tdf = read_csv(url, header=None)<br \/>\n\tdata = df.values<br \/>\n\treturn data[:, :-1], data[:, -1]<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-p\"># load or prepare the classification dataset<\/span><\/p>\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">load_dataset<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">url<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-s\">&#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/pima-indians-diabetes.csv&#8217;<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">df<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">read_csv<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">url<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">header<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">None<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">df<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">values<\/span><\/p>\n<p><span class=\"crayon-e\">\t<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0002 seconds] --><\/p>\n<p>The rest of the code remains unchanged.<\/p>\n<p>This is created so that you can drop in your own binary classification task and try it out.<\/p>\n<p>The complete example is listed below.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f6cf6f6b7d4e766566552\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# example of hill climbing the test set for the diabetes dataset<br \/>\nfrom random import randint<br \/>\nfrom pandas import read_csv<br \/>\nfrom sklearn.model_selection import train_test_split<br \/>\nfrom sklearn.metrics import accuracy_score<br \/>\nfrom matplotlib import pyplot<\/p>\n<p># load or prepare the classification dataset<br \/>\ndef load_dataset():<br \/>\n\turl = &#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/pima-indians-diabetes.csv&#8217;<br \/>\n\tdf = read_csv(url, header=None)<br \/>\n\tdata = df.values<br \/>\n\treturn data[:, :-1], data[:, -1]<\/p>\n<p># evaluate a set of predictions<br \/>\ndef evaluate_predictions(y_test, yhat):<br \/>\n\treturn accuracy_score(y_test, yhat)<\/p>\n<p># create a random set of predictions<br \/>\ndef random_predictions(n_examples):<br \/>\n\treturn [randint(0, 1) for _ in range(n_examples)]<\/p>\n<p># modify the current set of predictions<br \/>\ndef modify_predictions(current, n_changes=1):<br \/>\n\t# copy current solution<br \/>\n\tupdated = current.copy()<br \/>\n\tfor i in range(n_changes):<br \/>\n\t\t# select a point to change<br \/>\n\t\tix = randint(0, len(updated)-1)<br \/>\n\t\t# flip the class label<br \/>\n\t\tupdated[ix] = 1 &#8211; updated[ix]<br \/>\n\treturn updated<\/p>\n<p># run a hill climb for a set of predictions<br \/>\ndef hill_climb_testset(X_test, y_test, max_iterations):<br \/>\n\tscores = list()<br \/>\n\t# generate the initial solution<br \/>\n\tsolution = random_predictions(X_test.shape[0])<br \/>\n\t# evaluate the initial solution<br \/>\n\tscore = evaluate_predictions(y_test, solution)<br \/>\n\tscores.append(score)<br \/>\n\t# hill climb to a solution<br \/>\n\tfor i in range(max_iterations):<br \/>\n\t\t# record scores<br \/>\n\t\tscores.append(score)<br \/>\n\t\t# stop once we achieve the best score<br \/>\n\t\tif score == 1.0:<br \/>\n\t\t\tbreak<br \/>\n\t\t# generate new candidate<br \/>\n\t\tcandidate = modify_predictions(solution)<br \/>\n\t\t# evaluate candidate<br \/>\n\t\tvalue = evaluate_predictions(y_test, candidate)<br \/>\n\t\t# check if it is as good or better<br \/>\n\t\tif value &gt;= score:<br \/>\n\t\t\tsolution, score = candidate, value<br \/>\n\t\t\tprint(&#8216;&gt;%d, score=%.3f&#8217; % (i, score))<br \/>\n\treturn solution, scores<\/p>\n<p># load the dataset<br \/>\nX, y = load_dataset()<br \/>\nprint(X.shape, y.shape)<br \/>\n# split dataset into train and test sets<br \/>\nX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1)<br \/>\nprint(X_train.shape, X_test.shape, y_train.shape, y_test.shape)<br \/>\n# run hill climb<br \/>\nyhat, scores = hill_climb_testset(X_test, y_test, 5000)<br \/>\n# plot the scores vs iterations<br \/>\npyplot.plot(scores)<br \/>\npyplot.show()<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<div class=\"urvanov-syntax-highlighter-nums-content\">\n<p>1<\/p>\n<p>2<\/p>\n<p>3<\/p>\n<p>4<\/p>\n<p>5<\/p>\n<p>6<\/p>\n<p>7<\/p>\n<p>8<\/p>\n<p>9<\/p>\n<p>10<\/p>\n<p>11<\/p>\n<p>12<\/p>\n<p>13<\/p>\n<p>14<\/p>\n<p>15<\/p>\n<p>16<\/p>\n<p>17<\/p>\n<p>18<\/p>\n<p>19<\/p>\n<p>20<\/p>\n<p>21<\/p>\n<p>22<\/p>\n<p>23<\/p>\n<p>24<\/p>\n<p>25<\/p>\n<p>26<\/p>\n<p>27<\/p>\n<p>28<\/p>\n<p>29<\/p>\n<p>30<\/p>\n<p>31<\/p>\n<p>32<\/p>\n<p>33<\/p>\n<p>34<\/p>\n<p>35<\/p>\n<p>36<\/p>\n<p>37<\/p>\n<p>38<\/p>\n<p>39<\/p>\n<p>40<\/p>\n<p>41<\/p>\n<p>42<\/p>\n<p>43<\/p>\n<p>44<\/p>\n<p>45<\/p>\n<p>46<\/p>\n<p>47<\/p>\n<p>48<\/p>\n<p>49<\/p>\n<p>50<\/p>\n<p>51<\/p>\n<p>52<\/p>\n<p>53<\/p>\n<p>54<\/p>\n<p>55<\/p>\n<p>56<\/p>\n<p>57<\/p>\n<p>58<\/p>\n<p>59<\/p>\n<p>60<\/p>\n<p>61<\/p>\n<p>62<\/p>\n<p>63<\/p>\n<p>64<\/p>\n<p>65<\/p>\n<p>66<\/p>\n<p>67<\/p>\n<p>68<\/p>\n<p>69<\/p>\n<\/div>\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-p\"># example of hill climbing the test set for the diabetes dataset<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">random <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">randint<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">pandas <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">read_csv<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">model_selection <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">train_test_split<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">metrics <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">accuracy_score<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">matplotlib <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-i\">pyplot<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># load or prepare the classification dataset<\/span><\/p>\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">load_dataset<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">url<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-s\">&#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/pima-indians-diabetes.csv&#8217;<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">df<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">read_csv<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">url<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">header<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">None<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">df<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">values<\/span><\/p>\n<p><span class=\"crayon-e\">\t<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># evaluate a set of predictions<\/span><\/p>\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">evaluate_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">yhat<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">accuracy_score<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">yhat<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># create a random set of predictions<\/span><\/p>\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">random_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">n_examples<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-e\">randint<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">for<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">_<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">in<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">range<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">n_examples<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># modify the current set of predictions<\/span><\/p>\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">modify_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">current<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">n_changes<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-p\"># copy current solution<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">updated<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">current<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">copy<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">for<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">i<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">in<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">range<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">n_changes<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># select a point to change<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-v\">ix<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">randint<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">len<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">updated<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># flip the class label<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-v\">updated<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">ix<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">updated<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">ix<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">updated<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># run a hill climb for a set of predictions<\/span><\/p>\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">hill_climb_testset<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">max_iterations<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">scores<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">list<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-p\"># generate the initial solution<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">solution<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">random_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X_test<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-p\"># evaluate the initial solution<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">evaluate_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">solution<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">scores<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">append<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-p\"># hill climb to a solution<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">for<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">i<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">in<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">range<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">max_iterations<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># record scores<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-v\">scores<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">append<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># stop once we achieve the best score<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-st\">if<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">==<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">1.0<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t\t<\/span><span class=\"crayon-st\">break<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># generate new candidate<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-v\">candidate<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">modify_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">solution<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># evaluate candidate<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-v\">value<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">evaluate_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">candidate<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># check if it is as good or better<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-st\">if<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">value<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&gt;=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t\t<\/span><span class=\"crayon-v\">solution<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">candidate<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">value<\/span><\/p>\n<p><span class=\"crayon-e\">\t\t\t<\/span><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8216;&gt;%d, score=%.3f&#8217;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">%<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">i<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">solution<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">scores<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># load the dataset<\/span><\/p>\n<p><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">load_dataset<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># split dataset into train and test sets<\/span><\/p>\n<p><span class=\"crayon-v\">X_train<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_train<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">train_test_split<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">test_size<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">0.33<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">random_state<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X_train<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X_test<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_train<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># run hill climb<\/span><\/p>\n<p><span class=\"crayon-v\">yhat<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">scores<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">hill_climb_testset<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">5000<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># plot the scores vs iterations<\/span><\/p>\n<p><span class=\"crayon-v\">pyplot<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">plot<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">scores<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">pyplot<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">show<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0017 seconds] --><\/p>\n<p>Running the example reports the iteration number and accuracy each time an improvement is seen during the search.<\/p>\n<p>We use fewer iterations in this case because it is a simpler problem to optimize as we have fewer predictions to make.<\/p>\n<p><strong>Note<\/strong>: Your <a href=\"https:\/\/machinelearningmastery.com\/different-results-each-time-in-machine-learning\/\">results may vary<\/a> given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the average outcome.<\/p>\n<p>In this case, we can see that we achieved perfect accuracy in about 1,500 iterations.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f6cf6f6b7d4f510238919\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\n&gt;617, score=0.961<br \/>\n&gt;627, score=0.965<br \/>\n&gt;650, score=0.969<br \/>\n&gt;683, score=0.972<br \/>\n&gt;743, score=0.976<br \/>\n&gt;803, score=0.980<br \/>\n&gt;817, score=0.984<br \/>\n&gt;945, score=0.988<br \/>\n&gt;1350, score=0.992<br \/>\n&gt;1387, score=0.996<br \/>\n&gt;1565, score=1.000<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<div class=\"urvanov-syntax-highlighter-nums-content\">\n<p>1<\/p>\n<p>2<\/p>\n<p>3<\/p>\n<p>4<\/p>\n<p>5<\/p>\n<p>6<\/p>\n<p>7<\/p>\n<p>8<\/p>\n<p>9<\/p>\n<p>10<\/p>\n<p>11<\/p>\n<p>12<\/p>\n<\/div>\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p>&#8230;<\/p>\n<p>&gt;617, score=0.961<\/p>\n<p>&gt;627, score=0.965<\/p>\n<p>&gt;650, score=0.969<\/p>\n<p>&gt;683, score=0.972<\/p>\n<p>&gt;743, score=0.976<\/p>\n<p>&gt;803, score=0.980<\/p>\n<p>&gt;817, score=0.984<\/p>\n<p>&gt;945, score=0.988<\/p>\n<p>&gt;1350, score=0.992<\/p>\n<p>&gt;1387, score=0.996<\/p>\n<p>&gt;1565, score=1.000<\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0000 seconds] --><\/p>\n<p>A line plot of the search progress is also created showing that convergence was rapid.<\/p>\n<div id=\"attachment_10972\" class=\"wp-caption aligncenter\">\n<img decoding=\"async\" aria-describedby=\"caption-attachment-10972\" loading=\"lazy\" class=\"size-full wp-image-10972\" src=\"https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/05\/Line-Plot-of-Accuracy-vs-Hill-Climb-Optimization-Iteration-for-the-Diabetes-Dataset.png\" alt=\"Line Plot of Accuracy vs. Hill Climb Optimization Iteration for the Diabetes Dataset\" width=\"1280\" height=\"960\"><\/p>\n<p id=\"caption-attachment-10972\" class=\"wp-caption-text\">Line Plot of Accuracy vs. Hill Climb Optimization Iteration for the Diabetes Dataset<\/p>\n<\/div>\n<h2>Hill Climb Housing Regression Dataset<\/h2>\n<p>We will use the housing dataset as the basis for exploring hill climbing the test set regression problem.<\/p>\n<p>The housing dataset involves the prediction of a house price in thousands of dollars given details of the house and its neighborhood.<\/p>\n<p>It is a regression problem, meaning we are predicting a numerical value. There are 506 observations with 13 input variables and one output variable.<\/p>\n<p>A sample of the first five rows is listed below.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f6cf6f6b7d50822985995\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n0.00632,18.00,2.310,0,0.5380,6.5750,65.20,4.0900,1,296.0,15.30,396.90,4.98,24.00<br \/>\n0.02731,0.00,7.070,0,0.4690,6.4210,78.90,4.9671,2,242.0,17.80,396.90,9.14,21.60<br \/>\n0.02729,0.00,7.070,0,0.4690,7.1850,61.10,4.9671,2,242.0,17.80,392.83,4.03,34.70<br \/>\n0.03237,0.00,2.180,0,0.4580,6.9980,45.80,6.0622,3,222.0,18.70,394.63,2.94,33.40<br \/>\n0.06905,0.00,2.180,0,0.4580,7.1470,54.20,6.0622,3,222.0,18.70,396.90,5.33,36.20<br \/>\n&#8230;<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p>0.00632,18.00,2.310,0,0.5380,6.5750,65.20,4.0900,1,296.0,15.30,396.90,4.98,24.00<\/p>\n<p>0.02731,0.00,7.070,0,0.4690,6.4210,78.90,4.9671,2,242.0,17.80,396.90,9.14,21.60<\/p>\n<p>0.02729,0.00,7.070,0,0.4690,7.1850,61.10,4.9671,2,242.0,17.80,392.83,4.03,34.70<\/p>\n<p>0.03237,0.00,2.180,0,0.4580,6.9980,45.80,6.0622,3,222.0,18.70,394.63,2.94,33.40<\/p>\n<p>0.06905,0.00,2.180,0,0.4580,7.1470,54.20,6.0622,3,222.0,18.70,396.90,5.33,36.20<\/p>\n<p>&#8230;<\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0000 seconds] --><\/p>\n<p>First, we can update the <em>load_dataset()<\/em> function to load the housing dataset.<\/p>\n<p>As part of loading the dataset, we will normalize the target value. This will make hill climbing the predictions simpler as we can limit the floating-point values to range 0 to 1.<\/p>\n<p>This is not required generally, just the approach taken here to simplify the search algorithm.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f6cf6f6b7d51709177452\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# load or prepare the classification dataset<br \/>\ndef load_dataset():<br \/>\n\turl = &#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/housing.csv&#8217;<br \/>\n\tdf = read_csv(url, header=None)<br \/>\n\tdata = df.values<br \/>\n\tX, y = data[:, :-1], data[:, -1]<br \/>\n\t# normalize the target<br \/>\n\tscaler = MinMaxScaler()<br \/>\n\ty = y.reshape((len(y), 1))<br \/>\n\ty = scaler.fit_transform(y)<br \/>\n\treturn X, y<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-p\"># load or prepare the classification dataset<\/span><\/p>\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">load_dataset<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">url<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-s\">&#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/housing.csv&#8217;<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">df<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">read_csv<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">url<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">header<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">None<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">df<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-i\">values<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-p\"># normalize the target<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">scaler<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">MinMaxScaler<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">reshape<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-e\">len<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">scaler<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">fit_transform<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0004 seconds] --><\/p>\n<p>Next, we can update the scoring function to use the mean absolute error between the expected and predicted values.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f6cf6f6b7d52289778518\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# evaluate a set of predictions<br \/>\ndef evaluate_predictions(y_test, yhat):<br \/>\n\treturn mean_absolute_error(y_test, yhat)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-p\"># evaluate a set of predictions<\/span><\/p>\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">evaluate_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">yhat<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">mean_absolute_error<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">yhat<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0001 seconds] --><\/p>\n<p>We must also update the representation for a solution from 0 and 1 labels to floating-point values between 0 and 1.<\/p>\n<p>The generation of the initial candidate solution must be changed to create a list of random floats.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f6cf6f6b7d53812809392\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# create a random set of predictions<br \/>\ndef random_predictions(n_examples):<br \/>\n\treturn [random() for _ in range(n_examples)]<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-p\"># create a random set of predictions<\/span><\/p>\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">random_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">n_examples<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-e\">random<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">for<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">_<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">in<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">range<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">n_examples<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0001 seconds] --><\/p>\n<p>The single change made to a solution to create a new candidate solution, in this case, involves simply replacing a randomly chosen prediction in the list with a new random float.<\/p>\n<p>I chose this because it was simple.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f6cf6f6b7d54823193759\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# modify the current set of predictions<br \/>\ndef modify_predictions(current, n_changes=1):<br \/>\n\t# copy current solution<br \/>\n\tupdated = current.copy()<br \/>\n\tfor i in range(n_changes):<br \/>\n\t\t# select a point to change<br \/>\n\t\tix = randint(0, len(updated)-1)<br \/>\n\t\t# flip the class label<br \/>\n\t\tupdated[ix] = random()<br \/>\n\treturn updated<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-p\"># modify the current set of predictions<\/span><\/p>\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">modify_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">current<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">n_changes<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-p\"># copy current solution<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">updated<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">current<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">copy<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">for<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">i<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">in<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">range<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">n_changes<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># select a point to change<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-v\">ix<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">randint<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">len<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">updated<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># flip the class label<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-v\">updated<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">ix<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">random<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">updated<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0003 seconds] --><\/p>\n<p>A better approach would be to add Gaussian noise to an existing value, and I leave this to you as an extension. If you try it, let me know in the comments below.<\/p>\n<p>For example:<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f6cf6f6b7d55626940669\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\n# add gaussian noise<br \/>\nupdated[ix] += gauss(0, 0.1)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-p\"># add gaussian noise<\/span><\/p>\n<p><span class=\"crayon-v\">updated<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">ix<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">+=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">gauss<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">0.1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0001 seconds] --><\/p>\n<p>Finally, the search must be updated.<\/p>\n<p>The best value is now an error of 0.0, used to stop the search if found.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f6cf6f6b7d56027982432\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\n# stop once we achieve the best score<br \/>\nif score == 0.0:<br \/>\n\tbreak<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-p\"># stop once we achieve the best score<\/span><\/p>\n<p><span class=\"crayon-st\">if<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">==<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">0.0<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">break<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0001 seconds] --><\/p>\n<p>We also need to change the search from maximizing the score to now minimize the score.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f6cf6f6b7d57885409750\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\n# check if it is as good or better<br \/>\nif value &lt;= score:<br \/>\n\tsolution, score = candidate, value<br \/>\n\tprint(&#8216;&gt;%d, score=%.3f&#8217; % (i, score))<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-p\"># check if it is as good or better<\/span><\/p>\n<p><span class=\"crayon-st\">if<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">value<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&lt;=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">solution<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">candidate<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">value<\/span><\/p>\n<p><span class=\"crayon-e\">\t<\/span><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8216;&gt;%d, score=%.3f&#8217;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">%<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">i<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0002 seconds] --><\/p>\n<p>The updated search function with both of these changes is listed below.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f6cf6f6b7d58728176099\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# run a hill climb for a set of predictions<br \/>\ndef hill_climb_testset(X_test, y_test, max_iterations):<br \/>\n\tscores = list()<br \/>\n\t# generate the initial solution<br \/>\n\tsolution = random_predictions(X_test.shape[0])<br \/>\n\t# evaluate the initial solution<br \/>\n\tscore = evaluate_predictions(y_test, solution)<br \/>\n\tprint(&#8216;&gt;%.3f&#8217; % score)<br \/>\n\t# hill climb to a solution<br \/>\n\tfor i in range(max_iterations):<br \/>\n\t\t# record scores<br \/>\n\t\tscores.append(score)<br \/>\n\t\t# stop once we achieve the best score<br \/>\n\t\tif score == 0.0:<br \/>\n\t\t\tbreak<br \/>\n\t\t# generate new candidate<br \/>\n\t\tcandidate = modify_predictions(solution)<br \/>\n\t\t# evaluate candidate<br \/>\n\t\tvalue = evaluate_predictions(y_test, candidate)<br \/>\n\t\t# check if it is as good or better<br \/>\n\t\tif value &lt;= score:<br \/>\n\t\t\tsolution, score = candidate, value<br \/>\n\t\t\tprint(&#8216;&gt;%d, score=%.3f&#8217; % (i, score))<br \/>\n\treturn solution, scores<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<div class=\"urvanov-syntax-highlighter-nums-content\">\n<p>1<\/p>\n<p>2<\/p>\n<p>3<\/p>\n<p>4<\/p>\n<p>5<\/p>\n<p>6<\/p>\n<p>7<\/p>\n<p>8<\/p>\n<p>9<\/p>\n<p>10<\/p>\n<p>11<\/p>\n<p>12<\/p>\n<p>13<\/p>\n<p>14<\/p>\n<p>15<\/p>\n<p>16<\/p>\n<p>17<\/p>\n<p>18<\/p>\n<p>19<\/p>\n<p>20<\/p>\n<p>21<\/p>\n<p>22<\/p>\n<p>23<\/p>\n<p>24<\/p>\n<\/div>\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-p\"># run a hill climb for a set of predictions<\/span><\/p>\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">hill_climb_testset<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">max_iterations<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">scores<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">list<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-p\"># generate the initial solution<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">solution<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">random_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X_test<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-p\"># evaluate the initial solution<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">evaluate_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">solution<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8216;&gt;%.3f&#8217;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">%<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-p\"># hill climb to a solution<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">for<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">i<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">in<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">range<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">max_iterations<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># record scores<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-v\">scores<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">append<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># stop once we achieve the best score<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-st\">if<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">==<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">0.0<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t\t<\/span><span class=\"crayon-st\">break<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># generate new candidate<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-v\">candidate<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">modify_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">solution<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># evaluate candidate<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-v\">value<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">evaluate_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">candidate<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># check if it is as good or better<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-st\">if<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">value<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&lt;=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t\t<\/span><span class=\"crayon-v\">solution<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">candidate<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">value<\/span><\/p>\n<p><span class=\"crayon-e\">\t\t\t<\/span><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8216;&gt;%d, score=%.3f&#8217;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">%<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">i<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">solution<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">scores<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0006 seconds] --><\/p>\n<p>Tying this together, the complete example of hill climbing the test set for a regression task is listed below.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f6cf6f6b7d59992126266\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# example of hill climbing the test set for the housing dataset<br \/>\nfrom random import random<br \/>\nfrom random import randint<br \/>\nfrom pandas import read_csv<br \/>\nfrom sklearn.model_selection import train_test_split<br \/>\nfrom sklearn.metrics import mean_absolute_error<br \/>\nfrom sklearn.preprocessing import MinMaxScaler<br \/>\nfrom matplotlib import pyplot<\/p>\n<p># load or prepare the classification dataset<br \/>\ndef load_dataset():<br \/>\n\turl = &#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/housing.csv&#8217;<br \/>\n\tdf = read_csv(url, header=None)<br \/>\n\tdata = df.values<br \/>\n\tX, y = data[:, :-1], data[:, -1]<br \/>\n\t# normalize the target<br \/>\n\tscaler = MinMaxScaler()<br \/>\n\ty = y.reshape((len(y), 1))<br \/>\n\ty = scaler.fit_transform(y)<br \/>\n\treturn X, y<\/p>\n<p># evaluate a set of predictions<br \/>\ndef evaluate_predictions(y_test, yhat):<br \/>\n\treturn mean_absolute_error(y_test, yhat)<\/p>\n<p># create a random set of predictions<br \/>\ndef random_predictions(n_examples):<br \/>\n\treturn [random() for _ in range(n_examples)]<\/p>\n<p># modify the current set of predictions<br \/>\ndef modify_predictions(current, n_changes=1):<br \/>\n\t# copy current solution<br \/>\n\tupdated = current.copy()<br \/>\n\tfor i in range(n_changes):<br \/>\n\t\t# select a point to change<br \/>\n\t\tix = randint(0, len(updated)-1)<br \/>\n\t\t# flip the class label<br \/>\n\t\tupdated[ix] = random()<br \/>\n\treturn updated<\/p>\n<p># run a hill climb for a set of predictions<br \/>\ndef hill_climb_testset(X_test, y_test, max_iterations):<br \/>\n\tscores = list()<br \/>\n\t# generate the initial solution<br \/>\n\tsolution = random_predictions(X_test.shape[0])<br \/>\n\t# evaluate the initial solution<br \/>\n\tscore = evaluate_predictions(y_test, solution)<br \/>\n\tprint(&#8216;&gt;%.3f&#8217; % score)<br \/>\n\t# hill climb to a solution<br \/>\n\tfor i in range(max_iterations):<br \/>\n\t\t# record scores<br \/>\n\t\tscores.append(score)<br \/>\n\t\t# stop once we achieve the best score<br \/>\n\t\tif score == 0.0:<br \/>\n\t\t\tbreak<br \/>\n\t\t# generate new candidate<br \/>\n\t\tcandidate = modify_predictions(solution)<br \/>\n\t\t# evaluate candidate<br \/>\n\t\tvalue = evaluate_predictions(y_test, candidate)<br \/>\n\t\t# check if it is as good or better<br \/>\n\t\tif value &lt;= score:<br \/>\n\t\t\tsolution, score = candidate, value<br \/>\n\t\t\tprint(&#8216;&gt;%d, score=%.3f&#8217; % (i, score))<br \/>\n\treturn solution, scores<\/p>\n<p># load the dataset<br \/>\nX, y = load_dataset()<br \/>\nprint(X.shape, y.shape)<br \/>\n# split dataset into train and test sets<br \/>\nX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1)<br \/>\nprint(X_train.shape, X_test.shape, y_train.shape, y_test.shape)<br \/>\n# run hill climb<br \/>\nyhat, scores = hill_climb_testset(X_test, y_test, 100000)<br \/>\n# plot the scores vs iterations<br \/>\npyplot.plot(scores)<br \/>\npyplot.show()<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<div class=\"urvanov-syntax-highlighter-nums-content\">\n<p>1<\/p>\n<p>2<\/p>\n<p>3<\/p>\n<p>4<\/p>\n<p>5<\/p>\n<p>6<\/p>\n<p>7<\/p>\n<p>8<\/p>\n<p>9<\/p>\n<p>10<\/p>\n<p>11<\/p>\n<p>12<\/p>\n<p>13<\/p>\n<p>14<\/p>\n<p>15<\/p>\n<p>16<\/p>\n<p>17<\/p>\n<p>18<\/p>\n<p>19<\/p>\n<p>20<\/p>\n<p>21<\/p>\n<p>22<\/p>\n<p>23<\/p>\n<p>24<\/p>\n<p>25<\/p>\n<p>26<\/p>\n<p>27<\/p>\n<p>28<\/p>\n<p>29<\/p>\n<p>30<\/p>\n<p>31<\/p>\n<p>32<\/p>\n<p>33<\/p>\n<p>34<\/p>\n<p>35<\/p>\n<p>36<\/p>\n<p>37<\/p>\n<p>38<\/p>\n<p>39<\/p>\n<p>40<\/p>\n<p>41<\/p>\n<p>42<\/p>\n<p>43<\/p>\n<p>44<\/p>\n<p>45<\/p>\n<p>46<\/p>\n<p>47<\/p>\n<p>48<\/p>\n<p>49<\/p>\n<p>50<\/p>\n<p>51<\/p>\n<p>52<\/p>\n<p>53<\/p>\n<p>54<\/p>\n<p>55<\/p>\n<p>56<\/p>\n<p>57<\/p>\n<p>58<\/p>\n<p>59<\/p>\n<p>60<\/p>\n<p>61<\/p>\n<p>62<\/p>\n<p>63<\/p>\n<p>64<\/p>\n<p>65<\/p>\n<p>66<\/p>\n<p>67<\/p>\n<p>68<\/p>\n<p>69<\/p>\n<p>70<\/p>\n<p>71<\/p>\n<p>72<\/p>\n<p>73<\/p>\n<p>74<\/p>\n<p>75<\/p>\n<p>76<\/p>\n<\/div>\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-p\"># example of hill climbing the test set for the housing dataset<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">random <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">random<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">random <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">randint<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">pandas <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">read_csv<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">model_selection <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">train_test_split<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">metrics <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">mean_absolute_error<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">preprocessing <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">MinMaxScaler<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">matplotlib <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-i\">pyplot<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># load or prepare the classification dataset<\/span><\/p>\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">load_dataset<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">url<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-s\">&#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/housing.csv&#8217;<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">df<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">read_csv<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">url<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">header<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">None<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">df<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-i\">values<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-p\"># normalize the target<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">scaler<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">MinMaxScaler<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">reshape<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-e\">len<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">scaler<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">fit_transform<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">y<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># evaluate a set of predictions<\/span><\/p>\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">evaluate_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">yhat<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">mean_absolute_error<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">yhat<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># create a random set of predictions<\/span><\/p>\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">random_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">n_examples<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-e\">random<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">for<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">_<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">in<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">range<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">n_examples<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># modify the current set of predictions<\/span><\/p>\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">modify_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">current<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">n_changes<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-p\"># copy current solution<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">updated<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">current<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">copy<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">for<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">i<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">in<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">range<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">n_changes<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># select a point to change<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-v\">ix<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">randint<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">len<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">updated<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># flip the class label<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-v\">updated<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">ix<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">random<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">updated<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># run a hill climb for a set of predictions<\/span><\/p>\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">hill_climb_testset<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">max_iterations<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">scores<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">list<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-p\"># generate the initial solution<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">solution<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">random_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X_test<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-p\"># evaluate the initial solution<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">evaluate_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">solution<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8216;&gt;%.3f&#8217;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">%<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-p\"># hill climb to a solution<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">for<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">i<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">in<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">range<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">max_iterations<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># record scores<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-v\">scores<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">append<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># stop once we achieve the best score<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-st\">if<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">==<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">0.0<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t\t<\/span><span class=\"crayon-st\">break<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># generate new candidate<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-v\">candidate<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">modify_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">solution<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># evaluate candidate<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-v\">value<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">evaluate_predictions<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">candidate<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-p\"># check if it is as good or better<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t<\/span><span class=\"crayon-st\">if<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">value<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&lt;=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t\t\t<\/span><span class=\"crayon-v\">solution<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">candidate<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">value<\/span><\/p>\n<p><span class=\"crayon-e\">\t\t\t<\/span><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8216;&gt;%d, score=%.3f&#8217;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">%<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">i<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">solution<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">scores<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># load the dataset<\/span><\/p>\n<p><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">load_dataset<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># split dataset into train and test sets<\/span><\/p>\n<p><span class=\"crayon-v\">X_train<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_train<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">train_test_split<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">test_size<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">0.33<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">random_state<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X_train<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X_test<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_train<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># run hill climb<\/span><\/p>\n<p><span class=\"crayon-v\">yhat<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">scores<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">hill_climb_testset<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">100000<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># plot the scores vs iterations<\/span><\/p>\n<p><span class=\"crayon-v\">pyplot<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">plot<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">scores<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">pyplot<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">show<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0019 seconds] --><\/p>\n<p>Running the example reports the iteration number and MAE each time an improvement is seen during the search.<\/p>\n<p>We use many more iterations in this case because it is a more complex problem to optimize. The chosen method for creating candidate solutions also makes it slower and less likely we will achieve perfect error.<\/p>\n<p>In fact, we would not achieve perfect error; instead, it would be better to stop if error reached a value below a minimum value such as 1e-7 or something meaningful to the target domain. This, too, is left as an exercise for the reader.<\/p>\n<p>For example:<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f6cf6f6b7d5a190999233\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\n# stop once we achieve a good enough<br \/>\nif score &lt;= 1e-7:<br \/>\n\tbreak<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-p\"># stop once we achieve a good enough<\/span><\/p>\n<p><span class=\"crayon-st\">if<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">score<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&lt;=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">1e<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">7<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\t<\/span><span class=\"crayon-st\">break<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0001 seconds] --><\/p>\n<p><strong>Note<\/strong>: Your <a href=\"https:\/\/machinelearningmastery.com\/different-results-each-time-in-machine-learning\/\">results may vary<\/a> given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the average outcome.<\/p>\n<p>In this case, we can see that we achieved a good error by the end of the run.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f6cf6f6b7d5b456242069\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-mac print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\n&gt;95991, score=0.001<br \/>\n&gt;96011, score=0.001<br \/>\n&gt;96295, score=0.001<br \/>\n&gt;96366, score=0.001<br \/>\n&gt;96585, score=0.001<br \/>\n&gt;97575, score=0.001<br \/>\n&gt;98828, score=0.001<br \/>\n&gt;98947, score=0.001<br \/>\n&gt;99712, score=0.001<br \/>\n&gt;99913, score=0.001<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p>&#8230;<\/p>\n<p>&gt;95991, score=0.001<\/p>\n<p>&gt;96011, score=0.001<\/p>\n<p>&gt;96295, score=0.001<\/p>\n<p>&gt;96366, score=0.001<\/p>\n<p>&gt;96585, score=0.001<\/p>\n<p>&gt;97575, score=0.001<\/p>\n<p>&gt;98828, score=0.001<\/p>\n<p>&gt;98947, score=0.001<\/p>\n<p>&gt;99712, score=0.001<\/p>\n<p>&gt;99913, score=0.001<\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0000 seconds] --><\/p>\n<p>A line plot of the search progress is also created showing that convergence was rapid and sits flat for most of the iterations.<\/p>\n<div id=\"attachment_10973\" class=\"wp-caption aligncenter\">\n<img decoding=\"async\" aria-describedby=\"caption-attachment-10973\" loading=\"lazy\" class=\"size-full wp-image-10973\" src=\"https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/05\/Line-Plot-of-Accuracy-vs-Hill-Climb-Optimization-Iteration-for-the-Housing-Dataset.png\" alt=\"Line Plot of Accuracy vs. Hill Climb Optimization Iteration for the Housing Dataset\" width=\"1280\" height=\"960\"><\/p>\n<p id=\"caption-attachment-10973\" class=\"wp-caption-text\">Line Plot of Accuracy vs. Hill Climb Optimization Iteration for the Housing Dataset<\/p>\n<\/div>\n<h2>Further Reading<\/h2>\n<p>This section provides more resources on the topic if you are looking to go deeper.<\/p>\n<h3>Papers<\/h3>\n<h3>Articles<\/h3>\n<h2>Summary<\/h2>\n<p>In this tutorial, you discovered how to hill climb the test set for machine learning.<\/p>\n<p>Specifically, you learned:<\/p>\n<ul>\n<li>Perfect predictions can be made by hill climbing the test set without even looking at the training dataset.<\/li>\n<li>How to hill climb the test set for classification and regression tasks.<\/li>\n<li>We implicitly hill climb the test set when we overuse the test set to evaluate our modeling pipelines.<\/li>\n<\/ul>\n<p><strong>Do you have any questions?<\/strong><br \/>Ask your questions in the comments below and I will do my best to answer.<\/p>\n<div class=\"widget_text awac-wrapper\" id=\"custom_html-87\">\n<div class=\"widget_text awac widget custom_html-87\">\n<div class=\"textwidget custom-html-widget\">\n<div>\n<h2>Get a Handle on Modern Data Preparation!<\/h2>\n<p><a href=\"\/data-preparation-for-machine-learning\/\" rel=\"nofollow\"><img decoding=\"async\" src=\"https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/06\/Cover-220.png\" alt=\"Data Preparation for Machine Learning\" align=\"left\"><\/a><\/p>\n<h4>Prepare Your Machine Learning Data in Minutes<\/h4>\n<p>&#8230;with just a few lines of python code<\/p>\n<p>Discover how in my new Ebook:<br \/><a href=\"\/data-preparation-for-machine-learning\/\" rel=\"nofollow\">Data Preparation for Machine Learning<\/a><\/p>\n<p>It provides <strong>self-study tutorials<\/strong> with <strong>full working code<\/strong> on:<br \/><em>Feature Selection<\/em>, <em>RFE<\/em>, <em>Data Cleaning<\/em>, <em>Data Transforms<\/em>, <em>Scaling<\/em>, <em>Dimensionality Reduction<\/em>,<br \/>\n\tand much more&#8230;<\/p>\n<h4>Bring Modern Data Preparation Techniques to <br \/>Your Machine Learning Projects<\/h4>\n<p><a href=\"\/data-preparation-for-machine-learning\/\" class=\"woo-sc-button  red\"><span class=\"woo-\">See What&#8217;s Inside<\/span><\/a><\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div><\/div>\n","protected":false},"excerpt":{"rendered":"<p>https:\/\/machinelearningmastery.com\/hill-climb-the-test-set-for-machine-learning\/<\/p>\n","protected":false},"author":0,"featured_media":275,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[3],"tags":[],"_links":{"self":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/274"}],"collection":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/comments?post=274"}],"version-history":[{"count":0,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/274\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media\/275"}],"wp:attachment":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media?parent=274"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/categories?post=274"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/tags?post=274"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}