{"id":592,"date":"2020-11-21T02:55:56","date_gmt":"2020-11-21T02:55:56","guid":{"rendered":"https:\/\/machine-learning.webcloning.com\/2020\/11\/21\/a-gentle-introduction-to-pycaret-for-machine-learning\/"},"modified":"2020-11-21T02:55:56","modified_gmt":"2020-11-21T02:55:56","slug":"a-gentle-introduction-to-pycaret-for-machine-learning","status":"publish","type":"post","link":"https:\/\/salarydistribution.com\/machine-learning\/2020\/11\/21\/a-gentle-introduction-to-pycaret-for-machine-learning\/","title":{"rendered":"A Gentle Introduction to PyCaret for Machine Learning"},"content":{"rendered":"<div id=\"\">\n<p><strong>PyCaret<\/strong> is a Python open source machine learning library designed to make performing standard tasks in a machine learning project easy.<\/p>\n<p>It is a Python version of the Caret machine learning package in R, popular because it allows models to be evaluated, compared, and tuned on a given dataset with just a few lines of code.<\/p>\n<p>The PyCaret library provides these features, allowing the machine learning practitioner in Python to spot check a suite of standard machine learning algorithms on a classification or regression dataset with a single function call.<\/p>\n<p>In this tutorial, you will discover the PyCaret Python open source library for machine learning.<\/p>\n<p>After completing this tutorial, you will know:<\/p>\n<ul>\n<li>PyCaret is a Python version of the popular and widely used caret machine learning package in R.<\/li>\n<li>How to use PyCaret to easily evaluate and compare standard machine learning models on a dataset.<\/li>\n<li>How to use PyCaret to easily tune the hyperparameters of a well-performing machine learning model.<\/li>\n<\/ul>\n<p>Let\u2019s get started.<\/p>\n<div id=\"attachment_11862\" class=\"wp-caption aligncenter\">\n<img decoding=\"async\" aria-describedby=\"caption-attachment-11862\" loading=\"lazy\" class=\"size-full wp-image-11862\" src=\"https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2021\/03\/A-Gentle-Introduction-to-PyCaret-for-Machine-Learning.jpg\" alt=\"A Gentle Introduction to PyCaret for Machine Learning\" width=\"799\" height=\"383\"><\/p>\n<p id=\"caption-attachment-11862\" class=\"wp-caption-text\">A Gentle Introduction to PyCaret for Machine Learning<br \/>Photo by <a href=\"https:\/\/www.flickr.com\/photos\/photommo\/35017076361\/\">Thomas<\/a>, some rights reserved.<\/p>\n<\/div>\n<h2>Tutorial Overview<\/h2>\n<p>This tutorial is divided into four parts; they are:<\/p>\n<ol>\n<li>What Is PyCaret?<\/li>\n<li>Sonar Dataset<\/li>\n<li>Comparing Machine Learning Models<\/li>\n<li>Tuning Machine Learning Models<\/li>\n<\/ol>\n<h2>What Is PyCaret?<\/h2>\n<p><a href=\"https:\/\/pycaret.org\/\">PyCaret<\/a> is an open source Python machine learning library inspired by the <a href=\"https:\/\/topepo.github.io\/caret\/\">caret R package<\/a>.<\/p>\n<p>The goal of the caret package is to automate the major steps for evaluating and comparing machine learning algorithms for classification and regression. The main benefit of the library is that a lot can be achieved with very few lines of code and little manual configuration. The PyCaret library brings these capabilities to Python.<\/p>\n<blockquote>\n<p>PyCaret is an open-source, low-code machine learning library in Python that aims to reduce the cycle time from hypothesis to insights. It is well suited for seasoned data scientists who want to increase the productivity of their ML experiments by using PyCaret in their workflows or for citizen data scientists and those new to data science with little or no background in coding.<\/p>\n<\/blockquote>\n<p>\u2014 <a href=\"https:\/\/pycaret.org\/\">PyCaret Homepage<\/a><\/p>\n<p>The PyCaret library automates many steps of a machine learning project, such as:<\/p>\n<ul>\n<li>Defining the data transforms to perform (<em>setup()<\/em>)<\/li>\n<li>Evaluating and comparing standard models (<em>compare_models()<\/em>)<\/li>\n<li>Tuning model hyperparameters (<em>tune_model()<\/em>)<\/li>\n<\/ul>\n<p>As well as many more features not limited to creating ensembles, saving models, and deploying models.<\/p>\n<p>The PyCaret library has a wealth of documentation for using the API; you can get started here:<\/p>\n<p>We will not explore all of the features of the library in this tutorial; instead, we will focus on simple machine learning model comparison and hyperparameter tuning.<\/p>\n<p>You can install PyCaret using your Python package manager, such as pip. For example:<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<p><!-- [Format Time: 0.0001 seconds] --><\/p>\n<p>Once installed, we can confirm that the library is available in your development environment and is working correctly by printing the installed version.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5fb881455bab5536374940\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# check pycaret version<br \/>\nimport pycaret<br \/>\nprint(&#8216;PyCaret: %s&#8217; % pycaret.__version__)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-p\"># check pycaret version<\/span><\/p>\n<p><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">pycaret<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8216;PyCaret: %s&#8217;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">%<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">pycaret<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">__version__<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0002 seconds] --><\/p>\n<p>Running the example will load the PyCaret library and print the installed version number.<\/p>\n<p>Your version number should be the same or higher.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<p><!-- [Format Time: 0.0000 seconds] --><\/p>\n<p>If you need help installing PyCaret for your system, you can see the installation instructions here:<\/p>\n<p>Now that we are familiar with what PyCaret is, let\u2019s explore how we might use it on a machine learning project.<\/p>\n<h2>Sonar Dataset<\/h2>\n<p>We will use the Sonar standard binary classification dataset. You can learn more about it here:<\/p>\n<p>We can download the dataset directly from the URL and load it as a Pandas DataFrame.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5fb881455bab7344605961\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\n# define the location of the dataset<br \/>\nurl = &#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/sonar.csv&#8217;<br \/>\n# load the dataset<br \/>\ndf = read_csv(url, header=None)<br \/>\n# summarize the shape of the dataset<br \/>\nprint(df.shape)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-p\"># define the location of the dataset<\/span><\/p>\n<p><span class=\"crayon-v\">url<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-s\">&#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/sonar.csv&#8217;<\/span><\/p>\n<p><span class=\"crayon-p\"># load the dataset<\/span><\/p>\n<p><span class=\"crayon-v\">df<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">read_csv<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">url<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">header<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">None<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># summarize the shape of the dataset<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">df<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0002 seconds] --><\/p>\n<p>The PyCaret seems to require that a dataset has column names, and our dataset does not have column names, so we can set the column number as the column name directly.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5fb881455bab8883924241\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\n# set column names as the column number<br \/>\nn_cols = df.shape[1]<br \/>\ndf.columns = [str(i) for i in range(n_cols)]<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-p\"># set column names as the column number<\/span><\/p>\n<p><span class=\"crayon-v\">n_cols<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">df<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-v\">df<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">columns<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-e\">str<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">i<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">for<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">i<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">in<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">range<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">n_cols<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0002 seconds] --><\/p>\n<p>Finally, we can summarize the first few rows of data.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5fb881455bab9600597248\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\n# summarize the first few rows of data<br \/>\nprint(df.head())<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-p\"># summarize the first few rows of data<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">df<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">head<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0001 seconds] --><\/p>\n<p>Tying this together, the complete example of loading and summarizing the Sonar dataset is listed below.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5fb881455baba822822221\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# load the sonar dataset<br \/>\nfrom pandas import read_csv<br \/>\n# define the location of the dataset<br \/>\nurl = &#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/sonar.csv&#8217;<br \/>\n# load the dataset<br \/>\ndf = read_csv(url, header=None)<br \/>\n# summarize the shape of the dataset<br \/>\nprint(df.shape)<br \/>\n# set column names as the column number<br \/>\nn_cols = df.shape[1]<br \/>\ndf.columns = [str(i) for i in range(n_cols)]<br \/>\n# summarize the first few rows of data<br \/>\nprint(df.head())<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<div class=\"urvanov-syntax-highlighter-nums-content\">\n<p>1<\/p>\n<p>2<\/p>\n<p>3<\/p>\n<p>4<\/p>\n<p>5<\/p>\n<p>6<\/p>\n<p>7<\/p>\n<p>8<\/p>\n<p>9<\/p>\n<p>10<\/p>\n<p>11<\/p>\n<p>12<\/p>\n<p>13<\/p>\n<\/div>\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-p\"># load the sonar dataset<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">pandas <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-v\">read<\/span><span class=\"crayon-sy\">_<\/span>csv<\/p>\n<p><span class=\"crayon-p\"># define the location of the dataset<\/span><\/p>\n<p><span class=\"crayon-v\">url<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-s\">&#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/sonar.csv&#8217;<\/span><\/p>\n<p><span class=\"crayon-p\"># load the dataset<\/span><\/p>\n<p><span class=\"crayon-v\">df<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">read_csv<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">url<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">header<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">None<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># summarize the shape of the dataset<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">df<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># set column names as the column number<\/span><\/p>\n<p><span class=\"crayon-v\">n_cols<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">df<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-v\">df<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">columns<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-e\">str<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">i<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">for<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">i<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">in<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">range<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">n_cols<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-p\"># summarize the first few rows of data<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">df<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">head<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0005 seconds] --><\/p>\n<p>Running the example first loads the dataset and reports the shape, showing it has 208 rows and 61 columns.<\/p>\n<p>The first five rows are then printed showing that the input variables are all numeric and the target variable is column \u201c60\u201d and has string labels.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5fb881455babb774862062\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n(208, 61)<br \/>\n0 1 2 3 4 &#8230; 56 57 58 59 60<br \/>\n0 0.0200 0.0371 0.0428 0.0207 0.0954 &#8230; 0.0180 0.0084 0.0090 0.0032 R<br \/>\n1 0.0453 0.0523 0.0843 0.0689 0.1183 &#8230; 0.0140 0.0049 0.0052 0.0044 R<br \/>\n2 0.0262 0.0582 0.1099 0.1083 0.0974 &#8230; 0.0316 0.0164 0.0095 0.0078 R<br \/>\n3 0.0100 0.0171 0.0623 0.0205 0.0205 &#8230; 0.0050 0.0044 0.0040 0.0117 R<br \/>\n4 0.0762 0.0666 0.0481 0.0394 0.0590 &#8230; 0.0072 0.0048 0.0107 0.0094 R<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p>(208, 61)<\/p>\n<p>0 1 2 3 4 &#8230; 56 57 58 59 60<\/p>\n<p>0 0.0200 0.0371 0.0428 0.0207 0.0954 &#8230; 0.0180 0.0084 0.0090 0.0032 R<\/p>\n<p>1 0.0453 0.0523 0.0843 0.0689 0.1183 &#8230; 0.0140 0.0049 0.0052 0.0044 R<\/p>\n<p>2 0.0262 0.0582 0.1099 0.1083 0.0974 &#8230; 0.0316 0.0164 0.0095 0.0078 R<\/p>\n<p>3 0.0100 0.0171 0.0623 0.0205 0.0205 &#8230; 0.0050 0.0044 0.0040 0.0117 R<\/p>\n<p>4 0.0762 0.0666 0.0481 0.0394 0.0590 &#8230; 0.0072 0.0048 0.0107 0.0094 R<\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0000 seconds] --><\/p>\n<p>Next, we can use PyCaret to evaluate and compare a suite of standard machine learning algorithms to quickly discover what works well on this dataset.<\/p>\n<h2>PyCaret for Comparing Machine Learning Models<\/h2>\n<p>In this section, we will evaluate and compare the performance of standard machine learning models on the Sonar classification dataset.<\/p>\n<p>First, we must set the dataset with the PyCaret library via the <a href=\"https:\/\/pycaret.org\/classification\/\">setup() function<\/a>. This requires that we provide the Pandas DataFrame and specify the name of the column that contains the target variable.<\/p>\n<p>The <em>setup()<\/em> function also allows you to configure simple data preparation, such as scaling, power transforms, missing data handling, and PCA transforms.<\/p>\n<p>We will specify the data, target variable, and turn off HTML output, verbose output, and requests for user feedback.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5fb881455babc896304950\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\n# setup the dataset<br \/>\ngrid = setup(data=df, target=df.columns[-1], html=False, silent=True, verbose=False)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-p\"># setup the dataset<\/span><\/p>\n<p><span class=\"crayon-v\">grid<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">setup<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">df<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">target<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">df<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">columns<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">html<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-t\">False<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">silent<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-t\">True<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">verbose<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-t\">False<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0002 seconds] --><\/p>\n<p>Next, we can compare standard machine learning models by calling the <em>compare_models()<\/em> function.<\/p>\n<p>By default, it will evaluate models using 10-fold cross-validation, sort results by classification accuracy, and return the single best model.<\/p>\n<p>These are good defaults, and we don\u2019t need to change a thing.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5fb881455babe224488976\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\n# evaluate models and compare models<br \/>\nbest = compare_models()<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-p\"># evaluate models and compare models<\/span><\/p>\n<p><span class=\"crayon-v\">best<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">compare_models<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0001 seconds] --><\/p>\n<p>Call the <em>compare_models()<\/em> function will also report a table of results summarizing all of the models that were evaluated and their performance.<\/p>\n<p>Finally, we can report the best-performing model and its configuration.<\/p>\n<p>Tying this together, the complete example of evaluating a suite of standard models on the Sonar classification dataset is listed below.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5fb881455babf290351700\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# compare machine learning algorithms on the sonar classification dataset<br \/>\nfrom pandas import read_csv<br \/>\nfrom pycaret.classification import setup<br \/>\nfrom pycaret.classification import compare_models<br \/>\n# define the location of the dataset<br \/>\nurl = &#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/sonar.csv&#8217;<br \/>\n# load the dataset<br \/>\ndf = read_csv(url, header=None)<br \/>\n# set column names as the column number<br \/>\nn_cols = df.shape[1]<br \/>\ndf.columns = [str(i) for i in range(n_cols)]<br \/>\n# setup the dataset<br \/>\ngrid = setup(data=df, target=df.columns[-1], html=False, silent=True, verbose=False)<br \/>\n# evaluate models and compare models<br \/>\nbest = compare_models()<br \/>\n# report the best model<br \/>\nprint(best)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<div class=\"urvanov-syntax-highlighter-nums-content\">\n<p>1<\/p>\n<p>2<\/p>\n<p>3<\/p>\n<p>4<\/p>\n<p>5<\/p>\n<p>6<\/p>\n<p>7<\/p>\n<p>8<\/p>\n<p>9<\/p>\n<p>10<\/p>\n<p>11<\/p>\n<p>12<\/p>\n<p>13<\/p>\n<p>14<\/p>\n<p>15<\/p>\n<p>16<\/p>\n<p>17<\/p>\n<\/div>\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-p\"># compare machine learning algorithms on the sonar classification dataset<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">pandas <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">read_csv<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">pycaret<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">classification <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">setup<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">pycaret<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">classification <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-v\">compare<\/span><span class=\"crayon-sy\">_<\/span>models<\/p>\n<p><span class=\"crayon-p\"># define the location of the dataset<\/span><\/p>\n<p><span class=\"crayon-v\">url<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-s\">&#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/sonar.csv&#8217;<\/span><\/p>\n<p><span class=\"crayon-p\"># load the dataset<\/span><\/p>\n<p><span class=\"crayon-v\">df<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">read_csv<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">url<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">header<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">None<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># set column names as the column number<\/span><\/p>\n<p><span class=\"crayon-v\">n_cols<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">df<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-v\">df<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">columns<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-e\">str<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">i<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">for<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">i<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">in<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">range<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">n_cols<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-p\"># setup the dataset<\/span><\/p>\n<p><span class=\"crayon-v\">grid<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">setup<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">df<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">target<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">df<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">columns<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">html<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-t\">False<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">silent<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-t\">True<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">verbose<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-t\">False<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># evaluate models and compare models<\/span><\/p>\n<p><span class=\"crayon-v\">best<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">compare_models<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># report the best model<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">best<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0005 seconds] --><\/p>\n<p>Running the example will load the dataset, configure the PyCaret library, evaluate a suite of standard models, and report the best model found for the dataset.<\/p>\n<p><strong>Note<\/strong>: Your <a href=\"https:\/\/machinelearningmastery.com\/different-results-each-time-in-machine-learning\/\">results may vary<\/a> given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the average outcome.<\/p>\n<p>In this case, we can see that the \u201c<em>Extra Trees Classifier<\/em>\u201d has the best accuracy on the dataset with a score of about 86.95 percent.<\/p>\n<p>We can then see the configuration of the model that was used, which looks like it used default hyperparameter values.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5fb881455bac0351179999\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n                              Model  Accuracy     AUC  Recall   Prec.      F1<br \/>\n0            Extra Trees Classifier    0.8695  0.9497  0.8571  0.8778  0.8631<br \/>\n1               CatBoost Classifier    0.8695  0.9548  0.8143  0.9177  0.8508<br \/>\n2   Light Gradient Boosting Machine    0.8219  0.9096  0.8000  0.8327  0.8012<br \/>\n3      Gradient Boosting Classifier    0.8010  0.8801  0.7690  0.8110  0.7805<br \/>\n4              Ada Boost Classifier    0.8000  0.8474  0.7952  0.8071  0.7890<br \/>\n5            K Neighbors Classifier    0.7995  0.8613  0.7405  0.8276  0.7773<br \/>\n6         Extreme Gradient Boosting    0.7995  0.8934  0.7833  0.8095  0.7802<br \/>\n7          Random Forest Classifier    0.7662  0.8778  0.6976  0.8024  0.7345<br \/>\n8          Decision Tree Classifier    0.7533  0.7524  0.7119  0.7655  0.7213<br \/>\n9                  Ridge Classifier    0.7448  0.0000  0.6952  0.7574  0.7135<br \/>\n10                      Naive Bayes    0.7214  0.8159  0.8286  0.6700  0.7308<br \/>\n11              SVM &#8211; Linear Kernel    0.7181  0.0000  0.6286  0.7146  0.6309<br \/>\n12              Logistic Regression    0.7100  0.8104  0.6357  0.7263  0.6634<br \/>\n13     Linear Discriminant Analysis    0.6924  0.7510  0.6667  0.6762  0.6628<br \/>\n14  Quadratic Discriminant Analysis    0.5800  0.6308  0.1095  0.5000  0.1750<\/p>\n<p>     Kappa     MCC  TT (Sec)<br \/>\n0   0.7383  0.7446    0.1415<br \/>\n1   0.7368  0.7552    1.9930<br \/>\n2   0.6410  0.6581    0.0134<br \/>\n3   0.5989  0.6090    0.1413<br \/>\n4   0.5979  0.6123    0.0726<br \/>\n5   0.5957  0.6038    0.0019<br \/>\n6   0.5970  0.6132    0.0287<br \/>\n7   0.5277  0.5438    0.1107<br \/>\n8   0.5028  0.5192    0.0035<br \/>\n9   0.4870  0.5003    0.0030<br \/>\n10  0.4488  0.4752    0.0019<br \/>\n11  0.4235  0.4609    0.0024<br \/>\n12  0.4143  0.4285    0.0059<br \/>\n13  0.3825  0.3927    0.0034<br \/>\n14  0.1172  0.1792    0.0033<br \/>\nExtraTreesClassifier(bootstrap=False, ccp_alpha=0.0, class_weight=None,<br \/>\n                     criterion=&#8217;gini&#8217;, max_depth=None, max_features=&#8217;auto&#8217;,<br \/>\n                     max_leaf_nodes=None, max_samples=None,<br \/>\n                     min_impurity_decrease=0.0, min_impurity_split=None,<br \/>\n                     min_samples_leaf=1, min_samples_split=2,<br \/>\n                     min_weight_fraction_leaf=0.0, n_estimators=100, n_jobs=-1,<br \/>\n                     oob_score=False, random_state=2728, verbose=0,<br \/>\n                     warm_start=False)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<div class=\"urvanov-syntax-highlighter-nums-content\">\n<p>1<\/p>\n<p>2<\/p>\n<p>3<\/p>\n<p>4<\/p>\n<p>5<\/p>\n<p>6<\/p>\n<p>7<\/p>\n<p>8<\/p>\n<p>9<\/p>\n<p>10<\/p>\n<p>11<\/p>\n<p>12<\/p>\n<p>13<\/p>\n<p>14<\/p>\n<p>15<\/p>\n<p>16<\/p>\n<p>17<\/p>\n<p>18<\/p>\n<p>19<\/p>\n<p>20<\/p>\n<p>21<\/p>\n<p>22<\/p>\n<p>23<\/p>\n<p>24<\/p>\n<p>25<\/p>\n<p>26<\/p>\n<p>27<\/p>\n<p>28<\/p>\n<p>29<\/p>\n<p>30<\/p>\n<p>31<\/p>\n<p>32<\/p>\n<p>33<\/p>\n<p>34<\/p>\n<p>35<\/p>\n<p>36<\/p>\n<p>37<\/p>\n<p>38<\/p>\n<p>39<\/p>\n<p>40<\/p>\n<p>41<\/p>\n<\/div>\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Model\u00a0\u00a0Accuracy\u00a0\u00a0\u00a0\u00a0 AUC\u00a0\u00a0Recall\u00a0\u00a0 Prec.\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0F1\u00a0\u00a0<\/p>\n<p>0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Extra Trees Classifier\u00a0\u00a0\u00a0\u00a00.8695\u00a0\u00a00.9497\u00a0\u00a00.8571\u00a0\u00a00.8778\u00a0\u00a00.8631<\/p>\n<p>1\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 CatBoost Classifier\u00a0\u00a0\u00a0\u00a00.8695\u00a0\u00a00.9548\u00a0\u00a00.8143\u00a0\u00a00.9177\u00a0\u00a00.8508<\/p>\n<p>2\u00a0\u00a0 Light Gradient Boosting Machine\u00a0\u00a0\u00a0\u00a00.8219\u00a0\u00a00.9096\u00a0\u00a00.8000\u00a0\u00a00.8327\u00a0\u00a00.8012<\/p>\n<p>3\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Gradient Boosting Classifier\u00a0\u00a0\u00a0\u00a00.8010\u00a0\u00a00.8801\u00a0\u00a00.7690\u00a0\u00a00.8110\u00a0\u00a00.7805<\/p>\n<p>4\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Ada Boost Classifier\u00a0\u00a0\u00a0\u00a00.8000\u00a0\u00a00.8474\u00a0\u00a00.7952\u00a0\u00a00.8071\u00a0\u00a00.7890<\/p>\n<p>5\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0K Neighbors Classifier\u00a0\u00a0\u00a0\u00a00.7995\u00a0\u00a00.8613\u00a0\u00a00.7405\u00a0\u00a00.8276\u00a0\u00a00.7773<\/p>\n<p>6\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Extreme Gradient Boosting\u00a0\u00a0\u00a0\u00a00.7995\u00a0\u00a00.8934\u00a0\u00a00.7833\u00a0\u00a00.8095\u00a0\u00a00.7802<\/p>\n<p>7\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Random Forest Classifier\u00a0\u00a0\u00a0\u00a00.7662\u00a0\u00a00.8778\u00a0\u00a00.6976\u00a0\u00a00.8024\u00a0\u00a00.7345<\/p>\n<p>8\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Decision Tree Classifier\u00a0\u00a0\u00a0\u00a00.7533\u00a0\u00a00.7524\u00a0\u00a00.7119\u00a0\u00a00.7655\u00a0\u00a00.7213<\/p>\n<p>9\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Ridge Classifier\u00a0\u00a0\u00a0\u00a00.7448\u00a0\u00a00.0000\u00a0\u00a00.6952\u00a0\u00a00.7574\u00a0\u00a00.7135<\/p>\n<p>10\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Naive Bayes\u00a0\u00a0\u00a0\u00a00.7214\u00a0\u00a00.8159\u00a0\u00a00.8286\u00a0\u00a00.6700\u00a0\u00a00.7308<\/p>\n<p>11\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0SVM &#8211; Linear Kernel\u00a0\u00a0\u00a0\u00a00.7181\u00a0\u00a00.0000\u00a0\u00a00.6286\u00a0\u00a00.7146\u00a0\u00a00.6309<\/p>\n<p>12\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Logistic Regression\u00a0\u00a0\u00a0\u00a00.7100\u00a0\u00a00.8104\u00a0\u00a00.6357\u00a0\u00a00.7263\u00a0\u00a00.6634<\/p>\n<p>13\u00a0\u00a0\u00a0\u00a0 Linear Discriminant Analysis\u00a0\u00a0\u00a0\u00a00.6924\u00a0\u00a00.7510\u00a0\u00a00.6667\u00a0\u00a00.6762\u00a0\u00a00.6628<\/p>\n<p>14\u00a0\u00a0Quadratic Discriminant Analysis\u00a0\u00a0\u00a0\u00a00.5800\u00a0\u00a00.6308\u00a0\u00a00.1095\u00a0\u00a00.5000\u00a0\u00a00.1750<\/p>\n<p>\u00a0<\/p>\n<p>\u00a0\u00a0\u00a0\u00a0 Kappa\u00a0\u00a0\u00a0\u00a0 MCC\u00a0\u00a0TT (Sec)<\/p>\n<p>0\u00a0\u00a0 0.7383\u00a0\u00a00.7446\u00a0\u00a0\u00a0\u00a00.1415<\/p>\n<p>1\u00a0\u00a0 0.7368\u00a0\u00a00.7552\u00a0\u00a0\u00a0\u00a01.9930<\/p>\n<p>2\u00a0\u00a0 0.6410\u00a0\u00a00.6581\u00a0\u00a0\u00a0\u00a00.0134<\/p>\n<p>3\u00a0\u00a0 0.5989\u00a0\u00a00.6090\u00a0\u00a0\u00a0\u00a00.1413<\/p>\n<p>4\u00a0\u00a0 0.5979\u00a0\u00a00.6123\u00a0\u00a0\u00a0\u00a00.0726<\/p>\n<p>5\u00a0\u00a0 0.5957\u00a0\u00a00.6038\u00a0\u00a0\u00a0\u00a00.0019<\/p>\n<p>6\u00a0\u00a0 0.5970\u00a0\u00a00.6132\u00a0\u00a0\u00a0\u00a00.0287<\/p>\n<p>7\u00a0\u00a0 0.5277\u00a0\u00a00.5438\u00a0\u00a0\u00a0\u00a00.1107<\/p>\n<p>8\u00a0\u00a0 0.5028\u00a0\u00a00.5192\u00a0\u00a0\u00a0\u00a00.0035<\/p>\n<p>9\u00a0\u00a0 0.4870\u00a0\u00a00.5003\u00a0\u00a0\u00a0\u00a00.0030<\/p>\n<p>10\u00a0\u00a00.4488\u00a0\u00a00.4752\u00a0\u00a0\u00a0\u00a00.0019<\/p>\n<p>11\u00a0\u00a00.4235\u00a0\u00a00.4609\u00a0\u00a0\u00a0\u00a00.0024<\/p>\n<p>12\u00a0\u00a00.4143\u00a0\u00a00.4285\u00a0\u00a0\u00a0\u00a00.0059<\/p>\n<p>13\u00a0\u00a00.3825\u00a0\u00a00.3927\u00a0\u00a0\u00a0\u00a00.0034<\/p>\n<p>14\u00a0\u00a00.1172\u00a0\u00a00.1792\u00a0\u00a0\u00a0\u00a00.0033<\/p>\n<p>ExtraTreesClassifier(bootstrap=False, ccp_alpha=0.0, class_weight=None,<\/p>\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 criterion=&#8217;gini&#8217;, max_depth=None, max_features=&#8217;auto&#8217;,<\/p>\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 max_leaf_nodes=None, max_samples=None,<\/p>\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 min_impurity_decrease=0.0, min_impurity_split=None,<\/p>\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 min_samples_leaf=1, min_samples_split=2,<\/p>\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 min_weight_fraction_leaf=0.0, n_estimators=100, n_jobs=-1,<\/p>\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 oob_score=False, random_state=2728, verbose=0,<\/p>\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 warm_start=False)<\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0001 seconds] --><\/p>\n<p>We could use this configuration directly and fit a model on the entire dataset and use it to make predictions on new data.<\/p>\n<p>We can also use the table of results to get an idea of the types of models that perform well on the dataset, in this case, ensembles of decision trees.<\/p>\n<p>Now that we are familiar with how to compare machine learning models using PyCaret, let\u2019s look at how we might use the library to tune model hyperparameters.<\/p>\n<h2>Tuning Machine Learning Models<\/h2>\n<p>In this section, we will tune the hyperparameters of a machine learning model on the Sonar classification dataset.<\/p>\n<p>We must load and set up the dataset as we did before when comparing models.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5fb881455bac1491177615\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\n# setup the dataset<br \/>\ngrid = setup(data=df, target=df.columns[-1], html=False, silent=True, verbose=False)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-p\"># setup the dataset<\/span><\/p>\n<p><span class=\"crayon-v\">grid<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">setup<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">df<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">target<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">df<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">columns<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">html<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-t\">False<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">silent<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-t\">True<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">verbose<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-t\">False<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0002 seconds] --><\/p>\n<p>We can tune model hyperparameters using the <em>tune_model()<\/em> function in the PyCaret library.<\/p>\n<p>The function takes an instance of the model to tune as input and knows what hyperparameters to tune automatically. A random search of model hyperparameters is performed and the total number of evaluations can be controlled via the \u201c<em>n_iter<\/em>\u201d argument.<\/p>\n<p>By default, the function will optimize the \u2018<em>Accuracy<\/em>\u2018 and will evaluate the performance of each configuration using 10-fold cross-validation, although this sensible default configuration can be changed.<\/p>\n<p>We can perform a random search of the extra trees classifier as follows:<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5fb881455bac8823494696\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\n# tune model hyperparameters<br \/>\nbest = tune_model(ExtraTreesClassifier(), n_iter=200)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-p\"># tune model hyperparameters<\/span><\/p>\n<p><span class=\"crayon-v\">best<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">tune_model<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-e\">ExtraTreesClassifier<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">n_iter<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">200<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0001 seconds] --><\/p>\n<p>The function will return the best-performing model, which can be used directly or printed to determine the hyperparameters that were selected.<\/p>\n<p>It will also print a table of the results for the best configuration across the number of folds in the k-fold cross-validation (e.g. 10 folds).<\/p>\n<p>Tying this together, the complete example of tuning the hyperparameters of the extra trees classifier on the Sonar dataset is listed below.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5fb881455baca077848087\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# tune model hyperparameters on the sonar classification dataset<br \/>\nfrom pandas import read_csv<br \/>\nfrom sklearn.ensemble import ExtraTreesClassifier<br \/>\nfrom pycaret.classification import setup<br \/>\nfrom pycaret.classification import tune_model<br \/>\n# define the location of the dataset<br \/>\nurl = &#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/sonar.csv&#8217;<br \/>\n# load the dataset<br \/>\ndf = read_csv(url, header=None)<br \/>\n# set column names as the column number<br \/>\nn_cols = df.shape[1]<br \/>\ndf.columns = [str(i) for i in range(n_cols)]<br \/>\n# setup the dataset<br \/>\ngrid = setup(data=df, target=df.columns[-1], html=False, silent=True, verbose=False)<br \/>\n# tune model hyperparameters<br \/>\nbest = tune_model(ExtraTreesClassifier(), n_iter=200, choose_better=True)<br \/>\n# report the best model<br \/>\nprint(best)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<div class=\"urvanov-syntax-highlighter-nums-content\">\n<p>1<\/p>\n<p>2<\/p>\n<p>3<\/p>\n<p>4<\/p>\n<p>5<\/p>\n<p>6<\/p>\n<p>7<\/p>\n<p>8<\/p>\n<p>9<\/p>\n<p>10<\/p>\n<p>11<\/p>\n<p>12<\/p>\n<p>13<\/p>\n<p>14<\/p>\n<p>15<\/p>\n<p>16<\/p>\n<p>17<\/p>\n<p>18<\/p>\n<\/div>\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-p\"># tune model hyperparameters on the sonar classification dataset<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">pandas <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">read_csv<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">ensemble <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">ExtraTreesClassifier<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">pycaret<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">classification <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">setup<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">pycaret<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">classification <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-v\">tune<\/span><span class=\"crayon-sy\">_<\/span>model<\/p>\n<p><span class=\"crayon-p\"># define the location of the dataset<\/span><\/p>\n<p><span class=\"crayon-v\">url<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-s\">&#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/sonar.csv&#8217;<\/span><\/p>\n<p><span class=\"crayon-p\"># load the dataset<\/span><\/p>\n<p><span class=\"crayon-v\">df<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">read_csv<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">url<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">header<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">None<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># set column names as the column number<\/span><\/p>\n<p><span class=\"crayon-v\">n_cols<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">df<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-v\">df<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">columns<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-e\">str<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">i<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">for<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">i<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">in<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">range<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">n_cols<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-p\"># setup the dataset<\/span><\/p>\n<p><span class=\"crayon-v\">grid<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">setup<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">df<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">target<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">df<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">columns<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">html<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-t\">False<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">silent<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-t\">True<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">verbose<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-t\">False<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># tune model hyperparameters<\/span><\/p>\n<p><span class=\"crayon-v\">best<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">tune_model<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-e\">ExtraTreesClassifier<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">n_iter<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">200<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">choose_better<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-t\">True<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># report the best model<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">best<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0006 seconds] --><\/p>\n<p>Running the example first loads the dataset and configures the PyCaret library.<\/p>\n<p>A grid search is then performed reporting the performance of the best-performing configuration across the 10 folds of cross-validation and the mean accuracy.<\/p>\n<p><strong>Note<\/strong>: Your <a href=\"https:\/\/machinelearningmastery.com\/different-results-each-time-in-machine-learning\/\">results may vary<\/a> given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the average outcome.<\/p>\n<p>In this case, we can see that the random search found a configuration with an accuracy of about 75.29 percent, which is not better than the default configuration from the previous section that achieved a score of about 86.95 percent.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5fb881455bacb905490781\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n      Accuracy     AUC  Recall   Prec.      F1   Kappa     MCC<br \/>\n0       0.8667  1.0000  1.0000  0.7778  0.8750  0.7368  0.7638<br \/>\n1       0.6667  0.8393  0.4286  0.7500  0.5455  0.3119  0.3425<br \/>\n2       0.6667  0.8036  0.2857  1.0000  0.4444  0.2991  0.4193<br \/>\n3       0.7333  0.7321  0.4286  1.0000  0.6000  0.4444  0.5345<br \/>\n4       0.6667  0.5714  0.2857  1.0000  0.4444  0.2991  0.4193<br \/>\n5       0.8571  0.8750  0.6667  1.0000  0.8000  0.6957  0.7303<br \/>\n6       0.8571  0.9583  0.6667  1.0000  0.8000  0.6957  0.7303<br \/>\n7       0.7857  0.8776  0.5714  1.0000  0.7273  0.5714  0.6325<br \/>\n8       0.6429  0.7959  0.2857  1.0000  0.4444  0.2857  0.4082<br \/>\n9       0.7857  0.8163  0.5714  1.0000  0.7273  0.5714  0.6325<br \/>\nMean    0.7529  0.8270  0.5190  0.9528  0.6408  0.4911  0.5613<br \/>\nSD      0.0846  0.1132  0.2145  0.0946  0.1571  0.1753  0.1485<br \/>\nExtraTreesClassifier(bootstrap=False, ccp_alpha=0.0, class_weight=None,<br \/>\n                     criterion=&#8217;gini&#8217;, max_depth=1, max_features=&#8217;auto&#8217;,<br \/>\n                     max_leaf_nodes=None, max_samples=None,<br \/>\n                     min_impurity_decrease=0.0, min_impurity_split=None,<br \/>\n                     min_samples_leaf=4, min_samples_split=2,<br \/>\n                     min_weight_fraction_leaf=0.0, n_estimators=120,<br \/>\n                     n_jobs=None, oob_score=False, random_state=None, verbose=0,<br \/>\n                     warm_start=False)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<div class=\"urvanov-syntax-highlighter-nums-content\">\n<p>1<\/p>\n<p>2<\/p>\n<p>3<\/p>\n<p>4<\/p>\n<p>5<\/p>\n<p>6<\/p>\n<p>7<\/p>\n<p>8<\/p>\n<p>9<\/p>\n<p>10<\/p>\n<p>11<\/p>\n<p>12<\/p>\n<p>13<\/p>\n<p>14<\/p>\n<p>15<\/p>\n<p>16<\/p>\n<p>17<\/p>\n<p>18<\/p>\n<p>19<\/p>\n<p>20<\/p>\n<p>21<\/p>\n<\/div>\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Accuracy\u00a0\u00a0\u00a0\u00a0 AUC\u00a0\u00a0Recall\u00a0\u00a0 Prec.\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0F1\u00a0\u00a0 Kappa\u00a0\u00a0\u00a0\u00a0 MCC<\/p>\n<p>0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0.8667\u00a0\u00a01.0000\u00a0\u00a01.0000\u00a0\u00a00.7778\u00a0\u00a00.8750\u00a0\u00a00.7368\u00a0\u00a00.7638<\/p>\n<p>1\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0.6667\u00a0\u00a00.8393\u00a0\u00a00.4286\u00a0\u00a00.7500\u00a0\u00a00.5455\u00a0\u00a00.3119\u00a0\u00a00.3425<\/p>\n<p>2\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0.6667\u00a0\u00a00.8036\u00a0\u00a00.2857\u00a0\u00a01.0000\u00a0\u00a00.4444\u00a0\u00a00.2991\u00a0\u00a00.4193<\/p>\n<p>3\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0.7333\u00a0\u00a00.7321\u00a0\u00a00.4286\u00a0\u00a01.0000\u00a0\u00a00.6000\u00a0\u00a00.4444\u00a0\u00a00.5345<\/p>\n<p>4\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0.6667\u00a0\u00a00.5714\u00a0\u00a00.2857\u00a0\u00a01.0000\u00a0\u00a00.4444\u00a0\u00a00.2991\u00a0\u00a00.4193<\/p>\n<p>5\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0.8571\u00a0\u00a00.8750\u00a0\u00a00.6667\u00a0\u00a01.0000\u00a0\u00a00.8000\u00a0\u00a00.6957\u00a0\u00a00.7303<\/p>\n<p>6\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0.8571\u00a0\u00a00.9583\u00a0\u00a00.6667\u00a0\u00a01.0000\u00a0\u00a00.8000\u00a0\u00a00.6957\u00a0\u00a00.7303<\/p>\n<p>7\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0.7857\u00a0\u00a00.8776\u00a0\u00a00.5714\u00a0\u00a01.0000\u00a0\u00a00.7273\u00a0\u00a00.5714\u00a0\u00a00.6325<\/p>\n<p>8\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0.6429\u00a0\u00a00.7959\u00a0\u00a00.2857\u00a0\u00a01.0000\u00a0\u00a00.4444\u00a0\u00a00.2857\u00a0\u00a00.4082<\/p>\n<p>9\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0.7857\u00a0\u00a00.8163\u00a0\u00a00.5714\u00a0\u00a01.0000\u00a0\u00a00.7273\u00a0\u00a00.5714\u00a0\u00a00.6325<\/p>\n<p>Mean\u00a0\u00a0\u00a0\u00a00.7529\u00a0\u00a00.8270\u00a0\u00a00.5190\u00a0\u00a00.9528\u00a0\u00a00.6408\u00a0\u00a00.4911\u00a0\u00a00.5613<\/p>\n<p>SD\u00a0\u00a0\u00a0\u00a0\u00a0\u00a00.0846\u00a0\u00a00.1132\u00a0\u00a00.2145\u00a0\u00a00.0946\u00a0\u00a00.1571\u00a0\u00a00.1753\u00a0\u00a00.1485<\/p>\n<p>ExtraTreesClassifier(bootstrap=False, ccp_alpha=0.0, class_weight=None,<\/p>\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 criterion=&#8217;gini&#8217;, max_depth=1, max_features=&#8217;auto&#8217;,<\/p>\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 max_leaf_nodes=None, max_samples=None,<\/p>\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 min_impurity_decrease=0.0, min_impurity_split=None,<\/p>\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 min_samples_leaf=4, min_samples_split=2,<\/p>\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 min_weight_fraction_leaf=0.0, n_estimators=120,<\/p>\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 n_jobs=None, oob_score=False, random_state=None, verbose=0,<\/p>\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 warm_start=False)<\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0001 seconds] --><\/p>\n<p>We might be able to improve upon the grid search by specifying to the <em>tune_model()<\/em> function what hyperparameters to search and what ranges to search.<\/p>\n<h2>Further Reading<\/h2>\n<p>This section provides more resources on the topic if you are looking to go deeper.<\/p>\n<h2>Summary<\/h2>\n<p>In this tutorial, you discovered the PyCaret Python open source library for machine learning.<\/p>\n<p>Specifically, you learned:<\/p>\n<ul>\n<li>PyCaret is a Python version of the popular and widely used caret machine learning package in R.<\/li>\n<li>How to use PyCaret to easily evaluate and compare standard machine learning models on a dataset.<\/li>\n<li>How to use PyCaret to easily tune the hyperparameters of a well-performing machine learning model.<\/li>\n<\/ul>\n<p><strong>Do you have any questions?<\/strong><br \/>Ask your questions in the comments below and I will do my best to answer.<\/p>\n<div class=\"widget_text awac-wrapper\" id=\"custom_html-78\">\n<div class=\"widget_text awac widget custom_html-78\">\n<div class=\"textwidget custom-html-widget\">\n<div>\n<h2>Discover Fast Machine Learning in Python!<\/h2>\n<p><a href=\"\/machine-learning-with-python\/\" rel=\"nofollow\"><img decoding=\"async\" src=\"https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2014\/07\/MachineLearningMasteryWithPython-220px.png\" alt=\"Master Machine Learning With Python\" align=\"left\"><\/a><\/p>\n<h4>Develop Your Own Models in Minutes<\/h4>\n<p>&#8230;with just a few lines of scikit-learn code<\/p>\n<p>Learn how in my new Ebook:<br \/><a href=\"\/machine-learning-with-python\/\" rel=\"nofollow\">Machine Learning Mastery With Python<\/a><\/p>\n<p>Covers <strong>self-study tutorials<\/strong> and <strong>end-to-end projects<\/strong> like:<br \/><em>Loading data<\/em>, <em>visualization<\/em>, <em>modeling<\/em>, <em>tuning<\/em>, and much more&#8230;<\/p>\n<h4>Finally Bring Machine Learning To<br \/>Your Own Projects<\/h4>\n<p>Skip the Academics. Just Results.<\/p>\n<p><a href=\"\/machine-learning-with-python\/\" class=\"woo-sc-button  red\"><span class=\"woo-\">See What&#8217;s Inside<\/span><\/a><\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div><\/div>\n","protected":false},"excerpt":{"rendered":"<p>https:\/\/machinelearningmastery.com\/pycaret-for-machine-learning\/<\/p>\n","protected":false},"author":0,"featured_media":593,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[3],"tags":[],"_links":{"self":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/592"}],"collection":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/comments?post=592"}],"version-history":[{"count":0,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/592\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media\/593"}],"wp:attachment":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media?parent=592"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/categories?post=592"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/tags?post=592"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}