{"id":97,"date":"2020-08-17T07:54:47","date_gmt":"2020-08-17T07:54:47","guid":{"rendered":"https:\/\/machine-learning.webcloning.com\/2020\/08\/17\/how-to-use-seaborn-data-visualization-for-machine-learning\/"},"modified":"2020-08-17T07:54:47","modified_gmt":"2020-08-17T07:54:47","slug":"how-to-use-seaborn-data-visualization-for-machine-learning","status":"publish","type":"post","link":"https:\/\/salarydistribution.com\/machine-learning\/2020\/08\/17\/how-to-use-seaborn-data-visualization-for-machine-learning\/","title":{"rendered":"How to use Seaborn Data Visualization for Machine Learning"},"content":{"rendered":"<div id=\"\">\n<p>Data visualization provides insight into the distribution and relationships between variables in a dataset.<\/p>\n<p>This insight can be helpful in selecting data preparation techniques to apply prior to modeling and the types of algorithms that may be most suited to the data.<\/p>\n<p>Seaborn is a data visualization library for Python that runs on top of the popular Matplotlib data visualization library, although it provides a simple interface and aesthetically better-looking plots.<\/p>\n<p>In this tutorial, you will discover a gentle introduction to Seaborn data visualization for machine learning.<\/p>\n<p>After completing this tutorial, you will know:<\/p>\n<ul>\n<li>How to summarize the distribution of variables using bar charts, histograms, and box and whisker plots.<\/li>\n<li>How to summarize relationships using line plots and scatter plots.<\/li>\n<li>How to compare the distribution and relationships of variables for different class values on the same plot.<\/li>\n<\/ul>\n<p>Let\u2019s get started.<\/p>\n<div id=\"attachment_10415\" class=\"wp-caption aligncenter\" readability=\"29.423076923077\">\n<img decoding=\"async\" aria-describedby=\"caption-attachment-10415\" loading=\"lazy\" class=\"size-full wp-image-10415\" src=\"https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/06\/How-to-use-Seaborn-Data-Visualization-for-Machine-Learning.jpg\" alt=\"How to use Seaborn Data Visualization for Machine Learning\" width=\"800\" height=\"536\" srcset=\"https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/06\/How-to-use-Seaborn-Data-Visualization-for-Machine-Learning.jpg 800w, https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/06\/How-to-use-Seaborn-Data-Visualization-for-Machine-Learning-300x201.jpg 300w, https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/06\/How-to-use-Seaborn-Data-Visualization-for-Machine-Learning-768x515.jpg 768w\" sizes=\"(max-width: 800px) 100vw, 800px\"><\/p>\n<p id=\"caption-attachment-10415\" class=\"wp-caption-text\">How to use Seaborn Data Visualization for Machine Learning<br \/>Photo by <a href=\"https:\/\/flickr.com\/photos\/mdpettitt\/2743243609\/\">Martin Pettitt<\/a>, some rights reserved.<\/p>\n<\/div>\n<h2>Tutorial Overview<\/h2>\n<p>This tutorial is divided into six parts; they are:<\/p>\n<ul>\n<li>Seaborn Data Visualization Library<\/li>\n<li>Line Plots<\/li>\n<li>Bar Chart Plots<\/li>\n<li>Histogram Plots<\/li>\n<li>Box and Whisker Plots<\/li>\n<li>Scatter Plots<\/li>\n<\/ul>\n<h2>Seaborn Data Visualization Library<\/h2>\n<p>The primary plotting library for Python is called <a href=\"https:\/\/matplotlib.org\/\">Matplotlib<\/a>.<\/p>\n<p><a href=\"https:\/\/seaborn.pydata.org\/\">Seaborn<\/a> is a plotting library that offers a simpler interface, sensible defaults for plots needed for machine learning, and most importantly, the plots are aesthetically better looking than those in Matplotlib.<\/p>\n<p>Seaborn requires that Matplotlib is installed first.<\/p>\n<p>You can install Matplotlib directly using <a href=\"https:\/\/en.wikipedia.org\/wiki\/Pip_(package_manager)\">pip<\/a>, as follows:<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.12 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f3a32b12c94d549227448\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\" readability=\"7\">\n<p><textarea wrap=\"soft\" class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\nsudo pip install matplotlib<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\" readability=\"1\">\n<tr class=\"urvanov-syntax-highlighter-row\" readability=\"2\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\" readability=\"4\">\n<div class=\"crayon-pre\" readability=\"7\">\n<p><span class=\"crayon-e\">sudo <\/span><span class=\"crayon-e\">pip <\/span><span class=\"crayon-e\">install <\/span><span class=\"crayon-v\">matplotlib<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0001 seconds] --><\/p>\n<p>Once installed, you can confirm that the library can be loaded and used by printing the version number, as follows:<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.12 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f3a32b12c953087800892\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\" readability=\"7\">\n<p><textarea wrap=\"soft\" class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# matplotlib<br \/>\nimport matplotlib<br \/>\nprint(&#8216;matplotlib: %s&#8217; % matplotlib.__version__)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\" readability=\"1\">\n<tr class=\"urvanov-syntax-highlighter-row\" readability=\"2\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\" readability=\"4\">\n<div class=\"crayon-pre\" readability=\"7\">\n<p><span class=\"crayon-p\"># matplotlib<\/span><\/p>\n<p><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">matplotlib<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8216;matplotlib: %s&#8217;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">%<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">matplotlib<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">__version__<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0002 seconds] --><\/p>\n<p>Running the example prints the current version of the Matplotlib library.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.12 --><\/p>\n<p><!-- [Format Time: 0.0000 seconds] --><\/p>\n<p>Next, the Seaborn library can be installed, also using pip:<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.12 --><\/p>\n<p><!-- [Format Time: 0.0000 seconds] --><\/p>\n<p>Once installed, we can also confirm the library can be loaded and used by printing the version number, as follows:<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.12 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f3a32b12c957386750761\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\" readability=\"7\">\n<p><textarea wrap=\"soft\" class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# seaborn<br \/>\nimport seaborn<br \/>\nprint(&#8216;seaborn: %s&#8217; % seaborn.__version__)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\" readability=\"1\">\n<tr class=\"urvanov-syntax-highlighter-row\" readability=\"2\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\" readability=\"4\">\n<div class=\"crayon-pre\" readability=\"7\">\n<p><span class=\"crayon-p\"># seaborn<\/span><\/p>\n<p><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">seaborn<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8216;seaborn: %s&#8217;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">%<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">seaborn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">__version__<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0001 seconds] --><\/p>\n<p>Running the example prints the current version of the Seaborn library.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.12 --><\/p>\n<p><!-- [Format Time: 0.0000 seconds] --><\/p>\n<p>To create Seaborn plots, you must import the Seaborn library and call functions to create the plots.<\/p>\n<p>Importantly, Seaborn plotting functions expect data to be provided as <a href=\"https:\/\/pandas.pydata.org\/pandas-docs\/stable\/reference\/api\/pandas.DataFrame.html\">Pandas DataFrames<\/a>. This means that if you are loading your data from CSV files, you must use Pandas functions like <a href=\"https:\/\/pandas.pydata.org\/pandas-docs\/stable\/reference\/api\/pandas.read_csv.html\">read_csv()<\/a> to load your data as a DataFrame. When plotting, columns can then be specified via the DataFrame name or column index.<\/p>\n<p>To show the plot, you can call the <a href=\"https:\/\/matplotlib.org\/api\/_as_gen\/matplotlib.pyplot.show.html\">show() function<\/a> on Matplotlib library.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.12 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f3a32b12c959170580189\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\" readability=\"7\">\n<p><textarea wrap=\"soft\" class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\n# display the plot<br \/>\npyplot.show()<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\" readability=\"1\">\n<tr class=\"urvanov-syntax-highlighter-row\" readability=\"2\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-p\"># display the plot<\/span><\/p>\n<p><span class=\"crayon-v\">pyplot<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">show<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0001 seconds] --><\/p>\n<p>Alternatively, the plots can be saved to file, such as a PNG formatted image file. The <a href=\"https:\/\/matplotlib.org\/api\/_as_gen\/matplotlib.pyplot.savefig.html\">savefig() Matplotlib function<\/a> can be used to save images.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.12 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f3a32b12c95a548223138\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\" readability=\"7\">\n<p><textarea wrap=\"soft\" class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\n# save the plot<br \/>\npyplot.savefig(&#8216;my_image.png&#8217;)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\" readability=\"1\">\n<tr class=\"urvanov-syntax-highlighter-row\" readability=\"2\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\" readability=\"4\">\n<div class=\"crayon-pre\" readability=\"7\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-p\"># save the plot<\/span><\/p>\n<p><span class=\"crayon-v\">pyplot<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">savefig<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8216;my_image.png&#8217;<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0001 seconds] --><\/p>\n<p>Now that we have Seaborn installed, let\u2019s look at some common plots we may need when working with machine learning data.<\/p>\n<h2>Line Plots<\/h2>\n<p>A line plot is generally used to present observations collected at regular intervals.<\/p>\n<p>The x-axis represents the regular interval, such as time. The y-axis shows the observations, ordered by the x-axis and connected by a line.<\/p>\n<p>A line plot can be created in Seaborn by calling the <a href=\"https:\/\/seaborn.pydata.org\/generated\/seaborn.lineplot.html\">lineplot() function<\/a> and passing the x-axis data for the regular interval, and y-axis for the observations.<\/p>\n<p>We can demonstrate a line plot using a time series dataset of <a href=\"https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/monthly-car-sales.csv\">monthly car sales<\/a>.<\/p>\n<p>The dataset has two columns: \u201c<em>Month<\/em>\u201d and \u201c<em>Sales<\/em>.\u201d Month will be used as the x-axis and Sales will be plotted on the y-axis.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.12 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f3a32b12c95b366158588\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\" readability=\"9\">\n<p><textarea wrap=\"soft\" class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\n# create line plot<br \/>\nlineplot(x=&#8217;Month&#8217;, y=&#8217;Sales&#8217;, data=dataset)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\" readability=\"2\">\n<tr class=\"urvanov-syntax-highlighter-row\" readability=\"4\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\" readability=\"5\">\n<div class=\"crayon-pre\" readability=\"9\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-p\"># create line plot<\/span><\/p>\n<p><span class=\"crayon-e\">lineplot<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">x<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8216;Month&#8217;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8216;Sales&#8217;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">dataset<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0002 seconds] --><\/p>\n<p>Tying this together, the complete example is listed below.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.12 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f3a32b12c95d279350016\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\" readability=\"13\">\n<p><textarea wrap=\"soft\" class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# line plot of a time series dataset<br \/>\nfrom pandas import read_csv<br \/>\nfrom seaborn import lineplot<br \/>\nfrom matplotlib import pyplot<br \/>\n# load the dataset<br \/>\nurl = &#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/monthly-car-sales.csv&#8217;<br \/>\ndataset = read_csv(url, header=0)<br \/>\n# create line plot<br \/>\nlineplot(x=&#8217;Month&#8217;, y=&#8217;Sales&#8217;, data=dataset)<br \/>\n# show plot<br \/>\npyplot.show()<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\" readability=\"4\">\n<tr class=\"urvanov-syntax-highlighter-row\" readability=\"8\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\" readability=\"11.5\">\n<div class=\"crayon-pre\" readability=\"22\">\n<p><span class=\"crayon-p\"># line plot of a time series dataset<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">pandas <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">read_csv<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">seaborn <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">lineplot<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">matplotlib <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-i\">pyplot<\/span><\/p>\n<p><span class=\"crayon-p\"># load the dataset<\/span><\/p>\n<p><span class=\"crayon-v\">url<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-s\">&#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/monthly-car-sales.csv&#8217;<\/span><\/p>\n<p><span class=\"crayon-v\">dataset<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">read_csv<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">url<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">header<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># create line plot<\/span><\/p>\n<p><span class=\"crayon-e\">lineplot<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">x<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8216;Month&#8217;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8216;Sales&#8217;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">dataset<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># show plot<\/span><\/p>\n<p><span class=\"crayon-v\">pyplot<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">show<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0003 seconds] --><\/p>\n<p>Running the example first loads the time series dataset and creates a line plot of the data, clearly showing a trend and seasonality in the sales data.<\/p>\n<div id=\"attachment_10407\" class=\"wp-caption aligncenter\" readability=\"32\">\n<img decoding=\"async\" aria-describedby=\"caption-attachment-10407\" loading=\"lazy\" class=\"size-full wp-image-10407\" src=\"https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Line-Plot-of-a-Time-Series-Dataset.png\" alt=\"Line Plot of a Time Series Dataset\" width=\"1280\" height=\"960\" srcset=\"https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Line-Plot-of-a-Time-Series-Dataset.png 1280w, https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Line-Plot-of-a-Time-Series-Dataset-300x225.png 300w, https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Line-Plot-of-a-Time-Series-Dataset-1024x768.png 1024w, https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Line-Plot-of-a-Time-Series-Dataset-768x576.png 768w\" sizes=\"(max-width: 1280px) 100vw, 1280px\"><\/p>\n<p id=\"caption-attachment-10407\" class=\"wp-caption-text\">Line Plot of a Time Series Dataset<\/p>\n<\/div>\n<p>For more great examples of line plots with Seaborn, see: <a href=\"https:\/\/seaborn.pydata.org\/tutorial\/relational.html\">Visualizing statistical relationships<\/a>.<\/p>\n<h2>Bar Chart Plots<\/h2>\n<p>A bar chart is generally used to present relative quantities for multiple categories.<\/p>\n<p>The x-axis represents the categories that are spaced evenly. The y-axis represents the quantity for each category and is drawn as a bar from the baseline to the appropriate level on the y-axis.<\/p>\n<p>A bar chart can be created in Seaborn by calling the <a href=\"https:\/\/seaborn.pydata.org\/generated\/seaborn.countplot.html\">countplot() function<\/a> and passing the data.<\/p>\n<p>We will demonstrate a bar chart with a variable from the <a href=\"https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/breast-cancer.csv\">breast cancer classification dataset<\/a> that is comprised of categorical input variables.<\/p>\n<p>We will just plot one variable, in this case, the first variable which is the age bracket.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.12 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f3a32b12c95e340462994\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\" readability=\"8\">\n<p><textarea wrap=\"soft\" class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\n# create line plot<br \/>\ncountplot(x=0, data=dataset)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\" readability=\"1.5\">\n<tr class=\"urvanov-syntax-highlighter-row\" readability=\"3\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\" readability=\"4.5\">\n<div class=\"crayon-pre\" readability=\"8\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-p\"># create line plot<\/span><\/p>\n<p><span class=\"crayon-e\">countplot<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">x<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">dataset<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0001 seconds] --><\/p>\n<p>Tying this together, the complete example is listed below.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.12 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f3a32b12c95f778722147\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\" readability=\"12\">\n<p><textarea wrap=\"soft\" class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# bar chart plot of a categorical variable<br \/>\nfrom pandas import read_csv<br \/>\nfrom seaborn import countplot<br \/>\nfrom matplotlib import pyplot<br \/>\n# load the dataset<br \/>\nurl = &#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/breast-cancer.csv&#8217;<br \/>\ndataset = read_csv(url, header=None)<br \/>\n# create bar chart plot<br \/>\ncountplot(x=0, data=dataset)<br \/>\n# show plot<br \/>\npyplot.show()<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\" readability=\"3.5\">\n<tr class=\"urvanov-syntax-highlighter-row\" readability=\"7\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\" readability=\"11\">\n<div class=\"crayon-pre\" readability=\"21\">\n<p><span class=\"crayon-p\"># bar chart plot of a categorical variable<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">pandas <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">read_csv<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">seaborn <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">countplot<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">matplotlib <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-i\">pyplot<\/span><\/p>\n<p><span class=\"crayon-p\"># load the dataset<\/span><\/p>\n<p><span class=\"crayon-v\">url<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-s\">&#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/breast-cancer.csv&#8217;<\/span><\/p>\n<p><span class=\"crayon-v\">dataset<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">read_csv<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">url<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">header<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">None<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># create bar chart plot<\/span><\/p>\n<p><span class=\"crayon-e\">countplot<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">x<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">dataset<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># show plot<\/span><\/p>\n<p><span class=\"crayon-v\">pyplot<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">show<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0002 seconds] --><\/p>\n<p>Running the example first loads the breast cancer dataset and creates a bar chart plot of the data, showing each age group and the number of individuals (samples) that fall within reach group.<\/p>\n<div id=\"attachment_10408\" class=\"wp-caption aligncenter\" readability=\"32\">\n<img decoding=\"async\" aria-describedby=\"caption-attachment-10408\" loading=\"lazy\" class=\"size-full wp-image-10408\" src=\"https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Bar-Chart-Plot-of-Age-Range-Categorical-Variable.png\" alt=\"Bar Chart Plot of Age Range Categorical Variable\" width=\"1280\" height=\"960\" srcset=\"https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Bar-Chart-Plot-of-Age-Range-Categorical-Variable.png 1280w, https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Bar-Chart-Plot-of-Age-Range-Categorical-Variable-300x225.png 300w, https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Bar-Chart-Plot-of-Age-Range-Categorical-Variable-1024x768.png 1024w, https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Bar-Chart-Plot-of-Age-Range-Categorical-Variable-768x576.png 768w\" sizes=\"(max-width: 1280px) 100vw, 1280px\"><\/p>\n<p id=\"caption-attachment-10408\" class=\"wp-caption-text\">Bar Chart Plot of Age Range Categorical Variable<\/p>\n<\/div>\n<p>We might also want to plot the counts for each category for a variable, such as the first variable, against the class label.<\/p>\n<p>This can be achieved using the <em>countplot()<\/em> function and specifying the class variable (column index 9) via the \u201c<em>hue<\/em>\u201d argument, as follows:<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.12 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f3a32b12c960750998972\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\" readability=\"9\">\n<p><textarea wrap=\"soft\" class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\n# create bar chart plot<br \/>\ncountplot(x=0, hue=9, data=dataset)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\" readability=\"2\">\n<tr class=\"urvanov-syntax-highlighter-row\" readability=\"4\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\" readability=\"5\">\n<div class=\"crayon-pre\" readability=\"9\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-p\"># create bar chart plot<\/span><\/p>\n<p><span class=\"crayon-e\">countplot<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">x<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">hue<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">9<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">dataset<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0001 seconds] --><\/p>\n<p>Tying this together, the complete example is listed below.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.12 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f3a32b12c961505061019\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\" readability=\"13\">\n<p><textarea wrap=\"soft\" class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# bar chart plot of a categorical variable against a class variable<br \/>\nfrom pandas import read_csv<br \/>\nfrom seaborn import countplot<br \/>\nfrom matplotlib import pyplot<br \/>\n# load the dataset<br \/>\nurl = &#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/breast-cancer.csv&#8217;<br \/>\ndataset = read_csv(url, header=None)<br \/>\n# create bar chart plot<br \/>\ncountplot(x=0, hue=9, data=dataset)<br \/>\n# show plot<br \/>\npyplot.show()<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\" readability=\"4\">\n<tr class=\"urvanov-syntax-highlighter-row\" readability=\"8\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\" readability=\"11.5\">\n<div class=\"crayon-pre\" readability=\"22\">\n<p><span class=\"crayon-p\"># bar chart plot of a categorical variable against a class variable<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">pandas <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">read_csv<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">seaborn <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">countplot<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">matplotlib <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-i\">pyplot<\/span><\/p>\n<p><span class=\"crayon-p\"># load the dataset<\/span><\/p>\n<p><span class=\"crayon-v\">url<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-s\">&#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/breast-cancer.csv&#8217;<\/span><\/p>\n<p><span class=\"crayon-v\">dataset<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">read_csv<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">url<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">header<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">None<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># create bar chart plot<\/span><\/p>\n<p><span class=\"crayon-e\">countplot<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">x<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">hue<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">9<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">dataset<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># show plot<\/span><\/p>\n<p><span class=\"crayon-v\">pyplot<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">show<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0003 seconds] --><\/p>\n<p>Running the example first loads the breast cancer dataset and creates a bar chart plot of the data, showing each age group and the number of individuals (samples) that fall within each group separated by the two class labels for the dataset.<\/p>\n<div id=\"attachment_10409\" class=\"wp-caption aligncenter\" readability=\"32\">\n<img decoding=\"async\" aria-describedby=\"caption-attachment-10409\" loading=\"lazy\" class=\"size-full wp-image-10409\" src=\"https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Bar-Chart-Plot-of-Age-Range-Categorical-Variable-by-Class-Label.png\" alt=\"Bar Chart Plot of Age Range Categorical Variable by Class Label\" width=\"1280\" height=\"960\" srcset=\"https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Bar-Chart-Plot-of-Age-Range-Categorical-Variable-by-Class-Label.png 1280w, https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Bar-Chart-Plot-of-Age-Range-Categorical-Variable-by-Class-Label-300x225.png 300w, https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Bar-Chart-Plot-of-Age-Range-Categorical-Variable-by-Class-Label-1024x768.png 1024w, https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Bar-Chart-Plot-of-Age-Range-Categorical-Variable-by-Class-Label-768x576.png 768w\" sizes=\"(max-width: 1280px) 100vw, 1280px\"><\/p>\n<p id=\"caption-attachment-10409\" class=\"wp-caption-text\">Bar Chart Plot of Age Range Categorical Variable by Class Label<\/p>\n<\/div>\n<p>For more great examples of bar chart plots with Seaborn, see: <a href=\"https:\/\/seaborn.pydata.org\/tutorial\/categorical.html\">Plotting with categorical data<\/a>.<\/p>\n<h2>Histogram Plots<\/h2>\n<p>A histogram plot is generally used to summarize the distribution of a numerical data sample.<\/p>\n<p>The x-axis represents discrete bins or intervals for the observations. For example, observations with values between 1 and 10 may be split into five bins, the values [1,2] would be allocated to the first bin, [3,4] would be allocated to the second bin, and so on.<\/p>\n<p>The y-axis represents the frequency or count of the number of observations in the dataset that belong to each bin.<\/p>\n<p>Essentially, a data sample is transformed into a bar chart where each category on the x-axis represents an interval of observation values.<\/p>\n<p>A histogram can be created in Seaborn by calling the <a href=\"https:\/\/seaborn.pydata.org\/generated\/seaborn.distplot.html\">distplot() function<\/a> and passing the variable.<\/p>\n<p>We will demonstrate a boxplot with a numerical variable from the <a href=\"https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/pima-indians-diabetes.csv\">diabetes classification dataset<\/a>. We will just plot one variable, in this case, the first variable, which is the number of times that a patient was pregnant.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.12 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f3a32b12c962406018425\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\" readability=\"7\">\n<p><textarea wrap=\"soft\" class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\n# create histogram plot<br \/>\ndistplot(dataset[[0]])<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\" readability=\"1\">\n<tr class=\"urvanov-syntax-highlighter-row\" readability=\"2\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-p\"># create histogram plot<\/span><\/p>\n<p><span class=\"crayon-e\">distplot<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">dataset<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0001 seconds] --><\/p>\n<p>Tying this together, the complete example is listed below.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.12 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f3a32b12c964474146446\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\" readability=\"11\">\n<p><textarea wrap=\"soft\" class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# histogram plot of a numerical variable<br \/>\nfrom pandas import read_csv<br \/>\nfrom seaborn import distplot<br \/>\nfrom matplotlib import pyplot<br \/>\n# load the dataset<br \/>\nurl = &#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/pima-indians-diabetes.csv&#8217;<br \/>\ndataset = read_csv(url, header=None)<br \/>\n# create histogram plot<br \/>\ndistplot(dataset[[0]])<br \/>\n# show plot<br \/>\npyplot.show()<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\" readability=\"3\">\n<tr class=\"urvanov-syntax-highlighter-row\" readability=\"6\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\" readability=\"9.5\">\n<div class=\"crayon-pre\" readability=\"18\">\n<p><span class=\"crayon-p\"># histogram plot of a numerical variable<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">pandas <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">read_csv<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">seaborn <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">distplot<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">matplotlib <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-i\">pyplot<\/span><\/p>\n<p><span class=\"crayon-p\"># load the dataset<\/span><\/p>\n<p><span class=\"crayon-v\">url<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-s\">&#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/pima-indians-diabetes.csv&#8217;<\/span><\/p>\n<p><span class=\"crayon-v\">dataset<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">read_csv<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">url<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">header<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">None<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># create histogram plot<\/span><\/p>\n<p><span class=\"crayon-e\">distplot<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">dataset<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># show plot<\/span><\/p>\n<p><span class=\"crayon-v\">pyplot<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">show<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0002 seconds] --><\/p>\n<p>Running the example first loads the diabetes dataset and creates a histogram plot of the variable, showing the distribution of the values with a hard cut-off at zero.<\/p>\n<p>The plot shows both the histogram (counts of bins) as well as a smooth estimate of the probability density function.<\/p>\n<div id=\"attachment_10410\" class=\"wp-caption aligncenter\" readability=\"32\">\n<img decoding=\"async\" aria-describedby=\"caption-attachment-10410\" loading=\"lazy\" class=\"size-full wp-image-10410\" src=\"https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Histogram-Plot-of-Number-of-Times-Pregnant-Numerical-Variable.png\" alt=\"Histogram Plot of Number of Times Pregnant Numerical Variable\" width=\"1280\" height=\"960\" srcset=\"https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Histogram-Plot-of-Number-of-Times-Pregnant-Numerical-Variable.png 1280w, https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Histogram-Plot-of-Number-of-Times-Pregnant-Numerical-Variable-300x225.png 300w, https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Histogram-Plot-of-Number-of-Times-Pregnant-Numerical-Variable-1024x768.png 1024w, https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Histogram-Plot-of-Number-of-Times-Pregnant-Numerical-Variable-768x576.png 768w\" sizes=\"(max-width: 1280px) 100vw, 1280px\"><\/p>\n<p id=\"caption-attachment-10410\" class=\"wp-caption-text\">Histogram Plot of Number of Times Pregnant Numerical Variable<\/p>\n<\/div>\n<p>For more great examples of histogram plots with Seaborn, see: <a href=\"https:\/\/seaborn.pydata.org\/tutorial\/distributions.html\">Visualizing the distribution of a dataset<\/a>.<\/p>\n<h2>Box and Whisker Plots<\/h2>\n<p>A box and whisker plot, or boxplot for short, is generally used to summarize the distribution of a data sample.<\/p>\n<p>The x-axis is used to represent the data sample, where multiple boxplots can be drawn side by side on the x-axis if desired.<\/p>\n<p>The y-axis represents the observation values. A box is drawn to summarize the middle 50 percent of the dataset starting at the observation at the 25th percentile and ending at the 75th percentile. This is called the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Interquartile_range\">interquartile range<\/a>, or IQR. The median, or 50th percentile, is drawn with a line.<\/p>\n<p>Lines called whiskers are drawn extending from both ends of the box, calculated as (1.5 * IQR) to demonstrate the expected range of sensible values in the distribution. Observations outside the whiskers might be outliers and are drawn with small circles.<\/p>\n<p>A boxplot can be created in Seaborn by calling the <a href=\"https:\/\/seaborn.pydata.org\/generated\/seaborn.boxplot.html\">boxplot() function<\/a> and passing the data.<\/p>\n<p>We will demonstrate a boxplot with a numerical variable from the <a href=\"https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/pima-indians-diabetes.csv\">diabetes classification dataset<\/a>. We will just plot one variable, in this case, the first variable, which is the number of times that a patient was pregnant.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.12 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f3a32b12c965534656613\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\" readability=\"8\">\n<p><textarea wrap=\"soft\" class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\n# create box and whisker plot<br \/>\nboxplot(x=0, data=dataset)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\" readability=\"1.5\">\n<tr class=\"urvanov-syntax-highlighter-row\" readability=\"3\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\" readability=\"5.5\">\n<div class=\"crayon-pre\" readability=\"10\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-p\"># create box and whisker plot<\/span><\/p>\n<p><span class=\"crayon-e\">boxplot<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">x<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">dataset<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0001 seconds] --><\/p>\n<p>Tying this together, the complete example is listed below.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.12 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f3a32b12c966185472045\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\" readability=\"12\">\n<p><textarea wrap=\"soft\" class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# box and whisker plot of a numerical variable<br \/>\nfrom pandas import read_csv<br \/>\nfrom seaborn import boxplot<br \/>\nfrom matplotlib import pyplot<br \/>\n# load the dataset<br \/>\nurl = &#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/pima-indians-diabetes.csv&#8217;<br \/>\ndataset = read_csv(url, header=None)<br \/>\n# create box and whisker plot<br \/>\nboxplot(x=0, data=dataset)<br \/>\n# show plot<br \/>\npyplot.show()<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\" readability=\"3.5\">\n<tr class=\"urvanov-syntax-highlighter-row\" readability=\"7\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\" readability=\"12\">\n<div class=\"crayon-pre\" readability=\"23\">\n<p><span class=\"crayon-p\"># box and whisker plot of a numerical variable<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">pandas <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">read_csv<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">seaborn <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">boxplot<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">matplotlib <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-i\">pyplot<\/span><\/p>\n<p><span class=\"crayon-p\"># load the dataset<\/span><\/p>\n<p><span class=\"crayon-v\">url<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-s\">&#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/pima-indians-diabetes.csv&#8217;<\/span><\/p>\n<p><span class=\"crayon-v\">dataset<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">read_csv<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">url<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">header<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">None<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># create box and whisker plot<\/span><\/p>\n<p><span class=\"crayon-e\">boxplot<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">x<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">dataset<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># show plot<\/span><\/p>\n<p><span class=\"crayon-v\">pyplot<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">show<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0002 seconds] --><\/p>\n<p>Running the example first loads the diabetes dataset and creates a boxplot plot of the first input variable, showing the distribution of the number of times patients were pregnant.<\/p>\n<p>We can see the median just above 2.5 times, some outliers up around 15 times (wow!).<\/p>\n<div id=\"attachment_10411\" class=\"wp-caption aligncenter\" readability=\"32\">\n<img decoding=\"async\" aria-describedby=\"caption-attachment-10411\" loading=\"lazy\" class=\"size-full wp-image-10411\" src=\"https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Box-and-Whisker-Plot-of-Number-of-Times-Pregnant-Numerical-Variable.png\" alt=\"Box and Whisker Plot of Number of Times Pregnant Numerical Variable\" width=\"1280\" height=\"960\" srcset=\"https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Box-and-Whisker-Plot-of-Number-of-Times-Pregnant-Numerical-Variable.png 1280w, https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Box-and-Whisker-Plot-of-Number-of-Times-Pregnant-Numerical-Variable-300x225.png 300w, https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Box-and-Whisker-Plot-of-Number-of-Times-Pregnant-Numerical-Variable-1024x768.png 1024w, https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Box-and-Whisker-Plot-of-Number-of-Times-Pregnant-Numerical-Variable-768x576.png 768w\" sizes=\"(max-width: 1280px) 100vw, 1280px\"><\/p>\n<p id=\"caption-attachment-10411\" class=\"wp-caption-text\">Box and Whisker Plot of Number of Times Pregnant Numerical Variable<\/p>\n<\/div>\n<p>We might also want to plot the distribution of the numerical variable for each value of a categorical variable, such as the first variable, against the class label.<\/p>\n<p>This can be achieved by calling the <em>boxplot()<\/em> function and passing the class variable as the x-axis and the numerical variable as the y-axis.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.12 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f3a32b12c967551799531\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\" readability=\"9\">\n<p><textarea wrap=\"soft\" class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\n# create box and whisker plot<br \/>\nboxplot(x=8, y=0, data=dataset)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\" readability=\"2\">\n<tr class=\"urvanov-syntax-highlighter-row\" readability=\"4\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\" readability=\"6\">\n<div class=\"crayon-pre\" readability=\"11\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-p\"># create box and whisker plot<\/span><\/p>\n<p><span class=\"crayon-e\">boxplot<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">x<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">8<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">dataset<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0001 seconds] --><\/p>\n<p>Tying this together, the complete example is listed below.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.12 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f3a32b12c968616212526\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\" readability=\"13\">\n<p><textarea wrap=\"soft\" class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# box and whisker plot of a numerical variable vs class label<br \/>\nfrom pandas import read_csv<br \/>\nfrom seaborn import boxplot<br \/>\nfrom matplotlib import pyplot<br \/>\n# load the dataset<br \/>\nurl = &#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/pima-indians-diabetes.csv&#8217;<br \/>\ndataset = read_csv(url, header=None)<br \/>\n# create box and whisker plot<br \/>\nboxplot(x=8, y=0, data=dataset)<br \/>\n# show plot<br \/>\npyplot.show()<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\" readability=\"4\">\n<tr class=\"urvanov-syntax-highlighter-row\" readability=\"8\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\" readability=\"12.5\">\n<div class=\"crayon-pre\" readability=\"24\">\n<p><span class=\"crayon-p\"># box and whisker plot of a numerical variable vs class label<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">pandas <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">read_csv<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">seaborn <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">boxplot<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">matplotlib <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-i\">pyplot<\/span><\/p>\n<p><span class=\"crayon-p\"># load the dataset<\/span><\/p>\n<p><span class=\"crayon-v\">url<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-s\">&#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/pima-indians-diabetes.csv&#8217;<\/span><\/p>\n<p><span class=\"crayon-v\">dataset<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">read_csv<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">url<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">header<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">None<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># create box and whisker plot<\/span><\/p>\n<p><span class=\"crayon-e\">boxplot<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">x<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">8<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">dataset<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># show plot<\/span><\/p>\n<p><span class=\"crayon-v\">pyplot<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">show<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0003 seconds] --><\/p>\n<p>Running the example first loads the diabetes dataset and creates a boxplot of the data, showing the distribution of the number of times pregnant as a numerical variable for the two-class labels.<\/p>\n<div id=\"attachment_10412\" class=\"wp-caption aligncenter\" readability=\"32\">\n<img decoding=\"async\" aria-describedby=\"caption-attachment-10412\" loading=\"lazy\" class=\"size-full wp-image-10412\" src=\"https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Box-and-Whisker-Plot-of-Number-of-Times-Pregnant-Numerical-Variable-by-Class-Label.png\" alt=\"Box and Whisker Plot of Number of Times Pregnant Numerical Variable by Class Label\" width=\"1280\" height=\"960\" srcset=\"https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Box-and-Whisker-Plot-of-Number-of-Times-Pregnant-Numerical-Variable-by-Class-Label.png 1280w, https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Box-and-Whisker-Plot-of-Number-of-Times-Pregnant-Numerical-Variable-by-Class-Label-300x225.png 300w, https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Box-and-Whisker-Plot-of-Number-of-Times-Pregnant-Numerical-Variable-by-Class-Label-1024x768.png 1024w, https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Box-and-Whisker-Plot-of-Number-of-Times-Pregnant-Numerical-Variable-by-Class-Label-768x576.png 768w\" sizes=\"(max-width: 1280px) 100vw, 1280px\"><\/p>\n<p id=\"caption-attachment-10412\" class=\"wp-caption-text\">Box and Whisker Plot of Number of Times Pregnant Numerical Variable by Class Label<\/p>\n<\/div>\n<h2>Scatter Plots<\/h2>\n<p>A scatter plot, or scatterplot, is generally used to summarize the relationship between two paired data samples.<\/p>\n<p>Paired data samples mean that two measures were recorded for a given observation, such as the weight and height of a person.<\/p>\n<p>The x-axis represents observation values for the first sample, and the y-axis represents the observation values for the second sample. Each point on the plot represents a single observation.<\/p>\n<p>A scatterplot can be created in Seaborn by calling the <a href=\"https:\/\/seaborn.pydata.org\/generated\/seaborn.scatterplot.html\">scatterplot() function<\/a> and passing the two numerical variables.<\/p>\n<p>We will demonstrate a scatterplot with two numerical variables from the <a href=\"https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/pima-indians-diabetes.csv\">diabetes classification dataset<\/a>. We will plot the first versus the second variable, in this case, the first variable, which is the number of times that a patient was pregnant, and the second is the plasma glucose concentration after a two hour oral glucose tolerance test (<a href=\"https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/pima-indians-diabetes.names\">more details of the variables here<\/a>).<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.12 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f3a32b12c969146033990\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\" readability=\"9\">\n<p><textarea wrap=\"soft\" class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\n# create scatter plot<br \/>\nscatterplot(x=0, y=1, data=dataset)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\" readability=\"2\">\n<tr class=\"urvanov-syntax-highlighter-row\" readability=\"4\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\" readability=\"5\">\n<div class=\"crayon-pre\" readability=\"9\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-p\"># create scatter plot<\/span><\/p>\n<p><span class=\"crayon-e\">scatterplot<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">x<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">dataset<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0001 seconds] --><\/p>\n<p>Tying this together, the complete example is listed below.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.12 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f3a32b12c96a019116690\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\" readability=\"13\">\n<p><textarea wrap=\"soft\" class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# scatter plot of two numerical variables<br \/>\nfrom pandas import read_csv<br \/>\nfrom seaborn import scatterplot<br \/>\nfrom matplotlib import pyplot<br \/>\n# load the dataset<br \/>\nurl = &#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/pima-indians-diabetes.csv&#8217;<br \/>\ndataset = read_csv(url, header=None)<br \/>\n# create scatter plot<br \/>\nscatterplot(x=0, y=1, data=dataset)<br \/>\n# show plot<br \/>\npyplot.show()<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\" readability=\"4\">\n<tr class=\"urvanov-syntax-highlighter-row\" readability=\"8\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\" readability=\"11.5\">\n<div class=\"crayon-pre\" readability=\"22\">\n<p><span class=\"crayon-p\"># scatter plot of two numerical variables<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">pandas <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">read_csv<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">seaborn <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">scatterplot<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">matplotlib <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-i\">pyplot<\/span><\/p>\n<p><span class=\"crayon-p\"># load the dataset<\/span><\/p>\n<p><span class=\"crayon-v\">url<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-s\">&#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/pima-indians-diabetes.csv&#8217;<\/span><\/p>\n<p><span class=\"crayon-v\">dataset<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">read_csv<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">url<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">header<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">None<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># create scatter plot<\/span><\/p>\n<p><span class=\"crayon-e\">scatterplot<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">x<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">dataset<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># show plot<\/span><\/p>\n<p><span class=\"crayon-v\">pyplot<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">show<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0003 seconds] --><\/p>\n<p>Running the example first loads the diabetes dataset and creates a scatter plot of the first two input variables.<\/p>\n<p>We can see a somewhat uniform relationship between the two variables.<\/p>\n<div id=\"attachment_10413\" class=\"wp-caption aligncenter\" readability=\"32\">\n<img decoding=\"async\" aria-describedby=\"caption-attachment-10413\" loading=\"lazy\" class=\"size-full wp-image-10413\" src=\"https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Scatter-Plot-of-Number-of-Times-Pregnant-vs-Plasma-Glucose-Numerical-Variables.png\" alt=\"Scatter Plot of Number of Times Pregnant vs. Plasma Glucose Numerical Variables\" width=\"1280\" height=\"960\" srcset=\"https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Scatter-Plot-of-Number-of-Times-Pregnant-vs-Plasma-Glucose-Numerical-Variables.png 1280w, https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Scatter-Plot-of-Number-of-Times-Pregnant-vs-Plasma-Glucose-Numerical-Variables-300x225.png 300w, https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Scatter-Plot-of-Number-of-Times-Pregnant-vs-Plasma-Glucose-Numerical-Variables-1024x768.png 1024w, https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Scatter-Plot-of-Number-of-Times-Pregnant-vs-Plasma-Glucose-Numerical-Variables-768x576.png 768w\" sizes=\"(max-width: 1280px) 100vw, 1280px\"><\/p>\n<p id=\"caption-attachment-10413\" class=\"wp-caption-text\">Scatter Plot of Number of Times Pregnant vs. Plasma Glucose Numerical Variables<\/p>\n<\/div>\n<p>We might also want to plot the relationship for the pair of numerical variables against the class label.<\/p>\n<p>This can be achieved using the scatterplot() function and specifying the class variable (column index 8) via the \u201c<em>hue<\/em>\u201d argument, as follows:<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.12 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f3a32b12c96b679890520\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\" readability=\"10\">\n<p><textarea wrap=\"soft\" class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\n# create scatter plot<br \/>\nscatterplot(x=0, y=1, hue=8, data=dataset)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\" readability=\"2.5\">\n<tr class=\"urvanov-syntax-highlighter-row\" readability=\"5\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\" readability=\"5.5\">\n<div class=\"crayon-pre\" readability=\"10\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-p\"># create scatter plot<\/span><\/p>\n<p><span class=\"crayon-e\">scatterplot<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">x<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">hue<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">8<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">dataset<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0001 seconds] --><\/p>\n<p>Tying this together, the complete example is listed below.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.12 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f3a32b12c96c771766316\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\" readability=\"14\">\n<p><textarea wrap=\"soft\" class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# scatter plot of two numerical variables vs class label<br \/>\nfrom pandas import read_csv<br \/>\nfrom seaborn import scatterplot<br \/>\nfrom matplotlib import pyplot<br \/>\n# load the dataset<br \/>\nurl = &#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/pima-indians-diabetes.csv&#8217;<br \/>\ndataset = read_csv(url, header=None)<br \/>\n# create scatter plot<br \/>\nscatterplot(x=0, y=1, hue=8, data=dataset)<br \/>\n# show plot<br \/>\npyplot.show()<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\" readability=\"4.5\">\n<tr class=\"urvanov-syntax-highlighter-row\" readability=\"9\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\" readability=\"12\">\n<div class=\"crayon-pre\" readability=\"23\">\n<p><span class=\"crayon-p\"># scatter plot of two numerical variables vs class label<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">pandas <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">read_csv<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">seaborn <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">scatterplot<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">matplotlib <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-i\">pyplot<\/span><\/p>\n<p><span class=\"crayon-p\"># load the dataset<\/span><\/p>\n<p><span class=\"crayon-v\">url<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-s\">&#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/pima-indians-diabetes.csv&#8217;<\/span><\/p>\n<p><span class=\"crayon-v\">dataset<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">read_csv<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">url<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">header<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">None<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># create scatter plot<\/span><\/p>\n<p><span class=\"crayon-e\">scatterplot<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">x<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">hue<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">8<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">dataset<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># show plot<\/span><\/p>\n<p><span class=\"crayon-v\">pyplot<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">show<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0003 seconds] --><\/p>\n<p>Running the example first loads the diabetes dataset and creates a scatter plot of the first two variables vs. class label.<\/p>\n<div id=\"attachment_10414\" class=\"wp-caption aligncenter\" readability=\"32\">\n<img decoding=\"async\" aria-describedby=\"caption-attachment-10414\" loading=\"lazy\" class=\"size-full wp-image-10414\" src=\"https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Scatter-Plot-of-Number-of-Times-Pregnant-vs-Plasma-Glucose-Numerical-Variables-by-Class-Label.png\" alt=\"Scatter Plot of Number of Times Pregnant vs. Plasma Glucose Numerical Variables by Class Label\" width=\"1280\" height=\"960\" srcset=\"https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Scatter-Plot-of-Number-of-Times-Pregnant-vs-Plasma-Glucose-Numerical-Variables-by-Class-Label.png 1280w, https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Scatter-Plot-of-Number-of-Times-Pregnant-vs-Plasma-Glucose-Numerical-Variables-by-Class-Label-300x225.png 300w, https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Scatter-Plot-of-Number-of-Times-Pregnant-vs-Plasma-Glucose-Numerical-Variables-by-Class-Label-1024x768.png 1024w, https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/02\/Scatter-Plot-of-Number-of-Times-Pregnant-vs-Plasma-Glucose-Numerical-Variables-by-Class-Label-768x576.png 768w\" sizes=\"(max-width: 1280px) 100vw, 1280px\"><\/p>\n<p id=\"caption-attachment-10414\" class=\"wp-caption-text\">Scatter Plot of Number of Times Pregnant vs. Plasma Glucose Numerical Variables by Class Label<\/p>\n<\/div>\n<h2>Further Reading<\/h2>\n<p>This section provides more resources on the topic if you are looking to go deeper.<\/p>\n<h3>Tutorials<\/h3>\n<h3>APIs<\/h3>\n<h2>Summary<\/h2>\n<p>In this tutorial, you discovered a gentle introduction to Seaborn data visualization for machine learning.<\/p>\n<p>Specifically, you learned:<\/p>\n<ul>\n<li>How to summarize the distribution of variables using bar charts, histograms, and box and whisker plots.<\/li>\n<li>How to summarize relationships using line plots and scatter plots.<\/li>\n<li>How to compare the distribution and relationships of variables for different class values on the same plot.<\/li>\n<\/ul>\n<p><strong>Do you have any questions?<\/strong><br \/>Ask your questions in the comments below and I will do my best to answer.<\/p>\n<div class=\"widget_text awac-wrapper\">\n<div class=\"widget_text awac widget custom_html-78\">\n<div class=\"textwidget custom-html-widget\" readability=\"10.013414634146\">\n<div readability=\"15.673170731707\">\n<h2>Discover Fast Machine Learning in Python!<\/h2>\n<p><a href=\"\/machine-learning-with-python\/\" rel=\"nofollow\"><img decoding=\"async\" src=\"https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2014\/07\/MachineLearningMasteryWithPython-220px.png\" alt=\"Master Machine Learning With Python\" align=\"left\"><\/a><\/p>\n<h4>Develop Your Own Models in Minutes<\/h4>\n<p>&#8230;with just a few lines of scikit-learn code<\/p>\n<p>Learn how in my new Ebook:<br \/><a href=\"\/machine-learning-with-python\/\" rel=\"nofollow\">Machine Learning Mastery With Python<\/a><\/p>\n<p>Covers <strong>self-study tutorials<\/strong> and <strong>end-to-end projects<\/strong> like:<br \/><em>Loading data<\/em>, <em>visualization<\/em>, <em>modeling<\/em>, <em>tuning<\/em>, and much more&#8230;<\/p>\n<h4>Finally Bring Machine Learning To<br \/>Your Own Projects<\/h4>\n<p>Skip the Academics. Just Results.<\/p>\n<p><a href=\"\/machine-learning-with-python\/\" class=\"woo-sc-button  custom\"><span class=\"woo-\">See What&#8217;s Inside<\/span><\/a><\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div><\/div>\n","protected":false},"excerpt":{"rendered":"<p>https:\/\/machinelearningmastery.com\/seaborn-data-visualization-for-machine-learning\/<\/p>\n","protected":false},"author":1,"featured_media":98,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[3],"tags":[],"_links":{"self":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/97"}],"collection":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/comments?post=97"}],"version-history":[{"count":0,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/97\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media\/98"}],"wp:attachment":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media?parent=97"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/categories?post=97"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/tags?post=97"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}