{"id":1079,"date":"2021-10-23T08:42:10","date_gmt":"2021-10-23T08:42:10","guid":{"rendered":"https:\/\/salarydistribution.com\/machine-learning\/2021\/10\/23\/principal-component-analysis-for-visualization\/"},"modified":"2021-10-23T08:42:10","modified_gmt":"2021-10-23T08:42:10","slug":"principal-component-analysis-for-visualization","status":"publish","type":"post","link":"https:\/\/salarydistribution.com\/machine-learning\/2021\/10\/23\/principal-component-analysis-for-visualization\/","title":{"rendered":"Principal Component Analysis for Visualization"},"content":{"rendered":"<div id=\"\">\n<p id=\"last-modified-info\">Last Updated on October 20, 2021<\/p>\n<p>Principal component analysis (PCA) is an unsupervised machine learning technique. Perhaps the most popular use of principal component analysis is dimensionality reduction. Besides using PCA as a data preparation technique, we can also use it to help visualize data. A picture is worth a thousand words. With the data visualized, it is easier for us to get some insight and decide on the next step in our machine learning models.<\/p>\n<p>In this tutorial, you will discover how to visualize data using PCA, as well as using visualization to help determining the parameter for dimensionality reduction.<\/p>\n<p>After completing this tutorial, you will know:<\/p>\n<ul>\n<li>How to use visualize a high dimensional data<\/li>\n<li>What is explained variance in PCA<\/li>\n<li>Visually observe the explained variance from the result of PCA of high dimensional data<\/li>\n<\/ul>\n<p>Let\u2019s get started.<\/p>\n<div id=\"attachment_12962\" class=\"wp-caption aligncenter\"><img aria-describedby=\"caption-attachment-12962\" loading=\"lazy\" class=\"aligncenter size-full wp-image-12998\" data-cfsrc=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2021\/10\/6099561804_5207dc5fe3_o.jpg\" alt=\"\" width=\"900\" height=\"600\"><img decoding=\"async\" aria-describedby=\"caption-attachment-12962\" loading=\"lazy\" class=\"aligncenter size-full wp-image-12998\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2021\/10\/6099561804_5207dc5fe3_o.jpg\" alt=\"\" width=\"900\" height=\"600\"><\/p>\n<p id=\"caption-attachment-12962\" class=\"wp-caption-text\">Principal Component Analysis for Visualization<br \/>Photo by <a href=\"https:\/\/www.flickr.com\/photos\/vampa__\/6099561804\/\">Levan Gokadze<\/a>, some rights reserved.<\/p>\n<\/div>\n<h2><b>Tutorial Overview<\/b><\/h2>\n<p>This tutorial is divided into two parts; they are:<\/p>\n<ul>\n<li>Scatter plot of high dimensional data<\/li>\n<li>Visualizing the explained variance<\/li>\n<\/ul>\n<h2><b>Prerequisites<\/b><\/h2>\n<p>For this tutorial, we assume that you are already familiar with:<\/p>\n<h2><b>Scatter plot of high dimensional data<\/b><\/h2>\n<p>Visualization is a crucial step to get insight from data. We can learn from the visualization that whether a pattern can be observed and hence estimate which machine learning model is suitable.<\/p>\n<p>It is easy to depict things in two dimension. Normally a scatter plot with x- and y-axis are in two dimensional. Depicting things in three dimensional is a bit challenging but not impossible. In matplotlib, for example, can plot in 3D. The only problem is on paper or on screen, we need can only look at a 3D plot at one viewport or projection at a time. In matplotlib, this is controlled by the degree of elevation and azimuth. Depicting things in four or five dimensions is impossible because we live in a three-dimensional world and have no idea of how things in such a high dimension would look like.<\/p>\n<p>This is where a dimensionality reduction technique such as PCA comes into play. We can reduce the dimension to two or three so we can visualize it. Let\u2019s start with an example.<\/p>\n<p>We start with the <a href=\"https:\/\/scikit-learn.org\/stable\/datasets\/toy_dataset.html#wine-dataset\">wine dataset<\/a>, which is a classification dataset with 13 features and 3 classes. There are 178 samples:<\/p>\n<div id=\"urvanov-syntax-highlighter-617372c1ccf81867125857\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\nfrom sklearn.datasets import load_wine<br \/>\nwinedata = load_wine()<br \/>\nX, y = winedata[&#8216;data&#8217;], winedata[&#8216;target&#8217;]<br \/>\nprint(X.shape)<br \/>\nprint(y.shape)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">datasets <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">load_wine<\/span><\/p>\n<p><span class=\"crayon-v\">winedata<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">load_wine<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">winedata<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8216;data&#8217;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">winedata<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8216;target&#8217;<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<p>Among the 13 features, we can pick any two and plot with matplotlib (we color-coded the different classes using the <code>c<\/code> argument):<\/p>\n<div id=\"urvanov-syntax-highlighter-617372c1ccf88090780337\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\nimport matplotlib.pyplot as plt<br \/>\nplt.scatter(X[:,1], X[:,2], c=y)<br \/>\nplt.show()<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-e\">import <\/span><span class=\"crayon-v\">matplotlib<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">pyplot <\/span><span class=\"crayon-st\">as<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">plt<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">scatter<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">2<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">c<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">show<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<p><img loading=\"lazy\" class=\"aligncenter wp-image-12989\" data-cfsrc=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2021\/10\/output_5_0.png\" alt=\"\" width=\"490\" height=\"357\"><\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-12989\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2021\/10\/output_5_0.png\" alt=\"\" width=\"490\" height=\"357\"><\/p>\n<p>or we can also pick any three and show in 3D:<\/p>\n<div id=\"urvanov-syntax-highlighter-617372c1ccf89849908196\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\nax = fig.add_subplot(projection=&#8217;3d&#8217;)<br \/>\nax.scatter(X[:,1], X[:,2], X[:,3], c=y)<br \/>\nplt.show()<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-v\">ax<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">fig<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">add_subplot<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">projection<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8216;3d&#8217;<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">ax<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">scatter<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">2<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">3<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">c<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">show<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<p><img loading=\"lazy\" class=\"aligncenter wp-image-12990\" data-cfsrc=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2021\/10\/output_7_0.png\" alt=\"\" width=\"458\" height=\"449\"><\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-12990\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2021\/10\/output_7_0.png\" alt=\"\" width=\"458\" height=\"449\"><\/p>\n<p>But these doesn\u2019t reveal much of how the data looks like, because majority of the features are not shown. We now resort to principal component analysis:<\/p>\n<div id=\"urvanov-syntax-highlighter-617372c1ccf8a357910955\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\nfrom sklearn.decomposition import PCA<br \/>\npca = PCA()<br \/>\nXt = pca.fit_transform(X)<br \/>\nplot = plt.scatter(Xt[:,0], Xt[:,1], c=y)<br \/>\nplt.legend(handles=plot.legend_elements()[0], labels=list(winedata[&#8216;target_names&#8217;]))<br \/>\nplt.show()<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">decomposition <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">PCA<\/span><\/p>\n<p><span class=\"crayon-v\">pca<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">PCA<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">Xt<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">pca<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">fit_transform<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plot<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">scatter<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">Xt<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">Xt<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">c<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">legend<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">handles<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">plot<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">legend_elements<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">labels<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-e\">list<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">winedata<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8216;target_names&#8217;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">show<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<p><img loading=\"lazy\" class=\"aligncenter wp-image-12991\" data-cfsrc=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2021\/10\/output_9_0.png\" alt=\"\" width=\"500\" height=\"357\"><\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-12991\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2021\/10\/output_9_0.png\" alt=\"\" width=\"500\" height=\"357\"><\/p>\n<p>Here we transform the input data <code>X<\/code> by PCA into <code>Xt<\/code>. We consider only the first two columns, which contains the most information, and plot it in two dimensional. We can see that the purple class is quite distinctive, but there is still some overlap. But if we scale the data before PCA, the result would be different:<\/p>\n<div id=\"urvanov-syntax-highlighter-617372c1ccf8b958931887\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\nfrom sklearn.preprocessing import StandardScaler<br \/>\nfrom sklearn.pipeline import Pipeline<br \/>\npca = PCA()<br \/>\npipe = Pipeline([(&#8216;scaler&#8217;, StandardScaler()), (&#8216;pca&#8217;, pca)])<br \/>\nXt = pipe.fit_transform(X)<br \/>\nplot = plt.scatter(Xt[:,0], Xt[:,1], c=y)<br \/>\nplt.legend(handles=plot.legend_elements()[0], labels=list(winedata[&#8216;target_names&#8217;]))<br \/>\nplt.show()<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">preprocessing <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">StandardScaler<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">pipeline <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">Pipeline<\/span><\/p>\n<p><span class=\"crayon-v\">pca<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">PCA<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">pipe<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">Pipeline<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8216;scaler&#8217;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">StandardScaler<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8216;pca&#8217;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">pca<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">Xt<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">pipe<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">fit_transform<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plot<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">scatter<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">Xt<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">Xt<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">c<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">legend<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">handles<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">plot<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">legend_elements<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">labels<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-e\">list<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">winedata<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8216;target_names&#8217;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">show<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<p><img loading=\"lazy\" class=\"aligncenter wp-image-12992\" data-cfsrc=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2021\/10\/output_11_0.png\" alt=\"\" width=\"482\" height=\"357\"><\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-12992\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2021\/10\/output_11_0.png\" alt=\"\" width=\"482\" height=\"357\"><\/p>\n<p>Because PCA is sensitive to the scale, if we normalized each feature by <code>StandardScaler<\/code> we can see a better result. Here the different classes are more distinctive. By looking at this plot, we are confident that a simple model such as SVM can classify this dataset in high accuracy.<\/p>\n<p>Putting these together, the following is the complete code to generate the visualizations:<\/p>\n<div id=\"urvanov-syntax-highlighter-617372c1ccf8c991466943\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\nfrom sklearn.datasets import load_wine<br \/>\nfrom sklearn.decomposition import PCA<br \/>\nfrom sklearn.preprocessing import StandardScaler<br \/>\nfrom sklearn.pipeline import Pipeline<br \/>\nimport matplotlib.pyplot as plt<\/p>\n<p># Load dataset<br \/>\nwinedata = load_wine()<br \/>\nX, y = winedata[&#8216;data&#8217;], winedata[&#8216;target&#8217;]<br \/>\nprint(&#8220;X shape:&#8221;, X.shape)<br \/>\nprint(&#8220;y shape:&#8221;, y.shape)<\/p>\n<p># Show any two features<br \/>\nplt.figure(figsize=(8,6))<br \/>\nplt.scatter(X[:,1], X[:,2], c=y)<br \/>\nplt.xlabel(winedata[&#8220;feature_names&#8221;][1])<br \/>\nplt.ylabel(winedata[&#8220;feature_names&#8221;][2])<br \/>\nplt.title(&#8220;Two particular features of the wine dataset&#8221;)<br \/>\nplt.show()<\/p>\n<p># Show any three features<br \/>\nfig = plt.figure(figsize=(10,8))<br \/>\nax = fig.add_subplot(projection=&#8217;3d&#8217;)<br \/>\nax.scatter(X[:,1], X[:,2], X[:,3], c=y)<br \/>\nax.set_xlabel(winedata[&#8220;feature_names&#8221;][1])<br \/>\nax.set_ylabel(winedata[&#8220;feature_names&#8221;][2])<br \/>\nax.set_zlabel(winedata[&#8220;feature_names&#8221;][3])<br \/>\nax.set_title(&#8220;Three particular features of the wine dataset&#8221;)<br \/>\nplt.show()<\/p>\n<p># Show first two principal components without scaler<br \/>\npca = PCA()<br \/>\nplt.figure(figsize=(8,6))<br \/>\nXt = pca.fit_transform(X)<br \/>\nplot = plt.scatter(Xt[:,0], Xt[:,1], c=y)<br \/>\nplt.legend(handles=plot.legend_elements()[0], labels=list(winedata[&#8216;target_names&#8217;]))<br \/>\nplt.xlabel(&#8220;PC1&#8221;)<br \/>\nplt.ylabel(&#8220;PC2&#8221;)<br \/>\nplt.title(&#8220;First two principal components&#8221;)<br \/>\nplt.show()<\/p>\n<p># Show first two principal components with scaler<br \/>\npca = PCA()<br \/>\npipe = Pipeline([(&#8216;scaler&#8217;, StandardScaler()), (&#8216;pca&#8217;, pca)])<br \/>\nplt.figure(figsize=(8,6))<br \/>\nXt = pipe.fit_transform(X)<br \/>\nplot = plt.scatter(Xt[:,0], Xt[:,1], c=y)<br \/>\nplt.legend(handles=plot.legend_elements()[0], labels=list(winedata[&#8216;target_names&#8217;]))<br \/>\nplt.xlabel(&#8220;PC1&#8221;)<br \/>\nplt.ylabel(&#8220;PC2&#8221;)<br \/>\nplt.title(&#8220;First two principal components after scaling&#8221;)<br \/>\nplt.show()<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<div class=\"urvanov-syntax-highlighter-nums-content\">\n<p>1<\/p>\n<p>2<\/p>\n<p>3<\/p>\n<p>4<\/p>\n<p>5<\/p>\n<p>6<\/p>\n<p>7<\/p>\n<p>8<\/p>\n<p>9<\/p>\n<p>10<\/p>\n<p>11<\/p>\n<p>12<\/p>\n<p>13<\/p>\n<p>14<\/p>\n<p>15<\/p>\n<p>16<\/p>\n<p>17<\/p>\n<p>18<\/p>\n<p>19<\/p>\n<p>20<\/p>\n<p>21<\/p>\n<p>22<\/p>\n<p>23<\/p>\n<p>24<\/p>\n<p>25<\/p>\n<p>26<\/p>\n<p>27<\/p>\n<p>28<\/p>\n<p>29<\/p>\n<p>30<\/p>\n<p>31<\/p>\n<p>32<\/p>\n<p>33<\/p>\n<p>34<\/p>\n<p>35<\/p>\n<p>36<\/p>\n<p>37<\/p>\n<p>38<\/p>\n<p>39<\/p>\n<p>40<\/p>\n<p>41<\/p>\n<p>42<\/p>\n<p>43<\/p>\n<p>44<\/p>\n<p>45<\/p>\n<p>46<\/p>\n<p>47<\/p>\n<p>48<\/p>\n<p>49<\/p>\n<p>50<\/p>\n<p>51<\/p>\n<p>52<\/p>\n<\/div>\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">datasets <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">load_wine<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">decomposition <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">PCA<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">preprocessing <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">StandardScaler<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">pipeline <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">Pipeline<\/span><\/p>\n<p><span class=\"crayon-e\">import <\/span><span class=\"crayon-v\">matplotlib<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">pyplot <\/span><span class=\"crayon-st\">as<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">plt<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Load dataset<\/span><\/p>\n<p><span class=\"crayon-v\">winedata<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">load_wine<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">winedata<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8216;data&#8217;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">winedata<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8216;target&#8217;<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;X shape:&#8221;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;y shape:&#8221;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Show any two features<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">figure<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">figsize<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">8<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">6<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">scatter<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">2<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">c<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">xlabel<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">winedata<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8220;feature_names&#8221;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">ylabel<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">winedata<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8220;feature_names&#8221;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">2<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">title<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;Two particular features of the wine dataset&#8221;<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">show<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Show any three features<\/span><\/p>\n<p><span class=\"crayon-v\">fig<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">figure<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">figsize<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">10<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">8<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">ax<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">fig<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">add_subplot<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">projection<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8216;3d&#8217;<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">ax<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">scatter<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">2<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">3<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">c<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">ax<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">set_xlabel<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">winedata<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8220;feature_names&#8221;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">ax<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">set_ylabel<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">winedata<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8220;feature_names&#8221;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">2<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">ax<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">set_zlabel<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">winedata<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8220;feature_names&#8221;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">3<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">ax<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">set_title<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;Three particular features of the wine dataset&#8221;<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">show<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Show first two principal components without scaler<\/span><\/p>\n<p><span class=\"crayon-v\">pca<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">PCA<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">figure<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">figsize<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">8<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">6<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">Xt<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">pca<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">fit_transform<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plot<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">scatter<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">Xt<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">Xt<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">c<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">legend<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">handles<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">plot<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">legend_elements<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">labels<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-e\">list<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">winedata<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8216;target_names&#8217;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">xlabel<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;PC1&#8221;<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">ylabel<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;PC2&#8221;<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">title<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;First two principal components&#8221;<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">show<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Show first two principal components with scaler<\/span><\/p>\n<p><span class=\"crayon-v\">pca<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">PCA<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">pipe<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">Pipeline<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8216;scaler&#8217;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">StandardScaler<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8216;pca&#8217;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">pca<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">figure<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">figsize<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">8<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">6<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">Xt<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">pipe<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">fit_transform<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plot<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">scatter<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">Xt<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">Xt<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">c<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">legend<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">handles<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">plot<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">legend_elements<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">labels<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-e\">list<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">winedata<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8216;target_names&#8217;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">xlabel<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;PC1&#8221;<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">ylabel<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;PC2&#8221;<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">title<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;First two principal components after scaling&#8221;<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">show<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<p>If we apply the same method on a different dataset, such as MINST handwritten digits, the scatterplot is not showing distinctive boundary and therefore it needs a more complicated model such as neural network to classify:<\/p>\n<div id=\"urvanov-syntax-highlighter-617372c1ccf8d930417812\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\nfrom sklearn.datasets import load_digits<br \/>\nfrom sklearn.decomposition import PCA<br \/>\nfrom sklearn.preprocessing import StandardScaler<br \/>\nfrom sklearn.pipeline import Pipeline<br \/>\nimport matplotlib.pyplot as plt<\/p>\n<p>digitsdata = load_digits()<br \/>\nX, y = digitsdata[&#8216;data&#8217;], digitsdata[&#8216;target&#8217;]<br \/>\npca = PCA()<br \/>\npipe = Pipeline([(&#8216;scaler&#8217;, StandardScaler()), (&#8216;pca&#8217;, pca)])<br \/>\nplt.figure(figsize=(8,6))<br \/>\nXt = pipe.fit_transform(X)<br \/>\nplot = plt.scatter(Xt[:,0], Xt[:,1], c=y)<br \/>\nplt.legend(handles=plot.legend_elements()[0], labels=list(digitsdata[&#8216;target_names&#8217;]))<br \/>\nplt.show()<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">datasets <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">load_digits<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">decomposition <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">PCA<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">preprocessing <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">StandardScaler<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">pipeline <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">Pipeline<\/span><\/p>\n<p><span class=\"crayon-e\">import <\/span><span class=\"crayon-v\">matplotlib<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">pyplot <\/span><span class=\"crayon-st\">as<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">plt<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-v\">digitsdata<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">load_digits<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">digitsdata<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8216;data&#8217;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">digitsdata<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8216;target&#8217;<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-v\">pca<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">PCA<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">pipe<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">Pipeline<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8216;scaler&#8217;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">StandardScaler<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8216;pca&#8217;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">pca<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">figure<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">figsize<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">8<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">6<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">Xt<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">pipe<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">fit_transform<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plot<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">scatter<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">Xt<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">Xt<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">c<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">legend<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">handles<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">plot<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">legend_elements<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">labels<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-e\">list<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">digitsdata<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8216;target_names&#8217;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">show<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<p><img loading=\"lazy\" class=\"aligncenter wp-image-12993\" data-cfsrc=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2021\/10\/output_14_0.png\" alt=\"\" width=\"492\" height=\"357\"><\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-12993\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2021\/10\/output_14_0.png\" alt=\"\" width=\"492\" height=\"357\"><\/p>\n<h2><b>Visualizing the explained variance<\/b><\/h2>\n<p>PCA in essence is to rearrange the features by their linear combinations. Hence it is called a feature extraction technique. One characteristic of PCA is that the first principal component holds the most information about the dataset. The second principal component is more informative than the third, and so on.<\/p>\n<p>To illustrate this idea, we can remove the principal components from the original dataset in steps and see how the dataset looks like. Let\u2019s consider a dataset with fewer features, and show two features in a plot:<\/p>\n<div id=\"urvanov-syntax-highlighter-617372c1ccf8e864654087\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\nfrom sklearn.datasets import load_iris<br \/>\nirisdata = load_iris()<br \/>\nX, y = irisdata[&#8216;data&#8217;], irisdata[&#8216;target&#8217;]<br \/>\nplt.figure(figsize=(8,6))<br \/>\nplt.scatter(X[:,0], X[:,1], c=y)<br \/>\nplt.show()<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">datasets <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">load_iris<\/span><\/p>\n<p><span class=\"crayon-v\">irisdata<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">load_iris<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">irisdata<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8216;data&#8217;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">irisdata<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8216;target&#8217;<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">figure<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">figsize<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">8<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">6<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">scatter<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">c<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">show<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<p><img loading=\"lazy\" class=\"aligncenter wp-image-12994\" data-cfsrc=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2021\/10\/output_30_0.png\" alt=\"\" width=\"483\" height=\"359\"><\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-12994\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2021\/10\/output_30_0.png\" alt=\"\" width=\"483\" height=\"359\"><\/p>\n<p>This is the <a href=\"https:\/\/scikit-learn.org\/stable\/modules\/generated\/sklearn.datasets.load_iris.html\">iris dataset<\/a> which has only four features. The features are in comparable scales and hence we can skip the scaler. With a 4-features data, the PCA can produce at most 4 principal components:<\/p>\n<div id=\"urvanov-syntax-highlighter-617372c1ccf8f582587089\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\npca = PCA().fit(X)<br \/>\nprint(pca.components_)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-v\">pca<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">PCA<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">fit<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">pca<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">components_<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<div id=\"urvanov-syntax-highlighter-617372c1ccf90683044803\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n[[ 0.36138659 -0.08452251  0.85667061  0.3582892 ]<br \/>\n [ 0.65658877  0.73016143 -0.17337266 -0.07548102]<br \/>\n [-0.58202985  0.59791083  0.07623608  0.54583143]<br \/>\n [-0.31548719  0.3197231   0.47983899 -0.75365743]]<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p>[[ 0.36138659 -0.08452251\u00a0\u00a00.85667061\u00a0\u00a00.3582892 ]<\/p>\n<p> [ 0.65658877\u00a0\u00a00.73016143 -0.17337266 -0.07548102]<\/p>\n<p> [-0.58202985\u00a0\u00a00.59791083\u00a0\u00a00.07623608\u00a0\u00a00.54583143]<\/p>\n<p> [-0.31548719\u00a0\u00a00.3197231\u00a0\u00a0 0.47983899 -0.75365743]]<\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<p>For example, the first row is the first principal axis on which the first principal component is created. For any data point $p$ with features $p=(a,b,c,d)$, since the principal axis is denoted by the vector $v=(0.36,-0.08,0.86,0.36)$, the first principal component of this data point has the value $0.36 times a \u2013 0.08 times b + 0.86 times c + 0.36times d$ on the principal axis. Using vector dot product, this value can be denoted by<br \/>$$<br \/>p cdot v<br \/>$$<br \/>Therefore, with the dataset $X$ as a 150 $times$ 4 matrix (150 data points, each has 4 features), we can map each data point into to the value on this principal axis by matrix-vector multiplication:<br \/>$$<br \/>X times v<br \/>$$<br \/>and the result is a vector of length 150. Now if we remove from each data point corresponding value along the principal axis vector, that would be<br \/>$$<br \/>X \u2013 (X times v) times v^T<br \/>$$<br \/>where the transposed vector $v^T$ is a row and $Xtimes v$ is a column. The product $(X times v) times v^T$ follows matrix-matrix multiplication and the result is a $150times 4$ matrix, same dimension as $X$.<\/p>\n<p>If we plot the first two feature of $(X times v) times v^T$, it looks like this:<\/p>\n<div id=\"urvanov-syntax-highlighter-617372c1ccf91687830566\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\n# Remove PC1<br \/>\nXmean = X &#8211; X.mean(axis=0)<br \/>\nvalue = Xmean @ pca.components_[0]<br \/>\npc1 = value.reshape(-1,1) @ pca.components_[0].reshape(1,-1)<br \/>\nXremove = X &#8211; pc1<br \/>\nplt.scatter(Xremove[:,0], Xremove[:,1], c=y)<br \/>\nplt.show()<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-p\"># Remove PC1<\/span><\/p>\n<p><span class=\"crayon-v\">Xmean<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">mean<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">axis<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">value<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">Xmean<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">@<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">pca<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">components_<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-v\">pc1<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">value<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">reshape<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">@<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">pca<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">components_<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">reshape<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">Xremove<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">pc1<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">scatter<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">Xremove<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">Xremove<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">c<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">show<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<p><img loading=\"lazy\" class=\"aligncenter wp-image-12995\" data-cfsrc=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2021\/10\/output_34_0.png\" alt=\"\" width=\"483\" height=\"357\"><\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-12995\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2021\/10\/output_34_0.png\" alt=\"\" width=\"483\" height=\"357\"><\/p>\n<p>The numpy array <code>Xmean<\/code> is to shift the features of <code>X<\/code> to centered at zero. This is required for PCA. Then the array <code>value<\/code> is computed by matrix-vector multiplication.<br \/>The array <code>value<\/code> is the magnitude of each data point mapped on the principal axis. So if we multiply this value to the principal axis vector we get back an array <code>pc1<\/code>. Removing this from the original dataset <code>X<\/code>, we get a new array <code>Xremove<\/code>. In the plot we observed that the points on the scatter plot crumbled together and the cluster of each class is less distinctive than before. This means we removed a lot of information by removing the first principal component. If we repeat the same process again, the points are further crumbled:<\/p>\n<div id=\"urvanov-syntax-highlighter-617372c1ccf92580313136\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\n# Remove PC2<br \/>\nvalue = Xmean @ pca.components_[1]<br \/>\npc2 = value.reshape(-1,1) @ pca.components_[1].reshape(1,-1)<br \/>\nXremove = Xremove &#8211; pc2<br \/>\nplt.scatter(Xremove[:,0], Xremove[:,1], c=y)<br \/>\nplt.show()<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-p\"># Remove PC2<\/span><\/p>\n<p><span class=\"crayon-v\">value<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">Xmean<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">@<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">pca<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">components_<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-v\">pc2<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">value<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">reshape<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">@<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">pca<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">components_<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">reshape<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">Xremove<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">Xremove<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">pc2<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">scatter<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">Xremove<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">Xremove<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">c<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">show<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<p><img loading=\"lazy\" class=\"aligncenter wp-image-12999\" data-cfsrc=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2021\/10\/output_36_0-1.png\" alt=\"\" width=\"483\" height=\"357\"><\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-12999\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2021\/10\/output_36_0-1.png\" alt=\"\" width=\"483\" height=\"357\"><\/p>\n<p>This looks like a straight line but actually not. If we repeat once more, all points collapse into a straight line:<\/p>\n<div id=\"urvanov-syntax-highlighter-617372c1ccf93629822669\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\n# Remove PC3<br \/>\nvalue = Xmean @ pca.components_[2]<br \/>\npc3 = value.reshape(-1,1) @ pca.components_[2].reshape(1,-1)<br \/>\nXremove = Xremove &#8211; pc3<br \/>\nplt.scatter(Xremove[:,0], Xremove[:,1], c=y)<br \/>\nplt.show()<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-p\"># Remove PC3<\/span><\/p>\n<p><span class=\"crayon-v\">value<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">Xmean<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">@<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">pca<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">components_<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">2<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-v\">pc3<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">value<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">reshape<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">@<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">pca<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">components_<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">2<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">reshape<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">Xremove<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">Xremove<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">pc3<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">scatter<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">Xremove<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">Xremove<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">c<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">show<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<p><img loading=\"lazy\" class=\"aligncenter wp-image-13000\" data-cfsrc=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2021\/10\/output_38_0-1.png\" alt=\"\" width=\"490\" height=\"357\"><\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-13000\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2021\/10\/output_38_0-1.png\" alt=\"\" width=\"490\" height=\"357\"><\/p>\n<p>The points all fall on a straight line because we removed three principal components from the data where there are only four features. Hence our data matrix becomes <b>rank 1<\/b>. You can try repeat once more this process and the result would be all points collapse into a single point. The amount of information removed in each step as we removed the principal components can be found by the corresponding <b>explained variance ratio<\/b> from the PCA:<\/p>\n<div id=\"urvanov-syntax-highlighter-617372c1ccf94928162225\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\nprint(pca.explained_variance_ratio_)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">pca<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">explained_variance_ratio_<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<div id=\"urvanov-syntax-highlighter-617372c1ccf95062283207\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n[0.92461872 0.05306648 0.01710261 0.00521218]<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p>[0.92461872 0.05306648 0.01710261 0.00521218]<\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<p>Here we can see, the first component explained 92.5% variance and the second component explained 5.3% variance. If we removed the first two principal components, the remaining variance is only 2.2%, hence visually the plot after removing two components looks like a straight line. In fact, when we check with the plots above, not only we see the points are crumbled, but the range in the x- and y-axes are also smaller as we removed the components.<\/p>\n<p>In terms of machine learning, we can consider using only one single feature for classification in this dataset, namely the first principal component. We should expect to achieve no less than 90% of the original accuracy as using the full set of features:<\/p>\n<div id=\"urvanov-syntax-highlighter-617372c1ccf96908119252\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\nfrom sklearn.model_selection import train_test_split<br \/>\nfrom sklearn.metrics import f1_score<br \/>\nfrom collections import Counter<\/p>\n<p>X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)<br \/>\nfrom sklearn.svm import SVC<br \/>\nclf = SVC(kernel=&#8221;linear&#8221;, gamma=&#8217;auto&#8217;).fit(X_train, y_train)<br \/>\nprint(&#8220;Using all features, accuracy: &#8220;, clf.score(X_test, y_test))<br \/>\nprint(&#8220;Using all features, F1: &#8220;, f1_score(y_test, clf.predict(X_test), average=&#8221;macro&#8221;))<\/p>\n<p>mean = X_train.mean(axis=0)<br \/>\nX_train2 = X_train &#8211; mean<br \/>\nX_train2 = (X_train2 @ pca.components_[0]).reshape(-1,1)<br \/>\nclf = SVC(kernel=&#8221;linear&#8221;, gamma=&#8217;auto&#8217;).fit(X_train2, y_train)<br \/>\nX_test2 = X_test &#8211; mean<br \/>\nX_test2 = (X_test2 @ pca.components_[0]).reshape(-1,1)<br \/>\nprint(&#8220;Using PC1, accuracy: &#8220;, clf.score(X_test2, y_test))<br \/>\nprint(&#8220;Using PC1, F1: &#8220;, f1_score(y_test, clf.predict(X_test2), average=&#8221;macro&#8221;))<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<div class=\"urvanov-syntax-highlighter-nums-content\">\n<p>1<\/p>\n<p>2<\/p>\n<p>3<\/p>\n<p>4<\/p>\n<p>5<\/p>\n<p>6<\/p>\n<p>7<\/p>\n<p>8<\/p>\n<p>9<\/p>\n<p>10<\/p>\n<p>11<\/p>\n<p>12<\/p>\n<p>13<\/p>\n<p>14<\/p>\n<p>15<\/p>\n<p>16<\/p>\n<p>17<\/p>\n<p>18<\/p>\n<p>19<\/p>\n<\/div>\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">model_selection <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">train_test_split<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">metrics <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">f1_score<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">collections <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">Counter<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-v\">X_train<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_train<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">train_test_split<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">test_size<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">0.33<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">svm <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">SVC<\/span><\/p>\n<p><span class=\"crayon-v\">clf<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">SVC<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">kernel<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8220;linear&#8221;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">gamma<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8216;auto&#8217;<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">fit<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X_train<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_train<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;Using all features, accuracy: &#8220;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">clf<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">score<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;Using all features, F1: &#8220;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">f1_score<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">clf<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">predict<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X_test<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">average<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8220;macro&#8221;<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-v\">mean<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X_train<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">mean<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">axis<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">X_train2<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X_train<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">mean<\/span><\/p>\n<p><span class=\"crayon-v\">X_train2<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">_<\/span>train2<span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">@<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">pca<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">components_<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">reshape<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">clf<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">SVC<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">kernel<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8220;linear&#8221;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">gamma<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8216;auto&#8217;<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">fit<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X_train2<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_train<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">X_test2<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X_test<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">mean<\/span><\/p>\n<p><span class=\"crayon-v\">X_test2<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">_<\/span>test2<span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">@<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">pca<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">components_<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">reshape<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;Using PC1, accuracy: &#8220;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">clf<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">score<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X_test2<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;Using PC1, F1: &#8220;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">f1_score<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">clf<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">predict<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X_test2<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">average<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8220;macro&#8221;<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<div id=\"urvanov-syntax-highlighter-617372c1ccf97922343930\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\nUsing all features, accuracy:  1.0<br \/>\nUsing all features, F1:  1.0<br \/>\nUsing PC1, accuracy:  0.96<br \/>\nUsing PC1, F1:  0.9645191409897292<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p>Using all features, accuracy:\u00a0\u00a01.0<\/p>\n<p>Using all features, F1:\u00a0\u00a01.0<\/p>\n<p>Using PC1, accuracy:\u00a0\u00a00.96<\/p>\n<p>Using PC1, F1:\u00a0\u00a00.9645191409897292<\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<p>The other use of understanding the explained variance is on compression. Given the explained variance of the first principal component is large, if we need to store the dataset, we can store only the the projected values on the first principal axis ($Xtimes v$), as well as the vector $v$ of the principal axis. Then we can approximately reproduce the original dataset by multiplying them:<br \/>$$<br \/>X approx (Xtimes v) times v^T<br \/>$$<br \/>In this way, we need storage for only one value per data point instead of four values for four features. The approximation is more accurate if we store the projected values on multiple principal axes and add up multiple principal components.<\/p>\n<p>Putting these together, the following is the complete code to generate the visualizations:<\/p>\n<div id=\"urvanov-syntax-highlighter-617372c1ccf98877937920\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\nfrom sklearn.datasets import load_iris<br \/>\nfrom sklearn.model_selection import train_test_split<br \/>\nfrom sklearn.decomposition import PCA<br \/>\nfrom sklearn.metrics import f1_score<br \/>\nfrom sklearn.svm import SVC<br \/>\nimport matplotlib.pyplot as plt<\/p>\n<p># Load iris dataset<br \/>\nirisdata = load_iris()<br \/>\nX, y = irisdata[&#8216;data&#8217;], irisdata[&#8216;target&#8217;]<br \/>\nplt.figure(figsize=(8,6))<br \/>\nplt.scatter(X[:,0], X[:,1], c=y)<br \/>\nplt.xlabel(irisdata[&#8220;feature_names&#8221;][0])<br \/>\nplt.ylabel(irisdata[&#8220;feature_names&#8221;][1])<br \/>\nplt.title(&#8220;Two features from the iris dataset&#8221;)<br \/>\nplt.show()<\/p>\n<p># Show the principal components<br \/>\npca = PCA().fit(X)<br \/>\nprint(&#8220;Principal components:&#8221;)<br \/>\nprint(pca.components_)<\/p>\n<p># Remove PC1<br \/>\nXmean = X &#8211; X.mean(axis=0)<br \/>\nvalue = Xmean @ pca.components_[0]<br \/>\npc1 = value.reshape(-1,1) @ pca.components_[0].reshape(1,-1)<br \/>\nXremove = X &#8211; pc1<br \/>\nplt.figure(figsize=(8,6))<br \/>\nplt.scatter(Xremove[:,0], Xremove[:,1], c=y)<br \/>\nplt.xlabel(irisdata[&#8220;feature_names&#8221;][0])<br \/>\nplt.ylabel(irisdata[&#8220;feature_names&#8221;][1])<br \/>\nplt.title(&#8220;Two features from the iris dataset after removing PC1&#8221;)<br \/>\nplt.show()<\/p>\n<p># Remove PC2<br \/>\nXmean = X &#8211; X.mean(axis=0)<br \/>\nvalue = Xmean @ pca.components_[1]<br \/>\npc2 = value.reshape(-1,1) @ pca.components_[1].reshape(1,-1)<br \/>\nXremove = Xremove &#8211; pc2<br \/>\nplt.figure(figsize=(8,6))<br \/>\nplt.scatter(Xremove[:,0], Xremove[:,1], c=y)<br \/>\nplt.xlabel(irisdata[&#8220;feature_names&#8221;][0])<br \/>\nplt.ylabel(irisdata[&#8220;feature_names&#8221;][1])<br \/>\nplt.title(&#8220;Two features from the iris dataset after removing PC1 and PC2&#8221;)<br \/>\nplt.show()<\/p>\n<p># Remove PC3<br \/>\nXmean = X &#8211; X.mean(axis=0)<br \/>\nvalue = Xmean @ pca.components_[2]<br \/>\npc3 = value.reshape(-1,1) @ pca.components_[2].reshape(1,-1)<br \/>\nXremove = Xremove &#8211; pc3<br \/>\nplt.figure(figsize=(8,6))<br \/>\nplt.scatter(Xremove[:,0], Xremove[:,1], c=y)<br \/>\nplt.xlabel(irisdata[&#8220;feature_names&#8221;][0])<br \/>\nplt.ylabel(irisdata[&#8220;feature_names&#8221;][1])<br \/>\nplt.title(&#8220;Two features from the iris dataset after removing PC1 to PC3&#8221;)<br \/>\nplt.show()<\/p>\n<p># Print the explained variance ratio<br \/>\nprint(&#8220;Explainedd variance ratios:&#8221;)<br \/>\nprint(pca.explained_variance_ratio_)<\/p>\n<p># Split data<br \/>\nX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)<\/p>\n<p># Run classifer on all features<br \/>\nclf = SVC(kernel=&#8221;linear&#8221;, gamma=&#8217;auto&#8217;).fit(X_train, y_train)<br \/>\nprint(&#8220;Using all features, accuracy: &#8220;, clf.score(X_test, y_test))<br \/>\nprint(&#8220;Using all features, F1: &#8220;, f1_score(y_test, clf.predict(X_test), average=&#8221;macro&#8221;))<\/p>\n<p># Run classifier on PC1<br \/>\nmean = X_train.mean(axis=0)<br \/>\nX_train2 = X_train &#8211; mean<br \/>\nX_train2 = (X_train2 @ pca.components_[0]).reshape(-1,1)<br \/>\nclf = SVC(kernel=&#8221;linear&#8221;, gamma=&#8217;auto&#8217;).fit(X_train2, y_train)<br \/>\nX_test2 = X_test &#8211; mean<br \/>\nX_test2 = (X_test2 @ pca.components_[0]).reshape(-1,1)<br \/>\nprint(&#8220;Using PC1, accuracy: &#8220;, clf.score(X_test2, y_test))<br \/>\nprint(&#8220;Using PC1, F1: &#8220;, f1_score(y_test, clf.predict(X_test2), average=&#8221;macro&#8221;))<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<div class=\"urvanov-syntax-highlighter-nums-content\">\n<p>1<\/p>\n<p>2<\/p>\n<p>3<\/p>\n<p>4<\/p>\n<p>5<\/p>\n<p>6<\/p>\n<p>7<\/p>\n<p>8<\/p>\n<p>9<\/p>\n<p>10<\/p>\n<p>11<\/p>\n<p>12<\/p>\n<p>13<\/p>\n<p>14<\/p>\n<p>15<\/p>\n<p>16<\/p>\n<p>17<\/p>\n<p>18<\/p>\n<p>19<\/p>\n<p>20<\/p>\n<p>21<\/p>\n<p>22<\/p>\n<p>23<\/p>\n<p>24<\/p>\n<p>25<\/p>\n<p>26<\/p>\n<p>27<\/p>\n<p>28<\/p>\n<p>29<\/p>\n<p>30<\/p>\n<p>31<\/p>\n<p>32<\/p>\n<p>33<\/p>\n<p>34<\/p>\n<p>35<\/p>\n<p>36<\/p>\n<p>37<\/p>\n<p>38<\/p>\n<p>39<\/p>\n<p>40<\/p>\n<p>41<\/p>\n<p>42<\/p>\n<p>43<\/p>\n<p>44<\/p>\n<p>45<\/p>\n<p>46<\/p>\n<p>47<\/p>\n<p>48<\/p>\n<p>49<\/p>\n<p>50<\/p>\n<p>51<\/p>\n<p>52<\/p>\n<p>53<\/p>\n<p>54<\/p>\n<p>55<\/p>\n<p>56<\/p>\n<p>57<\/p>\n<p>58<\/p>\n<p>59<\/p>\n<p>60<\/p>\n<p>61<\/p>\n<p>62<\/p>\n<p>63<\/p>\n<p>64<\/p>\n<p>65<\/p>\n<p>66<\/p>\n<p>67<\/p>\n<p>68<\/p>\n<p>69<\/p>\n<p>70<\/p>\n<p>71<\/p>\n<p>72<\/p>\n<p>73<\/p>\n<p>74<\/p>\n<p>75<\/p>\n<p>76<\/p>\n<p>77<\/p>\n<p>78<\/p>\n<p>79<\/p>\n<\/div>\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">datasets <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">load_iris<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">model_selection <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">train_test_split<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">decomposition <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">PCA<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">metrics <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">f1_score<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">svm <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">SVC<\/span><\/p>\n<p><span class=\"crayon-e\">import <\/span><span class=\"crayon-v\">matplotlib<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">pyplot <\/span><span class=\"crayon-st\">as<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">plt<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Load iris dataset<\/span><\/p>\n<p><span class=\"crayon-v\">irisdata<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">load_iris<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">irisdata<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8216;data&#8217;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">irisdata<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8216;target&#8217;<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">figure<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">figsize<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">8<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">6<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">scatter<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">c<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">xlabel<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">irisdata<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8220;feature_names&#8221;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">ylabel<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">irisdata<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8220;feature_names&#8221;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">title<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;Two features from the iris dataset&#8221;<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">show<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Show the principal components<\/span><\/p>\n<p><span class=\"crayon-v\">pca<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">PCA<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">fit<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;Principal components:&#8221;<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">pca<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">components_<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Remove PC1<\/span><\/p>\n<p><span class=\"crayon-v\">Xmean<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">mean<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">axis<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">value<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">Xmean<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">@<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">pca<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">components_<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-v\">pc1<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">value<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">reshape<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">@<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">pca<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">components_<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">reshape<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">Xremove<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">pc1<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">figure<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">figsize<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">8<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">6<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">scatter<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">Xremove<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">Xremove<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">c<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">xlabel<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">irisdata<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8220;feature_names&#8221;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">ylabel<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">irisdata<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8220;feature_names&#8221;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">title<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;Two features from the iris dataset after removing PC1&#8221;<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">show<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Remove PC2<\/span><\/p>\n<p><span class=\"crayon-v\">Xmean<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">mean<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">axis<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">value<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">Xmean<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">@<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">pca<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">components_<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-v\">pc2<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">value<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">reshape<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">@<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">pca<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">components_<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">reshape<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">Xremove<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">Xremove<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">pc2<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">figure<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">figsize<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">8<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">6<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">scatter<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">Xremove<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">Xremove<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">c<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">xlabel<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">irisdata<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8220;feature_names&#8221;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">ylabel<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">irisdata<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8220;feature_names&#8221;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">title<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;Two features from the iris dataset after removing PC1 and PC2&#8221;<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">show<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Remove PC3<\/span><\/p>\n<p><span class=\"crayon-v\">Xmean<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">mean<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">axis<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">value<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">Xmean<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">@<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">pca<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">components_<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">2<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-v\">pc3<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">value<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">reshape<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">@<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">pca<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">components_<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">2<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">reshape<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">Xremove<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">Xremove<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">pc3<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">figure<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">figsize<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">8<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">6<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">scatter<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">Xremove<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">Xremove<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">c<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">xlabel<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">irisdata<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8220;feature_names&#8221;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">ylabel<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">irisdata<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8220;feature_names&#8221;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">title<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;Two features from the iris dataset after removing PC1 to PC3&#8221;<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">plt<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">show<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Print the explained variance ratio<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;Explainedd variance ratios:&#8221;<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">pca<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">explained_variance_ratio_<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Split data<\/span><\/p>\n<p><span class=\"crayon-v\">X_train<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_train<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">train_test_split<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">test_size<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">0.33<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Run classifer on all features<\/span><\/p>\n<p><span class=\"crayon-v\">clf<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">SVC<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">kernel<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8220;linear&#8221;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">gamma<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8216;auto&#8217;<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">fit<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X_train<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_train<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;Using all features, accuracy: &#8220;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">clf<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">score<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;Using all features, F1: &#8220;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">f1_score<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">clf<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">predict<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X_test<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">average<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8220;macro&#8221;<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Run classifier on PC1<\/span><\/p>\n<p><span class=\"crayon-v\">mean<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X_train<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">mean<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">axis<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">X_train2<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X_train<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">mean<\/span><\/p>\n<p><span class=\"crayon-v\">X_train2<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">_<\/span>train2<span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">@<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">pca<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">components_<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">reshape<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">clf<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">SVC<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">kernel<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8220;linear&#8221;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">gamma<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8216;auto&#8217;<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">fit<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X_train2<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_train<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">X_test2<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X_test<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">mean<\/span><\/p>\n<p><span class=\"crayon-v\">X_test2<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">_<\/span>test2<span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">@<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">pca<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">components_<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">reshape<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;Using PC1, accuracy: &#8220;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">clf<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">score<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X_test2<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;Using PC1, F1: &#8220;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">f1_score<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">y_test<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">clf<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">predict<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X_test2<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">average<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8220;macro&#8221;<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<h2>Further reading<\/h2>\n<p>This section provides more resources on the topic if you are looking to go deeper.<\/p>\n<h3>Books<\/h3>\n<h3>Tutorials<\/h3>\n<h3>APIs<\/h3>\n<h2 id=\"Summary\u00b6\">Summary<\/h2>\n<p>In this tutorial, you discovered how to visualize data using principal component analysis.<\/p>\n<p>Specifically, you learned:<\/p>\n<ul>\n<li>Visualize a high dimensional dataset in 2D using PCA<\/li>\n<li>How to use the plot in PCA dimensions to help choosing an appropriate machine learning model<\/li>\n<li>How to observe the explained variance ratio of PCA<\/li>\n<li>What the explained variance ratio means for machine learning<\/li>\n<\/ul>\n<p>\u00a0<\/p>\n<div class=\"widget_text awac-wrapper\" id=\"custom_html-69\">\n<div class=\"widget_text awac widget custom_html-69\">\n<div class=\"textwidget custom-html-widget\">\n<div>\n<h2>Get a Handle on Linear Algebra for Machine Learning!<\/h2>\n<p><a href=\"\/linear_algebra_for_machine_learning\/\" rel=\"nofollow\"><img width=\"220\" height=\"311\" data-cfstyle=\"border: 0;\" data-cfsrc=\"\/wp-content\/uploads\/2018\/01\/Cover-220-1.png\" alt=\"Linear Algebra for Machine Learning\" align=\"left\"><img decoding=\"async\" loading=\"lazy\" width=\"220\" height=\"311\" src=\"\/wp-content\/uploads\/2018\/01\/Cover-220-1.png\" alt=\"Linear Algebra for Machine Learning\" align=\"left\"><\/a><\/p>\n<h4>Develop a working understand of linear algebra<\/h4>\n<p>&#8230;by writing lines of code in python<\/p>\n<p>Discover how in my new Ebook:<br \/><a href=\"\/linear_algebra_for_machine_learning\/\" rel=\"nofollow\">Linear Algebra for Machine Learning<\/a><\/p>\n<p>It provides <strong>self-study tutorials<\/strong> on topics like:<br \/><em>Vector Norms, Matrix Multiplication, Tensors, Eigendecomposition, SVD, PCA<\/em> and much more&#8230;<\/p>\n<h4>Finally Understand the Mathematics of Data<\/h4>\n<p>Skip the Academics. Just Results.<\/p>\n<p><a href=\"\/linear_algebra_for_machine_learning\/\" class=\"woo-sc-button  red\"><span class=\"woo-\">See What&#8217;s Inside<\/span><\/a><\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div><\/div>\n","protected":false},"excerpt":{"rendered":"<p>https:\/\/machinelearningmastery.com\/principal-component-analysis-for-visualization\/<\/p>\n","protected":false},"author":0,"featured_media":1080,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[3],"tags":[],"_links":{"self":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/1079"}],"collection":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/comments?post=1079"}],"version-history":[{"count":0,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/1079\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media\/1080"}],"wp:attachment":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media?parent=1079"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/categories?post=1079"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/tags?post=1079"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}