{"id":410,"date":"2020-10-15T22:15:58","date_gmt":"2020-10-15T22:15:58","guid":{"rendered":"https:\/\/machine-learning.webcloning.com\/2020\/10\/15\/how-to-develop-lars-regression-models-in-python\/"},"modified":"2020-10-15T22:15:58","modified_gmt":"2020-10-15T22:15:58","slug":"how-to-develop-lars-regression-models-in-python","status":"publish","type":"post","link":"https:\/\/salarydistribution.com\/machine-learning\/2020\/10\/15\/how-to-develop-lars-regression-models-in-python\/","title":{"rendered":"How to Develop LARS Regression Models in Python"},"content":{"rendered":"<div id=\"\">\n<p>Regression is a modeling task that involves predicting a numeric value given an input.<\/p>\n<p>Linear regression is the standard algorithm for regression that assumes a linear relationship between inputs and the target variable. An extension to linear regression involves adding penalties to the loss function during training that encourage simpler models that have smaller coefficient values. These extensions are referred to as regularized linear regression or penalized linear regression.<\/p>\n<p>Lasso Regression is a popular type of regularized linear regression that includes an L1 penalty. This has the effect of shrinking the coefficients for those input variables that do not contribute much to the prediction task.<\/p>\n<p><strong>Least Angle Regression<\/strong> or <strong>LARS<\/strong> for short\u00a0provides an alternate, efficient way of fitting a Lasso regularized regression model that does not require any hyperparameters.<\/p>\n<p>In this tutorial, you will discover how to develop and evaluate LARS Regression models in Python.<\/p>\n<p>After completing this tutorial, you will know:<\/p>\n<ul>\n<li>LARS Regression provides an alternate way to train a Lasso regularized linear regression model that adds a penalty to the loss function during training.<\/li>\n<li>How to evaluate a LARS Regression model and use a final model to make predictions for new data.<\/li>\n<li>How to configure the LARS Regression model for a new dataset automatically using a cross-validation version of the estimator.<\/li>\n<\/ul>\n<p>Let\u2019s get started.<\/p>\n<div id=\"attachment_10595\" class=\"wp-caption aligncenter\">\n<img decoding=\"async\" aria-describedby=\"caption-attachment-10595\" loading=\"lazy\" class=\"size-full wp-image-10595\" src=\"https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/07\/How-to-Develop-LARS-Regression-Models-in-Python.jpg\" alt=\"How to Develop LARS Regression Models in Python\" width=\"799\" height=\"533\"><\/p>\n<p id=\"caption-attachment-10595\" class=\"wp-caption-text\">How to Develop LARS Regression Models in Python<br \/>Photo by <a href=\"https:\/\/flickr.com\/photos\/82955120@N05\/20449610383\/\">Nicolas Raymond<\/a>, some rights reserved.<\/p>\n<\/div>\n<h2>Tutorial Overview<\/h2>\n<p>This tutorial is divided into three parts; they are:<\/p>\n<ol>\n<li>LARS Regression<\/li>\n<li>Example of LARS Regression<\/li>\n<li>Tuning LARS Hyperparameters<\/li>\n<\/ol>\n<h2>LARS Regression<\/h2>\n<p><a href=\"https:\/\/machinelearningmastery.com\/solve-linear-regression-using-linear-algebra\/\">Linear regression<\/a> refers to a model that assumes a linear relationship between input variables and the target variable.<\/p>\n<p>With a single input variable, this relationship is a line, and with higher dimensions, this relationship can be thought of as a hyperplane that connects the input variables to the target variable. The coefficients of the model are found via an optimization process that seeks to minimize the sum squared error between the predictions (<em>yhat<\/em>) and the expected target values (<em>y<\/em>).<\/p>\n<ul>\n<li>loss = sum i=0 to n (y_i \u2013 yhat_i)^2<\/li>\n<\/ul>\n<p>A problem with linear regression is that estimated coefficients of the model can become large, making the model sensitive to inputs and possibly unstable. This is particularly true for problems with few observations (samples) or more samples (<em>n<\/em>) than input predictors (<em>p<\/em>) or variables (so-called <em>p &gt;&gt; n problems<\/em>).<\/p>\n<p>One approach to address the stability of regression models is to change the loss function to include additional costs for a model that has large coefficients. Linear regression models that use these modified loss functions during training are referred to collectively as penalized linear regression.<\/p>\n<p>A popular penalty is to penalize a model based on the sum of the absolute coefficient values. This is called the L1 penalty. An L1 penalty minimizes the size of all coefficients and allows some coefficients to be minimized to the value zero, which removes the predictor from the model.<\/p>\n<ul>\n<li>l1_penalty = sum j=0 to p abs(beta_j)<\/li>\n<\/ul>\n<p>An L1 penalty minimizes the size of all coefficients and allows any coefficient to go to the value of zero, effectively removing input features from the model. This acts as a type of automatic feature selection method.<\/p>\n<blockquote>\n<p>\u2026 a consequence of penalizing the absolute values is that some parameters are actually set to 0 for some value of lambda. Thus the lasso yields models that simultaneously use regularization to improve the model and to conduct feature selection.<\/p>\n<\/blockquote>\n<p>\u2014 Page 125, <a href=\"https:\/\/amzn.to\/38uGJOl\">Applied Predictive Modeling<\/a>, 2013.<\/p>\n<p>This penalty can be added to the cost function for linear regression and is referred to as <a href=\"https:\/\/en.wikipedia.org\/wiki\/Lasso_(statistics)\">Least Absolute Shrinkage And Selection Operator<\/a>\u00a0(LASSO), or more commonly, \u201c<em>Lasso<\/em>\u201d (with title case) for short.<\/p>\n<p>The Lasso trains the model using a least-squares loss training procedure.<\/p>\n<p><strong>Least Angle Regression<\/strong>, LAR or LARS for short, is an alternative approach to solving the optimization problem of fitting the penalized model. Technically, LARS is a forward stepwise version of feature selection for regression that can be adapted for the Lasso model.<\/p>\n<p>Unlike the Lasso, it does not require a hyperparameter that controls the weighting of the penalty in the loss function. Instead, the weighting is discovered automatically by LARS.<\/p>\n<blockquote>\n<p>\u2026 least angle regression (LARS), is a broad framework that encompasses the lasso and similar models. The LARS model can be used to fit lasso models more efficiently, especially in high-dimensional problems.<\/p>\n<\/blockquote>\n<p>\u2014 Page 126, <a href=\"https:\/\/amzn.to\/38uGJOl\">Applied Predictive Modeling<\/a>, 2013.<\/p>\n<p>Now that we are familiar with LARS penalized regression, let\u2019s look at a worked example.<\/p>\n<h2>Example of LARS Regression<\/h2>\n<p>In this section, we will demonstrate how to use the LARS Regression algorithm.<\/p>\n<p>First, let\u2019s introduce a standard regression dataset. We will use the housing dataset.<\/p>\n<p>The housing dataset is a standard machine learning dataset comprising 506 rows of data with 13 numerical input variables and a numerical target variable.<\/p>\n<p>Using a test harness of repeated stratified 10-fold cross-validation with three repeats, a naive model can achieve a mean absolute error (MAE) of about 6.6. A top-performing model can achieve a MAE on this same test harness of about 1.9. This provides the bounds of expected performance on this dataset.<\/p>\n<p>The dataset involves predicting the house price given details of the house\u2019s suburb in the American city of Boston.<\/p>\n<p>No need to download the dataset; we will download it automatically as part of our worked examples.<\/p>\n<p>The example below downloads and loads the dataset as a Pandas DataFrame and summarizes the shape of the dataset and the first five rows of data.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f88c7ceded33809471294\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# load and summarize the housing dataset<br \/>\nfrom pandas import read_csv<br \/>\nfrom matplotlib import pyplot<br \/>\n# load dataset<br \/>\nurl = &#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/housing.csv&#8217;<br \/>\ndataframe = read_csv(url, header=None)<br \/>\n# summarize shape<br \/>\nprint(dataframe.shape)<br \/>\n# summarize first few lines<br \/>\nprint(dataframe.head())<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-p\"># load and summarize the housing dataset<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">pandas <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">read_csv<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">matplotlib <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-i\">pyplot<\/span><\/p>\n<p><span class=\"crayon-p\"># load dataset<\/span><\/p>\n<p><span class=\"crayon-v\">url<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-s\">&#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/housing.csv&#8217;<\/span><\/p>\n<p><span class=\"crayon-v\">dataframe<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">read_csv<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">url<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">header<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">None<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># summarize shape<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">dataframe<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># summarize first few lines<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">dataframe<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">head<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0003 seconds] --><\/p>\n<p>Running the example confirms the 506 rows of data and 13 input variables and a single numeric target variable (14 in total). We can also see that all input variables are numeric.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f88c7ceded38972043245\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n(506, 14)<br \/>\n        0     1     2   3      4      5   &#8230;  8      9     10      11    12    13<br \/>\n0  0.00632  18.0  2.31   0  0.538  6.575  &#8230;   1  296.0  15.3  396.90  4.98  24.0<br \/>\n1  0.02731   0.0  7.07   0  0.469  6.421  &#8230;   2  242.0  17.8  396.90  9.14  21.6<br \/>\n2  0.02729   0.0  7.07   0  0.469  7.185  &#8230;   2  242.0  17.8  392.83  4.03  34.7<br \/>\n3  0.03237   0.0  2.18   0  0.458  6.998  &#8230;   3  222.0  18.7  394.63  2.94  33.4<br \/>\n4  0.06905   0.0  2.18   0  0.458  7.147  &#8230;   3  222.0  18.7  396.90  5.33  36.2<\/p>\n<p>[5 rows x 14 columns]<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p>(506, 14)<\/p>\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a00\u00a0\u00a0\u00a0\u00a0 1\u00a0\u00a0\u00a0\u00a0 2\u00a0\u00a0 3\u00a0\u00a0\u00a0\u00a0\u00a0\u00a04\u00a0\u00a0\u00a0\u00a0\u00a0\u00a05\u00a0\u00a0 &#8230;\u00a0\u00a08\u00a0\u00a0\u00a0\u00a0\u00a0\u00a09\u00a0\u00a0\u00a0\u00a0 10\u00a0\u00a0\u00a0\u00a0\u00a0\u00a011\u00a0\u00a0\u00a0\u00a012\u00a0\u00a0\u00a0\u00a013<\/p>\n<p>0\u00a0\u00a00.00632\u00a0\u00a018.0\u00a0\u00a02.31\u00a0\u00a0 0\u00a0\u00a00.538\u00a0\u00a06.575\u00a0\u00a0&#8230;\u00a0\u00a0 1\u00a0\u00a0296.0\u00a0\u00a015.3\u00a0\u00a0396.90\u00a0\u00a04.98\u00a0\u00a024.0<\/p>\n<p>1\u00a0\u00a00.02731\u00a0\u00a0 0.0\u00a0\u00a07.07\u00a0\u00a0 0\u00a0\u00a00.469\u00a0\u00a06.421\u00a0\u00a0&#8230;\u00a0\u00a0 2\u00a0\u00a0242.0\u00a0\u00a017.8\u00a0\u00a0396.90\u00a0\u00a09.14\u00a0\u00a021.6<\/p>\n<p>2\u00a0\u00a00.02729\u00a0\u00a0 0.0\u00a0\u00a07.07\u00a0\u00a0 0\u00a0\u00a00.469\u00a0\u00a07.185\u00a0\u00a0&#8230;\u00a0\u00a0 2\u00a0\u00a0242.0\u00a0\u00a017.8\u00a0\u00a0392.83\u00a0\u00a04.03\u00a0\u00a034.7<\/p>\n<p>3\u00a0\u00a00.03237\u00a0\u00a0 0.0\u00a0\u00a02.18\u00a0\u00a0 0\u00a0\u00a00.458\u00a0\u00a06.998\u00a0\u00a0&#8230;\u00a0\u00a0 3\u00a0\u00a0222.0\u00a0\u00a018.7\u00a0\u00a0394.63\u00a0\u00a02.94\u00a0\u00a033.4<\/p>\n<p>4\u00a0\u00a00.06905\u00a0\u00a0 0.0\u00a0\u00a02.18\u00a0\u00a0 0\u00a0\u00a00.458\u00a0\u00a07.147\u00a0\u00a0&#8230;\u00a0\u00a0 3\u00a0\u00a0222.0\u00a0\u00a018.7\u00a0\u00a0396.90\u00a0\u00a05.33\u00a0\u00a036.2<\/p>\n<p>\u00a0<\/p>\n<p>[5 rows x 14 columns]<\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0000 seconds] --><\/p>\n<p>The scikit-learn Python machine learning library provides an implementation of the LARS penalized regression algorithm via the <a href=\"https:\/\/scikit-learn.org\/stable\/modules\/generated\/sklearn.linear_model.Lars.html\">Lars class<\/a>.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f88c7ceded39201482457\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\n# define model<br \/>\nmodel = Lars()<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-p\"># define model<\/span><\/p>\n<p><span class=\"crayon-v\">model<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">Lars<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0001 seconds] --><\/p>\n<p>We can evaluate the LARS Regression model on the housing dataset using <a href=\"https:\/\/machinelearningmastery.com\/k-fold-cross-validation\/\">repeated 10-fold cross-validation<\/a> and report the average mean absolute error (MAE) on the dataset.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f88c7ceded3a255389114\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# evaluate an lars regression model on the dataset<br \/>\nfrom numpy import mean<br \/>\nfrom numpy import std<br \/>\nfrom numpy import absolute<br \/>\nfrom pandas import read_csv<br \/>\nfrom sklearn.model_selection import cross_val_score<br \/>\nfrom sklearn.model_selection import RepeatedKFold<br \/>\nfrom sklearn.linear_model import Lars<br \/>\n# load the dataset<br \/>\nurl = &#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/housing.csv&#8217;<br \/>\ndataframe = read_csv(url, header=None)<br \/>\ndata = dataframe.values<br \/>\nX, y = data[:, :-1], data[:, -1]<br \/>\n# define model<br \/>\nmodel = Lars()<br \/>\n# define model evaluation method<br \/>\ncv = RepeatedKFold(n_splits=10, n_repeats=3, random_state=1)<br \/>\n# evaluate model<br \/>\nscores = cross_val_score(model, X, y, scoring=&#8217;neg_mean_absolute_error&#8217;, cv=cv, n_jobs=-1)<br \/>\n# force scores to be positive<br \/>\nscores = absolute(scores)<br \/>\nprint(&#8216;Mean MAE: %.3f (%.3f)&#8217; % (mean(scores), std(scores)))<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<div class=\"urvanov-syntax-highlighter-nums-content\">\n<p>1<\/p>\n<p>2<\/p>\n<p>3<\/p>\n<p>4<\/p>\n<p>5<\/p>\n<p>6<\/p>\n<p>7<\/p>\n<p>8<\/p>\n<p>9<\/p>\n<p>10<\/p>\n<p>11<\/p>\n<p>12<\/p>\n<p>13<\/p>\n<p>14<\/p>\n<p>15<\/p>\n<p>16<\/p>\n<p>17<\/p>\n<p>18<\/p>\n<p>19<\/p>\n<p>20<\/p>\n<p>21<\/p>\n<p>22<\/p>\n<\/div>\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-p\"># evaluate an lars regression model on the dataset<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">numpy <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">mean<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">numpy <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">std<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">numpy <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">absolute<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">pandas <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">read_csv<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">model_selection <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">cross_val_score<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">model_selection <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">RepeatedKFold<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">linear_model <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-i\">Lars<\/span><\/p>\n<p><span class=\"crayon-p\"># load the dataset<\/span><\/p>\n<p><span class=\"crayon-v\">url<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-s\">&#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/housing.csv&#8217;<\/span><\/p>\n<p><span class=\"crayon-v\">dataframe<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">read_csv<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">url<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">header<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">None<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">data<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">dataframe<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-i\">values<\/span><\/p>\n<p><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-p\"># define model<\/span><\/p>\n<p><span class=\"crayon-v\">model<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">Lars<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># define model evaluation method<\/span><\/p>\n<p><span class=\"crayon-v\">cv<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">RepeatedKFold<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">n_splits<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">10<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">n_repeats<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">3<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">random_state<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># evaluate model<\/span><\/p>\n<p><span class=\"crayon-v\">scores<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">cross_val_score<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">model<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">scoring<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8216;neg_mean_absolute_error&#8217;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">cv<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">cv<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">n_jobs<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># force scores to be positive<\/span><\/p>\n<p><span class=\"crayon-v\">scores<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">absolute<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">scores<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8216;Mean MAE: %.3f (%.3f)&#8217;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">%<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-e\">mean<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">scores<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">std<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">scores<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0007 seconds] --><\/p>\n<p>Running the example evaluates the LARS Regression algorithm on the housing dataset and reports the average MAE across the three repeats of 10-fold cross-validation.<\/p>\n<p>Your specific results may vary given the stochastic nature of the learning algorithm. Consider running the example a few times.<\/p>\n<p>In this case, we can see that the model achieved a MAE of about 3.432.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<p><!-- [Format Time: 0.0000 seconds] --><\/p>\n<p>We may decide to use the LARS Regression as our final model and make predictions on new data.<\/p>\n<p>This can be achieved by fitting the model on all available data and calling the <em>predict()<\/em> function, passing in a new row of data.<\/p>\n<p>We can demonstrate this with a complete example, listed below.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f88c7ceded3c749543201\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# make a prediction with a lars regression model on the dataset<br \/>\nfrom pandas import read_csv<br \/>\nfrom sklearn.linear_model import Lars<br \/>\n# load the dataset<br \/>\nurl = &#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/housing.csv&#8217;<br \/>\ndataframe = read_csv(url, header=None)<br \/>\ndata = dataframe.values<br \/>\nX, y = data[:, :-1], data[:, -1]<br \/>\n# define model<br \/>\nmodel = Lars()<br \/>\n# fit model<br \/>\nmodel.fit(X, y)<br \/>\n# define new data<br \/>\nrow = [0.00632,18.00,2.310,0,0.5380,6.5750,65.20,4.0900,1,296.0,15.30,396.90,4.98]<br \/>\n# make a prediction<br \/>\nyhat = model.predict([row])<br \/>\n# summarize prediction<br \/>\nprint(&#8216;Predicted: %.3f&#8217; % yhat)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<div class=\"urvanov-syntax-highlighter-nums-content\">\n<p>1<\/p>\n<p>2<\/p>\n<p>3<\/p>\n<p>4<\/p>\n<p>5<\/p>\n<p>6<\/p>\n<p>7<\/p>\n<p>8<\/p>\n<p>9<\/p>\n<p>10<\/p>\n<p>11<\/p>\n<p>12<\/p>\n<p>13<\/p>\n<p>14<\/p>\n<p>15<\/p>\n<p>16<\/p>\n<p>17<\/p>\n<p>18<\/p>\n<\/div>\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-p\"># make a prediction with a lars regression model on the dataset<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">pandas <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">read_csv<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">linear_model <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-i\">Lars<\/span><\/p>\n<p><span class=\"crayon-p\"># load the dataset<\/span><\/p>\n<p><span class=\"crayon-v\">url<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-s\">&#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/housing.csv&#8217;<\/span><\/p>\n<p><span class=\"crayon-v\">dataframe<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">read_csv<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">url<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">header<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">None<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">data<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">dataframe<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-i\">values<\/span><\/p>\n<p><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-p\"># define model<\/span><\/p>\n<p><span class=\"crayon-v\">model<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">Lars<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># fit model<\/span><\/p>\n<p><span class=\"crayon-v\">model<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">fit<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># define new data<\/span><\/p>\n<p><span class=\"crayon-v\">row<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">0.00632<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">18.00<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">2.310<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">0.5380<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">6.5750<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">65.20<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">4.0900<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">296.0<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">15.30<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">396.90<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">4.98<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-p\"># make a prediction<\/span><\/p>\n<p><span class=\"crayon-v\">yhat<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">model<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">predict<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">row<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># summarize prediction<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8216;Predicted: %.3f&#8217;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">%<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">yhat<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0005 seconds] --><\/p>\n<p>Running the example fits the model and makes a prediction for the new rows of data.<\/p>\n<p>Your specific results may vary given the stochastic nature of the learning algorithm. Try running the example a few times.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<p><!-- [Format Time: 0.0000 seconds] --><\/p>\n<p>Next, we can look at configuring the model hyperparameters.<\/p>\n<h2>Tuning LARS Hyperparameters<\/h2>\n<p>As part of the LARS training algorithm, it automatically discovers the best value for the lambda hyperparameter used in the Lasso algorithm.<\/p>\n<p>This hyperparameter is referred to as the \u201c<em>alpha<\/em>\u201d argument in the scikit-learn implementation of Lasso and LARS.<\/p>\n<p>Nevertheless, the process of automatically discovering the best model and <em>alpha<\/em> hyperparameter is still based on a single training dataset.<\/p>\n<p>An alternative approach is to fit the model on multiple subsets of the training dataset and choose the best internal model configuration across the folds, in this case, the value of <em>alpha<\/em>. Generally, this is referred to as a cross-validation estimator.<\/p>\n<p>The scikit-learn libraries offer a cross-validation version of the LARS for finding a more robust value for <em>alpha<\/em> via the <a href=\"https:\/\/scikit-learn.org\/stable\/modules\/generated\/sklearn.linear_model.LarsCV.html\">LarsCV class<\/a>.<\/p>\n<p>The example below demonstrates how to fit a <em>LarsCV<\/em> model and report the <em>alpha<\/em> value found via cross-validation<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f88c7ceded3e496443510\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# use automatically configured the lars regression algorithm<br \/>\nfrom numpy import arange<br \/>\nfrom pandas import read_csv<br \/>\nfrom sklearn.linear_model import LarsCV<br \/>\nfrom sklearn.model_selection import RepeatedKFold<br \/>\n# load the dataset<br \/>\nurl = &#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/housing.csv&#8217;<br \/>\ndataframe = read_csv(url, header=None)<br \/>\ndata = dataframe.values<br \/>\nX, y = data[:, :-1], data[:, -1]<br \/>\n# define model evaluation method<br \/>\ncv = RepeatedKFold(n_splits=10, n_repeats=3, random_state=1)<br \/>\n# define model<br \/>\nmodel = LarsCV(cv=cv, n_jobs=-1)<br \/>\n# fit model<br \/>\nmodel.fit(X, y)<br \/>\n# summarize chosen configuration<br \/>\nprint(&#8216;alpha: %f&#8217; % model.alpha_)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<div class=\"urvanov-syntax-highlighter-nums-content\">\n<p>1<\/p>\n<p>2<\/p>\n<p>3<\/p>\n<p>4<\/p>\n<p>5<\/p>\n<p>6<\/p>\n<p>7<\/p>\n<p>8<\/p>\n<p>9<\/p>\n<p>10<\/p>\n<p>11<\/p>\n<p>12<\/p>\n<p>13<\/p>\n<p>14<\/p>\n<p>15<\/p>\n<p>16<\/p>\n<p>17<\/p>\n<p>18<\/p>\n<\/div>\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-p\"># use automatically configured the lars regression algorithm<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">numpy <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">arange<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">pandas <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">read_csv<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">linear_model <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">LarsCV<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">model_selection <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-i\">RepeatedKFold<\/span><\/p>\n<p><span class=\"crayon-p\"># load the dataset<\/span><\/p>\n<p><span class=\"crayon-v\">url<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-s\">&#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/housing.csv&#8217;<\/span><\/p>\n<p><span class=\"crayon-v\">dataframe<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">read_csv<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">url<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">header<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">None<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">data<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">dataframe<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-i\">values<\/span><\/p>\n<p><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-p\"># define model evaluation method<\/span><\/p>\n<p><span class=\"crayon-v\">cv<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">RepeatedKFold<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">n_splits<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">10<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">n_repeats<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">3<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">random_state<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># define model<\/span><\/p>\n<p><span class=\"crayon-v\">model<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">LarsCV<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">cv<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">cv<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">n_jobs<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># fit model<\/span><\/p>\n<p><span class=\"crayon-v\">model<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">fit<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># summarize chosen configuration<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8216;alpha: %f&#8217;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">%<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">model<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">alpha_<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0006 seconds] --><\/p>\n<p>Running the example fits the <em>LarsCV<\/em> model using repeated cross-validation and reports an optimal <em>alpha<\/em> value found across the runs.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<p><!-- [Format Time: 0.0000 seconds] --><\/p>\n<p>This version of the LARS model may prove more robust in practice.<\/p>\n<p>We can evaluate it using the same procedure we did in the previous section, although in this case, each model fit is based on the hyperparameters found via repeated k-fold cross-validation internally (e.g. cross-validation of a cross-validation estimator).<\/p>\n<p>The complete example is listed below.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<div id=\"urvanov-syntax-highlighter-5f88c7ceded40158751726\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n# evaluate an lars cross-validation regression model on the dataset<br \/>\nfrom numpy import mean<br \/>\nfrom numpy import std<br \/>\nfrom numpy import absolute<br \/>\nfrom pandas import read_csv<br \/>\nfrom sklearn.model_selection import cross_val_score<br \/>\nfrom sklearn.model_selection import RepeatedKFold<br \/>\nfrom sklearn.linear_model import LarsCV<br \/>\n# load the dataset<br \/>\nurl = &#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/housing.csv&#8217;<br \/>\ndataframe = read_csv(url, header=None)<br \/>\ndata = dataframe.values<br \/>\nX, y = data[:, :-1], data[:, -1]<br \/>\n# define model evaluation method<br \/>\ncv = RepeatedKFold(n_splits=10, n_repeats=3, random_state=1)<br \/>\n# define model<br \/>\nmodel = LarsCV(cv=cv, n_jobs=-1)<br \/>\n# evaluate model<br \/>\nscores = cross_val_score(model, X, y, scoring=&#8217;neg_mean_absolute_error&#8217;, cv=cv, n_jobs=-1)<br \/>\n# force scores to be positive<br \/>\nscores = absolute(scores)<br \/>\nprint(&#8216;Mean MAE: %.3f (%.3f)&#8217; % (mean(scores), std(scores)))<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<div class=\"urvanov-syntax-highlighter-nums-content\">\n<p>1<\/p>\n<p>2<\/p>\n<p>3<\/p>\n<p>4<\/p>\n<p>5<\/p>\n<p>6<\/p>\n<p>7<\/p>\n<p>8<\/p>\n<p>9<\/p>\n<p>10<\/p>\n<p>11<\/p>\n<p>12<\/p>\n<p>13<\/p>\n<p>14<\/p>\n<p>15<\/p>\n<p>16<\/p>\n<p>17<\/p>\n<p>18<\/p>\n<p>19<\/p>\n<p>20<\/p>\n<p>21<\/p>\n<p>22<\/p>\n<\/div>\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-p\"># evaluate an lars cross-validation regression model on the dataset<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">numpy <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">mean<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">numpy <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">std<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">numpy <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">absolute<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-e\">pandas <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">read_csv<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">model_selection <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">cross_val_score<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">model_selection <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">RepeatedKFold<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">sklearn<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">linear_model <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-i\">LarsCV<\/span><\/p>\n<p><span class=\"crayon-p\"># load the dataset<\/span><\/p>\n<p><span class=\"crayon-v\">url<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-s\">&#8216;https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/housing.csv&#8217;<\/span><\/p>\n<p><span class=\"crayon-v\">dataframe<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">read_csv<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">url<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">header<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">None<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">data<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">dataframe<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-i\">values<\/span><\/p>\n<p><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">data<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-p\"># define model evaluation method<\/span><\/p>\n<p><span class=\"crayon-v\">cv<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">RepeatedKFold<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">n_splits<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">10<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">n_repeats<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">3<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">random_state<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># define model<\/span><\/p>\n<p><span class=\"crayon-v\">model<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">LarsCV<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">cv<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">cv<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">n_jobs<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># evaluate model<\/span><\/p>\n<p><span class=\"crayon-v\">scores<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">cross_val_score<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">model<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">X<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">y<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">scoring<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8216;neg_mean_absolute_error&#8217;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">cv<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-v\">cv<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">n_jobs<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-p\"># force scores to be positive<\/span><\/p>\n<p><span class=\"crayon-v\">scores<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">absolute<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">scores<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8216;Mean MAE: %.3f (%.3f)&#8217;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">%<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-e\">mean<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">scores<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">std<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">scores<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table>\n<\/div>\n<\/div>\n<p><!-- [Format Time: 0.0008 seconds] --><\/p>\n<p>Running the example will evaluate the cross-validated estimation of model hyperparameters using repeated cross-validation.<\/p>\n<p>Your specific results may vary given the stochastic nature of the learning algorithm. Try running the example a few times.<\/p>\n<p>In this case, we can see that we achieved slightly better results with 3.374 vs. 3.432 in the previous section.<\/p>\n<p><!-- Urvanov Syntax Highlighter v2.8.14 --><\/p>\n<p><!-- [Format Time: 0.0000 seconds] --><\/p>\n<h2>Further Reading<\/h2>\n<p>This section provides more resources on the topic if you are looking to go deeper.<\/p>\n<h3>Books<\/h3>\n<h3>APIs<\/h3>\n<h3>Articles<\/h3>\n<h2>Summary<\/h2>\n<p>In this tutorial, you discovered how to develop and evaluate LARS Regression models in Python.<\/p>\n<p>Specifically, you learned:<\/p>\n<ul>\n<li>LARS Regression provides an alternate way to train a Lasso regularized linear regression model that adds a penalty to the loss function during training.<\/li>\n<li>How to evaluate a LARS Regression model and use a final model to make predictions for new data.<\/li>\n<li>How to configure the LARS Regression model for a new dataset automatically using a cross-validation version of the estimator.<\/li>\n<\/ul>\n<p><strong>Do you have any questions?<\/strong><br \/>Ask your questions in the comments below and I will do my best to answer.<\/p>\n<div class=\"widget_text awac-wrapper\" id=\"custom_html-78\">\n<div class=\"widget_text awac widget custom_html-78\">\n<div class=\"textwidget custom-html-widget\">\n<div>\n<h2>Discover Fast Machine Learning in Python!<\/h2>\n<p><a href=\"\/machine-learning-with-python\/\" rel=\"nofollow\"><img decoding=\"async\" src=\"https:\/\/3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com\/wp-content\/uploads\/2014\/07\/MachineLearningMasteryWithPython-220px.png\" alt=\"Master Machine Learning With Python\" align=\"left\"><\/a><\/p>\n<h4>Develop Your Own Models in Minutes<\/h4>\n<p>&#8230;with just a few lines of scikit-learn code<\/p>\n<p>Learn how in my new Ebook:<br \/><a href=\"\/machine-learning-with-python\/\" rel=\"nofollow\">Machine Learning Mastery With Python<\/a><\/p>\n<p>Covers <strong>self-study tutorials<\/strong> and <strong>end-to-end projects<\/strong> like:<br \/><em>Loading data<\/em>, <em>visualization<\/em>, <em>modeling<\/em>, <em>tuning<\/em>, and much more&#8230;<\/p>\n<h4>Finally Bring Machine Learning To<br \/>Your Own Projects<\/h4>\n<p>Skip the Academics. Just Results.<\/p>\n<p><a href=\"\/machine-learning-with-python\/\" class=\"woo-sc-button  red\"><span class=\"woo-\">See What&#8217;s Inside<\/span><\/a><\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div><\/div>\n","protected":false},"excerpt":{"rendered":"<p>https:\/\/machinelearningmastery.com\/lars-regression-with-python\/<\/p>\n","protected":false},"author":0,"featured_media":411,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[3],"tags":[],"_links":{"self":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/410"}],"collection":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/comments?post=410"}],"version-history":[{"count":0,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/410\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media\/411"}],"wp:attachment":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media?parent=410"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/categories?post=410"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/tags?post=410"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}