{"id":1087,"date":"2021-10-27T08:42:49","date_gmt":"2021-10-27T08:42:49","guid":{"rendered":"https:\/\/salarydistribution.com\/machine-learning\/2021\/10\/27\/using-singular-value-decomposition-to-build-a-recommender-system\/"},"modified":"2021-10-27T08:42:49","modified_gmt":"2021-10-27T08:42:49","slug":"using-singular-value-decomposition-to-build-a-recommender-system","status":"publish","type":"post","link":"https:\/\/salarydistribution.com\/machine-learning\/2021\/10\/27\/using-singular-value-decomposition-to-build-a-recommender-system\/","title":{"rendered":"Using Singular Value Decomposition to Build a Recommender System"},"content":{"rendered":"<div id=\"\">\n<p id=\"last-modified-info\">Last Updated on October 27, 2021<\/p>\n<p>Singular value decomposition is a very popular linear algebra technique to break down a matrix into the product of a few smaller matrices. In fact, it is a technique that has many uses. One example is that we can use SVD to discover relationship between items. A recommender system can be build easily from this.<\/p>\n<p>In this tutorial, we will see how a recommender system can be build just using linear algebra techniques.<\/p>\n<p>After completing this tutorial, you will know:<\/p>\n<ul>\n<li>What has singular value decomposition done to a matrix<\/li>\n<li>How to interpret the result of singular value decomposition<\/li>\n<li>What data a single recommender system require, and how we can make use of SVD to analyze it<\/li>\n<li>How we can make use of the result from SVD to make recommendations<\/li>\n<\/ul>\n<p>Let\u2019s get started.<\/p>\n<div id=\"attachment_4963\" class=\"wp-caption aligncenter\"><img aria-describedby=\"caption-attachment-4963\" loading=\"lazy\" class=\"size-full wp-image-4963\" data-cfsrc=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2021\/10\/roberto-arias-ihpiRgog1vs-unsplash.jpg\" alt=\"Using Singular Value Decomposition to Build a Recommender System\" width=\"640\" height=\"480\"><img decoding=\"async\" aria-describedby=\"caption-attachment-4963\" loading=\"lazy\" class=\"size-full wp-image-4963\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2021\/10\/roberto-arias-ihpiRgog1vs-unsplash.jpg\" alt=\"Using Singular Value Decomposition to Build a Recommender System\" width=\"640\" height=\"480\"><\/p>\n<p id=\"caption-attachment-4963\" class=\"wp-caption-text\">Using Singular Value Decomposition to Build a Recommender System<br \/>Photo by <a href=\"https:\/\/unsplash.com\/photos\/ihpiRgog1vs\">Roberto Arias<\/a>, some rights reserved.<\/p>\n<\/div>\n<h2>Tutorial overview<\/h2>\n<p>This tutorial is divided into 3 parts; they are:<\/p>\n<ul>\n<li>Review of Singular Value Decomposition<\/li>\n<li>The Meaning of Singular Value Decomposition in Recommender System<\/li>\n<li>Implementing a Recommender System<\/li>\n<\/ul>\n<h2>Review of Singular Value Decomposition<\/h2>\n<p>Just like a number such as 24 can be decomposed as factors 24=2\u00d73\u00d74, a matrix can also be expressed as multiplication of some other matrices. Because matrices are arrays of numbers, they have their own rules of multiplication. Consequently, they have different ways of factorization, or known as <strong>decomposition<\/strong>. QR decomposition or LU decomposition are common examples. Another example is <strong>singular value decomposition<\/strong>, which has no restriction to the shape or properties of the matrix to be decomposed.<\/p>\n<p>Singular value decomposition assumes a matrix $M$ (for example, a $mtimes n$ matrix) is decomposed as<br \/>$$<br \/>M = Ucdot Sigma cdot V^T<br \/>$$<br \/>where $U$ is a $mtimes m$ matrix, $Sigma$ is a diagonal matrix of $mtimes n$, and $V^T$ is a $ntimes n$ matrix. The diagonal matrix $Sigma$ is an interesting one, which it can be non-square but only the entries on the diagonal could be non-zero. The matrices $U$ and $V^T$ are <strong>orthonormal<\/strong> matrices. Meaning the columns of $U$ or rows of $V$ are (1) orthogonal to each other and are (2) unit vectors. Vectors are orthogonal to each other if any two vectors\u2019 dot product is zero. A vector is unit vector if its L2-norm is 1. Orthonormal matrix has the property that its transpose is its inverse. In other words, since $U$ is an orthonormal matrix, $U^T = U^{-1}$ or $Ucdot U^T=U^Tcdot U=I$, where $I$ is the identity matrix.<\/p>\n<p>Singular value decomposition gets its name from the diagonal entries on $Sigma$, which are called the singular values of matrix $M$. They are in fact, the square root of the eigenvalues of matrix $Mcdot M^T$. Just like a number factorized into primes, the singular value decomposition of a matrix reveals a lot about the structure of that matrix.<\/p>\n<p>But actually what described above is called the <strong>full SVD<\/strong>. There is another version called <strong>reduced SVD<\/strong> or <strong>compact SVD<\/strong>. We still have write $M = UcdotSigmacdot V^T$ but we have $Sigma$ a $rtimes r$ square diagonal matrix with $r$ the <strong>rank<\/strong> of matrix $M$, which is usually less than or equal to the smaller of $m$ and $n$. The matrix $U$ is than a $mtimes r$ matrix and $V^T$ is a $rtimes n$ matrix. Because matrices $U$ and $V^T$ are non-square, they are called <strong>semi-orthonormal<\/strong>, meaning $U^Tcdot U=I$ and $V^Tcdot V=I$, with $I$ in both case a $rtimes r$ identity matrix.<\/p>\n<h2>The Meaning of Singular Value Decomposition in Recommender System<\/h2>\n<p>If the matrix $M$ is rank $r$, than we can prove that the matrices $Mcdot M^T$ and $M^Tcdot M$ are both rank $r$. In singular value decomposition (the reduced SVD), the columns of matrix $U$ are eigenvectors of $Mcdot M^T$ and the rows of matrix $V^T$ are eigenvectors of $M^Tcdot M$. What\u2019s interesting is that $Mcdot M^T$ and $M^Tcdot M$ are potentially in different size (because matrix $M$ can be non-square shape), but they have the same set of eigenvalues, which are the square of values on the diagonal of $Sigma$.<\/p>\n<p>This is why the result of singular value decomposition can reveal a lot about the matrix $M$.<\/p>\n<p>Imagine we collected some book reviews such that books are columns and people are rows, and the entries are the ratings that a person gave to a book. In that case, $Mcdot M^T$ would be a table of person-to-person which the entries would mean the sum of the ratings one person gave match with another one. Similarly $M^Tcdot M$ would be a table of book-to-book which entries are the sum of the ratings received match with that received by another book. What can be the hidden connection between people and books? That could be the genre, or the author, or something  in. similar nature.<\/p>\n<h2>Implementing a Recommender System<\/h2>\n<p>Let\u2019s see how we can make use of the result from SVD to build a recommender system. Firstly, let\u2019s download the dataset from this link (caution: it is 600MB big)<\/p>\n<p>This dataset is the \u201cSocial Recommendation Data\u201d from \u201c<a href=\"https:\/\/cseweb.ucsd.edu\/~jmcauley\/datasets.html#social_data\">Recommender Systems and Personalization Datasets<\/a>\u201c. It contains the reviews given by users on books on <a href=\"https:\/\/www.librarything.com\/\">Librarything<\/a>. What we are interested are the number of \u201cstars\u201d a user given to a book.<\/p>\n<p>If we open up this tar file we will see a large file named \u201creviews.json\u201d. We can extract it, or read the included file on the fly. First three lines of reviews.json are shown below:<\/p>\n<div id=\"urvanov-syntax-highlighter-617903a206e88239641678\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\nimport tarfile<\/p>\n<p># Read downloaded file from:<br \/>\n# http:\/\/deepyeti.ucsd.edu\/jmcauley\/datasets\/librarything\/lthing_data.tar.gz<br \/>\nwith tarfile.open(&#8220;lthing_data.tar.gz&#8221;) as tar:<br \/>\n    print(&#8220;Files in tar archive:&#8221;)<br \/>\n    tar.list()<\/p>\n<p>    with tar.extractfile(&#8220;lthing_data\/reviews.json&#8221;) as file:<br \/>\n        count = 0<br \/>\n        for line in file:<br \/>\n            print(line)<br \/>\n            count += 1<br \/>\n            if count &gt; 3:<br \/>\n                break<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-e\">import <\/span><span class=\"crayon-i\">tarfile<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Read downloaded file from:<\/span><\/p>\n<p><span class=\"crayon-p\"># http:\/\/deepyeti.ucsd.edu\/jmcauley\/datasets\/librarything\/lthing_data.tar.gz<\/span><\/p>\n<p><span class=\"crayon-e\">with <\/span><span class=\"crayon-v\">tarfile<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">open<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;lthing_data.tar.gz&#8221;<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">as<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">tar<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;Files in tar archive:&#8221;<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">tar<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">list<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-e\">with <\/span><span class=\"crayon-v\">tar<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">extractfile<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;lthing_data\/reviews.json&#8221;<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">as<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">file<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">count<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">0<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-st\">for<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">line <\/span><span class=\"crayon-st\">in<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">file<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">line<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">count<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">+=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">1<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-st\">if<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">count<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&gt;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">3<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-st\">break<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<p>The above will print:<\/p>\n<div id=\"urvanov-syntax-highlighter-617903a206e8e438751994\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\nFiles in tar archive:<br \/>\n?rwxr-xr-x julian\/julian 0 2016-09-30 17:58:55 lthing_data\/<br \/>\n?rw-r&#8211;r&#8211; julian\/julian 4824989 2014-01-02 13:55:12 lthing_data\/edges.txt<br \/>\n?rw-rw-r&#8211; julian\/julian 1604368260 2016-09-30 17:58:25 lthing_data\/reviews.json<br \/>\nb&#8221;{&#8216;work&#8217;: &#8216;3206242&#8217;, &#8216;flags&#8217;: [], &#8216;unixtime&#8217;: 1194393600, &#8216;stars&#8217;: 5.0, &#8216;nhelpful&#8217;: 0, &#8216;time&#8217;: &#8216;Nov 7, 2007&#8217;, &#8216;comment&#8217;: &#8216;This a great book for young readers to be introduced to the world of Middle Earth. &#8216;, &#8216;user&#8217;: &#8216;van_stef&#8217;}n&#8221;<br \/>\nb&#8221;{&#8216;work&#8217;: &#8216;12198649&#8217;, &#8216;flags&#8217;: [], &#8216;unixtime&#8217;: 1333756800, &#8216;stars&#8217;: 5.0, &#8216;nhelpful&#8217;: 0, &#8216;time&#8217;: &#8216;Apr 7, 2012&#8217;, &#8216;comment&#8217;: &#8216;Help Wanted: Tales of On The Job Terror from Evil Jester Press is a fun and scary read. This book is edited by Peter Giglio and has short stories by Joe McKinney, Gary Brandner, Henry Snider and many more. As if work wasnt already scary enough, this book gives you more reasons to be scared. Help Wanted is an excellent anthology that includes some great stories by some master storytellers.\\nOne of the stories includes Agnes: A Love Story by David C. Hayes, which tells the tale of a lawyer named Jack who feels unappreciated at work and by his wife so he starts a relationship with a photocopier. They get along well until the photocopier starts wanting the lawyer to kill for it. The thing I liked about this story was how the author makes you feel sorry for Jack. His two co-workers are happily married and love their jobs while Jack is married to a paranoid alcoholic and he hates and works at a job he cant stand. You completely understand how he can fall in love with a copier because he is a lonely soul that no one understands except the copier of course.\\nAnother story in Help Wanted is Work Life Balance by Jeff Strand. In this story a man works for a company that starts to let their employees do what they want at work. It starts with letting them come to work a little later than usual, then the employees are allowed to hug and kiss on the job. Things get really out of hand though when the company starts letting employees carry knives and stab each other, as long as it doesnt interfere with their job. This story is meant to be more funny then scary but still has its scary moments. Jeff Strand does a great job mixing humor and horror in this story.\\nAnother good story in Help Wanted: On The Job Terror is The Chapel Of Unrest by Stephen Volk. This is a gothic horror story that takes place in the 1800s and has to deal with an undertaker who has the duty of capturing and embalming a ghoul who has been eating dead bodies in a graveyard. Stephen Volk through his use of imagery in describing the graveyard, the chapel and the clothes of the time, transports you into an 1800s gothic setting that reminded me of Bram Stokers Dracula.\\nOne more story in this anthology that I have to mention is Expulsion by Eric Shapiro which tells the tale of a mad man going into a office to kill his fellow employees. This is a very short but very powerful story that gets you into the mind of a disgruntled employee but manages to end on a positive note. Though there were stories I didnt like in Help Wanted, all in all its a very good anthology. I highly recommend this book &#8216;, &#8216;user&#8217;: &#8216;dwatson2&#8217;}n&#8221;<br \/>\nb&#8221;{&#8216;work&#8217;: &#8216;12533765&#8217;, &#8216;flags&#8217;: [], &#8216;unixtime&#8217;: 1352937600, &#8216;nhelpful&#8217;: 0, &#8216;time&#8217;: &#8216;Nov 15, 2012&#8217;, &#8216;comment&#8217;: &#8216;Magoon, K. (2012). Fire in the streets. New York: Simon and Schuster\/Aladdin. 336 pp. ISBN: 978-1-4424-2230-8. (Hardcover); $16.99.\\nKekla Magoon is an author to watch (http:\/\/www.spicyreads.org\/Author_Videos.html- scroll down). One of my favorite books from 2007 is Magoons The Rock and the River. At the time, I mentioned in reviews that we have very few books that even mention the Black Panther Party, let alone deal with them in a careful, thorough way. Fire in the Streets continues the story Magoon began in her debut book. While her familys financial fortunes drip away, not helped by her mothers drinking and assortment of boyfriends, the Panthers provide a very real respite for Maxie. Sam is still dealing with the death of his brother. Maxies relationship with Sam only serves to confuse and upset them both. Her friends, Emmalee and Patrice, are slowly drifting away. The Panther Party is the only thing that seems to make sense and she basks in its routine and consistency. She longs to become a full member of the Panthers and constantly battles with her Panther brother Raheem over her maturity and ability to do more than office tasks. Maxie wants to have her own gun. When Maxie discovers that there is someone working with the Panthers that is leaking information to the government about Panther activity, Maxie investigates. Someone is attempting to destroy the only place that offers her shelter. Maxie is determined to discover the identity of the traitor, thinking that this will prove her worth to the organization. However, the truth is not simple and it is filled with pain. Unfortunately we still do not have many teen books that deal substantially with the Democratic National Convention in Chicago, the Black Panther Party, and the social problems in Chicago that lead to the civil unrest. Thankfully, Fire in the Streets lives up to the standard Magoon set with The Rock and the River. Readers will feel like they have stepped back in time. Magoons factual tidbits add journalistic realism to the story and only improves the atmosphere. Maxie has spunk. Readers will empathize with her Atlas-task of trying to hold onto her world. Fire in the Streets belongs in all middle school and high school libraries. While readers are able to read this story independently of The Rock and the River, I strongly urge readers to read both and in order. Magoons recognition by the Coretta Scott King committee and the NAACP Image awards are NOT mistakes!&#8217;, &#8216;user&#8217;: &#8216;edspicer&#8217;}n&#8221;<br \/>\nb'{&#8216;work&#8217;: &#8216;12981302&#8217;, &#8216;flags&#8217;: [], &#8216;unixtime&#8217;: 1364515200, &#8216;stars&#8217;: 4.0, &#8216;nhelpful&#8217;: 0, &#8216;time&#8217;: &#8216;Mar 29, 2013&#8217;, &#8216;comment&#8217;: &#8220;Well, I definitely liked this book better than the last in the series. There was less fighting and more story. I liked both Toni and Ricky Lee and thought they were pretty good together. The banter between the two was sweet and often times funny. I enjoyed seeing some of the past characters and of course it&#8217;s always nice to be introduced to new ones. I just wonder how many more of these books there will be. At least two hopefully, one each for Rory and Reece. &#8220;, &#8216;user&#8217;: &#8216;amdrane2&#8242;}n&#8217;<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p>Files in tar archive:<\/p>\n<p>?rwxr-xr-x julian\/julian 0 2016-09-30 17:58:55 lthing_data\/<\/p>\n<p>?rw-r&#8211;r&#8211; julian\/julian 4824989 2014-01-02 13:55:12 lthing_data\/edges.txt<\/p>\n<p>?rw-rw-r&#8211; julian\/julian 1604368260 2016-09-30 17:58:25 lthing_data\/reviews.json<\/p>\n<p>b&#8221;{&#8216;work&#8217;: &#8216;3206242&#8217;, &#8216;flags&#8217;: [], &#8216;unixtime&#8217;: 1194393600, &#8216;stars&#8217;: 5.0, &#8216;nhelpful&#8217;: 0, &#8216;time&#8217;: &#8216;Nov 7, 2007&#8217;, &#8216;comment&#8217;: &#8216;This a great book for young readers to be introduced to the world of Middle Earth. &#8216;, &#8216;user&#8217;: &#8216;van_stef&#8217;}n&#8221;<\/p>\n<p>b&#8221;{&#8216;work&#8217;: &#8216;12198649&#8217;, &#8216;flags&#8217;: [], &#8216;unixtime&#8217;: 1333756800, &#8216;stars&#8217;: 5.0, &#8216;nhelpful&#8217;: 0, &#8216;time&#8217;: &#8216;Apr 7, 2012&#8217;, &#8216;comment&#8217;: &#8216;Help Wanted: Tales of On The Job Terror from Evil Jester Press is a fun and scary read. This book is edited by Peter Giglio and has short stories by Joe McKinney, Gary Brandner, Henry Snider and many more. As if work wasnt already scary enough, this book gives you more reasons to be scared. Help Wanted is an excellent anthology that includes some great stories by some master storytellers.\\nOne of the stories includes Agnes: A Love Story by David C. Hayes, which tells the tale of a lawyer named Jack who feels unappreciated at work and by his wife so he starts a relationship with a photocopier. They get along well until the photocopier starts wanting the lawyer to kill for it. The thing I liked about this story was how the author makes you feel sorry for Jack. His two co-workers are happily married and love their jobs while Jack is married to a paranoid alcoholic and he hates and works at a job he cant stand. You completely understand how he can fall in love with a copier because he is a lonely soul that no one understands except the copier of course.\\nAnother story in Help Wanted is Work Life Balance by Jeff Strand. In this story a man works for a company that starts to let their employees do what they want at work. It starts with letting them come to work a little later than usual, then the employees are allowed to hug and kiss on the job. Things get really out of hand though when the company starts letting employees carry knives and stab each other, as long as it doesnt interfere with their job. This story is meant to be more funny then scary but still has its scary moments. Jeff Strand does a great job mixing humor and horror in this story.\\nAnother good story in Help Wanted: On The Job Terror is The Chapel Of Unrest by Stephen Volk. This is a gothic horror story that takes place in the 1800s and has to deal with an undertaker who has the duty of capturing and embalming a ghoul who has been eating dead bodies in a graveyard. Stephen Volk through his use of imagery in describing the graveyard, the chapel and the clothes of the time, transports you into an 1800s gothic setting that reminded me of Bram Stokers Dracula.\\nOne more story in this anthology that I have to mention is Expulsion by Eric Shapiro which tells the tale of a mad man going into a office to kill his fellow employees. This is a very short but very powerful story that gets you into the mind of a disgruntled employee but manages to end on a positive note. Though there were stories I didnt like in Help Wanted, all in all its a very good anthology. I highly recommend this book &#8216;, &#8216;user&#8217;: &#8216;dwatson2&#8217;}n&#8221;<\/p>\n<p>b&#8221;{&#8216;work&#8217;: &#8216;12533765&#8217;, &#8216;flags&#8217;: [], &#8216;unixtime&#8217;: 1352937600, &#8216;nhelpful&#8217;: 0, &#8216;time&#8217;: &#8216;Nov 15, 2012&#8217;, &#8216;comment&#8217;: &#8216;Magoon, K. (2012). Fire in the streets. New York: Simon and Schuster\/Aladdin. 336 pp. ISBN: 978-1-4424-2230-8. (Hardcover); $16.99.\\nKekla Magoon is an author to watch (http:\/\/www.spicyreads.org\/Author_Videos.html- scroll down). One of my favorite books from 2007 is Magoons The Rock and the River. At the time, I mentioned in reviews that we have very few books that even mention the Black Panther Party, let alone deal with them in a careful, thorough way. Fire in the Streets continues the story Magoon began in her debut book. While her familys financial fortunes drip away, not helped by her mothers drinking and assortment of boyfriends, the Panthers provide a very real respite for Maxie. Sam is still dealing with the death of his brother. Maxies relationship with Sam only serves to confuse and upset them both. Her friends, Emmalee and Patrice, are slowly drifting away. The Panther Party is the only thing that seems to make sense and she basks in its routine and consistency. She longs to become a full member of the Panthers and constantly battles with her Panther brother Raheem over her maturity and ability to do more than office tasks. Maxie wants to have her own gun. When Maxie discovers that there is someone working with the Panthers that is leaking information to the government about Panther activity, Maxie investigates. Someone is attempting to destroy the only place that offers her shelter. Maxie is determined to discover the identity of the traitor, thinking that this will prove her worth to the organization. However, the truth is not simple and it is filled with pain. Unfortunately we still do not have many teen books that deal substantially with the Democratic National Convention in Chicago, the Black Panther Party, and the social problems in Chicago that lead to the civil unrest. Thankfully, Fire in the Streets lives up to the standard Magoon set with The Rock and the River. Readers will feel like they have stepped back in time. Magoons factual tidbits add journalistic realism to the story and only improves the atmosphere. Maxie has spunk. Readers will empathize with her Atlas-task of trying to hold onto her world. Fire in the Streets belongs in all middle school and high school libraries. While readers are able to read this story independently of The Rock and the River, I strongly urge readers to read both and in order. Magoons recognition by the Coretta Scott King committee and the NAACP Image awards are NOT mistakes!&#8217;, &#8216;user&#8217;: &#8216;edspicer&#8217;}n&#8221;<\/p>\n<p>b'{&#8216;work&#8217;: &#8216;12981302&#8217;, &#8216;flags&#8217;: [], &#8216;unixtime&#8217;: 1364515200, &#8216;stars&#8217;: 4.0, &#8216;nhelpful&#8217;: 0, &#8216;time&#8217;: &#8216;Mar 29, 2013&#8217;, &#8216;comment&#8217;: &#8220;Well, I definitely liked this book better than the last in the series. There was less fighting and more story. I liked both Toni and Ricky Lee and thought they were pretty good together. The banter between the two was sweet and often times funny. I enjoyed seeing some of the past characters and of course it&#8217;s always nice to be introduced to new ones. I just wonder how many more of these books there will be. At least two hopefully, one each for Rory and Reece. &#8220;, &#8216;user&#8217;: &#8216;amdrane2&#8242;}n&#8217;<\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<p>Each line in reviews.json is a record. We are going to extract the \u201cuser\u201d, \u201cwork\u201d, and \u201cstars\u201d field of each record as long as there are no missing data among these three. Despite the name, the records are not well-formed JSON strings (most notably it uses single quote rather than double quote). Therefore, we cannot use <code>json<\/code> package from Python but to use <code>ast<\/code> to decode such string:<\/p>\n<div id=\"urvanov-syntax-highlighter-617903a206e8f925394157\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\nimport ast<\/p>\n<p>reviews = []<br \/>\nwith tarfile.open(&#8220;lthing_data.tar.gz&#8221;) as tar:<br \/>\n    with tar.extractfile(&#8220;lthing_data\/reviews.json&#8221;) as file:<br \/>\n        for line in file:<br \/>\n            record = ast.literal_eval(line.decode(&#8220;utf8&#8221;))<br \/>\n            if any(x not in record for x in [&#8216;user&#8217;, &#8216;work&#8217;, &#8216;stars&#8217;]):<br \/>\n                continue<br \/>\n            reviews.append([record[&#8216;user&#8217;], record[&#8216;work&#8217;], record[&#8216;stars&#8217;]])<br \/>\nprint(len(reviews), &#8220;records retrieved&#8221;)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">ast<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-v\">reviews<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-e\">with <\/span><span class=\"crayon-v\">tarfile<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">open<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;lthing_data.tar.gz&#8221;<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">as<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">tar<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-e\">with <\/span><span class=\"crayon-v\">tar<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">extractfile<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;lthing_data\/reviews.json&#8221;<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">as<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">file<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-st\">for<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">line <\/span><span class=\"crayon-st\">in<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">file<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">record<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">ast<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">literal_eval<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">line<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">decode<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;utf8&#8221;<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-st\">if<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">any<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-i\">x<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">not<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">in<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">record <\/span><span class=\"crayon-st\">for<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">x<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">in<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8216;user&#8217;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-s\">&#8216;work&#8217;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-s\">&#8216;stars&#8217;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-st\">continue<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">reviews<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">append<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">record<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8216;user&#8217;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">record<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8216;work&#8217;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">record<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8216;stars&#8217;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-e\">len<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">reviews<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-s\">&#8220;records retrieved&#8221;<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<div id=\"urvanov-syntax-highlighter-617903a206e93676318072\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n1387209 records retrieved<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p>1387209 records retrieved<\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<p>Now we should make a matrix of how different users rate each book. We make use of the pandas library to help convert the data we collected into a table:<\/p>\n<div id=\"urvanov-syntax-highlighter-617903a206e94279545852\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\nimport pandas as pd<br \/>\nreviews = pd.DataFrame(reviews, columns=[&#8220;user&#8221;, &#8220;work&#8221;, &#8220;stars&#8221;])<br \/>\nprint(reviews.head())<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">pandas <\/span><span class=\"crayon-st\">as<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">pd<\/span><\/p>\n<p><span class=\"crayon-v\">reviews<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">pd<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">DataFrame<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">reviews<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">columns<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8220;user&#8221;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-s\">&#8220;work&#8221;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-s\">&#8220;stars&#8221;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">reviews<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">head<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<div id=\"urvanov-syntax-highlighter-617903a206e95518557137\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n            user      work  stars<br \/>\n0       van_stef   3206242    5.0<br \/>\n1       dwatson2  12198649    5.0<br \/>\n2       amdrane2  12981302    4.0<br \/>\n3  Lila_Gustavus   5231009    3.0<br \/>\n4      skinglist    184318    2.0<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0user\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0work\u00a0\u00a0stars<\/p>\n<p>0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 van_stef\u00a0\u00a0 3206242\u00a0\u00a0\u00a0\u00a05.0<\/p>\n<p>1\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 dwatson2\u00a0\u00a012198649\u00a0\u00a0\u00a0\u00a05.0<\/p>\n<p>2\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 amdrane2\u00a0\u00a012981302\u00a0\u00a0\u00a0\u00a04.0<\/p>\n<p>3\u00a0\u00a0Lila_Gustavus\u00a0\u00a0 5231009\u00a0\u00a0\u00a0\u00a03.0<\/p>\n<p>4\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0skinglist\u00a0\u00a0\u00a0\u00a0184318\u00a0\u00a0\u00a0\u00a02.0<\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<p>As an example, we try not to use all data in order to save time and memory. Here we consider only those users who reviewed more than 50 books and also those books who are reviewed by more than 50 users. This way, we trimmed our dataset to less than 15% of its original size:<\/p>\n<div id=\"urvanov-syntax-highlighter-617903a206e96825887960\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\n# Look for the users who reviewed more than 50 books<br \/>\nusercount = reviews[[&#8220;work&#8221;,&#8221;user&#8221;]].groupby(&#8220;user&#8221;).count()<br \/>\nusercount = usercount[usercount[&#8220;work&#8221;] &gt;= 50]<br \/>\nprint(usercount.head())<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-p\"># Look for the users who reviewed more than 50 books<\/span><\/p>\n<p><span class=\"crayon-v\">usercount<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">reviews<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8220;work&#8221;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-s\">&#8220;user&#8221;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">groupby<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;user&#8221;<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">count<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">usercount<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">usercount<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">usercount<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8220;work&#8221;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&gt;=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">50<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">usercount<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">head<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<div id=\"urvanov-syntax-highlighter-617903a206e97410157330\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n            work<br \/>\nuser<br \/>\n              84<br \/>\n-Eva-        602<br \/>\n06nwingert   370<br \/>\n1983mk        63<br \/>\n1dragones    194<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0work<\/p>\n<p>user<\/p>\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a084<\/p>\n<p>-Eva-\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0602<\/p>\n<p>06nwingert\u00a0\u00a0 370<\/p>\n<p>1983mk\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a063<\/p>\n<p>1dragones\u00a0\u00a0\u00a0\u00a0194<\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<div id=\"urvanov-syntax-highlighter-617903a206e98852915718\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\n# Look for the books who reviewed by more than 50 users<br \/>\nworkcount = reviews[[&#8220;work&#8221;,&#8221;user&#8221;]].groupby(&#8220;work&#8221;).count()<br \/>\nworkcount = workcount[workcount[&#8220;user&#8221;] &gt;= 50]<br \/>\nprint(workcount.head())<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-p\"># Look for the books who reviewed by more than 50 users<\/span><\/p>\n<p><span class=\"crayon-v\">workcount<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">reviews<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8220;work&#8221;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-s\">&#8220;user&#8221;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">groupby<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;work&#8221;<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">count<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">workcount<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">workcount<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">workcount<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8220;user&#8221;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&gt;=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">50<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">workcount<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">head<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<div id=\"urvanov-syntax-highlighter-617903a206e99982691667\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n          user<br \/>\nwork<br \/>\n10000      106<br \/>\n10001       53<br \/>\n1000167    186<br \/>\n10001797    53<br \/>\n10005525   134<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0user<\/p>\n<p>work<\/p>\n<p>10000\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0106<\/p>\n<p>10001\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 53<\/p>\n<p>1000167\u00a0\u00a0\u00a0\u00a0186<\/p>\n<p>10001797\u00a0\u00a0\u00a0\u00a053<\/p>\n<p>10005525\u00a0\u00a0 134<\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<div id=\"urvanov-syntax-highlighter-617903a206e9a495757259\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\n# Keep only the popular books and active users<br \/>\nreviews = reviews[reviews[&#8220;user&#8221;].isin(usercount.index) &amp; reviews[&#8220;work&#8221;].isin(workcount.index)]<br \/>\nprint(reviews)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-p\"># Keep only the popular books and active users<\/span><\/p>\n<p><span class=\"crayon-v\">reviews<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">reviews<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">reviews<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8220;user&#8221;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">isin<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">usercount<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">index<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&amp;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">reviews<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8220;work&#8221;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">isin<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">workcount<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">index<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">reviews<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<div id=\"urvanov-syntax-highlighter-617903a206e9b988732215\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n                user     work  stars<br \/>\n0           van_stef  3206242    5.0<br \/>\n6            justine     3067    4.5<br \/>\n18           stephmo  1594925    4.0<br \/>\n19         Eyejaybee  2849559    5.0<br \/>\n35       LisaMaria_C   452949    4.5<br \/>\n&#8230;              &#8230;      &#8230;    &#8230;<br \/>\n1387161     connie53     1653    4.0<br \/>\n1387177   BruderBane    24623    4.5<br \/>\n1387192  StuartAston  8282225    4.0<br \/>\n1387202      danielx  9759186    4.0<br \/>\n1387206     jclark88  8253945    3.0<\/p>\n<p>[205110 rows x 3 columns]<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0user\u00a0\u00a0\u00a0\u00a0 work\u00a0\u00a0stars<\/p>\n<p>0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 van_stef\u00a0\u00a03206242\u00a0\u00a0\u00a0\u00a05.0<\/p>\n<p>6\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0justine\u00a0\u00a0\u00a0\u00a0 3067\u00a0\u00a0\u00a0\u00a04.5<\/p>\n<p>18\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 stephmo\u00a0\u00a01594925\u00a0\u00a0\u00a0\u00a04.0<\/p>\n<p>19\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Eyejaybee\u00a0\u00a02849559\u00a0\u00a0\u00a0\u00a05.0<\/p>\n<p>35\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 LisaMaria_C\u00a0\u00a0 452949\u00a0\u00a0\u00a0\u00a04.5<\/p>\n<p>&#8230;\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0&#8230;\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0&#8230;\u00a0\u00a0\u00a0\u00a0&#8230;<\/p>\n<p>1387161\u00a0\u00a0\u00a0\u00a0 connie53\u00a0\u00a0\u00a0\u00a0 1653\u00a0\u00a0\u00a0\u00a04.0<\/p>\n<p>1387177\u00a0\u00a0 BruderBane\u00a0\u00a0\u00a0\u00a024623\u00a0\u00a0\u00a0\u00a04.5<\/p>\n<p>1387192\u00a0\u00a0StuartAston\u00a0\u00a08282225\u00a0\u00a0\u00a0\u00a04.0<\/p>\n<p>1387202\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0danielx\u00a0\u00a09759186\u00a0\u00a0\u00a0\u00a04.0<\/p>\n<p>1387206\u00a0\u00a0\u00a0\u00a0 jclark88\u00a0\u00a08253945\u00a0\u00a0\u00a0\u00a03.0<\/p>\n<p>\u00a0<\/p>\n<p>[205110 rows x 3 columns]<\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<p>Then we can make use of \u201cpivot table\u201d function in pandas to convert this into a matrix:<\/p>\n<div id=\"urvanov-syntax-highlighter-617903a206e9c860361667\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\nreviewmatrix = reviews.pivot(index=&#8221;user&#8221;, columns=&#8221;work&#8221;, values=&#8221;stars&#8221;).fillna(0)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-v\">reviewmatrix<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">reviews<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">pivot<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">index<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8220;user&#8221;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">columns<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8220;work&#8221;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">values<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8220;stars&#8221;<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">fillna<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<p>The result is a matrix of 5593 rows and 2898 columns<\/p>\n<p><img loading=\"lazy\" class=\"aligncenter wp-image-13020\" data-cfsrc=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2021\/10\/svd-matrix-1024x466.png\" alt=\"\" width=\"1024\" height=\"466\"><\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-13020\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2021\/10\/svd-matrix-1024x466.png\" alt=\"\" width=\"1024\" height=\"466\"><br \/>Here we represented 5593 users and 2898 books in a matrix. Then we apply the SVD (this will take a while):<\/p>\n<div id=\"urvanov-syntax-highlighter-617903a206e9d282509642\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\nfrom numpy.linalg import svd<br \/>\nmatrix = reviewmatrix.values<br \/>\nu, s, vh = svd(matrix, full_matrices=False)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-e\">from <\/span><span class=\"crayon-v\">numpy<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">linalg <\/span><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">svd<\/span><\/p>\n<p><span class=\"crayon-v\">matrix<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">reviewmatrix<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-i\">values<\/span><\/p>\n<p><span class=\"crayon-v\">u<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">s<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">vh<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">svd<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">matrix<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">full_matrices<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-t\">False<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<p>By default, the <code>svd()<\/code> returns a full singular value decomposition. We choose a reduced version so we can use smaller matrices to save memory. The columns of <code>vh<\/code> correspond to the books. We can based on vector space model to find which book are most similar to the one we are looking at:<\/p>\n<div id=\"urvanov-syntax-highlighter-617903a206e9e292096821\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\n&#8230;<br \/>\nimport numpy as np<br \/>\ndef cosine_similarity(v,u):<br \/>\n    return (v @ u)\/ (np.linalg.norm(v) * np.linalg.norm(u))<\/p>\n<p>highest_similarity = -np.inf<br \/>\nhighest_sim_col = -1<br \/>\nfor col in range(1,vh.shape[1]):<br \/>\n    similarity = cosine_similarity(vh[:,0], vh[:,col])<br \/>\n    if similarity &gt; highest_similarity:<br \/>\n        highest_similarity = similarity<br \/>\n        highest_sim_col = col<\/p>\n<p>print(&#8220;Column %d is most similar to column 0&#8221; % highest_sim_col)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-sy\">.<\/span><\/p>\n<p><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">numpy <\/span><span class=\"crayon-st\">as<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">np<\/span><\/p>\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">cosine_similarity<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">v<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-v\">u<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-i\">v<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">@<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">u<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">\/<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">np<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">linalg<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">norm<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">v<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">*<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">np<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">linalg<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">norm<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">u<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-v\">highest_similarity<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-v\">np<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">inf<\/span><\/p>\n<p><span class=\"crayon-v\">highest_sim_col<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><\/p>\n<p><span class=\"crayon-st\">for<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">col <\/span><span class=\"crayon-st\">in<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">range<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-v\">vh<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">similarity<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">cosine_similarity<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">vh<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">vh<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-v\">col<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-st\">if<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">similarity<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&gt;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">highest_similarity<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">highest_similarity<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">similarity<\/span><\/p>\n<p><span class=\"crayon-e\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">highest_sim_col<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">col<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;Column %d is most similar to column 0&#8221;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">%<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">highest_sim_col<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<p>And in the above example, we try to find the book that is best match to to first column. The result is:<\/p>\n<div id=\"urvanov-syntax-highlighter-617903a206e9f619375293\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\nColumn 906 is most similar to column 0<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p>Column 906 is most similar to column 0<\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<p>In a recommendation system, when a user picked a book, we may show her a few other books that are similar to the one she picked based on the cosine distance as calculated above.<\/p>\n<p>Depends on the dataset, we may use truncated SVD to reduce the dimension of matrix <code>vh<\/code>. In essence, this means we are removing several rows on <code>vh<\/code> that the corresponding singular values in <code>s<\/code> are small, before we use it to compute the similarity. This would likely make the prediction more accurate as those less important features of a book are removed from consideration.<\/p>\n<p>Note that, in the decomposition $M=UcdotSigmacdot V^T$ we know the rows of $U$ are the users and columns of $V^T$ are books, we cannot identify what are the meanings of the columns of $U$ or rows of $V^T$ (an equivalently, that of $Sigma$). We know they could be genres, for example, that provide some underlying connections between the users and the books but we cannot be sure what exactly are they. However, this does not stop us from using them as <strong>features<\/strong>\u00a0in our recommendation system.<\/p>\n<p>Tying all together, the following is the complete code:<\/p>\n<div id=\"urvanov-syntax-highlighter-617903a206ea0033537136\" class=\"urvanov-syntax-highlighter-syntax crayon-theme-classic urvanov-syntax-highlighter-font-monaco urvanov-syntax-highlighter-os-pc print-yes notranslate\" data-settings=\" minimize scroll-mouseover disable-anim\">\n<p><textarea class=\"urvanov-syntax-highlighter-plain print-no\" data-settings=\"dblclick\" readonly><br \/>\nimport tarfile<br \/>\nimport ast<br \/>\nimport pandas as pd<br \/>\nimport numpy as np<\/p>\n<p># Read downloaded file from:<br \/>\n# http:\/\/deepyeti.ucsd.edu\/jmcauley\/datasets\/librarything\/lthing_data.tar.gz<br \/>\nwith tarfile.open(&#8220;lthing_data.tar.gz&#8221;) as tar:<br \/>\n    print(&#8220;Files in tar archive:&#8221;)<br \/>\n    tar.list()<\/p>\n<p>    print(&#8220;nSample records:&#8221;)<br \/>\n    with tar.extractfile(&#8220;lthing_data\/reviews.json&#8221;) as file:<br \/>\n        count = 0<br \/>\n        for line in file:<br \/>\n            print(line)<br \/>\n            count += 1<br \/>\n            if count &gt; 3:<br \/>\n                break<\/p>\n<p># Collect records<br \/>\nreviews = []<br \/>\nwith tarfile.open(&#8220;lthing_data.tar.gz&#8221;) as tar:<br \/>\n    with tar.extractfile(&#8220;lthing_data\/reviews.json&#8221;) as file:<br \/>\n        for line in file:<br \/>\n            try:<br \/>\n                record = ast.literal_eval(line.decode(&#8220;utf8&#8221;))<br \/>\n            except:<br \/>\n                print(line.decode(&#8220;utf8&#8221;))<br \/>\n                raise<br \/>\n            if any(x not in record for x in [&#8216;user&#8217;, &#8216;work&#8217;, &#8216;stars&#8217;]):<br \/>\n                continue<br \/>\n            reviews.append([record[&#8216;user&#8217;], record[&#8216;work&#8217;], record[&#8216;stars&#8217;]])<br \/>\nprint(len(reviews), &#8220;records retrieved&#8221;)<\/p>\n<p># Print a few sample of what we collected<br \/>\nreviews = pd.DataFrame(reviews, columns=[&#8220;user&#8221;, &#8220;work&#8221;, &#8220;stars&#8221;])<br \/>\nprint(reviews.head())<\/p>\n<p># Look for the users who reviewed more than 50 books<br \/>\nusercount = reviews[[&#8220;work&#8221;,&#8221;user&#8221;]].groupby(&#8220;user&#8221;).count()<br \/>\nusercount = usercount[usercount[&#8220;work&#8221;] &gt;= 50]<\/p>\n<p># Look for the books who reviewed by more than 50 users<br \/>\nworkcount = reviews[[&#8220;work&#8221;,&#8221;user&#8221;]].groupby(&#8220;work&#8221;).count()<br \/>\nworkcount = workcount[workcount[&#8220;user&#8221;] &gt;= 50]<\/p>\n<p># Keep only the popular books and active users<br \/>\nreviews = reviews[reviews[&#8220;user&#8221;].isin(usercount.index) &amp; reviews[&#8220;work&#8221;].isin(workcount.index)]<br \/>\nprint(&#8220;nSubset of data:&#8221;)<br \/>\nprint(reviews)<\/p>\n<p># Convert records into user-book review score matrix<br \/>\nreviewmatrix = reviews.pivot(index=&#8221;user&#8221;, columns=&#8221;work&#8221;, values=&#8221;stars&#8221;).fillna(0)<br \/>\nmatrix = reviewmatrix.values<\/p>\n<p># Singular value decomposition<br \/>\nu, s, vh = np.linalg.svd(matrix, full_matrices=False)<\/p>\n<p># Find the highest similarity<br \/>\ndef cosine_similarity(v,u):<br \/>\n    return (v @ u)\/ (np.linalg.norm(v) * np.linalg.norm(u))<\/p>\n<p>highest_similarity = -np.inf<br \/>\nhighest_sim_col = -1<br \/>\nfor col in range(1,vh.shape[1]):<br \/>\n    similarity = cosine_similarity(vh[:,0], vh[:,col])<br \/>\n    if similarity &gt; highest_similarity:<br \/>\n        highest_similarity = similarity<br \/>\n        highest_sim_col = col<\/p>\n<p>print(&#8220;Column %d (book id %s) is most similar to column 0 (book id %s)&#8221; %<br \/>\n        (highest_sim_col, reviewmatrix.columns[col], reviewmatrix.columns[0])<br \/>\n)<\/textarea><\/p>\n<div class=\"urvanov-syntax-highlighter-main\">\n<table class=\"crayon-table\">\n<tr class=\"urvanov-syntax-highlighter-row\">\n<td class=\"crayon-nums \" data-settings=\"show\">\n<div class=\"urvanov-syntax-highlighter-nums-content\">\n<p>1<\/p>\n<p>2<\/p>\n<p>3<\/p>\n<p>4<\/p>\n<p>5<\/p>\n<p>6<\/p>\n<p>7<\/p>\n<p>8<\/p>\n<p>9<\/p>\n<p>10<\/p>\n<p>11<\/p>\n<p>12<\/p>\n<p>13<\/p>\n<p>14<\/p>\n<p>15<\/p>\n<p>16<\/p>\n<p>17<\/p>\n<p>18<\/p>\n<p>19<\/p>\n<p>20<\/p>\n<p>21<\/p>\n<p>22<\/p>\n<p>23<\/p>\n<p>24<\/p>\n<p>25<\/p>\n<p>26<\/p>\n<p>27<\/p>\n<p>28<\/p>\n<p>29<\/p>\n<p>30<\/p>\n<p>31<\/p>\n<p>32<\/p>\n<p>33<\/p>\n<p>34<\/p>\n<p>35<\/p>\n<p>36<\/p>\n<p>37<\/p>\n<p>38<\/p>\n<p>39<\/p>\n<p>40<\/p>\n<p>41<\/p>\n<p>42<\/p>\n<p>43<\/p>\n<p>44<\/p>\n<p>45<\/p>\n<p>46<\/p>\n<p>47<\/p>\n<p>48<\/p>\n<p>49<\/p>\n<p>50<\/p>\n<p>51<\/p>\n<p>52<\/p>\n<p>53<\/p>\n<p>54<\/p>\n<p>55<\/p>\n<p>56<\/p>\n<p>57<\/p>\n<p>58<\/p>\n<p>59<\/p>\n<p>60<\/p>\n<p>61<\/p>\n<p>62<\/p>\n<p>63<\/p>\n<p>64<\/p>\n<p>65<\/p>\n<p>66<\/p>\n<p>67<\/p>\n<p>68<\/p>\n<p>69<\/p>\n<p>70<\/p>\n<p>71<\/p>\n<p>72<\/p>\n<p>73<\/p>\n<p>74<\/p>\n<\/div>\n<\/td>\n<td class=\"urvanov-syntax-highlighter-code\">\n<div class=\"crayon-pre\">\n<p><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">tarfile<\/span><\/p>\n<p><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">ast<\/span><\/p>\n<p><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">pandas <\/span><span class=\"crayon-st\">as<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">pd<\/span><\/p>\n<p><span class=\"crayon-e\">import <\/span><span class=\"crayon-e\">numpy <\/span><span class=\"crayon-st\">as<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">np<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Read downloaded file from:<\/span><\/p>\n<p><span class=\"crayon-p\"># http:\/\/deepyeti.ucsd.edu\/jmcauley\/datasets\/librarything\/lthing_data.tar.gz<\/span><\/p>\n<p><span class=\"crayon-e\">with <\/span><span class=\"crayon-v\">tarfile<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">open<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;lthing_data.tar.gz&#8221;<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">as<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">tar<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;Files in tar archive:&#8221;<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">tar<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">list<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;nSample records:&#8221;<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-e\">with <\/span><span class=\"crayon-v\">tar<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">extractfile<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;lthing_data\/reviews.json&#8221;<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">as<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">file<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">count<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">0<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-st\">for<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">line <\/span><span class=\"crayon-st\">in<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">file<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">line<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">count<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">+=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">1<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-st\">if<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">count<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&gt;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">3<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-st\">break<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Collect records<\/span><\/p>\n<p><span class=\"crayon-v\">reviews<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-e\">with <\/span><span class=\"crayon-v\">tarfile<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">open<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;lthing_data.tar.gz&#8221;<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">as<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">tar<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-e\">with <\/span><span class=\"crayon-v\">tar<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">extractfile<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;lthing_data\/reviews.json&#8221;<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">as<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">file<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-st\">for<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">line <\/span><span class=\"crayon-st\">in<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">file<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-st\">try<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">record<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">ast<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">literal_eval<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">line<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">decode<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;utf8&#8221;<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">except<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">line<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">decode<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;utf8&#8221;<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-e\">raise<\/span><\/p>\n<p><span class=\"crayon-e\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-st\">if<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">any<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-i\">x<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">not<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">in<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">record <\/span><span class=\"crayon-st\">for<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-i\">x<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-st\">in<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8216;user&#8217;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-s\">&#8216;work&#8217;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-s\">&#8216;stars&#8217;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-st\">continue<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">reviews<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">append<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">record<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8216;user&#8217;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">record<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8216;work&#8217;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">record<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8216;stars&#8217;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-e\">len<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">reviews<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-s\">&#8220;records retrieved&#8221;<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Print a few sample of what we collected<\/span><\/p>\n<p><span class=\"crayon-v\">reviews<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">pd<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">DataFrame<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">reviews<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">columns<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8220;user&#8221;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-s\">&#8220;work&#8221;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-s\">&#8220;stars&#8221;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">reviews<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">head<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Look for the users who reviewed more than 50 books<\/span><\/p>\n<p><span class=\"crayon-v\">usercount<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">reviews<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8220;work&#8221;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-s\">&#8220;user&#8221;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">groupby<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;user&#8221;<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">count<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">usercount<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">usercount<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">usercount<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8220;work&#8221;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&gt;=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">50<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Look for the books who reviewed by more than 50 users<\/span><\/p>\n<p><span class=\"crayon-v\">workcount<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">reviews<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8220;work&#8221;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-s\">&#8220;user&#8221;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">groupby<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;work&#8221;<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">count<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">workcount<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">workcount<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">workcount<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8220;user&#8221;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&gt;=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-cn\">50<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Keep only the popular books and active users<\/span><\/p>\n<p><span class=\"crayon-v\">reviews<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">reviews<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">reviews<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8220;user&#8221;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">isin<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">usercount<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">index<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&amp;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">reviews<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-s\">&#8220;work&#8221;<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">isin<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">workcount<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">index<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">]<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;nSubset of data:&#8221;<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">reviews<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Convert records into user-book review score matrix<\/span><\/p>\n<p><span class=\"crayon-v\">reviewmatrix<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">reviews<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">pivot<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">index<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8220;user&#8221;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">columns<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8220;work&#8221;<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">values<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-s\">&#8220;stars&#8221;<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">fillna<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-v\">matrix<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">reviewmatrix<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-i\">values<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Singular value decomposition<\/span><\/p>\n<p><span class=\"crayon-v\">u<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">s<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">vh<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">np<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">linalg<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">svd<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">matrix<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">full_matrices<\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-t\">False<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-p\"># Find the highest similarity<\/span><\/p>\n<p><span class=\"crayon-e\">def <\/span><span class=\"crayon-e\">cosine_similarity<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">v<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-v\">u<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-st\">return<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-i\">v<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">@<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">u<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">\/<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">np<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">linalg<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">norm<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">v<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">*<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">np<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">linalg<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">norm<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">u<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-v\">highest_similarity<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-v\">np<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-e\">inf<\/span><\/p>\n<p><span class=\"crayon-v\">highest_sim_col<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&#8211;<\/span><span class=\"crayon-cn\">1<\/span><\/p>\n<p><span class=\"crayon-st\">for<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">col <\/span><span class=\"crayon-st\">in<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">range<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-v\">vh<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">shape<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">1<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">similarity<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">cosine_similarity<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">vh<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">vh<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-o\">:<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-v\">col<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-st\">if<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">similarity<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">&gt;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">highest_similarity<\/span><span class=\"crayon-o\">:<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">highest_similarity<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">similarity<\/span><\/p>\n<p><span class=\"crayon-e\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-v\">highest_sim_col<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">=<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-e\">col<\/span><\/p>\n<p>\u00a0<\/p>\n<p><span class=\"crayon-e\">print<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-s\">&#8220;Column %d (book id %s) is most similar to column 0 (book id %s)&#8221;<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-o\">%<\/span><\/p>\n<p><span class=\"crayon-h\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/span><span class=\"crayon-sy\">(<\/span><span class=\"crayon-v\">highest_sim_col<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">reviewmatrix<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">columns<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-v\">col<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">,<\/span><span class=\"crayon-h\"> <\/span><span class=\"crayon-v\">reviewmatrix<\/span><span class=\"crayon-sy\">.<\/span><span class=\"crayon-v\">columns<\/span><span class=\"crayon-sy\">[<\/span><span class=\"crayon-cn\">0<\/span><span class=\"crayon-sy\">]<\/span><span class=\"crayon-sy\">)<\/span><\/p>\n<p><span class=\"crayon-sy\">)<\/span><\/p>\n<\/div>\n<\/td>\n<\/tr>\n<\/table><\/div>\n<\/p><\/div>\n<h2>Further reading<\/h2>\n<p>This section provides more resources on the topic if you are looking to go deeper.<\/p>\n<h3>Books<\/h3>\n<h3>APIs<\/h3>\n<h3>Articles<\/h3>\n<h2>Summary<\/h2>\n<p>In this tutorial, you discovered how to build a recommender system using singular value decomposition.<\/p>\n<p>Specifically, you learned:<\/p>\n<ul>\n<li>What a singular value decomposition mean to a matrix<\/li>\n<li>How to interpret the result of a singular value decomposition<\/li>\n<li>Find similarity from the columns of matrix $V^T$ obtained from singular value decomposition, and make recommendations based on the similarity<\/li>\n<\/ul>\n<div class=\"widget_text awac-wrapper\" id=\"custom_html-69\">\n<div class=\"widget_text awac widget custom_html-69\">\n<div class=\"textwidget custom-html-widget\">\n<div>\n<h2>Get a Handle on Linear Algebra for Machine Learning!<\/h2>\n<p><a href=\"\/linear_algebra_for_machine_learning\/\" rel=\"nofollow\"><img width=\"220\" height=\"311\" data-cfstyle=\"border: 0;\" data-cfsrc=\"\/wp-content\/uploads\/2018\/01\/Cover-220-1.png\" alt=\"Linear Algebra for Machine Learning\" align=\"left\"><img decoding=\"async\" loading=\"lazy\" width=\"220\" height=\"311\" src=\"\/wp-content\/uploads\/2018\/01\/Cover-220-1.png\" alt=\"Linear Algebra for Machine Learning\" align=\"left\"><\/a><\/p>\n<h4>Develop a working understand of linear algebra<\/h4>\n<p>&#8230;by writing lines of code in python<\/p>\n<p>Discover how in my new Ebook:<br \/><a href=\"\/linear_algebra_for_machine_learning\/\" rel=\"nofollow\">Linear Algebra for Machine Learning<\/a><\/p>\n<p>It provides <strong>self-study tutorials<\/strong> on topics like:<br \/><em>Vector Norms, Matrix Multiplication, Tensors, Eigendecomposition, SVD, PCA<\/em> and much more&#8230;<\/p>\n<h4>Finally Understand the Mathematics of Data<\/h4>\n<p>Skip the Academics. Just Results.<\/p>\n<p><a href=\"\/linear_algebra_for_machine_learning\/\" class=\"woo-sc-button  red\"><span class=\"woo-\">See What&#8217;s Inside<\/span><\/a><\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div><\/div>\n","protected":false},"excerpt":{"rendered":"<p>https:\/\/machinelearningmastery.com\/using-singular-value-decomposition-to-build-a-recommender-system\/<\/p>\n","protected":false},"author":0,"featured_media":1088,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[3],"tags":[],"_links":{"self":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/1087"}],"collection":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/comments?post=1087"}],"version-history":[{"count":0,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/1087\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media\/1088"}],"wp:attachment":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media?parent=1087"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/categories?post=1087"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/tags?post=1087"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}