{"id":2543,"date":"2022-09-20T16:40:04","date_gmt":"2022-09-20T16:40:04","guid":{"rendered":"https:\/\/salarydistribution.com\/machine-learning\/2022\/09\/20\/why-the-new-nvidia-grace-hopper-superchip-is-ideal-for-next-gen-recommender-systems\/"},"modified":"2022-09-20T16:40:04","modified_gmt":"2022-09-20T16:40:04","slug":"why-the-new-nvidia-grace-hopper-superchip-is-ideal-for-next-gen-recommender-systems","status":"publish","type":"post","link":"https:\/\/salarydistribution.com\/machine-learning\/2022\/09\/20\/why-the-new-nvidia-grace-hopper-superchip-is-ideal-for-next-gen-recommender-systems\/","title":{"rendered":"Why the New NVIDIA Grace Hopper Superchip Is Ideal for Next-Gen Recommender Systems"},"content":{"rendered":"<div data-url=\"https:\/\/blogs.nvidia.com\/blog\/2022\/09\/20\/grace-hopper-recommender-systems\/\" data-title=\"Why the New NVIDIA Grace Hopper Superchip Is Ideal for Next-Gen Recommender Systems\" data-hashtags=\"\">\n<p>Recommender systems, the economic engines of the internet, are getting a new turbocharger: the <a href=\"https:\/\/nvidianews.nvidia.com\/news\/nvidia-introduces-grace-cpu-superchip\">NVIDIA Grace Hopper Superchip<\/a>.<\/p>\n<p>Every day, recommenders serve up trillions of search results, ads, products, music and news stories to billions of people. They\u2019re among the most important AI models of our time because they\u2019re incredibly effective at finding in the internet\u2019s pandemonium the pearls users want.<\/p>\n<p>These machine learning pipelines run on data, terabytes of it. The more data recommenders consume, the more accurate their results and the more return on investment they deliver.<\/p>\n<p>To process this data tsunami, companies are already adopting <a href=\"https:\/\/blogs.nvidia.com\/blog\/2021\/09\/01\/what-is-accelerated-computing\/\">accelerated computing<\/a> to personalize services for their customers. Grace Hopper will take their advances to the next level.<\/p>\n<h2><b>GPUs Drive 16% More Engagement<\/b><\/h2>\n<p>Pinterest, the image-sharing social media company, was able to move to 100x larger recommender models by adopting NVIDIA GPUs. That increased engagement by 16% for its more than 400 million users.<\/p>\n<p>\u201cNormally, we would be happy with a 2% increase, and 16% is just a beginning,\u201d a software engineer at the company said in <a href=\"https:\/\/blogs.nvidia.com\/blog\/2022\/08\/04\/pinterest-gpu-acceleration-recommenders\/\">a recent blog<\/a>. \u201cWe see additional gains \u2014 it opens a lot of doors for opportunities.\u201d<\/p>\n<figure id=\"attachment_59627\" aria-describedby=\"caption-attachment-59627\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/blogs.nvidia.com\/wp-content\/uploads\/2022\/09\/Recommendation-system-1-scaled.jpg\"><img decoding=\"async\" loading=\"lazy\" class=\"size-large wp-image-59627\" src=\"https:\/\/blogs.nvidia.com\/wp-content\/uploads\/2022\/09\/Recommendation-system-1-672x359.jpg\" alt=\"Recommendation systems on Grace Hopper\" width=\"672\" height=\"359\"><\/a><figcaption id=\"caption-attachment-59627\" class=\"wp-caption-text\">Recommenders consume tens of terabytes of embeddings, data tables that provide context for making accurate predictions.<\/figcaption><\/figure>\n<p>The next generation of the NVIDIA AI platform promises even greater gains for companies processing massive datasets with super-sized recommender models.<\/p>\n<p>Because data is the fuel of AI, Grace Hopper is designed to pump more data through recommender systems than any other processor on the planet.<\/p>\n<h2><b>NVLink Accelerates Grace Hopper<\/b><\/h2>\n<p>Grace Hopper achieves this because it\u2019s a superchip \u2014 two chips in one unit, sharing a superfast chip-to-chip interconnect. It\u2019s an Arm-based <a href=\"https:\/\/www.nvidia.com\/en-us\/data-center\/grace-cpu\/\">NVIDIA Grace CPU<\/a> and a Hopper GPU that communicate over <a href=\"https:\/\/www.nvidia.com\/en-us\/data-center\/nvlink-c2c\/\">NVIDIA NVLink-C2C<\/a>.<\/p>\n<p>What\u2019s more, NVLink also connects many superchips into a super system, a computing <a href=\"https:\/\/blogs.nvidia.com\/blog\/2021\/03\/05\/what-is-a-cluster-pod\/\">cluster<\/a> built to run terabyte-class recommender systems.<\/p>\n<p>NVLink carries data at a whopping 900 gigabytes per second \u2014 7x the bandwidth of PCIe Gen 5, the interconnect most leading edge upcoming systems will use.<\/p>\n<p>That means Grace Hopper feeds recommenders 7x more of the embeddings \u2014 data tables packed with context \u2014 that they need to personalize results for users.<\/p>\n<h2><b>More Memory, Greater Efficiency<\/b><\/h2>\n<p>The Grace CPU uses LPDDR5X, a type of memory that strikes the optimal balance of bandwidth, energy efficiency, capacity and cost for recommender systems and other demanding workloads. It provides 50% more bandwidth while using an eighth of the power per gigabyte of traditional DDR5 memory subsystems.<\/p>\n<p>Any Hopper GPU in a cluster can access Grace\u2019s memory over NVLink. It\u2019s a feature of Grace Hopper that provides the largest pools of GPU memory ever.<\/p>\n<p>In addition, NVLink-C2C requires just 1.3 picojoules per bit transferred, giving it more than 5x the energy efficiency of PCIe Gen 5.<\/p>\n<p>The overall result is recommenders get a further up to 4x more performance and greater efficiency using Grace Hopper than using Hopper with traditional CPUs (see chart below).<\/p>\n<p><a href=\"https:\/\/blogs.nvidia.com\/wp-content\/uploads\/2022\/09\/Grace-Hopper-recsys-perf-FINAL.jpg\"><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-large wp-image-59630\" src=\"https:\/\/blogs.nvidia.com\/wp-content\/uploads\/2022\/09\/Grace-Hopper-recsys-perf-FINAL-672x428.jpg\" alt=\"Grace Hopper accelerates recommenders\" width=\"672\" height=\"428\"><\/a><\/p>\n<h2><b>All the Software You Need<\/b><\/h2>\n<p>The Grace Hopper Superchip runs the full stack of NVIDIA AI software used in some of the world\u2019s largest recommender systems today.<\/p>\n<p><a href=\"https:\/\/developer.nvidia.com\/nvidia-merlin\">NVIDIA Merlin<\/a> is the rocket fuel of recommenders, a collection of models, methods and libraries for building AI systems that can provide better predictions and increase clicks.<\/p>\n<p><a href=\"https:\/\/developer.nvidia.com\/blog\/accelerating-embedding-with-the-hugectr-tensorflow-embedding-plugin\/\">NVIDIA Merlin HugeCTR<\/a>, a recommender framework, helps users process massive datasets fast across distributed GPU clusters with help from the <a href=\"https:\/\/developer.nvidia.com\/nccl\">NVIDIA Collective Communications Library<\/a>.<\/p>\n<p>Learn more about Grace Hopper and NVLink in this <a href=\"https:\/\/developer.nvidia.com\/blog\/inside-nvidia-grace-cpu-nvidia-amps-up-superchip-engineering-for-hpc-and-ai\/?ncid=so-nvsh-307816#cid=hpc06_so-nvsh_en-us\">technical blog<\/a>. Watch <a href=\"https:\/\/register.nvidia.com\/flow\/nvidia\/gtcfall2022\/attendeeportal\/page\/sessioncatalog\/session\/1657686623816001KO2t?tab.catalogallsessionstab=16566177511100015Kus\">this GTC session<\/a> to learn more about building recommender systems.<\/p>\n<p>You can also hear NVIDIA CEO and co-founder Jensen Huang provide perspective on recommenders <a href=\"https:\/\/youtu.be\/PWcNlRI00jo?t=4559\">here<\/a>\u00a0or watch the full GTC keynote below.<\/p>\n<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>https:\/\/blogs.nvidia.com\/blog\/2022\/09\/20\/grace-hopper-recommender-systems\/<\/p>\n","protected":false},"author":0,"featured_media":2544,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[3],"tags":[],"_links":{"self":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/2543"}],"collection":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/comments?post=2543"}],"version-history":[{"count":0,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/2543\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media\/2544"}],"wp:attachment":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media?parent=2543"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/categories?post=2543"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/tags?post=2543"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}