{"id":1215,"date":"2021-11-18T08:32:46","date_gmt":"2021-11-18T08:32:46","guid":{"rendered":"https:\/\/salarydistribution.com\/machine-learning\/2021\/11\/18\/mlperf-hpc-benchmarks-show-the-power-of-hpcai\/"},"modified":"2021-11-18T08:32:46","modified_gmt":"2021-11-18T08:32:46","slug":"mlperf-hpc-benchmarks-show-the-power-of-hpcai","status":"publish","type":"post","link":"https:\/\/salarydistribution.com\/machine-learning\/2021\/11\/18\/mlperf-hpc-benchmarks-show-the-power-of-hpcai\/","title":{"rendered":"MLPerf HPC Benchmarks Show the Power of HPC+AI\u00a0"},"content":{"rendered":"<div data-url=\"https:\/\/blogs.nvidia.com\/blog\/2021\/11\/17\/mlperf-hpc-ai\/\" data-title=\"MLPerf HPC Benchmarks Show the Power of HPC+AI\u00a0\" data-hashtags=\"\">\n<p>NVIDIA-powered systems won four of five tests in MLPerf HPC 1.0, an industry benchmark for AI performance on scientific applications in high performance computing.<\/p>\n<p>They\u2019re the latest results from <a href=\"https:\/\/www.nvidia.com\/en-us\/data-center\/mlperf\/\">MLPerf<\/a>, a set of industry benchmarks for deep learning first released in May 2018. MLPerf HPC addresses a style of computing that speeds and augments simulations on supercomputers with AI.<\/p>\n<p>Recent advances in <a href=\"https:\/\/blogs.nvidia.com\/blog\/2020\/11\/19\/covid-ai-gordon-bell-winner\/\">molecular dynamics<\/a>, <a href=\"https:\/\/blogs.nvidia.com\/blog\/2020\/09\/23\/hpc-ai-black-holes\/\">astronomy<\/a> and <a href=\"https:\/\/www.ecmwf.int\/en\/about\/media-centre\/science-blog\/2021\/large-scale-machine-learning-applications-weather-and\">climate simulation<\/a> all used HPC+AI to make scientific breakthroughs. It\u2019s a trend driving the adoption of <a href=\"https:\/\/blogs.nvidia.com\/blog\/2021\/10\/18\/exascale-day-ai\/\">exascale AI<\/a> for users in both science and industry.<\/p>\n<h2><b>What the Benchmarks Measure<\/b><\/h2>\n<p>MLPerf HPC 1.0 measured training of AI models in three typical workloads for HPC centers.<\/p>\n<ul>\n<li>CosmoFlow estimates details of objects in images from telescopes.<\/li>\n<li>DeepCAM tests detection of hurricanes and atmospheric rivers in climate data.<\/li>\n<li>OpenCatalyst tracks how well systems predict forces among atoms in molecules.<\/li>\n<\/ul>\n<p>Each test has two parts. A measure of how fast a system trains a model is called strong scaling. Its counterpart, weak scaling, is a measure of maximum system throughput, that is, how many models a system can train in a given time.<\/p>\n<p>Compared to the best results in strong scaling from last year\u2019s MLPerf 0.7 round, NVIDIA delivered 5x better results for CosmoFlow. In DeepCAM, we delivered nearly 7x more performance.<\/p>\n<p>The <a href=\"https:\/\/blogs.nvidia.com\/blog\/2021\/05\/27\/nersc-perlmutter-ai-supercomputer\/\">Perlmutter<\/a> Phase 1 system at Lawrence Berkeley National Lab led in strong scaling in the OpenCatalyst benchmark using 512 of its 6,144 <a href=\"https:\/\/www.nvidia.com\/en-us\/data-center\/a100\/\">NVIDIA A100 Tensor Core GPUs<\/a>.<\/p>\n<p>In the weak-scaling category, we led DeepCAM using 16 nodes per job and 256 simultaneous jobs. All our tests ran on <a href=\"https:\/\/blogs.nvidia.com\/blog\/2020\/08\/14\/making-selene-pandemic-ai\/\">NVIDIA Selene<\/a> (pictured above), our in-house system and the world\u2019s largest industrial supercomputer.<\/p>\n<figure id=\"attachment_54172\" aria-describedby=\"caption-attachment-54172\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/blogs.nvidia.com\/wp-content\/uploads\/2021\/11\/MLPerf-HPC-results-FINAL-1-scaled.jpg\"><\/p>\n<p><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/blogs.nvidia.com\/wp-content\/uploads\/2021\/11\/MLPerf-HPC-results-FINAL-1-672x353.jpg\" alt=\"NVIDIA wins MLPerf HPC, Nov 2021\" width=\"672\" height=\"353\"><\/p>\n<p><\/a><figcaption id=\"caption-attachment-54172\" class=\"wp-caption-text\">NVIDIA delivered leadership results in both the speed of training a model and per-chip efficiency.<\/figcaption><\/figure>\n<p>The latest results demonstrate another dimension of the NVIDIA AI platform and its performance leadership. It marks the eighth straight time NVIDIA delivered top scores in MLPerf benchmarks that span AI training and inference in the data center, the cloud and the network\u2019s edge.<\/p>\n<h2><b>A Broad Ecosystem<\/b><\/h2>\n<p>Seven of the eight participants in this round submitted results using NVIDIA GPUs.<\/p>\n<p>They include the J\u00fclich Supercomputing Centre in Germany, the Swiss National Supercomputing Centre and, in the U.S., the Argonne and Lawrence Berkeley National Laboratories, the National Center for Supercomputing Applications and the Texas Advanced Computing Center.<\/p>\n<p>\u201cWith the benchmark test, we have shown that our machine can unfold its potential in practice and contribute to keeping Europe on the ball when it comes to AI,\u201d said Thomas Lippert, director of the J\u00fclich Supercomputing Centre in <a href=\"https:\/\/www.fz-juelich.de\/SharedDocs\/Meldungen\/PORTAL\/EN\/2021\/2021-11-18-mlperf-hpc\">a blog<\/a>.<\/p>\n<p>The MLPerf benchmarks are backed by <a href=\"https:\/\/mlcommons.org\/en\/\">MLCommons<\/a>, an industry group led by Alibaba, Google, Intel, Meta, NVIDIA and others.<\/p>\n<h2><b>How We Did It<\/b><\/h2>\n<p>The strong showing is the result of a mature NVIDIA AI platform that includes a full stack of software.<\/p>\n<p>In this round, we tuned our code with tools available to everyone, such as <a href=\"https:\/\/docs.nvidia.com\/deeplearning\/dali\/user-guide\/docs\/\">NVIDIA DALI<\/a> to accelerate data processing and <a href=\"https:\/\/developer.nvidia.com\/blog\/cuda-graphs\/\">CUDA Graphs<\/a> to reduce small-batch latency for efficiently scaling up to 1,024 or more GPUs.<\/p>\n<p>We also applied <a href=\"https:\/\/www.youtube.com\/watch?v=uzYZP_z_5WE&amp;t=3s\">NVIDIA SHARP<\/a>, a key component within <a href=\"https:\/\/www.nvidia.com\/en-us\/data-center\/magnum-io\/\">NVIDIA MagnumIO<\/a>. It provides in-network computing to accelerate communications and offload data operations to the <a href=\"https:\/\/www.nvidia.com\/en-us\/networking\/infiniband-switching\/\">NVIDIA Quantum InfiniBand switch<\/a>.<\/p>\n<p>For a deeper dive into how we used these tools see our developer <a href=\"https:\/\/developer.nvidia.com\/blog\/mlperf-hpc-v1-0-deep-dive-into-optimizations-leading-to-record-setting-nvidia-performance\/\">blog<\/a>.<\/p>\n<p>All the software we used for our submissions is available from the MLPerf repository. We regularly add such code to the <a href=\"https:\/\/ngc.nvidia.com\/catalog\">NGC catalog<\/a>, our software hub for pretrained AI models, industry application frameworks, GPU applications and other software resources.<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>https:\/\/blogs.nvidia.com\/blog\/2021\/11\/17\/mlperf-hpc-ai\/<\/p>\n","protected":false},"author":0,"featured_media":1216,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[3],"tags":[],"_links":{"self":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/1215"}],"collection":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/comments?post=1215"}],"version-history":[{"count":0,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/1215\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media\/1216"}],"wp:attachment":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media?parent=1215"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/categories?post=1215"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/tags?post=1215"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}