{"id":558,"date":"2020-11-19T02:26:04","date_gmt":"2020-11-19T02:26:04","guid":{"rendered":"https:\/\/machine-learning.webcloning.com\/2020\/11\/19\/nvidia-ampere-computing-raise-arm-26x-in-supercomputing\/"},"modified":"2020-11-19T02:26:04","modified_gmt":"2020-11-19T02:26:04","slug":"nvidia-ampere-computing-raise-arm-26x-in-supercomputing","status":"publish","type":"post","link":"https:\/\/salarydistribution.com\/machine-learning\/2020\/11\/19\/nvidia-ampere-computing-raise-arm-26x-in-supercomputing\/","title":{"rendered":"NVIDIA, Ampere Computing Raise Arm 26x in Supercomputing"},"content":{"rendered":"<div data-url=\"https:\/\/blogs.nvidia.com\/blog\/2020\/11\/17\/arm-ampere-hpc-sc20\/\" data-title=\"NVIDIA, Ampere Computing Raise Arm 26x in Supercomputing\">\n<p>In the past 18 months, researchers have witnessed a whopping 25.5x performance boost for Arm-based platforms in high performance computing, thanks to the combined efforts of the Arm and NVIDIA ecosystems.<\/p>\n<p>Many engineers deserve a round of applause for the gains.<\/p>\n<ul>\n<li>The Arm Neoverse N1 core gave systems-on-a-chip like Ampere Computing\u2019s Altra an estimated 2.3x improvement over last year\u2019s designs.<\/li>\n<li>NVIDIA\u2019s <a href=\"https:\/\/www.nvidia.com\/en-us\/data-center\/a100\/\">A100 Tensor Core GPUs<\/a> delivered its largest ever gains in a single generation.<\/li>\n<li>The latest platforms upshifted to more and faster cores, input\/output lanes and memory.<\/li>\n<li>And application developers tuned their software with many new optimizations.<\/li>\n<\/ul>\n<p>As a result, NVIDIA\u2019s Arm-based reference design for HPC, with two Ampere Altra SoCs and two A100 GPUs, just delivered 25.5x the muscle of the dual-SoC servers researchers were using in June 2019. Our GPU-accelerated, Arm-based reference platform alone saw a 2.5x performance gain in 12 months.<\/p>\n<p>The results span applications \u2014 including GROMACS, LAMMPS, MILC, NAMD and Quantum Espresso \u2014 that are key to work like <a href=\"https:\/\/blogs.nvidia.com\/blog\/2020\/09\/28\/drug-discovery-covid-19\/\">drug discovery<\/a>, a top priority during the pandemic. These and many other applications ready to run on Arm-based systems are available in containers on <a href=\"https:\/\/ngc.nvidia.com\/\">NGC<\/a>, our hub for GPU-accelerated software.<\/p>\n<p>Companies and researchers pushing the limits in areas such as molecular dynamics and quantum chemistry can harness these apps to drive advances not only in basic science but in fields such as healthcare.<\/p>\n<h2><b>Under the Hood with Arm and HPC<\/b><\/h2>\n<p>The latest reference architecture marries the energy-efficient throughput of Ampere Computing\u2019s <a href=\"https:\/\/amperecomputing.com\/wp-content\/uploads\/2020\/11\/Mt._Jade_PB_v0.65_20201102.pdf\">Mt. Jade<\/a>, a 2U-sized server platform, with NVIDIA\u2019s HGX A100 that\u2019s already accelerating <a href=\"https:\/\/blogs.nvidia.com\/blog\/2020\/05\/15\/hpc-supercomputers-a100-gpus\/\">several supercomputers around the world<\/a>. It\u2019s the successor to a design that debuted last year based on the Marvell ThunderX2 and NVIDIA V100 GPUs.<\/p>\n<p>Mt. Jade consists of two Ampere Altra SoCs packing 80 cores each based on the Arm Neoverse N1 core, all running at up to 3 GHz. They provide a whopping 192 PCI Express Gen4 lanes and up to 8TB of memory to feed two A100 GPUs.<\/p>\n<figure id=\"attachment_47834\" aria-describedby=\"caption-attachment-47834\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/blogs.nvidia.com\/wp-content\/uploads\/2020\/11\/Mt-Jade-x1280.jpg\"><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/blogs.nvidia.com\/wp-content\/uploads\/2020\/11\/Mt-Jade-x1280-400x194.jpg\" alt=\"Ampere Computing Mt. Jade reference design\" width=\"400\" height=\"194\"><\/a><figcaption id=\"caption-attachment-47834\" class=\"wp-caption-text\">The Mt. Jade server platform supports 192 PCIe Gen4 lanes.<\/figcaption><\/figure>\n<p>The combination creates a compelling node for next-generation supercomputers. Ampere Computing has already attracted support from nine original equipment and design manufacturers and systems integrators, including Gigabyte, Lenovo and Wiwynn.<\/p>\n<h2><b>A Rising Arm HPC Ecosystem<\/b><\/h2>\n<p>In another sign of an expanding ecosystem, the <a href=\"https:\/\/a-hug.org\/\">Arm HPC User Group<\/a> hosted a virtual event ahead of SC20 with more than three dozen talks from organizations including AWS, Hewlett Packard Enterprise, the Juelich Supercomputing Center, RIKEN in Japan, and Oak Ridge and Sandia National Labs in the U.S. Most of the talks are available on <a href=\"https:\/\/www.youtube.com\/channel\/UCFLVQ8FeIElHKEWRZuMSQVw\/videos\">its YouTube channel<\/a>.<\/p>\n<p>In June, Arm made its biggest splash in supercomputing to date. That\u2019s when the Fugaku system in Japan debuted at No. 1 on the TOP500 list of the world\u2019s fastest supercomputers with a stunning 415.5 petaflops using the Arm-based A64FX CPU from Fujitsu.<\/p>\n<p>At the time it was one of four Arm-powered supercomputers on the list, and the first using Arm\u2019s Scalable Vector Extensions, technology embedded in Arm\u2019s next-generation Neoverse designs that NVIDIA will support in its software.<\/p>\n<p>Meanwhile, AWS is already running in the cloud HPC jobs like genomics, financial risk modeling and computational fluid dynamics on its Arm-based Graviton2 processors.<\/p>\n<h2><b>NVIDIA Accelerates Arm in HPC<\/b><\/h2>\n<p>Arm\u2019s growing HPC presence is part of a broad ecosystem of 13 million developers in areas that span smartphones to supercomputers. It\u2019s a community NVIDIA aims to expand with <a href=\"https:\/\/nvidianews.nvidia.com\/news\/nvidia-to-acquire-arm-for-40-billion-creating-worlds-premier-computing-company-for-the-age-of-ai\">our deal to acquire Arm<\/a> to create the world\u2019s premier company for the age of AI.<\/p>\n<p>We\u2019re extending the ecosystem with Arm support built into our NVIDIA AI, HPC, networking and graphics software. At last year\u2019s supercomputing event, NVIDIA CEO Jensen Huang announced our work <a href=\"https:\/\/blogs.nvidia.com\/blog\/2019\/11\/18\/hpc-nvidia-gpu-acceleration-arm\/\">accelerating Arm in HPC<\/a> in addition to our ongoing support for IBM POWER and x86 architectures.<\/p>\n<figure id=\"attachment_47837\" aria-describedby=\"caption-attachment-47837\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/blogs.nvidia.com\/wp-content\/uploads\/2020\/11\/Nv-support-for-Arm.jpg\"><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/blogs.nvidia.com\/wp-content\/uploads\/2020\/11\/Nv-support-for-Arm-672x391.jpg\" alt=\"Nvidia support for Arm ecosystem\" width=\"672\" height=\"391\"><\/a><figcaption id=\"caption-attachment-47837\" class=\"wp-caption-text\">Nvidia has expanded its support for the Arm ecosystem.<\/figcaption><\/figure>\n<p>Since then, we\u2019ve announced our <a href=\"https:\/\/www.mellanox.com\/products\/bluefield2-overview\">BlueField-2 DPUs<\/a> that use Arm IP to accelerate and secure networking and storage jobs for cloud, embedded and enterprise applications. And for more than a decade, we\u2019ve been an avid user of Arm designs inside products such as our <a href=\"https:\/\/developer.nvidia.com\/embedded\/jetson-nano\">Jetson Nano<\/a> modules for robotics and other embedded systems.<\/p>\n<p>We\u2019re excited to be part of dramatic performance gains for Arm in HPC. It\u2019s the latest page in the story of an open, thriving Arm ecosystem that keeps getting better.<\/p>\n<p>Learn more in the <a href=\"http:\/\/www.nvidia.com\/sc20\">NVIDIA SC20 Special Address<\/a>.<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>http:\/\/feedproxy.google.com\/~r\/nvidiablog\/~3\/6xqMEUBnz84\/<\/p>\n","protected":false},"author":0,"featured_media":559,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[3],"tags":[],"_links":{"self":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/558"}],"collection":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/comments?post=558"}],"version-history":[{"count":0,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/558\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media\/559"}],"wp:attachment":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media?parent=558"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/categories?post=558"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/tags?post=558"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}