{"id":4241,"date":"2025-08-22T16:47:47","date_gmt":"2025-08-22T16:47:47","guid":{"rendered":"https:\/\/salarydistribution.com\/machine-learning\/2025\/08\/22\/hot-topics-at-hot-chips-inference-networking-ai-innovation-at-every-scale-all-built-on-nvidia\/"},"modified":"2025-08-22T16:47:47","modified_gmt":"2025-08-22T16:47:47","slug":"hot-topics-at-hot-chips-inference-networking-ai-innovation-at-every-scale-all-built-on-nvidia","status":"publish","type":"post","link":"https:\/\/salarydistribution.com\/machine-learning\/2025\/08\/22\/hot-topics-at-hot-chips-inference-networking-ai-innovation-at-every-scale-all-built-on-nvidia\/","title":{"rendered":"Hot Topics at Hot Chips: Inference, Networking, AI Innovation at Every Scale \u2014 All Built on NVIDIA"},"content":{"rendered":"<div>\n\t\t<span class=\"bsf-rt-reading-time\"><span class=\"bsf-rt-display-label\"><\/span> <span class=\"bsf-rt-display-time\"><\/span> <span class=\"bsf-rt-display-postfix\"><\/span><\/span><\/p>\n<p>AI reasoning, inference and networking will be top of mind for attendees of next week\u2019s Hot Chips conference.<\/p>\n<p>A key forum for processor and system architects from industry and academia, Hot Chips \u2014 running Aug. 24-26 at Stanford University \u2014 showcases the latest innovations poised to advance <a target=\"_blank\" href=\"https:\/\/www.nvidia.com\/en-us\/glossary\/ai-factory\/\" rel=\"noopener\">AI factories<\/a> and drive revenue for the trillion-dollar data center computing market.<\/p>\n<p>At the conference, NVIDIA will join industry leaders including Google and Microsoft in a \u201ctutorial\u201d session \u2014 taking place on Sunday, Aug. 24 \u2014 that discusses designing rack-scale architecture for data centers.<\/p>\n<p>In addition, NVIDIA experts will present at four sessions and one tutorial detailing how:<\/p>\n<ul>\n<li>NVIDIA networking, including the <a target=\"_blank\" href=\"https:\/\/www.nvidia.com\/en-us\/networking\/products\/ethernet\/supernic\/\" rel=\"noopener\">NVIDIA ConnectX-8 SuperNIC<\/a>, delivers AI reasoning at rack- and data-center scale. <em>(Featuring Idan Burstein, principal architect of network adapters and systems-on-a-chip at NVIDIA)<\/em><\/li>\n<li>Neural rendering advancements and massive leaps in inference \u2014 powered by the NVIDIA Blackwell architecture, including the <a target=\"_blank\" href=\"https:\/\/www.nvidia.com\/en-us\/geforce\/graphics-cards\/50-series\/rtx-5090\/\" rel=\"noopener\">NVIDIA GeForce RTX 5090 GPU<\/a> \u2014 provide next-level graphics and simulation capabilities. <em>(Featuring Marc Blackstein, senior director of architecture at NVIDIA)<\/em><\/li>\n<li><a target=\"_blank\" href=\"https:\/\/www.nvidia.com\/en-us\/networking\/products\/silicon-photonics\/\" rel=\"noopener\">Co-packaged optics (CPO) switches<\/a> with integrated silicon photonics \u2014 built with light-speed fiber rather than copper wiring to send information quicker and using less power \u2014 enable efficient, high-performance, gigawatt-scale AI factories. The talk will also highlight <a target=\"_blank\" href=\"https:\/\/nvidianews.nvidia.com\/news\/nvidia-introduces-spectrum-xgs-ethernet-to-connect-distributed-data-centers-into-giga-scale-ai-super-factories\" rel=\"noopener\">NVIDIA <\/a><a target=\"_blank\" href=\"https:\/\/nvidianews.nvidia.com\/news\/nvidia-introduces-spectrum-xgs-ethernet-to-connect-distributed-data-centers-into-giga-scale-ai-super-factories\" rel=\"noopener\">Spectrum-XGS<\/a><a target=\"_blank\" href=\"https:\/\/nvidianews.nvidia.com\/news\/nvidia-introduces-spectrum-xgs-ethernet-to-connect-distributed-data-centers-into-giga-scale-ai-super-factories\" rel=\"noopener\"> Ethernet<\/a>, a new scale-across technology for unifying distributed data centers into AI super-factories. <em>(Featuring Gilad Shainer, senior vice president of networking at NVIDIA)<\/em><\/li>\n<li>The NVIDIA GB10 Superchip serves as the engine within the <a target=\"_blank\" href=\"https:\/\/www.nvidia.com\/en-us\/products\/workstations\/dgx-spark\/\" rel=\"noopener\">NVIDIA DGX Spark<\/a> desktop supercomputer. <em>(Featuring Andi Skende, senior distinguished engineer at NVIDIA)<\/em><\/li>\n<\/ul>\n<p>It\u2019s all part of how NVIDIA\u2019s latest technologies are accelerating inference to drive AI innovation everywhere, at every scale.<\/p>\n<h2>NVIDIA Networking Fosters AI Innovation at Scale<\/h2>\n<p><a target=\"_blank\" href=\"https:\/\/www.nvidia.com\/en-us\/glossary\/ai-reasoning\/\" rel=\"noopener\">AI reasoning<\/a> \u2014 when artificial intelligence systems can analyze and solve complex problems through multiple AI inference passes \u2014 requires rack-scale performance to deliver optimal user experiences efficiently.<\/p>\n<p>In data centers powering today\u2019s AI workloads, networking acts as the central nervous system, connecting all the components \u2014 servers, storage devices and other hardware \u2014 into a single, cohesive, powerful computing unit.<\/p>\n<figure id=\"attachment_84155\" aria-describedby=\"caption-attachment-84155\" class=\"wp-caption alignleft\"><img decoding=\"async\" loading=\"lazy\" class=\"wp-image-84155 size-full\" src=\"https:\/\/blogs.nvidia.com\/wp-content\/uploads\/2025\/08\/two-nvidia-connectx08-supernic.png\" alt=\"\" width=\"303\" height=\"171\"><figcaption id=\"caption-attachment-84155\" class=\"wp-caption-text\">NVIDIA ConnectX-8 SuperNIC<\/figcaption><\/figure>\n<p>Burstein\u2019s Hot Chips session will dive into how NVIDIA networking technologies \u2014 particularly NVIDIA ConnectX-8 SuperNICs \u2014 enable high-speed, low-latency, multi-GPU communication to deliver market-leading AI reasoning performance at scale.<\/p>\n<p>As part of the NVIDIA networking platform, NVIDIA NVLink, NVLink Switch and NVLink Fusion deliver scale-up connectivity \u2014 linking GPUs and compute elements within and across servers for ultra low-latency, high-bandwidth data exchange.<\/p>\n<p><a target=\"_blank\" href=\"https:\/\/www.nvidia.com\/en-us\/networking\/spectrumx\/\" rel=\"noopener\">NVIDIA Spectrum-X Ethernet<\/a> provides the scale-out fabric to connect entire clusters, rapidly streaming massive datasets into AI models and orchestrating GPU-to-GPU communication across the data center. <a target=\"_blank\" href=\"https:\/\/nvidianews.nvidia.com\/news\/nvidia-introduces-spectrum-xgs-ethernet-to-connect-distributed-data-centers-into-giga-scale-ai-super-factories\" rel=\"noopener\">Spectrum-XGS<\/a> <a target=\"_blank\" href=\"https:\/\/nvidianews.nvidia.com\/news\/nvidia-introduces-spectrum-xgs-ethernet-to-connect-distributed-data-centers-into-giga-scale-ai-super-factories\" rel=\"noopener\">Ethernet<\/a> scale-across technology extends the extreme performance and scale of Spectrum-X Ethernet to interconnect multiple, distributed data centers to form AI super-factories capable of giga-scale intelligence.<\/p>\n<figure id=\"attachment_84152\" aria-describedby=\"caption-attachment-84152\" class=\"wp-caption alignright\"><img decoding=\"async\" loading=\"lazy\" class=\"wp-image-84152 size-full\" src=\"https:\/\/blogs.nvidia.com\/wp-content\/uploads\/2025\/08\/three-distributed-ai-data-centers.png\" alt=\"\" width=\"423\" height=\"235\"><figcaption id=\"caption-attachment-84152\" class=\"wp-caption-text\">Connecting distributed AI data centers with NVIDIA Spectrum-XGS Ethernet.<\/figcaption><\/figure>\n<p>At the heart of Spectrum-X Ethernet, CPO switches push the limits of performance and efficiency for AI infrastructure at scale, and will be covered in detail by Shainer in his talk.<\/p>\n<p><a target=\"_blank\" href=\"https:\/\/www.nvidia.com\/en-us\/data-center\/gb200-nvl72\/\" rel=\"noopener\">NVIDIA GB200 NVL72<\/a> \u2014 an exascale computer in a single rack \u2014 features 36 NVIDIA GB200 Superchips, each containing two NVIDIA B200 GPUs and an NVIDIA Grace CPU, interconnected by the largest NVLink domain ever offered, with NVLink Switch providing 130 terabytes per second of low-latency GPU communications for AI and high-performance computing workloads.<\/p>\n<figure id=\"attachment_84143\" aria-describedby=\"caption-attachment-84143\" class=\"wp-caption alignleft\"><img decoding=\"async\" loading=\"lazy\" class=\"wp-image-84143 size-full\" src=\"https:\/\/blogs.nvidia.com\/wp-content\/uploads\/2025\/08\/four-rack-scale-system.png\" alt=\"\" width=\"282\" height=\"159\"><figcaption id=\"caption-attachment-84143\" class=\"wp-caption-text\">An NVIDIA rack-scale system.<\/figcaption><\/figure>\n<p>Built with the NVIDIA Blackwell architecture, GB200 NVL72 systems deliver massive leaps in reasoning inference performance.<\/p>\n<h2>NVIDIA Blackwell and CUDA Bring AI to Millions of Developers<\/h2>\n<p>The NVIDIA GeForce RTX 5090 GPU \u2014 also powered by Blackwell and to be covered in Blackstein\u2019s talk \u2014 doubles performance in today\u2019s games with <a target=\"_blank\" href=\"https:\/\/www.nvidia.com\/en-us\/geforce\/technologies\/dlss\/\" rel=\"noopener\">NVIDIA DLSS 4<\/a> technology.<\/p>\n<figure id=\"attachment_84140\" aria-describedby=\"caption-attachment-84140\" class=\"wp-caption alignright\"><img decoding=\"async\" loading=\"lazy\" class=\"wp-image-84140 size-full\" src=\"https:\/\/blogs.nvidia.com\/wp-content\/uploads\/2025\/08\/five-geforce.jpg\" alt=\"\" width=\"256\" height=\"144\"><figcaption id=\"caption-attachment-84140\" class=\"wp-caption-text\">NVIDIA GeForce RTX 5090 GPU<\/figcaption><\/figure>\n<p>It can also add neural rendering features for games to deliver up to 10x performance, 10x footprint amplification and a 10x reduction in design cycles,\u00a0 helping enhance realism in computer graphics and simulation. This offers smooth, responsive visual experiences at low energy consumption and improves the lifelike simulation of characters and effects.<\/p>\n<p><a target=\"_blank\" href=\"https:\/\/developer.nvidia.com\/cuda-toolkit\" rel=\"noopener\">NVIDIA CUDA<\/a>, the world\u2019s most widely available computing infrastructure, lets users deploy and run AI models using NVIDIA Blackwell anywhere.<\/p>\n<p>Hundreds of millions of GPUs run CUDA across the globe, from NVIDIA GB200 NVL72 rack-scale systems to <a target=\"_blank\" href=\"https:\/\/www.nvidia.com\/en-us\/geforce\/graphics-cards\/50-series\/\" rel=\"noopener\">GeForce RTX<\/a>\u2013 and <a target=\"_blank\" href=\"https:\/\/www.nvidia.com\/en-us\/products\/workstations\/\" rel=\"noopener\">NVIDIA RTX PRO<\/a>-powered PCs and workstations, with <a target=\"_blank\" href=\"https:\/\/www.nvidia.com\/en-us\/products\/workstations\/dgx-spark\/\" rel=\"noopener\">NVIDIA DGX Spark<\/a> powered by NVIDIA GB10 \u2014 discussed in Skende\u2019s session \u2014 coming soon.<\/p>\n<h2>From Algorithms to AI Supercomputers \u2014 Optimized for LLMs<\/h2>\n<figure id=\"attachment_84149\" aria-describedby=\"caption-attachment-84149\" class=\"wp-caption alignleft\"><img decoding=\"async\" loading=\"lazy\" class=\"wp-image-84149 size-full\" src=\"https:\/\/blogs.nvidia.com\/wp-content\/uploads\/2025\/08\/six-dgx-spark.png\" alt=\"\" width=\"292\" height=\"164\"><figcaption id=\"caption-attachment-84149\" class=\"wp-caption-text\">NVIDIA DGX Spark<\/figcaption><\/figure>\n<p>Delivering powerful performance and capabilities in a compact package, DGX Spark lets developers, researchers, data scientists and students push the boundaries of <a target=\"_blank\" href=\"https:\/\/www.nvidia.com\/en-us\/glossary\/generative-ai\/\" rel=\"noopener\">generative AI<\/a> right at their desktops, and accelerate workloads across industries.<\/p>\n<p>As part of the NVIDIA Blackwell platform, DGX Spark brings support for NVFP4, a low-precision numerical format to enable efficient <a href=\"https:\/\/blogs.nvidia.com\/blog\/what-is-agentic-ai\/\">agentic AI<\/a> inference, particularly of large language models (<a target=\"_blank\" href=\"https:\/\/www.nvidia.com\/en-us\/glossary\/large-language-models\/\" rel=\"noopener\">LLMs<\/a>). Learn more about NVFP4 in this NVIDIA Technical Blog.<\/p>\n<h2>Open-Source Collaborations Propel Inference Innovation<\/h2>\n<p>NVIDIA accelerates several open-source libraries and frameworks to accelerate and optimize AI workloads for LLMs and distributed inference. These include <a target=\"_blank\" href=\"https:\/\/docs.nvidia.com\/tensorrt-llm\/index.html\" rel=\"noopener\">NVIDIA TensorRT-LLM<\/a>, <a target=\"_blank\" href=\"https:\/\/www.nvidia.com\/en-us\/ai\/dynamo\/\" rel=\"noopener\">NVIDIA Dynamo<\/a>, TileIR, Cutlass, the <a target=\"_blank\" href=\"https:\/\/developer.nvidia.com\/nccl\" rel=\"noopener\">NVIDIA Collective Communication Library<\/a> and NIX \u2014 which are integrated into millions of workflows.<\/p>\n<p>Allowing developers to build with their framework of choice, NVIDIA has collaborated with top open framework providers to offer model optimizations for FlashInfer, PyTorch, SGLang, vLLM and others.<\/p>\n<p>Plus, <a target=\"_blank\" href=\"https:\/\/www.nvidia.com\/en-us\/ai-data-science\/products\/nim-microservices\/\" rel=\"noopener\">NVIDIA NIM microservices<\/a> are available for popular open models like OpenAI\u2019s gpt-oss and Llama 4, \u00a0making it easy for developers to operate managed application programming interfaces with the flexibility and security of self-hosting models on their preferred infrastructure.<\/p>\n<p><em>Learn more about the latest advancements in inference and accelerated computing by joining <\/em><a target=\"_blank\" href=\"https:\/\/hotchips.org\/\" rel=\"noopener\"><em>NVIDIA at Hot Chips<\/em><\/a><em>. <\/em><\/p>\n<p>\u00a0<\/p>\n<\/p><\/div>\n","protected":false},"excerpt":{"rendered":"<p>https:\/\/blogs.nvidia.com\/blog\/hot-chips-inference-networking\/<\/p>\n","protected":false},"author":0,"featured_media":4242,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[3],"tags":[],"_links":{"self":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/4241"}],"collection":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/comments?post=4241"}],"version-history":[{"count":0,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/4241\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media\/4242"}],"wp:attachment":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media?parent=4241"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/categories?post=4241"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/tags?post=4241"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}