{"id":2925,"date":"2023-03-22T19:40:19","date_gmt":"2023-03-22T19:40:19","guid":{"rendered":"https:\/\/salarydistribution.com\/machine-learning\/2023\/03\/22\/ai-opener-openais-sutskever-in-conversation-with-jensen-huang\/"},"modified":"2023-03-22T19:40:19","modified_gmt":"2023-03-22T19:40:19","slug":"ai-opener-openais-sutskever-in-conversation-with-jensen-huang","status":"publish","type":"post","link":"https:\/\/salarydistribution.com\/machine-learning\/2023\/03\/22\/ai-opener-openais-sutskever-in-conversation-with-jensen-huang\/","title":{"rendered":"AI Opener: OpenAI\u2019s Sutskever in Conversation With Jensen Huang"},"content":{"rendered":"<div data-url=\"https:\/\/blogs.nvidia.com\/blog\/2023\/03\/22\/sutskever-openai-gtc\/\" data-title=\"AI Opener: OpenAI\u2019s Sutskever in Conversation With Jensen Huang\" data-hashtags=\"\">\n<p>Like old friends catching up over coffee, two industry icons reflected on how modern AI got its start, where it\u2019s at today and where it needs to go next.<\/p>\n<p>Jensen Huang, founder and CEO of NVIDIA, interviewed AI pioneer Ilya Sutskever in a <a href=\"https:\/\/www.nvidia.com\/gtc\/session-catalog\/?tab.catalogallsessionstab=16566177511100015Kus#\/session\/1669748941314001t6Nv\">fireside chat<\/a> at <a href=\"https:\/\/www.nvidia.com\/gtc\/\">GTC<\/a>. The talk was recorded a day after the launch of GPT-4, the most powerful AI model to date from OpenAI, the research company Sutskever co-founded.<\/p>\n<p>They talked at length about GPT-4 and its forerunners, including ChatGPT. That generative AI model, though only a few months old, is already the most popular computer application in history.<\/p>\n<p>Their conversation touched on the capabilities, limits and inner workings of the deep neural networks that are capturing the imaginations of hundreds of millions of users.<\/p>\n<p>Compared to ChatGPT, GPT-4 marks a \u201cpretty substantial improvement across many dimensions,\u201d said Sutskever, noting the new model can read images as well as text.<\/p>\n<p>\u201cIn some future version, [users] might get a diagram back\u201d in response to a query, he said.<\/p>\n<h2><b>Under the Hood With GPT<\/b><\/h2>\n<p>\u201cThere\u2019s a misunderstanding that ChatGPT is one large language model, but there\u2019s a system around it,\u201d said Huang.<\/p>\n<p>In a sign of that complexity, Sutskever said OpenAI uses two levels of training.<\/p>\n<p>The first stage focuses on accurately predicting the next word in a series. Here, \u201cwhat the neural net learns is some representation of the process that produced the text, and that\u2019s a projection of the world,\u201d he said.<\/p>\n<p>The second \u201cis where we communicate to the neural network what we want, including guardrails \u2026 so it becomes more reliable and precise,\u201d he added.<\/p>\n<h2><b>Present at the Creation<\/b><\/h2>\n<p>While he\u2019s at the swirling center of modern AI today, Sutskever was also present at its creation.<\/p>\n<p>In 2012, he was among the first to show the power of deep neural networks trained on massive datasets. In an academic contest, the AlexNet model he demonstrated with AI pioneers Geoff Hinton and Alex Krizhevsky recognized images faster than a human could.<\/p>\n<p>Huang referred to their work as the <a href=\"https:\/\/blogs.nvidia.com\/blog\/2016\/01\/12\/accelerating-ai-artificial-intelligence-gpus\/\">Big Bang of AI<\/a>.<\/p>\n<p>The results \u201cbroke the record by such a large margin, it was clear there was a discontinuity here,\u201d Huang said.<\/p>\n<h2><b>The Power of Parallel Processing<\/b><\/h2>\n<p>Part of that breakthrough came from the parallel processing the team applied to its model with GPUs.<\/p>\n<p>\u201cThe ImageNet dataset and a convolutional neural network were a great fit for GPUs that made it unbelievably fast to train something unprecedented,\u201d Sutskever said.<\/p>\n<p><a href=\"https:\/\/blogs.nvidia.com\/wp-content\/uploads\/2023\/03\/Ilya-wide-arms-scaled.jpg\"><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-large wp-image-63224\" src=\"https:\/\/blogs.nvidia.com\/wp-content\/uploads\/2023\/03\/Ilya-wide-arms-672x354.jpg\" alt=\"Another image from the fireside chat between Ilya Sutskever of OpenAI and Jensen Huang.\" width=\"672\" height=\"354\"><\/a><\/p>\n<p>That early work ran on a few GeForce GTX 5080 GPUs in a University of Toronto lab. Today, <a href=\"https:\/\/news.microsoft.com\/source\/features\/ai\/how-microsofts-bet-on-azure-unlocked-an-ai-revolution\/\">tens of thousands<\/a> of the latest NVIDIA A100 and H100 Tensor Core GPUs in the Microsoft Azure cloud service handle training and inference on models like ChatGPT.<\/p>\n<p>\u201cIn the 10 years we\u2019ve known each other, the models you\u2019ve trained [have grown by] about a million times,\u201d Huang said. \u201cNo one in computer science would have believed the computation done in that time would be a million times larger.\u201d<\/p>\n<p>\u201cI had a very strong belief that bigger is better, and a goal at OpenAI was to scale,\u201d said Sutskever.<\/p>\n<h2><b>A Billion Words<\/b><\/h2>\n<p>Along the way, the two shared a laugh.<\/p>\n<p>\u201cHumans hear a billion words in a lifetime,\u201d Sutskever said.<\/p>\n<p>\u201cDoes that include the words in my own head,\u201d Huang shot back.<\/p>\n<p>\u201cMake it 2 billion,\u201d Sutskever deadpanned.<\/p>\n<h2><b>The Future of AI<\/b><\/h2>\n<p>They ended their nearly hour-long talk discussing the outlook for AI.<\/p>\n<p>Asked if GPT-4 has reasoning capabilities, Sutskever suggested the term is hard to define and the capability may still be on the horizon.<\/p>\n<p>\u201cWe\u2019ll keep seeing systems that astound us with what they can do,\u201d he said. \u201cThe frontier is in reliability, getting to a point where we can trust what it can do, and that if it doesn\u2019t know something, it says so,\u201d he added.<\/p>\n<p>\u201cYour body of work is incredible \u2026 truly remarkable,\u201d said Huang in closing the session. \u201cThis has been one of the best beyond Ph.D. descriptions of the state of the art of large language models,\u201d he said.<\/p>\n<p>To get all the news from GTC, watch the keynote below.<\/p>\n<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>https:\/\/blogs.nvidia.com\/blog\/2023\/03\/22\/sutskever-openai-gtc\/<\/p>\n","protected":false},"author":0,"featured_media":2926,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[3],"tags":[],"_links":{"self":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/2925"}],"collection":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/comments?post=2925"}],"version-history":[{"count":0,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/2925\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media\/2926"}],"wp:attachment":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media?parent=2925"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/categories?post=2925"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/tags?post=2925"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}