{"id":358,"date":"2020-10-06T10:24:22","date_gmt":"2020-10-06T10:24:22","guid":{"rendered":"https:\/\/machine-learning.webcloning.com\/2020\/10\/06\/ai-can-see-clearly-now-gans-take-the-jitters-out-of-video-calls\/"},"modified":"2020-10-06T10:24:22","modified_gmt":"2020-10-06T10:24:22","slug":"ai-can-see-clearly-now-gans-take-the-jitters-out-of-video-calls","status":"publish","type":"post","link":"https:\/\/salarydistribution.com\/machine-learning\/2020\/10\/06\/ai-can-see-clearly-now-gans-take-the-jitters-out-of-video-calls\/","title":{"rendered":"AI Can See Clearly Now: GANs Take the Jitters Out of Video Calls"},"content":{"rendered":"<div data-url=\"https:\/\/blogs.nvidia.com\/blog\/2020\/10\/05\/gan-video-conferencing-maxine\/\" data-title=\"AI Can See Clearly Now: GANs Take the Jitters Out of Video Calls\">\n<p>Ming-Yu Liu and Arun Mallya were on a video call when one of them started to break up, then freeze.<\/p>\n<p>It\u2019s an irksome reality of life in the pandemic that most of us have shared. But unlike most of us, Liu and Mallya could do something about it.<\/p>\n<p>They are AI researchers at NVIDIA and specialists in computer vision. Working with colleague Ting-Chun Wang, they realized they could use a neural network in place of the software called a video codec typically used to compress and decompress video for transmission over the net.<\/p>\n<p>Their work enables a video call with one-tenth the network bandwidth users typically need. It promises to reduce bandwidth consumption by orders of magnitude in the future.<\/p>\n<p>\u201cWe want to provide a better experience for video communications with AI so even people who only have access to extremely low bandwidth can still upgrade from voice to video calls,\u201d said Mallya.<\/p>\n<h2><b>Better Connections Thanks to GANs<\/b><\/h2>\n<p>The technique works even when callers are wearing a hat, glasses, headphones or a mask. And just for fun, they spiced up their demo with a couple bells and whistles so users can change their hair styles or clothes digitally or create an avatar.<\/p>\n<p>A more serious feature in the works (shown at top) uses the neural network to align the position of users\u2019 faces for a more natural experience. Callers watch their video feeds, but they appear to be looking directly at their cameras, enhancing the feeling of a face-to-face connection.<\/p>\n<p>\u201cWith computer vision techniques, we can locate a person\u2019s head over a wide range of angles, and we think this will help people have more natural conversations,\u201d said Wang.<\/p>\n<p>Say hello to the latest way AI is making virtual life more real.<\/p>\n<h2><b>How AI-Assisted Video Calls Work<\/b><\/h2>\n<p>The mechanism behind AI-assisted video calls is simple.<\/p>\n<p>A sender first transmits a reference image of the caller, just like today\u2019s systems that typically use a compressed video stream. Then, rather than sending a fat stream of pixel-packed images, it sends data on the locations of a few key points around the user\u2019s eyes, nose and mouth.<\/p>\n<p>A<a href=\"https:\/\/blogs.nvidia.com\/blog\/2017\/05\/17\/generative-adversarial-network\/\"> generative adversarial network<\/a> on the receiver\u2019s side uses the initial image and the facial key points to reconstruct subsequent images on a local GPU. As a result, much less data is sent over the network.<\/p>\n<\/p>\n<p>Liu\u2019s work in GANs hit the spotlight last year with<a href=\"https:\/\/blogs.nvidia.com\/blog\/2019\/03\/18\/gaugan-photorealistic-landscapes-nvidia-research\/\"> GauGAN<\/a>, an AI tool that turns anyone\u2019s doodles into photorealistic works of art. GauGAN has already been used to create more than<a href=\"https:\/\/blogs.nvidia.com\/blog\/2019\/07\/30\/gaugan-ai-painting\/\"> a million images<\/a> and is available at the<a href=\"https:\/\/www.nvidia.com\/en-us\/research\/ai-playground\/\"> AI Playground<\/a>.<\/p>\n<p>\u201cThe pandemic motivated us because everyone is doing video conferencing now, so we explored how we can ease the bandwidth bottlenecks so providers can serve more people at the same time,\u201d said Liu.<\/p>\n<h2><b>GPUs Bust Bandwidth Bottlenecks<\/b><\/h2>\n<p>The approach is part of an industry trend of shifting network bottlenecks into computational tasks that can be more easily tackled with local or cloud resources.<\/p>\n<p>\u201cThese days lots of companies want to turn bandwidth problems into compute problems because it\u2019s often hard to add more bandwidth and easier to add more compute,\u201d said Andrew Page, a director of advanced products in NVIDIA\u2019s media group.<\/p>\n<figure id=\"attachment_47265\" aria-describedby=\"caption-attachment-47265\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/blogs.nvidia.com\/wp-content\/uploads\/2020\/10\/Maxine-Video-Converence-Workflow.jpg\"><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/blogs.nvidia.com\/wp-content\/uploads\/2020\/10\/Maxine-Video-Converence-Workflow-672x438.jpg\" alt=\"\" width=\"672\" height=\"438\"><\/a><figcaption id=\"caption-attachment-47265\" class=\"wp-caption-text\">NVIDIA Maxine bundles a suite of tools for video conferencing and streaming services.<\/figcaption><\/figure>\n<h2><b>AI Instruments Tune Video Services<\/b><\/h2>\n<p>GAN video compression is one of several capabilities coming to <a href=\"http:\/\/developer.nvidia.com\/maxine\">NVIDIA Maxine<\/a>, a cloud-AI video-streaming platform to enhance video conferencing and calls. It packs audio, video and conversational AI features in a single toolkit that supports a broad range of devices.<\/p>\n<p>Announced this week at GTC, Maxine lets service providers deliver video at super resolution with real-time translation, background noise removal and context-aware closed captioning. Users can enjoy features such as face alignment, support for virtual assistants and realistic animation of avatars.<\/p>\n<p>\u201cVideo conferencing is going through a renaissance,\u201d said Page. \u201cThrough the pandemic, we\u2019ve all lived through its warts, but video is here to stay now as a part of our lives going forward because we are visual creatures.\u201d<\/p>\n<p>Maxine harnesses the power of NVIDIA GPUs with <a href=\"https:\/\/developer.nvidia.com\/tensor-cores\">Tensor Cores<\/a> running software such as <a href=\"https:\/\/developer.nvidia.com\/nvidia-jarvis\">NVIDIA Jarvis<\/a>, an SDK for conversational AI that delivers a suite of speech and text capabilities. Together, they deliver AI capabilities that are useful today and serve as building blocks for tomorrow\u2019s video products and services.<\/p>\n<p>Learn more about <a href=\"https:\/\/www.nvidia.com\/en-us\/research\/\">NVIDIA Research<\/a>. And watch NVIDIA CEO Jensen Huang recap all the news at GTC in the video below.<\/p>\n<\/p>\n<p><i>It\u2019s not too late to get access to hundreds of live and on-demand talks at GTC.<\/i><a href=\"https:\/\/reg.rainfocus.com\/flow\/nvidia\/gtcfall20\/reg\/login\"> <i>Register now<\/i><\/a> <i>through Oct. 9 using promo code <\/i><i>CMB4KN<\/i><i> to get 20 percent off.<\/i><\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>http:\/\/feedproxy.google.com\/~r\/nvidiablog\/~3\/eRfnRCPSq0g\/<\/p>\n","protected":false},"author":0,"featured_media":359,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[3],"tags":[],"_links":{"self":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/358"}],"collection":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/comments?post=358"}],"version-history":[{"count":0,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/358\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media\/359"}],"wp:attachment":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media?parent=358"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/categories?post=358"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/tags?post=358"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}