{"id":985,"date":"2021-10-02T08:48:38","date_gmt":"2021-10-02T08:48:38","guid":{"rendered":"https:\/\/salarydistribution.com\/machine-learning\/2021\/10\/02\/exaggeration-detector-could-lead-to-more-accurate-health-science-journalism\/"},"modified":"2021-10-02T08:48:38","modified_gmt":"2021-10-02T08:48:38","slug":"exaggeration-detector-could-lead-to-more-accurate-health-science-journalism","status":"publish","type":"post","link":"https:\/\/salarydistribution.com\/machine-learning\/2021\/10\/02\/exaggeration-detector-could-lead-to-more-accurate-health-science-journalism\/","title":{"rendered":"\u2018Exaggeration Detector\u2019 Could Lead to More Accurate Health Science Journalism"},"content":{"rendered":"<div data-url=\"https:\/\/blogs.nvidia.com\/blog\/2021\/10\/01\/exaggeration-detector\/\" data-title=\"\u2018Exaggeration Detector\u2019 Could Lead to More Accurate Health Science Journalism\">\n<p>It would be an exaggeration to say you\u2019ll never again read a news article overhyping a medical breakthrough. But, thanks to researchers at the University of Copenhagen, spotting hyperbole may one day get more manageable.<\/p>\n<p>In a new paper, Dustin Wright and Isabelle Augenstein explain how they used NVIDIA GPUs to train an \u201cexaggeration detection system\u201d to identify overenthusiastic claims in health science reporting.<\/p>\n<p>The paper comes amid a pandemic that has fueled demand for understandable, accurate information. And social media has made health misinformation more widespread.<\/p>\n<p>Research like Wright and Augenstein\u2019s could speed more precise health sciences news to more people.<\/p>\n<p><b>Read the full paper\u00a0here: <\/b><a href=\"https:\/\/arxiv.org\/pdf\/2108.13493.pdf\"><b>https:\/\/arxiv.org\/pdf\/2108.13493.pdf<\/b><\/a>.<\/p>\n<h2>A \u2018Sobering Realization\u2019<\/h2>\n<p>\u201cPart of the reason why things in popular journalism tend to get sensationalized is some of the journalists don\u2019t read the papers they\u2019re writing about,\u201d Wright says. \u201cIt\u2019s a bit of a sobering realization.\u201d<\/p>\n<p>It\u2019s hard to blame them. Many journalists need to summarize a lot of information fast and often don\u2019t have the time to dig deeper.<\/p>\n<figure id=\"attachment_53116\" aria-describedby=\"caption-attachment-53116\" class=\"wp-caption alignleft\"><a href=\"https:\/\/blogs.nvidia.com\/wp-content\/uploads\/2021\/09\/dustin-wright.jpg\"><\/p>\n<p><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/blogs.nvidia.com\/wp-content\/uploads\/2021\/09\/dustin-wright.jpg\" alt=\"\" width=\"400\" height=\"400\"><\/p>\n<p><\/a><figcaption id=\"caption-attachment-53116\" class=\"wp-caption-text\">University of Copenhagen researcher <a href=\"https:\/\/dustinbwright.com\/\">Dustin Wright<\/a>.<\/figcaption><\/figure>\n<p>That task falls on the press offices of universities and research institutions. They employ writers to produce press releases \u2014 short, news-style summaries \u2014 relied on by news outlets.<\/p>\n<h2>Shot On<\/h2>\n<p>That makes the problem of detecting exaggeration in health sciences press releases a good \u201cfew-shot learning\u201d use case.<\/p>\n<p>Few-shot learning techniques can train AI in areas where data isn\u2019t plentiful \u2014 there are only a few items to learn from.<\/p>\n<p>It\u2019s not the first time researchers have put natural language techniques to work detecting hype. Wright points to the earlier work of colleagues in scientific exaggeration detection and misinformation.<\/p>\n<p>Wright and Augenstein\u2019s contribution is to reframe the problem and apply a novel, multitask-capable version of a technique called Pattern Exploiting Training, which they dubbed MT-PET.<\/p>\n<p>The co-authors started by curating a collection that included both the releases and the papers they were summarizing.<\/p>\n<p>Each pair, or \u201ctuple,\u201d has annotations from experts comparing claims made in the papers with those in corresponding press releases.<\/p>\n<p>These 563 tuples gave them a strong base of training data.<\/p>\n<p>They then broke the problem of detecting exaggeration into two related issues.<\/p>\n<p>First, seeing the strength of claims made in press releases and the scientific papers they summarized. Then, identifying the level of exaggeration.<\/p>\n<h2>Teacher\u2019s PET<\/h2>\n<p>They then ran this data through a novel kind of PET model, which learns much the way some second-grade students learn reading comprehension.<\/p>\n<p>The training procedure relies on cloze-style phrases \u2014 phrases that mask a keyword an AI needs to fill \u2014 to ensure it understands a task.<\/p>\n<p>For example, a teacher might ask a student to fill in the blanks in a sentence such as \u201cI ride a big ____ bus to school.\u201d<\/p>\n<figure id=\"attachment_53122\" aria-describedby=\"caption-attachment-53122\" class=\"wp-caption alignright\"><a href=\"https:\/\/blogs.nvidia.com\/wp-content\/uploads\/2021\/09\/mt-pet-pet-exaggeration.png\"><\/p>\n<p><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/blogs.nvidia.com\/wp-content\/uploads\/2021\/09\/mt-pet-pet-exaggeration.png\" alt=\"\" width=\"307\" height=\"272\"><\/p>\n<p><\/a><figcaption id=\"caption-attachment-53122\" class=\"wp-caption-text\">Researchers Dustin Wright and Isabel Augenstein created complementary pattern-verbalizer pairs for a main task and an auxiliary task. These pairs are then used to train a machine learning model on data from both tasks (source: https:\/\/arxiv.org\/pdf\/2108.13493.pdf).<\/figcaption><\/figure>\n<p>If they answer \u201cyellow,\u201d the teacher knows they understand what they see. If not, the teacher knows the student needs more help.<\/p>\n<p>Wright and Augenstein expanded on the idea to train a PET model to both detect the strength of claims made in press releases and to assess whether a press release overstates a papers\u2019 claims.<\/p>\n<p>The researchers trained their models on a shared computing cluster, using four Intel Xeon CPUs and a single NVIDIA TITAN X GPU.<\/p>\n<p>As a result, Wright and Augenstein were able to show how MT-PET outperforms PET and supervised learning.<\/p>\n<p>Such technology could allow researchers to spot exaggeration in fields with a limited amount of expertise to classify training data.<\/p>\n<p>AI-enabled grammar checkers can already help writers polish the quality of their prose.<\/p>\n<p>One day, similar tools could help journalists summarize new findings more accurately, Wright says.<\/p>\n<h2>Not Easy<\/h2>\n<p>To be sure, putting this research to work would need investment in production, marketing and usability, Wright says.<\/p>\n<p>Wright\u2019s also realistic about the human factors that can lead to exaggeration.<\/p>\n<p>Press releases convey information. But they also need to be bold enough to generate interest from reporters. Not always easy.<\/p>\n<p>\u201cWhenever I tweet about stuff, I think, \u2018how can I get this tweet out without exaggeration,\u2019\u201d Wright says. \u201cIt\u2019s hard.\u201d<\/p>\n<p><em>You can catch Dustin Wright and Isabella Augenstein on Twitter at @dustin_wright37 and @IAugenstein. <\/em><i>Read their full paper, \u201cSemi-Supervised Exaggeration Detection of Health Science Press Releases,\u201d here: <\/i><a href=\"https:\/\/arxiv.org\/pdf\/2108.13493.pdf\"><i>https:\/\/arxiv.org\/pdf\/2108.13493.pdf<\/i><\/a><i>.<\/i><i><br \/><\/i><i><br \/><\/i><i>Featured image credit: <\/i><a href=\"https:\/\/www.flickr.com\/photos\/shookphotos\/4244271798\/in\/photostream\/\"><i>Vintage postcard, copyright expired<\/i><\/a><i>.\u00a0<\/i><\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>http:\/\/feedproxy.google.com\/~r\/nvidiablog\/~3\/5iQjOChnSwU\/<\/p>\n","protected":false},"author":0,"featured_media":986,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[3],"tags":[],"_links":{"self":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/985"}],"collection":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/comments?post=985"}],"version-history":[{"count":0,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/985\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media\/986"}],"wp:attachment":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media?parent=985"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/categories?post=985"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/tags?post=985"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}