{"id":3673,"date":"2024-07-23T00:08:30","date_gmt":"2024-07-23T00:08:30","guid":{"rendered":"https:\/\/salarydistribution.com\/machine-learning\/2024\/07\/23\/nvidias-ai-masters-sweep-kdd-cup-2024-data-science-competition\/"},"modified":"2024-07-23T00:08:30","modified_gmt":"2024-07-23T00:08:30","slug":"nvidias-ai-masters-sweep-kdd-cup-2024-data-science-competition","status":"publish","type":"post","link":"https:\/\/salarydistribution.com\/machine-learning\/2024\/07\/23\/nvidias-ai-masters-sweep-kdd-cup-2024-data-science-competition\/","title":{"rendered":"NVIDIA\u2019s AI Masters Sweep KDD Cup 2024 Data Science Competition"},"content":{"rendered":"<div>\n\t\t<span class=\"bsf-rt-reading-time\"><span class=\"bsf-rt-display-label\"><\/span> <span class=\"bsf-rt-display-time\"><\/span> <span class=\"bsf-rt-display-postfix\"><\/span><\/span><\/p>\n<p>Team NVIDIA has triumphed at the Amazon <a href=\"https:\/\/kdd2024.kdd.org\/\" target=\"_blank\" rel=\"noopener\">KDD Cup 2024<\/a>, securing first place Friday across all five competition tracks.<\/p>\n<p>The team \u2014 consisting of NVIDIANs <a href=\"https:\/\/www.linkedin.com\/in\/aerdem4\/&amp;sa=D&amp;source=docs&amp;ust=1721670200192732&amp;usg=AOvVaw097v0q1qfYZH3Wca9bat0d\" target=\"_blank\" rel=\"noopener\">Ahmet Erdem<\/a>, <a href=\"https:\/\/www.linkedin.com\/in\/benedikt-schifferer\/\" target=\"_blank\" rel=\"noopener\">Benedikt Schifferer<\/a>, <a href=\"https:\/\/www.kaggle.com\/cdeotte\" target=\"_blank\" rel=\"noopener\">Chris Deotte<\/a>, <a href=\"https:\/\/www.linkedin.com\/in\/giba1\/\" target=\"_blank\" rel=\"noopener\">Gilberto Titericz<\/a>, <a href=\"https:\/\/www.linkedin.com\/in\/lytic\/\" target=\"_blank\" rel=\"noopener\">Ivan Sorokin<\/a> and <a href=\"https:\/\/www.linkedin.com\/in\/simon-jegou\/\" target=\"_blank\" rel=\"noopener\">Simon Jegou<\/a> \u2014 demonstrated its prowess in generative AI, winning in categories that included text generation, multiple-choice questions, name entity recognition, ranking, and retrieval.<\/p>\n<p><a href=\"https:\/\/blogs.nvidia.com\/wp-content\/uploads\/2024\/07\/image1.png\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-large wp-image-73115\" src=\"https:\/\/blogs.nvidia.com\/wp-content\/uploads\/2024\/07\/image1-672x156.png\" alt=\"\" width=\"672\" height=\"156\"><\/a><\/p>\n<p>The competition, themed \u201c<a href=\"https:\/\/www.aicrowd.com\/challenges\/amazon-kdd-cup-2024-multi-task-online-shopping-challenge-for-llms\" target=\"_blank\" rel=\"noopener\">Multi-Task Online Shopping Challenge for LLMs<\/a>,\u201d asked participants to solve various challenges using limited datasets.<\/p>\n<p>\u201cThe new trend in LLM competitions is that they don\u2019t give you training data,\u201d said Deotte, a senior data scientist at NVIDIA. \u201cThey give you 96 example questions \u2014 not enough to train a model \u2014 so we came up with 500,000 questions on our own.\u201d<\/p>\n<p>Deotte explained that the NVIDIA team generated a variety of questions by writing some themselves, using a <a href=\"https:\/\/www.nvidia.com\/en-us\/glossary\/large-language-models\/\" target=\"_blank\" rel=\"noopener\">large language model<\/a> to create others, and transforming existing e-commerce datasets.<\/p>\n<p>\u201cOnce we had our questions, it was straightforward to use existing frameworks to fine-tune a language model,\u201d he said.<\/p>\n<p>The competition organizers hid the test questions to ensure participants couldn\u2019t exploit previously known answers. This approach encourages models that generalize well to any question about e-commerce, proving the model\u2019s capability to handle real-world scenarios effectively.<\/p>\n<p>Despite these constraints, Team NVIDIA\u2019s innovative approach outperformed all competitors by using Qwen2-72B, a just-released LLM with 72 billion parameters, fine-tuned on eight NVIDIA A100 Tensor Core GPUs, and employing QLoRA, a technique for fine-tuning models with datasets.<\/p>\n<h2><strong>About the KDD Cup 2024<\/strong><\/h2>\n<p>The KDD Cup, organized by the Association for Computing Machinery\u2019s Special Interest Group on Knowledge Discovery and Data Mining, or ACM SIGKDD, is a prestigious annual competition that promotes research and development in the field.<\/p>\n<p>This year\u2019s challenge, hosted by Amazon, focused on mimicking the complexities of online shopping with the goal of making it a more intuitive and satisfying experience using large language models. Organizers utilized the test dataset ShopBench \u2014 a benchmark that replicates the massive challenge for online shopping with 57 tasks and about 20,000 questions derived from real-world Amazon shopping data \u2014 to evaluate participants\u2019 models.<\/p>\n<p>The ShopBench benchmark focused on four key shopping skills, along with a fifth \u201call-in-one\u201d challenge:<\/p>\n<ol>\n<li>Shopping Concept Understanding: Decoding complex shopping concepts and terminologies.<\/li>\n<li>Shopping Knowledge Reasoning: Making informed decisions with shopping knowledge.<\/li>\n<li>User Behavior Alignment: Understanding dynamic customer behavior.<\/li>\n<li>Multilingual Abilities: Shopping across languages.<\/li>\n<li>All-Around: Solving all tasks from the previous tracks in a unified solution.<\/li>\n<\/ol>\n<h2><strong>NVIDIA\u2019s Winning Solution<\/strong><\/h2>\n<p>NVIDIA\u2019s winning solution involved creating a single model for each track.<\/p>\n<p>The team fine-tuned the just-released Qwen2-72B model using eight NVIDIA A100 Tensor Core GPUs for approximately 24 hours. The GPUs provided fast and efficient processing, significantly reducing the time required for fine-tuning.<\/p>\n<p><a href=\"https:\/\/blogs.nvidia.com\/wp-content\/uploads\/2024\/07\/image2.jpg\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-large wp-image-73118\" src=\"https:\/\/blogs.nvidia.com\/wp-content\/uploads\/2024\/07\/image2-672x191.jpg\" alt=\"\" width=\"672\" height=\"191\"><\/a><\/p>\n<p>First, the team generated training datasets based on the provided examples and synthesized additional data using Llama 3 70B hosted on <a href=\"http:\/\/build.nvidia.com\" target=\"_blank\" rel=\"noopener\">build.nvidia.com<\/a>.<\/p>\n<p>Next, they employed QLoRA (Quantized Low-Rank Adaptation), a training process using the data created in step one. QLoRA modifies a smaller subset of the model\u2019s weights, allowing efficient training and fine-tuning.<\/p>\n<p>The model was then quantized \u2014 making it smaller and able to run on a system with a smaller hard drive and less memory \u2014 with AWQ 4-bit and used the vLLM inference library to predict the test datasets on four NVIDIA T4 Tensor Core GPUs within the time constraints.<\/p>\n<p>This approach secured the top spot in each individual track and the overall first place in the competition\u2014a clean sweep for NVIDIA for the second year in a row.<\/p>\n<p>The team plans to submit a detailed paper on its solution next month and plans to present its findings at KDD 2024 in Barcelona.<\/p>\n<p>\t\t<!-- .entry-footer --><\/p><\/div>\n","protected":false},"excerpt":{"rendered":"<p>https:\/\/blogs.nvidia.com\/blog\/nvidia-ai-masters-kdd-cup-2024\/<\/p>\n","protected":false},"author":0,"featured_media":3674,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[3],"tags":[],"_links":{"self":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/3673"}],"collection":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/comments?post=3673"}],"version-history":[{"count":0,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/3673\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media\/3674"}],"wp:attachment":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media?parent=3673"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/categories?post=3673"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/tags?post=3673"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}