{"id":4507,"date":"2026-03-17T15:55:39","date_gmt":"2026-03-17T15:55:39","guid":{"rendered":"https:\/\/salarydistribution.com\/machine-learning\/2026\/03\/17\/snap-decisions-how-open-libraries-for-accelerated-data-processing-boost-a-b-testing-for-snapchat\/"},"modified":"2026-03-17T15:55:39","modified_gmt":"2026-03-17T15:55:39","slug":"snap-decisions-how-open-libraries-for-accelerated-data-processing-boost-a-b-testing-for-snapchat","status":"publish","type":"post","link":"https:\/\/salarydistribution.com\/machine-learning\/2026\/03\/17\/snap-decisions-how-open-libraries-for-accelerated-data-processing-boost-a-b-testing-for-snapchat\/","title":{"rendered":"Snap Decisions: How Open Libraries for Accelerated Data Processing Boost A\/B Testing for Snapchat"},"content":{"rendered":"<div>\n<p><span>The features on social media apps like Snapchat evolve nearly as fast as what\u2019s trending. To keep pace, its parent company Snap has adopted open data processing libraries from NVIDIA on Google Cloud services to boost development.\u00a0<\/span><\/p>\n<p><span>Every new feature rolled out to Snapchat\u2019s more than 940 million monthly active users goes through a set of controlled experiments before it\u2019s launched. During this A\/B testing cycle, the development team studies different variables with a subset of users, measuring nearly 6,000 metrics that analyze engagement, app performance and monetization.\u00a0<\/span><\/p>\n<p><span>Snap runs thousands of these experiments each month \u2014 processing over 10 petabytes of data within a three-hour window each morning using the Apache Spark distributed framework. By adopting <\/span><a target=\"_blank\" href=\"https:\/\/developer.nvidia.com\/topics\/ai\/data-science\/cuda-x-data-science-libraries\/cudf#section-accelerate-apache-spark\" rel=\"noopener\"><span>Apache Spark accelerated by NVIDIA cuDF<\/span><\/a><span>, the company is boosting these data processing workloads on NVIDIA GPUs to achieve 4x speedups in runtime with the same number of machines, providing a cost-effective path to scale.<\/span><\/p>\n<p><span>By pairing NVIDIA\u2019s GPU-optimized software, including NVIDIA CUDA-X libraries, with Google\u2019s infrastructure management services such as Google Kubernetes Engine, Snap is harnessing a full-stack platform for data processing at scale.\u00a0<\/span><\/p>\n<p><span>\u201cExperimentation is at the core of our company. Changing our data infrastructure from CPUs to GPUs allows us to efficiently scale this experimentation to more features, more metrics and more users over time,\u201d said Prudhvi Vatala, senior engineering manager at Snap. \u201cThe more experiments we\u2019re able to run, the more innovative experiences we can deliver for Snapchat users.\u201d<\/span><\/p>\n<h2><b>A Sustainable Way to Scale<\/b><\/h2>\n<p><span>Snapchat fans frequently see new features in the app \u2014 from arrival notifications to AI-generated stickers \u2014 but Snap is also continuously rolling out behind-the-scenes updates such as performance optimizations and compatibility updates for new operating system versions.\u00a0<\/span><\/p>\n<p><span>The A\/B testing for all these new features now runs on cuDF, which allows developers to run existing Apache Spark applications on NVIDIA GPUs with no code changes for easy deployment. The open library for accelerated data processing builds on the power of the NVIDIA <\/span><a target=\"_blank\" href=\"https:\/\/developer.nvidia.com\/topics\/ai\/data-science\/cuda-x-data-science-libraries\/cudf#section-accelerate-apache-spark\" rel=\"noopener\"><span>cuDF<\/span><\/a><span> GPU DataFrame library while scaling it for the Apache Spark distributed computing framework.<\/span><\/p>\n<p><span>With this migration, the team has \u2014 based on Snap internal data collected between January 1 and February 28 \u2014 realized 76% daily cost savings using NVIDIA GPUs on Google Kubernetes Engine compared with CPU-only workflows.<\/span><\/p>\n<p><span>\u201cWe were projecting an ambitious roadmap to scale up experimentation that would have blown up our computing costs based on our existing infrastructure,\u201d Vatala said. \u201cSwitching to GPU-accelerated pipelines with cuDF gave us a way to flatten the scaling curve, and the results were tremendous.\u201d<\/span><\/p>\n<p><span><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-91207 size-full\" src=\"https:\/\/blogs.nvidia.com\/wp-content\/uploads\/2026\/03\/Snap_pullquote-scaled.jpg\" alt=\"\" width=\"2048\" height=\"819\">To support workload migration, the team also harnessed cuDF suite of microservices that automatically qualify, test, configure and optimize Spark workloads for GPU acceleration at scale.\u00a0<\/span><\/p>\n<p><span>Working with NVIDIA experts, the Snap team optimized its pipelines on Google Cloud\u2019s G2 virtual machines powered by NVIDIA L4 GPUs so they required just 2,100 GPUs running concurrently \u2014 as opposed to the initial projection that around 5,500 GPUs would need to run concurrently, according to data Snap collected between January 1 and March 13.<\/span><\/p>\n<p><span>\u201cWhen I saw the results of the initial experiments, they were pretty crazy \u2014 we saw much higher cost savings than we had expected,\u201d said Joshua Sambasivam, a backend engineer on the A\/B testing team. \u201cThe Spark accelerator is a perfect match for our workloads.\u201d<\/span><\/p>\n<p><span>Looking ahead, the Snap team plans to integrate the Spark accelerator beyond the A\/B team to a broader range of production workloads.\u00a0<\/span><\/p>\n<p><span>\u201cWe didn\u2019t realize we were sitting on this gold mine,\u201d Vatala said. \u201cWe\u2019ve so far migrated our two biggest pipelines, but there\u2019s a lot of opportunity ahead.\u201d\u00a0<\/span><\/p>\n<p><span>Learn more by tuning into <\/span><a target=\"_blank\" href=\"https:\/\/www.nvidia.com\/gtc\/session-catalog\/sessions\/gtc26-s81678\/\" rel=\"noopener\"><span>Vatala\u2019s session at NVIDIA GTC<\/span><\/a><span>, taking place <\/span><span>Tuesday, March 17 at 1 p.m. PT<\/span><span>.\u00a0<\/span><\/p>\n<p><i><span>Read more about <\/span><\/i><a target=\"_blank\" href=\"https:\/\/developer.nvidia.com\/topics\/ai\/data-science\/cuda-x-data-science-libraries\/cudf\" rel=\"noopener\"><i><span>NVIDIA cuDF <\/span><\/i><\/a><i><span>and get started with <\/span><\/i><a target=\"_blank\" href=\"https:\/\/developer.nvidia.com\/topics\/ai\/data-science\/cuda-x-data-science-libraries\/cudf#section-accelerate-apache-spark\" rel=\"noopener\"><i><span>GPU acceleration for Apache Spark<\/span><\/i><\/a><i><span>.<\/span><\/i><\/p>\n<p><i><span>Main image above courtesy of Snap, depicting A\/B test of its Maps feature.<\/span><\/i><\/p>\n<\/p><\/div>\n","protected":false},"excerpt":{"rendered":"<p>https:\/\/blogs.nvidia.com\/blog\/snap-accelerated-data-processing\/<\/p>\n","protected":false},"author":0,"featured_media":4508,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[3],"tags":[],"_links":{"self":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/4507"}],"collection":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/comments?post=4507"}],"version-history":[{"count":0,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/4507\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media\/4508"}],"wp:attachment":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media?parent=4507"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/categories?post=4507"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/tags?post=4507"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}