{"id":262,"date":"2020-09-24T19:51:33","date_gmt":"2020-09-24T19:51:33","guid":{"rendered":"https:\/\/machine-learning.webcloning.com\/2020\/09\/24\/ingest-data-from-apache-kafka\/"},"modified":"2020-09-24T19:51:33","modified_gmt":"2020-09-24T19:51:33","slug":"ingest-data-from-apache-kafka","status":"publish","type":"post","link":"https:\/\/salarydistribution.com\/machine-learning\/2020\/09\/24\/ingest-data-from-apache-kafka\/","title":{"rendered":"Ingest data from Apache Kafka"},"content":{"rendered":"<div id=\"\">\n<!-- begin main body content --><\/p>\n<p><strong>This is part of the <a href=\"https:\/\/developer.ibm.com\/series\/getting-started-with-ibm-streams-learning-path\">Learning path: Get started with IBM Streams<\/a><\/strong>.<\/p>\n<h2 id=\"summary\">Summary<\/h2>\n<p>In this developer code pattern, we walk you through the basics of creating a streaming application powered by Apache Kafka, one of the most popular open source distributed event-streaming platforms used for creating real-time data pipeline and streaming apps. The application will be built using IBM Streams on IBM Cloud Pak\u00ae for Data.<\/p>\n<h2 id=\"description\">Description<\/h2>\n<p>In this pattern, we walk you through the basics of creating a streaming application powered by Apache Kafka. Our app will be built using IBM Streams on IBM Cloud Pak for Data. IBM Streams provides a built-in IDE (Streams Flows) that allows you to visually create a streaming app. The IBM Cloud Pak for Data platform provides additional support, such as integration with multiple data sources, built-in analytics, Jupyter Notebooks, and machine learning.<\/p>\n<p>For our Apache Kafka service, we will be using IBM Event Streams on IBM Cloud, which is a high-throughput message bus built on the Kafka platform. In the following examples, we will show it as both a source and a target of clickstream data \u2014 data captured from user clicks as they browsed online shopping websites.<\/p>\n<h2 id=\"flow\">Flow<\/h2>\n<p><img class=\"lazycontent\" data-src=\"https:\/\/developer.ibm.com\/developer\/default\/patterns\/add-event-streams-and-a-db-in-python-to-clickstream\/images\/flow.png\" alt=\"flow\"><\/p>\n<ol>\n<li>User creates streaming app in IBM Streams.<\/li>\n<li>Streaming app uses Kafka service via IBM Event Streams to send\/recieve messages.<\/li>\n<li>Jupyter notebook is generated from IBM Streams app.<\/li>\n<li>User executes streaming app in Jupyter notebook.<\/li>\n<li>Jupyter notebook accesses Kafka service via IBM Event Streams to send\/receive messages.<\/li>\n<\/ol>\n<h2 id=\"instructions\">Instructions<\/h2>\n<p>Ready to get started? The <a href=\"https:\/\/github.com\/IBM\/ibm-streams-with-kafka\">README<\/a> explains the steps to:<\/p>\n<ol>\n<li>Clone the repo<\/li>\n<li>Provison Event Streams on IBM Cloud<\/li>\n<li>Create sample Kafka console Python app<\/li>\n<li>Add IBM Streams service to Cloud Pak for Data<\/li>\n<li>Create a new project in Cloud Pak for Data<\/li>\n<li>Create a Streams Flow in Cloud Pak for Data<\/li>\n<li>Create a Streams Flow with Kafka as source<\/li>\n<li>Use Streams Flow option to generate a notebook<\/li>\n<li>Run the generated Streams Flow notebook<\/li>\n<\/ol>\n<p>This pattern is part of the <a href=\"https:\/\/developer.ibm.com\/series\/getting-started-with-ibm-streams-learning-path\">Learning path: Get started with IBM Streams<\/a>. To continue the series and learn more about IBM Streams, check out a code pattern titled <a href=\"https:\/\/developer.ibm.com\/patterns\/build-a-streaming-app-using-ibm-streams-python-api\">Build a streaming app using a Python API<\/a>. <\/p>\n<\/p><\/div>\n","protected":false},"excerpt":{"rendered":"<p>https:\/\/developer.ibm.com\/patterns\/add-event-streams-and-a-db-in-python-to-clickstream\/<\/p>\n","protected":false},"author":0,"featured_media":263,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[3],"tags":[],"_links":{"self":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/262"}],"collection":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/comments?post=262"}],"version-history":[{"count":0,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/posts\/262\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media\/263"}],"wp:attachment":[{"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/media?parent=262"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/categories?post=262"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/salarydistribution.com\/machine-learning\/wp-json\/wp\/v2\/tags?post=262"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}