Analyzing streaming Data from Kafka Topics

Now you can create streaming data topics in the Watson Data Platform UI, receive data from diverse sources, and create connections to these topics in order to analyze streaming data in your analytic projects and notebooks.

From the Data Science Experience UI, go to “Watson Data Platform”->”Data Services”. If you don’t have one yet, create an instance of “Message Hub”. This may take a little while, then it appears in your list of data service instances. Now click on the Message Hub service instance entry to get to the list of streaming data topics. Here you can create one or more Apache Kafka topics, to which you can send data messages from your apps, devices, or other streaming data sources via the Apache Kafka API. See Analyze streaming data from Kafka topics for more details.

Now, from within your project, you can create a connection and pick the Message Hub service instance as the data service you want to connect to, and from within a notebook pick that connection and click “Insert to Code”. This gives you access info needed to connect to the topic, that you can now use in your Python, R, Scala code to read data from the streaming data topic directly or through Spark Streaming within the notebook.


Originally published at datascience.ibm.com on September 27, 2016 by Thomas Schaeck.