Apache kafka, a new keyword which is complex to understand but let’s make easy for you to learn and implement. Kafka is nothing less like a messaging queue,which means a data pipeline(which takes input of data and returns data,like pipe does).Kafka is an open source platform developed by Apache Software Foundation written in java and scala as well.
The key points for Kafka are as follows :
  • Allows you to publish and consume data streams
         Р Publish Data Streams(Publisher)
Before discussing publisher, let’s discuss what topic is. Topic, a broad term, which means categories of feeds or data streams are stored into it.One topic can have multiple consumers who can subscribe data.Topics are further divided into partitions,which helps them to execute tasks in parallel fashion.
Publishers allows you to publish data streams in topic/topics of your choice.Publisher is responsible for choosing topic and partition of topic in which user wants to publish data and can manage the data.The concept of partitions are used to make our work easier and for load balancing as well.
     РSubscribe Data Streams(Consumer)
Consumers will subscribe data from topics.Consumers have their consumer groups,data stored in topic is consumed by one consumer instance subscribing consumer group.If the number of consumers are greater than partition number,then some of the consumers will be idle.
    РStreams Data
 This allows processing of data. Processing of data involves filtering,transformation or aggregation of data.Example: Filtering of tweets on the basis of particular keyword.
Let’s move towards the installation steps of kafka :
  1.  Download the latest version of kafka  from here : https://kafka.apache.org/downloads
  • ¬†Extract the tar file by following command : ¬†tar -xzf kafka_2.11-
  • ¬† Go to the kafka directory : cd kafka_2.11-

2. Kafka uses zookeeper at back-end.First needs to start zookeeper server by following command :  bin/zookeeper-server-start.sh config/zookeeper.properties


¬† ¬†3.¬†Now,let’s start Kafka server : ¬†bin/kafka-server-start.sh config/server.properties
By default,kafka occupies 9092 port and zookeeper takes 2181 port.
In the next Blog,we’ll illustrate you with an example of publishing and subscribing data.