Running BigData Applications with Microservices and Airflow

Sibasish Brahma
2 min readMar 29, 2019
  1. Build a Kafka cluster for middleware and message passing.
  2. Build Microservice with Kafka using Springboot
  3. Build Spark job repository for jobs to be exposed as service
  4. Setup Airflow and read parameter Json files from Microservice application.

Kafka cluster start :

start zookeeper
start brokers of kafka

create some topic in kafka which can be used to pass json message.

SpringBoot Kafka Microservice :

Springboot UI to pass parameters

Spark Jobs as Service :

Job reads the kafka topic and prints the message also writes it to parameter file in Airflow execution directory.

Airflow setup and Dag creation through parameter file :

start Airflow cluster
Dynamic Dag got created from parameter file passed from Microservice.

This Dag will run and execute Spark jobs to work upon your data in datalake. We can create dynamic pipelines and execute our pipeline from the front-end with ease.

--

--

Sibasish Brahma

Technology evangelist | Public speaker | Social worker | Spiritual Researcher | Socio-Political Analyst | BigData & AI Systems-Solutions Architect | Optimist