Kafka Visualization

Prerequisite

Experience working in a data platform engineering capacity dealing with data streams
Integrating pipelines

Challenge

You have been hired by ACME Corp, to visualize streams of data coming from devices in the ocean. Your point of contact has asked you to ingest the raw streaming data from these devices and visualize it on a map and show how the devices move in the ocean over time.

The Task

Architecture Diagram

The data is streamed daily and you are provided data for Day 1 and Day 2.
The data can be picked up from here for Day 1 and Day 3
It is a fat JSON file that contains the location information from 90 devices in the ocean. Each device has a unique spotterId.
The structure of the json is as below:

Requirements

Prepare a docker compose that contains the following: Confluent Kafka, Apache Druid, and Apache Superset
Publish Day 1 data into Kafka topic ocean-data
Connect this kafka topic to Apache Druid
Connect Apache Superset to the Druid database
Pick a Map dashboard from Superset and show the latitude / longitude locations of the ocean devices
Publish Day 2 data into the same topic
Refresh the Map dashboard in superset to now see day 2 data on the map

Deliverable

A GitHub repo with read permissions given to GitHub users rafty8s,bsneider, omnipresent07, and barakstout (how to invite collaborators)
The repo should have enough README.md instructions
A docker-compose.yml using which everything can be spun up
Any additional commands that need to be executed