Session

Using Redis for Distributed & Fault-tolerant Data Plumbing Infrastructure

Speaker(s): Atul Gore

Context:
Altizon is a Industrial IOT PaaS company. We collect data from industrial machines, process, analyze, aggregate and store it in a data lake. A number of our customers also request for real time or scheduled push of this raw and processed data into their data infrastructure or cloud environments.

Solution:
We’ve built a distributed fault tolerant data push adapter service in Java centered around redis for the same purpose. We run a 3 node sentinel setup to cover for infra failures.  The solution uses a mechanism for scheduling future work items and detecting ready to run work items using the ‘Z’ data structures provided by Redis.  Ready to run workitems are pushed to a multiple Redis queues and then picked up by a number of worker threads (across multiple computes) thus allowing massive parallelism. We need to guarantee data delivery with deliver atleast once semantics. This means we need to detect work item failures and reschedule failed work items. We use the atomic move operations provided by redis for elements between queues.

Result:
The data structures provided by redis allowed us to create an extremely reliable, fast and fault tolerant push data service.