The year is 2021. As the world is still deeply affected by the COVID-19 pandemic, the United Nations have decided that in order to prevent the same scenario of ever happening again, there should be a worldwide, real-time contact-tracing system in place capable of tracking every human being on the planet. Everyone is now equipped with a small, pebble-like device. If a person you have been in contact with is confirmed to carry a new kind of virus for which there is no remedy, the device starts glowing in red. Each device reports the anonymized contact information to one global backend system capable of handling this type of massive scale.
Two months ago when I first started working on this project, the paragraph above would have sounded to me like an unlikely future taken straight out of a dystopian science-fiction book. Fast-forward to today and I really have to wonder whether this scenario might not actually happen one day.
In the beginning of 2018 I started to get more and more interested in the building blocks necessary to create clusters. At the foundation of clustered systems are so-called membership protocols. The job of a membership protocol is to keep clustered applications up-to-date with the list of nodes that are part of the cluster, allowing all the individual nodes to act as one system.
Up until November 2018, I considered the SWIM failure detector and membership protocol together with Hashicorp’s Lifeguard extensions to be the state-of-the art in this domain. Then the paper Stable and Consistent Membership at Scale with Rapid (L. Suresh et al., 2018) struck my eye and I’ve been itching to try it out since then. I finally decided to make the time to try it out (that is to say integrating it with Akka Cluster) at the end of February. What I first thought would take a couple of weeks at most turned out to take two months. This is in big part because I had set the goal to create a 10000 node cluster on cheap AWS EC2 instances – I didn’t want to settle for 5000 nodes because that was too close to the already existing 2400 node Akka cluster. Coupled with the COVID-19 outbreak and the resulting change of lifestyle, these have been a very strange and exhausting, yet interesting, two months.
The rest of this article is structured as follows:
Akka Cluster was created in 2013. For reference, here’s Jonas Boner’s talk about The Road to Akka Cluster, and beyond which shows what motivated the design of the system.
The following is a summary of how the membership service works.
**SEE ALSO: **Tour of Akka Typed: Cluster Sharding
Propagating membership information is achieved via random biased push-pull gossip. At each tick of the clock (1 second by default) each node:
#articles #akka #reactive programming