ZooKeeper is an open-source distributed coordination service and is the standard for organized services used by Big data ecosystem. Therefore, this blog post mainly introduces the basic knowledge of zookeeper and how to use Docker containers to build clusters.

Zookeeper history

As we all know, many problems in distributed systems are caused by the lack of coordination mechanisms. For big companies like Yahoo, they are also facing the same dilemma.

ZooKeeper first originated from a research group in Yahoo Research Institute. At that time, many large-scale systems within Yahoo basically depended on a similar system for distributed coordination, but these systems often had distributed single-point problems.

Therefore, Yahoo developers tried to develop a universal distributed coordination framework with no single point of issue, so that developers can concentrate on processing business logic.

In terms of technology leadership, no matter which aspect has Google’s presence, of course, in terms of distributed coordination technology, Google launched Chubby, but Chubby is the non-open source. Fortunately, Yahoo donated ZooKeeper as an open-source program to Apache, which brings us convenience in our daily software development.

Since many of the self-research projects were named after animals at the time, Raghu Ramakrishnan, the chief scientist of Yahoo Research Institute, joked: “If this continues, our place will become a zoo!”. This is the origin of the name “zookeeper”.

drawn by zosionlee

Zookeeper Mechanism

Zookeeper is a distributed service management framework designed based on the observer pattern. It is responsible for storing and managing the data that everyone cares about, and then accepts the observer’s registration. Once the state of these data changes, Zookeeper will Responsible for notifying those observers who have registered on Zookeeper to respond accordingly.

Features

  • Simple data model

Zookeeper enables distributed programs to coordinate with each other through a shared tree-structured namespace, that is to say, the data model in the Zookeeper server memory is composed of a series of data nodes called “ZNodes”, and Zookeeper stores the full amount of data in the memory in order to improve server throughput and reduce latency.

  • Clusters

A Zookeeper cluster is usually composed of a group of machines, and each machine maintains the current server state in memory, and each machine communicates with each other.

  • Sequential access

For each update request from the client, Zookeeper will assign a globally unique incremental number, which reflects the sequence of all transaction operations.

  • High performance

Zookeeper stores the full amount of data in memory and directly serves all non-transactional requests from the client.

Election

The Zookeeper cluster is a “one-master, multiple-slave” mode. The master is the leader and the slave is the follower. The leader is obtained through the election.

Leader election is the key to guaranteeing the consistency of distributed data. When Zookeeper enters the following two states, it needs to enter the leader election. The first one is “ server initialization” and the other is the leader goes down.

The election when the server is initialized will go through the following steps.

  • The nodes in the cluster communicate with each other, and each machine tries to find the leader, so it enters the election state
  • Each node will vote for itself as the leader, and then send this vote to other nodes in the cluster
  • After each server in the cluster receives a vote, it first judges the validity of the vote. If it passes the validity check, for each vote, the server needs to compare the vote of others with its own vote
  • After each vote, the node will count the voting information to determine whether more than half of the machines have received the same voting information.
  • Once the Leader is determined, each server will update its own status. If it is a follower, then change to “FOLLOWING”, if it is a leader, change to “LEADING”.

#zookeeper #distributed-systems #docker #docker-compose

Getting started with zookeeper cluster
2.00 GEEK