Agreement in Faulty Processes, another reason to toast to Leslie

Matters are complicated if we demand that a process group to reach an agreement. This is done for electing a coord, deciding or not whether to commit a Tx, and synchronizing. This is not that simple when comm. and procs. are not perfect. The general goal is to reach consensus, even with the faulty nodes in a finite set of steps.

The two-army problem

5000 blue army troops are in a valley, let’s say Val d’Nuria in Catalunia.
Two groups from the red army troops are camped on the hills overlooking the valley, and are each of them has 3000 men.
If both red armies attack at the same time, they will result victorious, however, if only one attacks, math says:
5000 vs. 3000 = Slaughter!
The main problem here is that the communication channel is unreliable, the red army messenger can be captured by the blue army on his way down the hill to the other side. It’s all about ACKs! They may never know when an ACK is the final ACK, or when their messenger got captured.

Byzantine generals problem

Now comm. channels are reliable but the processes are not. Now there are n blue armies around Val d’Nuria. Communication is done by cellphone, as the generals all like Smartphones and so. However, among the generals there are m traitors (faulty) among the generals, and these are continually feeding the line with contradictory information.
The goal is to exchange information about the size of each army. So each general exchanges its information and by the end, they will each have a vector of length n corresponding to all the armies. If gral i is loyal then the element i is his troop strength, otherwise it’s undefined.
An algorithm by good old Leslie Lamport et al. is shown below, with n=4 and m=1:
  • Step 1(a): general 1 reports 1K troops, 2 reports 2K, 3 lies to everyone and 4 reports 4K troops
  • Step 2(b):  Results are collected as a vector
  • Step 3(c): Every gral. passes his vector from (b) to every other gral. Every gral gets 3 vectors, one from each general. Here general 3 is still a filthy little lier and lies more than ever, with values a-l
  • Step 4: Each gral examines the value of each ith value, if any value has a majority, that value is used. If no value has a majority, the position is marked UNKNOWN.
  • Result: (1,2,UNKNOWN,4)

In a system with m faulty processes, agreement can be achieved only if 2m+1 correctly functioning processes are present for a total of 3m+1.

If you like this post and nerdy apparel. Check this out!

About marianovalles

My name is Mariano Vallés. I'm originally from Mendoza, Argentina. I'm a software engineer and an EMDC (European Master in Distributed Computing) graduate. In 2010 I moved to Barcelona, then I moved to Stockholm and now I live in Berlin and work for Wooga as a Backend Engineer.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: