Tools of trade?

Seems interdependency from other projects will be my thing for the next few months.
Right now I’m just and not only looking and trying to grasp concepts from:

  • Gradle[1]: as a build automation tool
  • Giuce [2 ]: as a dependency injection framework
  • s4-meter[3]: as the stress tool that’s currently being used.
  • JMeter[4,5]: to try to and change the stress testing of the S4 platform
  • JMeter plugins[6]: as suggested by Marcus (log.scalethis.se)
  • Spring Framework[7]: as another dependency injection platform
  • and some others include:
    • Kryo[8]: serializer/deserializer
    • ZooKeeper [9]: coordination tool.

[1] http://gradle.org/docs/current/userguide/tutorial_java_projects.html

[2] http://code.google.com/p/google-guice/wiki/GettingStarted
[3] https://github.com/matthieumorel/s4-meter/

[4] http://jmeter.apache.org/usermanual/boss.html
[5] http://jmeter.apache.org/usermanual/build-web-test-plan.html

[6] http://code.google.com/p/jmeter-plugins/
[7] http://www.springsource.org/
[8] http://code.google.com/p/kryo/wiki/BenchmarksAndComparisons

[9] http://dl.acm.org/citation.cfm?id=1855851


Agreement in Faulty Processes, another reason to toast to Leslie

Matters are complicated if we demand that a process goruo reaches an agereement. This is done for electinfg a coord, deciding or not whether to commit a Tx, and synchronizing. This is not that simple when comm. and procs. are not perfect. The general goal is to reach consensus, even with the faulty nodes in a finite set of steps.

The two-army problem

Red army 5000 troops is in a valley, let’s say Val d’Nuria.
Two blue armies are camped on the hills overlooking the valley, and are 3000 strong
If both blue armies attack at the same time, they will result victorious, however if only one attacks, math says:
5000 vs. 3000 = Slaughter!
The problem is that the communication channel is unreliable, the blue army messenger can be captured by the red army on his way down the hill to the other side. It’s all about ACKs! They may never know when the ACK is the final ACK, or when this messenger got captured.

Byzantine generals problem

Now comm. channels are reliable but the processes are not. Now there are n blue armies around Val d’Nuria. Communication is done by cellphone, as the generals all like Smartphones and so. However, among the generals there are m traitors (faulty) among the generals, and these are continually feeding the line with contradictory information.
The goal is to exchange information about the size of each army. So each general exchanges its information and by the end, they will each have a vector of length n corresponding to all the armies. If gral i is loyal then the element i is his troop strength, otherwise it’s undefined.
An algorithm by good old Leslie Lamport et al. is shown below, with n=4 and m=1:
  • Step 1(a): general 1 reports 1K troops, 2 reports 2K, 3 lies to everyone and 4 reports 4K troops
  • Step 2(b):  Results are collected as a vector
  • Step 3(c): Every gral. passes his vector from (b) to every other gral. Every gral gets 3 vectors, one from each general. Here general 3 is still a filthy little lier and lies more than ever, with values a-l
  • Step 4: Each gral examines the value of each ith value, if any value has a majority, that value is used. If no value has a majority, the position is marked UNKNOWN.
  • Result: (1,2,UNKNOWN,4)

In a system with m faulty processes, agreement can be achieved only if 2m+1 correctly functioning processes are present for a total of 3m+1.

If you like this post and nerdy apparel. Check this out!
http://www.spreadshirt.com/never-trust-byzantine-generals-bags-C3376A7439092

Back to basics. Today: Recovery

My thesis work heavily involves dealing with the recovery process of the system.
Here on I have extracted some interesting points from Tanenbaum et al. “Distributed Systems
Recovery
Recovery, from an error (where the error is the part of the system that may lead to an failure), is fundamental to fault tolerant systems.
There can be two types of recovery mechanisms:
  • Backward recovery: Take a system from a current erroneous state to a previous correct state. In this scenario a checkpointing (record sys state from time to time) mechanism is needed.
  • Fwd recovery: When the sys finds an erroneous state, an attempt is made to take the sys to a correct new state. Woo! This one is a hard one to implement. This needs the system to know what errors can occur in advance.

For example. In case of a lost packet in a communication process, bwd recovery would be to retransmit the packet, thus returning to the previous correct state. On the other hand, erasure coding would be the fwd recovery approach. (We, Lalith and me) worked in an erasure coding toy application as part of our “Implementation of Distributed Systems || Advanced Topics in Distributed Systems” in KTH, Sweden. Code is here)

No hay problema? Yeah, right.

  • It is costly in terms of perfomance for a Dist. System to return to a previous state.
  • As recovery mechanisms are independent of the systems, it is hard to guarantee that the error/s will not happen again.
  • Some states cannot be rolled back. Imagine an error leading to a 1000EUR transfer off an account.

To balance some of these problems, some systems combine checkpointing with message logging, since this allows the system to completely recover from process crashes plus, it allows the checkpointing frequency to be lower. Two approaches can be taken regarding message logging:

  1. sender-based logging: after a checkpoint has been taken, a process logs its messages before sending them off.
  2. receiver-based logging: the receiving process first logs an incoming message before delivering to the app it’s executing.

A artistic ASCII art representation of the benefit of checkpointing + msg loggin is shown below:

—–0—|-|-|-X
where:
0 : checkpointed state
| : logged messages
X : process crash.

It is easy to observe that when the system crashes, it can be restored, not only to the checkpointed state, but also to replay the logged messages and actions to recover almost completely.

A survey (of 44 pages!) is available if interested in exploring this topic.

To handle checkpointing there are two main techniques:

  • Independent checkpointing: each process/ node handles their own independent checkpointing mechanism. This has the problem of handling inconsistent states when failures occur, since P1 may have checkpointed a state which is not consistent with the one saved by P2 and so on.
  • Coordinated Checkpointing: All proces synchronize jointly and write their state to local stable storage. This gives a globally consistent state, where no cascaded rollback can happen (domino effect) at the cost of handling coordination in a distributed system, which gets its own chapter in the book.

Follow

Get every new post delivered to your Inbox.