Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I am also a huge fan of Lamport. Paxos is so concise and of such beauty that once you understand it, all other consensus protocols seem just different ways of phrasing Paxos with additional features (e.g. membership changes, make it into a log, etc.). Paxos is the smallest, most elegant consensus building block that I have seen so far. Raft's protocol covers a much bigger area than just consensus (it gives you a log, supports reconfiguration, etc.). Paxos' entire protocol fits a single slide. Period.

I wrote a few posts about how to understand paxos, and this is my personal favorite – https://blog.the-pans.com/understanding-paxos/.

Lamport did comment on it (via email) saying that understanding Paxos is less important than proving it works (via TLA+).



Your view that Paxos is an algorithm for read-modify-write is a really deep insight. Once you adopt this point of view, things like Raft are just RMW updates of a log. (I like to think of Paxos as the fault-tolerant version of the load-linked/store-conditional pair of instructions.)

However, there is a difficulty in expressing exactly what RMW-Paxos does, because it may end up "modifying" a value that was never "committed".

For example, assume that I have three acceptors storing a replicated counter. The proposer executes phase-1, increments the value received from the acceptor with the highest ballot, and sends the value to all acceptors in phase-2. Consider now an execution where all acceptors initially hold 0. The proposer manages to store a new value 1 in one acceptor, and then crashes. Thus, 1 was not committed and a reader may conclude that the consensus value is still 0. Now a new proposer arrives, receives (1, 0, 0) from the three acceptors, chooses 1 as the newest value, and writes (2, 2, 2) to all acceptors. Thus "2" is now committed, and the commit of the "2" retroactively commits the "1". Thus, the condition for being committed is no longer "there exists a phase-2 quorum ...", but it is more complicated and depends on future history.

I have verified with TLA+ that this algorithm does indeed implement an atomic read-modify-write operation, but only under a complicated notion of being "committed" that boils down to either having a phase-2 quorum, or one of the successor operations having a phase-2 quorum.

Did you ever figure out a simpler way to say this?


Instead of Read-Modify-Write, I like to think of it as Read/Determine/Write cycle.

A note worthy point is that, Paxos actually expects proposers to drop their value and carry-forward someone-else's value to complete the consensus. This is typically not an intended behavior from client's point of view, but proposers role is to form the consensus, not push their value.

Read phase is basically understanding if there is any value that could've been chosen when F nodes are unavailable.

Determine phase is to choose to carry-forward any value (if-any) from the Read phase or use the new value from the client.

Write phase is to actually ask majority of the acceptors to save the value selected in the Determine phase.


You are totally correct that in Paxos the proposer is supposed to "carry-forward" the value proposed by somebody else. In this view, which is how everybody understands Paxos, Paxos is an algorithm for consensus.

However, the proposer can also modify the value proposed by somebody else, and it turns out that this is a perfectly valid algorithm for implementing a distributed read-modify-write. This variant does more than just consensus---it basically behaves like a fault-tolerant memory with an atomic update operation (think compare-and-swap or load-linked/store-conditional). In his blog post, GP also mentions that paxos is a read-modify-write transaction, but perhaps I read too much into what he is saying. Either way, this RMW variant is correct, I have verified it exhaustively with TLA+, and it is used in a few production systems that I have implemented.

The difficulty is now to say exactly what this RMW variant does. I was hoping that GP (or anybody else) may have some insights.


In case anyone wants to read more about the RMW variant, this is called CASPaxos: https://arxiv.org/abs/1802.07000

I first learned about this as a project called gryadka and I had trouble believing that extending Paxos from consensus (write once) to a full linearizable register worked until model checking a TLA+ spec for it that I wrote.


> Lamport did comment on it (via email) saying that understanding Paxos is less important than proving it works (via TLA+).

I suppose that's not wrong; if I know for a fact that something is going to behave how I expect it too, I don't generally care how it works. I can treat the algorithm as a black box and just go from there.

That said, I dont' completely agree. If I am implementing Paxos in NewLangOfTheMonth, it's likely that a straight one-to-one port of another version will not fully work due to tiny differences in the language semantics. If I don't understand Paxos, then it's unlikely I'll be able to correct any mistakes in the implementation.


Yeah I agree. I think understanding the protocol makes it much less likely to introduce protocol level bugs during implementation. E.g. why certain steps need to be done in certain order.


Agreed, being able to trust it is probably what I want in order to use it at my job to get shit done - being able to understand it/implement it is what I want to do on my own time.

Unless you're one of the rare group who gets paid to implement it in x language/library/technology, it's more of a mental exercise than anything, something nice to go "ahhh cool!" when you understand it, then move on.


Re: "proving it works", in a previous interview Lamport said that some folks measure simplicity by showing something to people and they say whether or not they understand it, while he measures simplicity by showing something to people and they can prove it's correct.


Allowing me to bikeshed for a second, was the progress bar a stylistic choice on this blog or part of some default styling? I keep seeing it pop up on random sites and I can't help but think people are developing on browsers that elide the scrollbar. Adding it back in seems like a signal that it was actually useful information for the user. I know safari removes it, not sure about chrome. Firefox does not remove it on desktop but does on mobile.


Their slide [1] looks like just a reworded version of yours [2]?

[1]: https://martinfowler.com/articles/patterns-of-distributed-sy...

[2]: https://blog.the-pans.com/paxos-explained/#paxos


I totally agree. Paxos is such a simple solution that it is so amazing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: