> Each client in the Jepsen test harness is (independently) scheduling n writes ...

aphyr · on Sept 5, 2013

Like I said, mean free paths will vary depending on your write profile. None of this alters my original assertion, which is that row operations are not isolated.

cbsmith · on Sept 5, 2013

You've got a corner case that is way harder to hit than you think you've measured it to be, and the scenarios where it may happen would have almost certainly not have the design required to cause it. Even so, it is addressable.

Based on my own experiences with this scenario, I'd be surprised if you managed to experience any problems if you turned off throttling (and didn't force everything to the same timestamp).

So yeah, you have a scenario that can happen, and I'd recommend anyone who absolutely cannot have that happen either not use Cassandra or design their schema accordingly. Absent that scenario though, row operations are isolated.

aphyr · on Sept 5, 2013

You're presuming that all reads occur after the system has quiesced. This is not always the case. I'm happy your write pattern works for you, and that you measure your consistency; I'm just trying to keep folks honest about what their systems actually provide; and give them tools to analyze those constraints.

cbsmith · on Sept 5, 2013

> You're presuming that all reads occur after the system has quiesced.

It sure looked like they did, but maybe I misread the code.

> This is not always the case.

Even if that weren't the case, you might check out the probabilities on your birthday problem. What you've got is effectively a calendar with 100 million days (microseconds in the benchmark's 100 seconds) and 5 people (5 writes to the same record). You've managed to end up with those 5 people sharing a birthday well over 1% of the time.

> I'm just trying to keep folks honest about what their systems actually provide

I appreciate that you've found an interesting corner case that I'd not considered.

Actually, it's not that I hadn't considered it. When creating client side timestamps I do tend to think about this scenario, but with server side timestamps, I tend to think of it much like I think of collisions on Type 1 UUID's, but in truth the probabilities are higher.

I do think Cassandra ought to consider either a) using Type 1 UUID or similar strategies to make the risk of these collisions all but ridiculous or b) when resolving ties in the timestamps on cells by choosing a winning node or thread or something other than the value in the cell, but rather tied to the update operation. That would avoid this scenario in a fashion more fitting with the rest of the semantics of the system.

> give them tools to analyze those constraints

I think unfortunately in this case the analysis of those constraints is flawed.

rbranson · on Sept 5, 2013

Finding corner cases is the point of Jepsen :)

cbsmith · on Sept 5, 2013

> Finding corner cases is the point of Jepsen :)

And with that it obviously did a great job. The probabilities of finding those corner cases is unfortunately completely misrepresented.

I'd worry though that this distortion of the probabilities might mean it also doesn't find other kinds of corner cases.