Two weeks ago, my client’s datastore indexes became corrupted. The failure was demonstrably AppEngine's. Attempts to manually rebuild indexes failed. The AppEngine team worked for several days to resolve the problem, which we appreciated. But such failures are disheartening when you’re busy signing up paid customers.
I suppose the term “beta” is so overused that it has nearly lost its meaning. Sometimes, however, it still means “warning, ye who enter here.”
--
I've also written some high-level thoughts about the "big three" cloud services on my blog:
Basically, some combination of puts to one of our models would cause queries to return inconsistent results. For example, the query:
SELECT * FROM MyModel WHERE property=True
would return N results, but
SELECT * FROM MyModel WHERE property=True ORDER BY foo
would return far fewer than N results.
It appears that _something_ we were doing exercised an AppEngine bug that caused the indexes to get corrupted. (Note that this would happen even from AppEngine's data viewer -- which is to say, we got N by paging through ALL results, not just one fetched batch.)
We worked with Google but designing a simple repro case proved difficult -- the corruption happened only under high load, and only on one of our models.
Trial and error showed us that if we manually fetched and re-put() the "broken" rows in our table (aka the ones that didn't show up when we used ORDER BY,) the indexes would right themselves. So I wrote a bit of code to do this, and started running a cron job on a local box to invoke that code regularly. Lame, but only temporary.
Last time we exchanged mail with the AppEngine team, they told us they were aware that this was a bug on their end and were working on a fix. However, to my knowledge, they're still working on it...
I believe the app engine datastore has been stated as being built upon bigtable. I suppose the index creation is "newer" code than the core of bigtable, but I'd like to hope that the Google bigtable code is bugfree at this point...
Nothing is ever bug free. The only thing the Big Table team can hope for is "statistically unlikely to contain a serious bug in the code paths commonly executed by our high traffic products".
I suppose the term “beta” is so overused that it has nearly lost its meaning. Sometimes, however, it still means “warning, ye who enter here.”
--
I've also written some high-level thoughts about the "big three" cloud services on my blog:
http://davepeck.org/2008/11/30/the-clouds-spectrum/
http://davepeck.org/2008/12/03/the-clouds-dark-lining/