From the beginning I've understood mongodb to be built with it's approach for sc...

mitchellh · on April 13, 2012

[Note: I wrote the blog post]

I'm not at all against horizontally scaling. However, I don't believe that horizontally scaling should be necessary doing a mere 200 updates to per second to a data store that isn't even fsyncing writes to disk.

Think of it in terms of cost per ops. Let's just say 200 update ops per second is the point at which you need to shard (not scientific, but let's just use that as a benchmark since that is what we saw at Kiip). MongoDB likes memory, so let's use high-memory AWS instances as a cost benchmark. I think this is fair since MongoDB advertises itself as a DB built for the cloud. The cheapest high-memory instance is around $330/month.

That gives you a cost per op of 6.37e-5 cents per update operation.

Let's compare this to PostgreSQL, which we've had in production for a couple months at Kiip now. Our PostgreSQL master server has peaked at around 1000 updates per second without issue, and also with the bonus that it doesn't block reads for no reason. The cost per op for PostgreSQL is 1.27e-5 cents.

Therefore, if you're swimming in money, then MongoDB seems like a great way to scale. However, we try to be more efficient with our infrastructure expenditures.

EDIT: Updated numbers, math is hard.

zzzeek · on April 13, 2012

typo in the numbers ? I can get your mongoDB number but not pg:

    >>> seconds_per_month = 60 * 60 * 24 * 30
    >>> ops_per_month_pg = 1000 * seconds_per_month
    >>> ops_per_month_mg = 200 * seconds_per_month
    >>> 330.0 / ops_per_month_pg * 100
    1.2731481481481482e-05
    >>> 330.0 / ops_per_month_mg * 100
    6.36574074074074e-05

nivertech · on April 14, 2012

You could have skipped all the calculations and just say, that 1000/200=5, I.e PostgreSQL 5 times more cost effective than MongoDB.

mitchellh · on April 13, 2012

You're absolutely right. Decimal points are hard. Fixed my comment (and noted it).

fusiongyro · on April 13, 2012

I like your post, and I agree with your conclusion, but I have to say I'm puzzled by your decision to back MongoDB with EBS. Were you running MongoDB atop EC2 instances as well? Can you elaborate on this a little?

mitchellh · on April 13, 2012

We were running running MongoDB atop EC2 instances. We chose to back MongoDB with EBS because that was the only reasonable way to get base backups (via snapshots) of the database. Although 10gen recommends using replica sets for backup, we also wanted a way to durably backup our database since there was so much important data in it (user accounts, billing, and so on).

On the other hand, we run PostgreSQL straight on top of a RAID of ephemeral drives, which has had good throughput compared to EBS so far. The reason we're able to do this is because PostgreSQL provides a mechanism for doing base backups safely without having to snapshot the disk[1]. Therefore, we just do an S3 base backup of our entire data (which uploads at 65 MB/s from within EC2) fairly infrequently, while doing WAL shipping very frequently.

[1]: http://www.postgresql.org/docs/8.1/static/backup-online.html

mrkurt · on April 13, 2012

You can either do LVM snapshots (with journaling) on the ephemeral drives, or use mongodump with the oplog option to get consistant "hot" backups. The downside of mongodump is it churns your working set.

fusiongyro · on April 13, 2012

Interesting, thanks.

jrussbowman · on April 14, 2012

Interesting. Thanks for the reply and breaking it down the way you did. That provides some serious food for thought. Looking forward to your next post on the rationales for the other data stores.

jabwork · on April 13, 2012

Can you give any sort of indication of the value of a schemaless database and the flexibility it provided as the team fleshed out the data model? Was this a mere convenience over traditional schema migration or something more?

willvarfar · on April 14, 2012

And also, did you ever introduce a bug by getting a typo in a field name? ;)

LeafStorm · on April 13, 2012

While I agree that you should definitely use your tools in the best way you're capable of, I think for most people there's a baseline expectation that if you save data in a database, that data will be safe (at the very least, recoverable) unless something happens like the server catching on fire.

Nearly every other major database has this as the default -- MySQL, PostgreSQL, CouchDB, and Berkeley DB to name a few. (Redis doesn't, but it's also very upfront about it, and does provide this kind of durability as an option from early on.) So when MongoDB breaks this expectation, and when asked to support it as an option, just says, "That's what more hardware is for," it's a pretty big turnoff.

mnutt · on April 13, 2012

There's actually two separate concerns:

Mongo client's "safe" operator causes the client to see if the database threw an error and throw an error itself. Like someone mentioned, it mostly falls on the mongo clients to implement this. We mostly use fire-and-forget for our application, since it is just logging stats and speed is more important than losing an increment here or there. There should probably be better documentation telling people to always use the safe operations for important data.

There is also the durability issue. Early versions shipped with durability turned off by default and required replication to maintain durability. Mongo has had the journal feature since 1.8 and has it enabled by default since 1.9.2. (current version is 2.0)

So while mongo has definitely been unsafe in the past, both kinds of safety are now supported, and one is default. The other is either not that big a deal or egregious, depending on the way you view mongo.

antirez · on April 13, 2012

You are wrong about Redis. Cheers

justinsb · on April 13, 2012

Care to explain? I believe for Redis, "appendfsync everysec" is the default. The poster's point was that MySQL and Postgres both ship with something like "appendfsync always", and you have to opt-in to the the less safe mode if you want to get more performance. Redis ships with the less safe mode pre-selected, and so has higher performance out-of-the-box.

corford · on April 13, 2012

You're right. The Postgres equivalent to "appendfsync always" is "synchronous_commit = on". Which AFAIK is the default.

However, one of the nice things about redis is that even if you run "appendfsync everysec" you never run the risk of corruption. You're only risk is losing a maximum of 2 seconds worth of data.

If you missed it, there's a wonderful blog post by antirez covering all of this (and a lot more) here: http://antirez.com/post/redis-persistence-demystified.html

justinsb · on April 14, 2012

All modern relational databases use this approach (it's called ARIES), so all offer the same guarantees as Redis.

moe · on April 14, 2012

And to praise redis some more; it also comes with the sanest (async) replication support of any database. By far.

It's literally a one-liner in the config file. No bootstrap needed, maintenance free, absolutely no strings attached. 10 seconds and you're done.

Which means you'll actually use it from day 1 and never worry about it. Compare that to any other database.

saurik · on April 13, 2012

The behavior described in the article, though, is just a fundamental misuse of the hardware resources. If I'm hitting write bottlenecks, I need to shard: that's just a fact of reality. However, write throughput and read throughput should be unrelated (as durable writes are bottlenecks by disk and reads of hot data are bottlenecked by CPU/RAM): if I have saturated five machines with writes I should have five machines worth of read throughput, and that is a /lot/ of throughput... with the behavior in this article, you have no read capacity as you are blocked by writes, even if the writes are to unrelated collections of data (which is downright non-sensical behavior). Even with more machines (I guess, twice as many), the latency of your reads is now being effected horribly by having to wait for the write locks... it is just a bad solution.

cheald · on April 13, 2012

I think the article is a fair criticism, and I think your response is likewise fair.

Mongo was built with horizontal scaling in mind, and to that end, it tends to suffer noticeably when you overload a single node. Things like the global write lock and single-threaded map/reduce are definitely problems, and shouldn't be pooh-pooh'd away as "oh, just scale horizontally". Uncompacted key names are a real problem, and a consequence of it is that Mongo tends to take more disk/RAM for the same data (+indexes) than a comparable RDBMS does. Maintenance is clunky - you end up having to repair or compact each slave individually, then step down your master and repair or compact it, then step it back up (which does work decently, but is time consuming!)

None of these are showstoppers, but they are pain points, especially if you're riding the "Mongo is magical scaling sauce" hype train. It takes a lot of work to really get it humming, and once you do, it's pretty damn good at the things it's good at, but there are gotchas and they will bite you if you aren't looking out for them.

ericflo · on April 13, 2012

Mongo was not designed with horizontal scaling in mind. Riak, Cassandra, HBase, Project Voldemort...these are the projects that were designed with horizontal scaling in mind (as evidenced by their architectures.) But not Mongo.

jasonmccay · on April 13, 2012

I have to respectfully disagree. Your comment is a bit sweeping and I really don't think that MongoDB's sharding solution is a bad one ... simply the strategies are different.

There is a large set of nice features that makes Mongo, for most people, nice to use long before you even need to address sharding. The percentage of people that will need to shard is much lower than the percentage of people that can get considerable benefits of how you can query data in MongoDB, for example, vs. Riak.

You are such a Riak-lover. :P

ericflo · on April 14, 2012

I am definitely a Riak-lover :) And I also agree that there is a large set of nice features that make Mongo nice to use before sharding.

cheald · on April 14, 2012

Mongo's sort of a weird halfway point between "totally horizontal" stores like Riak or Cassandra, and "Hope you have a ton of RAM" stores like your traditional RDBMS setup. I think it's hard to look at it and say that it wasn't designed with horizontal scaling in mind, but it's totally fair to say that it wasn't designed purely for horizontal scaling, either.

paulsutter · on April 13, 2012

You're right that MongoDB is designed from the start for horizontal scaling, and the author could have better articulated his reluctance to add machines.

However, his suggestions would improve mongodb performance in all configurations, not limited to single node.

The mongodb guys did a great job simplfying the internal architecture and focusing on a great API to get the product out quickly. I'm confident they can make these sorts of internal improvements to broaden the use of mongodb.

heretohelp · on April 13, 2012

>You're right that MongoDB is designed from the start for horizontal scaling,

Nope.