- Run transactions to completion –single threaded –in timestamp order [will use multiple cores, with each having a single thread]
- Only data changed within a single invocation of a stored procedure is in a transaction, transactions can't span multiple rounds of communication with a client.
- You are also discouraged from doing SUM operations because it would take a long time and block other transactions.
I don't see how this is different than a NoSQL database. You cut a number of features (some critical to certain applications) from a relational database and get a bastardized version of a major database. It's a NoSQL database that uses a subset of the SQL standard and an enforced schema!
I find these discussions extremely annoying. I'm currently using MongoDB for my application because:
1. I don't need join support
2. I prefer my current schema to be denormalized.
3. Documents store better than rows for my data.
I could have used a relational database just fine. My data will fit with a little nudging.
The point is, you use the technology that best fits your problem. My current problem fits well into MongoDB but it could be solved less nicely with a different database.
All VoltDB is is another option if you have corners you can cut from the normal relational database model.
FYI: Relational databases don't require transactions of any kind. SQL92 does require transactions; however it's not required to support multiple rounds of communication with the client.
Suggesting there is no difference between a Key Value store and an SQL database capable of ad hock queries, Transactions, and Joins is ridiculous.
> Suggesting there is no difference between a Key Value store and an SQL database capable of ad hock queries, Transactions, and Joins is ridiculous.
My intent was to make the claim that VoltDB is a feature subset of a relational database and is overhyped in the same lines that many NoSQL servers are. It's just a different subset of features than the key/value store servers.
My point boils down to VoltDB is merely another option if you have corners you can cut from the normal relational database model.
I probably could have been more clear in stating that. Bringing mongodb into the discussion muddied things.
I get where you are coming from, however if your application is already using SQL then the transition to a more limited but faster database is a lot simpler than going to NoSQL.
Also one of the simple hacks to increase speed is to have a smaller working set database on a separate system to handle more recent items. Because it's under a much higher load the "live" database tends to have really simple usage pattern also due to its smaller size it tends to fit into RAM. And, the “Live” DB tends to have different optimizations (EX: Fewer indexes because you have more writes). Based on this a "striped down" but still SQL database seems like a perfect fit.
PS: It's a lot like Memcached, something that can speed up your application with minimal development time is worth a lot.
For many of my use cases, the distributed transaction control is the key feature of any database product. The reason there are so many products out there with simple transaction models is that it is not that hard to write one.
It is really hard to do the distributed transaction thing [1] in a horizontally scalable fashion.
The problem I have with this approach is that I don't know from the start what my needs are.
For example if I'm building a web app in my free time ... I'm starting with Mysql, and an ORM client like Django's, and I'm doing data-modelling that fits these tools.
Then if I would want to switch to MongoDB, if the need arises, I end up throwing away a lot of code / rewriting lots of logic. This means waisted hours and my free time for working on such stuff is very limited.
In lots of scenarios it's easier for me to just do sharding where I need it on top of MySql.
And that's what RDBMSs are good for ... they come with lots of crap you don't need, but that you may eventually which makes them fit most problems, unless you have extreme scalability needs, and even then you've got workarounds to turn to.
How do you know if / where you'll have scalability problems? And how do you know what to choose when you're doing exploratory programming?
How do you know if / where you'll have scalability problems
You won't.
Just code it for the best model for your data. 95% of the time a RDBMS will work just fine for your data. If you have scaling issues down the road, chances are you'll be working on this full-time and will have the energy to devote to properly thinking about scaling. You aren't Google or Facebook (yet). Until you are, just work with what best fits your data or is fastest to code.
Well, the one-thread-per-region thing is an interesting way of sidestepping locks. So I'll give them that.
But it's basically memcached with an SQL parser and some hand wavey stuff about how since it's replicated, the data never needs to make it to disk. That one doesn't sit well with me for some reason, even if it's theoretically true.
> Hey Mike, I'll get in contact with you shortly once I pull some logs. We've been analyzing the data and it seems that where we lose data coincides with some system reboots, we had a few problems with replication a few weeks ago, and had it disabled, so that's likely why we are seeing loss.
- Run transactions to completion –single threaded –in timestamp order [will use multiple cores, with each having a single thread]
- Only data changed within a single invocation of a stored procedure is in a transaction, transactions can't span multiple rounds of communication with a client.
- You are also discouraged from doing SUM operations because it would take a long time and block other transactions.
I don't see how this is different than a NoSQL database. You cut a number of features (some critical to certain applications) from a relational database and get a bastardized version of a major database. It's a NoSQL database that uses a subset of the SQL standard and an enforced schema!
I find these discussions extremely annoying. I'm currently using MongoDB for my application because:
1. I don't need join support
2. I prefer my current schema to be denormalized.
3. Documents store better than rows for my data.
I could have used a relational database just fine. My data will fit with a little nudging.
The point is, you use the technology that best fits your problem. My current problem fits well into MongoDB but it could be solved less nicely with a different database.
All VoltDB is is another option if you have corners you can cut from the normal relational database model.