Stored procedures may be all of those things, but they don't have to - it's just that most of the time, developers don't really care, so they have fancy versioning, deployment and continuous integration for all their code, except for stored procedures.
Also, DB procedures are not easy to "debug" in the traditional way, but SQL client is basically the first REPL every programmer becomes familiar with. You can easily step stored procedure by running it's commands one by one, unless it's fancy Oracle forall loop with cursor or something (and the cursor select can still be selected as normal).
Also, databases tend to have more strong data types than programming languages in general so putting constraints in DB means, that bad data are not savable in the system.
> so they have fancy versioning, deployment and continuous integration for all their code, except for stored procedures.
Exactly, because those things are extremely important to how code gets shipped and delivers value to the business. Stored procedures become a huge risk to future development which ultimately means it's a risk to the businesses ability to deliver value. What happens when you need to change DB vendors because the business has been so successful that you've outgrown a relational database? You have to rewrite the ENTIRE MC portion of your MVC application. Why would someone ever do this?
Avoid state at all costs. Stored procedures are stateful. Schema and migrations is pain enough already.
Write me a check constraint that validates an email address being put in a varchar column and reports back a sensible message which can be bound to an entry field with metadata about the error.
Write me a constraint and key arrangement which is unique across two and three columns in separate groups.
> Avoid state at all costs. Stored procedures are stateful. Schema and migrations is pain enough already.
What do you mean by that? How is having a bunch of queries in a stored procedure more "stateful" than having the same queries in the application?
> Write me a check constraint that validates an email address being put in a varchar column and reports back a sensible message which can be bound to an entry field with metadata about the error.
Postgres gives you metadata about the error, though the error message will still be a generic "CHECK constraint violated" or some such.
> Write me a constraint and key arrangement which is unique across two and three columns in separate groups.
I'm not sure what you want to see based on that description, but surely you're not advocating enforcing unique constraints in the application?
If I have to load the stored procedure into the persistence engine then that step is required. This is no more stateful than queries in the application but it means that the relevant state in both the application and the database engine needs to be reloaded and constantly sychronised. Ergo, two times the work.
CHECK constraint violated is no good for humans. Prevention is better than cure here.
Why shouldn't I enforce unique constraints in the application?
1. Open a transaction
2. Get a user by name from the ORM.
3. Exists? Tell user that the username is already registered.
4. Doesn't exist? Save new User instance.
5. Commit transaction.
Steps 2 and 3 can be as arbitrarily complicated as you need them to be, are fully testable and cheap with anything that uses MVCC.
"Why shouldn't I enforce unique constraints in the application?"
You should to both. For all the reasons you mention, it's often cleaner to just do it in the application especially when you can use a framework with a simple "validate_uniqueness" flag.
But, what you're describing is also the very definition of a race condition. It's the same reason you don't increment counters by retrieving them, adding 1 to it and then saving the number back to the database and instead pass in an increment command.
Check it in the application but let the database make sure it doesn't get violated in a race condition. There's a significant amount of either/or in this entire conversation (not just you, the whole thread) when the database absolutely can and should be leveraged for certain things.
It's extremism and purism where the problems get introduced (in both directions).
I do most what you say and still think both of you are right.
I mean stored procedures have some use cases but i've seen people using it EVERYWHERE and I've seen people (including myself) NEVER use it.
I mean currently my dataset is so small I don't need stored procedures, I barely do anything more than CRUD. Okay I have a bigger GROUP BY query but that is all, and at one point I load a HUGE dataset into my application memory (1000 rows) but that works REALLY REALLY fast in scala and I tried to create a stored procedure around it, but I failed, and the application code uses the dataset to generate a big calculation. Currently I just have a Map<String, Map<String, List<Row>> which is easy accessible and usable for my calculation. I mean I could've done similar with stored procedures but the performance gains are really low.
For what you're describing it doesn't sound like stored procs are worth it. Avoid introducing them unless you find that they are necessary or beneficial, but don't avoid them entirely on principle.
Preserving data integrity tends to be a much more worthy use case for database logic than retrieval display.
There is a good chance that your proposed algorithm to enforce a uniqueness constraint in the application won't work. As in, you've left out enough details that would be critical for getting it right, and in my experience, a lot of programmers would only get this right by accident if they get it right at all.
First problem is that the SQL standard provides no way to make this work portably on any standards-compliant database. So right there you are going to have to code to the database to one degree or another.
So, let's say you want to make this work in Postgres. Now, you'll need to be using Postgres 9.0 at least; otherwise your uniqueness constraint won't be a uniqueness constraint.
Try this, in any version of Postgres. Open up two psql sessions. In one, run a `create table unique (x text);`. Then run `begin isolation level repeatable read; select * from unique where x = 'foo';` in one of the sessions. Repeat those two commands in the other sessions.
Neither session sees 'foo'. So now both can go ahead and run `insert into unique values ('foo'); commit;`. Both transactions will succeed, and you can confirm that there are now two instances of 'foo' in the table.
In fact, `begin isolation level serializable` in PostgreSQL 9.0 or later is the minimum isolation level to make this work. And, you will need retry logic around the transaction in case of a serialization failure. (Perhaps your DB access layer or language would hide this latter detail from you, or perhaps not.)
In PostgreSQL 8.4 and before, serializable and repeatable read were equivalent, and both were still SQL standards compliant. In PostgreSQL 9.0, the repeatable read isolation level stayed the same, while the serializable isolation level was strengthened.
Unless you can accept a certain level of degraded accuracy by using a probabilistic construct such as a Bloom filter, by far the biggest cost of maintaining uniqueness is the index. And you'll need that index whether you use the database or the application to enforce uniqueness.
And, judiciously pushing computation onto a database can actually be cheaper for the database as well as its clients. This scenario is likely to be one of those situations.
> CHECK constraint violated is no good for humans.
Well, sure, an application should respond to DB errors by presenting appropriate messages on the UI, just like any other errors it encounters. You should only see "CHECK constraint violated" if you are bypassing the app and using the DB. Otherwise, you should see something nice provided by the app.
> Why shouldn't I enforce unique constraints in the application?
Because you should do it in the database whether or not you do it in the application, and then once you have, well, DRY.
> Avoid state at all costs. Stored procedures are stateful.
Stored procedures are no more state than application code is.
> Write me a constraint and key arrangement which is unique across two and three columns in separate groups.
What does "unique across two and three columns in separate groups" mean? I get that its something more complex than a simple multicolumn uniqueness constraint, but not what it is supposed to do.
I suspect that whatever it is can be done with PostgreSQL -- possibly using the (relatively) new exclusion constraints -- but I can't quite be sure without more clarity on what you mean.
Here is interesting talk about database migrations and stored procedures and unit tests: http://www.pgcon.org/2013/schedule/events/615.en.html
Also, DB procedures are not easy to "debug" in the traditional way, but SQL client is basically the first REPL every programmer becomes familiar with. You can easily step stored procedure by running it's commands one by one, unless it's fancy Oracle forall loop with cursor or something (and the cursor select can still be selected as normal).
Also, databases tend to have more strong data types than programming languages in general so putting constraints in DB means, that bad data are not savable in the system.