We use stored procedures. Not by choice; this is a legacy we're stuck with. It is nothing but an unadulterated disaster of a technology regardless of what you use it for. I'm talking 45,000 stored procedures here. 2000 tables. TiB of data. 50,000 requests/second across SOAP/web/desktop etc. It's hell.
Problems with stored procedures:
1) Performance. More code running in the hard to scale-out black box. You're just hanging yourself with hardware and/or license costs in the long run.
2) Maintenance. All database objects are stateful i.e. they have to be loaded to work. The sheer complexity of managing that on top of the table state results in massive cost. Add to that tooling, version control costs as well. Have you tried merging a stored procedure in a language which has no compiler and very loose verification?
3) Orthoganality. Nothing inside the relational model matches the logical concepts in your application. Think of transactions, domain modelling etc.
4) Duplication. You still have to write something in the application to talk to every single one of those stored procedures and map it to and from parameters and back to collections.
5) Transaction scoping. Do you know how expensive it is to introduce distributed transactions? Well it's a ton more cash when EVERYTHING is inside that black box.
6) Lock in. Your stored procedures aren't portable. Good luck trying to shift vendor in the future when you hit a brick wall.
Now I know it's popular to bash on Rails and I wouldn't use it personally but there are people using the same model on top of other platforms, like us.
Sorry but databases are just a hole to put your shit in when you want it out of memory. If you start investing too much in all of the specific functionality you're hanging yourself.
Sorry but databases are just a hole to put your shit in when you want it out of memory.
You've got to be joking, right?
Data is an enterprise's single biggest asset. A robust, consistent and performant store is vital. SPs can be written as garbage like any other logic, but in the right hands they are a perfectly valid tool for providing useful access to complex data.
Interestingly some very enterprise products don't even try and use database features like foreign key relationships - it can be a bit of a shock to open a database with many thousands of tables and realise that there is no obvious way to work out how they relate without looking at application level structures.
I'm sorry to hear your anecdote. I would have to come see your particular situation to see exactly what you mean by 1, 2, 5, because those don't seem unsolvable, but in general:
> 3) Orthoganality
You've introduced that by treating a relational store as a "hole to put your shit in". It's not fair to blame the database for that.
> 4) Duplication
Not with the project that is the topic of this thread you don't.
> 6) Lock in
As I mentioned in my original comment, you can be locked to your database or you can be locked to your ORM/DAO/ActiveRecord/DB client library or whatever it is you're using.
Not using database features isn't the key to heaven anymore than using them is the key to hell. I just meant to point out that in my experience they are underused massively.
Certainly not an anedote. I was an Oracle and SQL Server DBA for a number of years amongst other hats on stupid big datasets and loads. Add to that 25 years' experience getting companies out of deep trouble that everyone else has given up on. I know my shit.
Orthogonality: I haven't introduced anything here. Very rarely does any conceptual model of reality fit into the relational model. It's more imperative than that. Everything is usually crudely shoehorned into it because it's a compromise that people are barely willing to make or because they don't understand how to model a system properly.
Duplication: there is duplication in there. The versioning is very inadequate and API stability is the key to success on this. Plus also, this is a minor part of the application to consider. It's no different to issuing SQL. The protocol is different, that is all.
Lock in: There is no ORM lock-in past the platform. If you isolate everything properly i.e. use command-query-separation then this is a non issue. It's trivial to replace the ORM. You can even do it piecemeal. We've done it. I yanked out bastardisd ADO and EF out and stuck Nhibernate in. If you couple all your logic into the database, no banana. That luxury goes out of the window.
They may be underused, but when your vendor pulls a 26% price hike on half a million quid's worth of kit, can you afford to bend over and take it?
It's a tradeoff, but not one I'm willing to make on medium to large scale systems where there is a capital risk.
Interesting, though as this is a way to interface with Postgres I'm not sure your warnings about licensing costs are useful. Still, there are the other reasons you mentioned.
Some organisations won't allow use of a product unless there's a support option. I can understand this after a complete system failure a few years back on an open source unsupported product. I was hired to fix it! :) Mostly though in comes EnterpriseDB and support costs then. Problem is always staff availability here in the UK though so we always end up with SQL Server and Orscle bodies.
How can you compare being locked into a DB versus locked into an ORM ?
One of the main features of an ORM has been abstraction from the intrinsic properties of that database. ORM was a concept that was popularised by the original Obj-C/EOF/WebObjects back in the day which supported retrieving data from any database you pointed it at. And it fully supported you enhancing it's access layer with database specific features.
This is a much longer topic but it boils down to 2 things:
1. Switching your database is not easy with or without stored procedures because it will involve down time for the application while the data is migrated, then verifying that it works as expected in the new database with that ORM. You hope for the best, but it's always more complicated to switch a database.
2. The ORM tends to lock you into the application stack. Switching a part of your application from something like Rails to Go when you need to performance tune is significantly easier and more common than switching the entire database backing the whole system.
Beyond those two are the harsh realities of working with large datasets. As soon as a dataset it non-trivially small relying on the application to do core work on it becomes self destructive by adding network latency and in many cases object creation (check some Rails benchmarks on object creation costs). It becomes a big deal.
This is not to say that doing the bulk of work in the ORM is bad or that everything should be done in the database, it's a matter of balance. The only dangerous opinions on the matter are the "purist to the detriment of all else."
Verifying uniqueness, exclusion and maintaining data integrity should be the job of the database in most cases. That is what it's good at. Performing actual business logic on that data should not unless there is a significant performance based reason for it in most cases.
In Postgres the "stored procedure" thing is a little bit different because they're significantly more valuable thanks to the volume of functionality built into PG. Everything is basically a function in PG.
In PG, you can use functions to create indexes and when the function is used in a where clause that index will be used. You can use functions to create constraints, unique indexes and even notify outside process that are listening of changes in the database with pubsub.
PG is a heck of a lot more than just a "datastore" and that's why these discussions are important. If you want a generic dumb datastore...there are databases built for that. PG is built for a whole lot more than that.
1) Makes no sense whatsoever; stored procedures are typically going to be much faster than the equivalent mix of application-level code written in a scripting language that needs to communicate with a database and is likely vastly slower than PL/pgSQL or PL/SQL. The hardware licensing costs having nothing to do with stored procedures and apply just as much to anything else (there's free as an beer options and costly proprietary ones in either case).
2) This is not a reasonable objection, I could replace "loaded" with "compiled" and your non-argument would make just as much sense. The alternative does not make the "complexity" go away, it just distributes across multiple languages in your application and database.
3) No.
4) Another non-argument against stored procedures. For example, suppose I have a table "time_series(series_id INT, tstamp TIMESTAMP, val NUMERIC)". A common need would be to accumulate all points (tstamp, val) associated with a series_id. Following your logic, you either end up with tons of con the application side sending similar variations of a query that looks like "SELECT tstamp, val FROM time_series WHERE series_id = $x ORDER BY tstamp" or you create one application-level module that acts as an abstraction around a query like that. In the first case, you're doing massive duplication. In the second case, you've essentially made a stored procedure that is distributed across your database and application and all the issues you raised of having to write something to talk to it apply just the same.
5) Again, no. DBs are precisely to place to deal with issues like this as they have means for dealing with things like foreign tables. The application-level alternative just means re-inventing it all yourself and you're probably going to make a lot more mistakes and write a lot more code that way.
6) Non-argument (applies just as well to Rails, Python, Linux, etc.)
You had me until your last sentence. Database are very important to (and very good at) store your data. And data is important (duh). All the issues you described above are related to working with the data, which should not happen in the database.
That's a poor argument because we're not discussing the arbitraty boundary of storing versus working with data as the lines between those are very blurred. This is even more the case when you use stored procedures which work with data close to the storage.
While we're on this subject, RDBMS are no better at storing data than any other technology out there[1]. In fact when you start thinking abstractly like this, other tech such as Riak makes sense for a lot of workloads.
The only real benefits of RDBMS' are fixed schema, fungibility of staff, the ability to issue completely random queries and get a result in a reasonable amount of time and the proliferation of ORMs.
[1] Caveated on insane design decisions like MyISAM storage engine and MongoDB as a whole.
Riak makes sense for some workloads. For the vast majority of workloads out there, you will save time and money with an ACID system. Good ACID systems are mostly RDBMSs, so I would say, for now at least, that typical RDBMSs do have a leg up on other technologies out there.
It seems to me that least in a part of you argument, you seem to be confusing the deficiencies of particular implementations or server-run code with the idea itself. Stored procedures are bad because your RDBMS has a crappy compiler? Hmm...
(One might argue, of course, that SQL in itself turned out to be a lousy interface protocol for relational data processing, and that it caused a lot of pain to begin with. But that's a different topic.)
I'm actually arguing that my RDBMS is no place for a compiler. I want to compile everything externally, test it and then deploy it. Not deploy it, compile it, cross fingers.
Some counter-points I've heard made:
1) Performance. Stored procedures are fast, meaning it will be longer before you need to scale out.
2) Security. If you only use stored procs, you're a lot less exposed to SQL injections etc.
I don't really have a firm opinion either way, but it's not as clear cut as you are making out.
1) Stored procedures aren't that much faster than issuing plain SQL over the connection. The main performance bottlenecks in a RDBMS are cache and IOPS. Regardless of where you execute those, they are all inside that black box after the query is parsed. You also get the added pain of cached query plans which fuck up performance when the query optimiser makes assumptions about table statistics and stores them with a stored procedure. (SQL Server and Oracle at least are guilty of this).
2) The only place I've had SQL injection attacks in the last few years is where people have used dynamic SQL inside stored procedures (sp_executesql) and not escaped them properly. Outside of that, both ORM and proper parameter binding make this entirely irrelevant.
We use stored procedures. Not by choice; this is a legacy we're stuck with. It is nothing but an unadulterated disaster of a technology regardless of what you use it for. I'm talking 45,000 stored procedures here. 2000 tables. TiB of data. 50,000 requests/second across SOAP/web/desktop etc. It's hell.
Problems with stored procedures:
1) Performance. More code running in the hard to scale-out black box. You're just hanging yourself with hardware and/or license costs in the long run.
2) Maintenance. All database objects are stateful i.e. they have to be loaded to work. The sheer complexity of managing that on top of the table state results in massive cost. Add to that tooling, version control costs as well. Have you tried merging a stored procedure in a language which has no compiler and very loose verification?
3) Orthoganality. Nothing inside the relational model matches the logical concepts in your application. Think of transactions, domain modelling etc.
4) Duplication. You still have to write something in the application to talk to every single one of those stored procedures and map it to and from parameters and back to collections.
5) Transaction scoping. Do you know how expensive it is to introduce distributed transactions? Well it's a ton more cash when EVERYTHING is inside that black box.
6) Lock in. Your stored procedures aren't portable. Good luck trying to shift vendor in the future when you hit a brick wall.
Now I know it's popular to bash on Rails and I wouldn't use it personally but there are people using the same model on top of other platforms, like us.
Sorry but databases are just a hole to put your shit in when you want it out of memory. If you start investing too much in all of the specific functionality you're hanging yourself.