I understand the primary premise about the difficulty with testing SQL and fully agree with it.
I do have a question though - while I understand how functors can help make the problem easier to tackle, I am not sure I fully understand how functors are different from a similar existing tool - stored procedures.
Some DB flavors:
- can take tables as arguments to stored procedures
- can return tables
- also offer the additional benefit of being able to run almost all flavors of SQL commands ( DDL, DML, DQL, DCL, TCL) in those stored procedures
Netezza stored procedures, for example, can do what you describe here:
Am I missing something fundamental here? How are functors different from stored procedures? To me, they seem to be just a sub-class / sub-type of stored procedures.
The goal is that the composable parts get woven into an efficient, planner-friendly query. Stored procedures completely undermine that unless something very exciting has happened since last I checked (SQL Server, but probably applies to all of them). You will likely end up in a "row by agonizing row" situation.
(Okay, maybe you can find a few cases where stored procs work decently, but in general both composability and performance will be much worse than the proposed functors.)
OK, this I understand, that is a good insight - cursors are row-processing based so its gonna be slow
I think Netezza, SQL Server and Oracle are all cursor-based processing "by default" so this makes a lot of sense. I suspect that they all have bulk operation capability but can't immediately think of how I would have worked bulk processing in a way that maps to this article - maybe something like analytic functions like windowing, partitioning etc. that is definitely not row by row.
Having said that, the examples I see for actual testing in the article are DQL / DML so would be multiple row processing by default .. yes, the functor definition / creation is a DDL process but it is a "do once and reuse the definition" thing (like, the author correctly observes, a view, which is the point of functors) and the functor in use would just be DML. In which case, functors go back to looking like stored procedures...
I also understood composability as being built in for SQL - for example, in Oracle, packages allow composability of stored procedures, triggers, sequences etc allow composability of DML and views allow composability of queries and tables - which the author points out in the article.
With functors, DDL, DML, DQL, DCL, TCL would still be the only command tools available unless a new command language was invented for SQL for testing - let call that something like DTL (Data Test Language), with a whole new bunch of associated new SQL keywords, capability and functionality that are built right into the core of the DB engine that are optimized for what functors are trying to achieve.
Regarding "can't immediately think of how I would have worked bulk processing in a way that maps to this article think of how to map composibility" ...
I believe stored procedures where you construct dynamic sql and execute the results can basically provide the composability/performance described with bulk non row-based logic. If you keep it simple it can work ok.
They seem somewhat like stored procedures, but not stored? As in a query can contain a functor in it and then use it immediately. I didn't see those `create functor` statements as anything other than ephemeral - or am I wrong?
EDIT: also stored procs that use imperative logic and cursors can be quite a bit slower than queries that achieve the same logic - the capability here is purposefully a subset of that and is just to help build standard SQL queries that can go through the standard query planner.
I think they have to be long lived else they cannot make sense for performant testing. ie they are created as DB objects, using DDL, in the same way tables, views, functions etc are made.
They can certainly be created at test run time but that would slow things down a lot - you would essentially be creating a ton of objects every time you run the test which means having a setup to test if they exist or not, take them down if they do or fix them if they don't match spec ( e.g. column and data type changes etc etc )
The more I think about this, the more complicated I realize it would be to manage this dynamically:
You essentially have to build a test harness enviroment that figures out your testing elements dynamically from your data environment (with some kind of parameterization engine and data set to tell it what to look for so as to "make functors and run them" (e.g. all PKs of FKs or all columns starting with a certain prefix or all columns of a certain data type etc etc), gets the most up to date definitions of those elements from system tables and uses that data to create or update or drop functor objects ... wow, ok, this is getting complicated, I am going to stop now before I see the void.
I do have a question though - while I understand how functors can help make the problem easier to tackle, I am not sure I fully understand how functors are different from a similar existing tool - stored procedures.
Some DB flavors:
- can take tables as arguments to stored procedures - can return tables - also offer the additional benefit of being able to run almost all flavors of SQL commands ( DDL, DML, DQL, DCL, TCL) in those stored procedures
Netezza stored procedures, for example, can do what you describe here:
https://www.ibm.com/docs/en/netezza?topic=nsg-return-result-...
As can SQL Server & Oracle (which both return cursors, which are just ordered tables):
https://learn.microsoft.com/en-us/sql/relational-databases/s...
https://docs.oracle.com/cd/B28359_01/appdev.111/b28843/tdddg...
Am I missing something fundamental here? How are functors different from stored procedures? To me, they seem to be just a sub-class / sub-type of stored procedures.