Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I wish SQL DBs had some focus on implementing stateless protocols. Is there any reason the "connection pool" could not be built into the same service as the SQL database, and start a move to an RPC protocol instead of stateful protocol?

So instead of "set xact_abort on" being a command setting one of a hundred possible 100 stateful connection flags, such flags would be part of every network roundtrip (as it is with HTTP).

Of course long lived DB transactions, temporary connection scoped tables, etc would need to be adjusted a bit in approach. But a shift towards SQL over RPC is certainly not impossible.

And long lived, client managed database transactions are worse performance wise than doing the transaction in one roundtrip; SQL is powerful enough that stored procedure-style transactions with parameter tables can do anything -- and for cases they cannot, optimistic concurrency control should usually be chosen instead anyway.

The cases where a long lived/client managed SQL transaction is NEEDED is rather seldom.



First off, a long-lived transaction is separate from a long-lived socket connection. They may correlate, but that's conflating two very different issues.

One could always use MySQL like HTTP; make a connection, authenticate, request a query, get the response, and close connection. On Postgres this is harder due to per-connection process overhead, but solutions like pgbouncer were made exactly for this.

But here's the fundamental difference: HTTP expects large numbers of disparate clients from literally anywhere with varying network quality/speed, and connection overhead would totally swamp your servers if they were stateful. Memory and file descriptors would quickly get in short supply.

Database access is inherently different. You shouldn't have random clients; you have a small, fixed set of client services and developers/admins connecting over a datacenter backplane, querying over and over and over again. The overhead of establishing/tearing down the sockets and re-authenticating becomes a performance bottleneck.

Folks don't create database connection pools because they're bored; they solve a very real and measurable architectural concern, just like HTTP pipelining did years ago and what HTTP/2 addresses through multiplexing: connection build up and tear down overhead. HTTP/3 goes further by avoiding TCP altogether in favor of custom connection handling over UDP.

But HTTP/3 just handles web assets. We regularly expect errors where we need to hit the reload button. SQL usage is markedly different. There is more of an emphasis on reliability at every step. Hitting a lock and retrying is very rare by comparison in the database world relative to the HTTP space. It happens (and it's a PITA), but it's nowhere near the norm.

As for transactions, you appear to regard them as a "sometimes" thing when if fact they're an "every time" thing. Every SELECT, INSERT, UPDATE, or DELETE, no matter how trivial involves a database transaction. They have to in order to enforce ACID.

I've written a few basic HTTP 1.x servers over the years in a few different languages. Fun exercise. I highly recommend it to get a better understanding of how the protocol works and common design tradeoffs. ACID databases are different. Very different. Mind bogglingly different. Orders of magnitude more complex even among the simplest. Compare the smallest, most lightweight web server you can find to SQLite or H2.


I understand all of this, I am not as ignorant as you seem to assume.

Well, you have a HTTP connection pool too, but they are not stateful in the way SQL connections are at all. You can send several requests after one another on the same HTTP connection, but the next request does not remember the previous one. Each request stands "on its own".

Which is very unlike SQL. And, client mamaged transactions build on this: They are stateful, each session can only hold one transaction at the time, and some buggy badly written code can thus easily go wild and fill up the connection pool and either a) take down that backend instance (if the size is capped) or b) use A LOT of connections and risk taking down the database.

With HTTP, the connection is held for as long as the network roundtrip then goes back to the pool.

With SQL, how long a piece of code holds on to a connection is unbounded. (If you use client managed transactions, or hold to a connection for temp tables or other reasons.)


> With HTTP, the connection is held for as long as the network roundtrip then goes back to the pool.

• Keep-Alive has entered the chat • HTTP pipelining popped in • Connection multiplexing has entered the room • Web Sockets pokes head in • Server-Sent Events chimes in as well

And yes, again, the connection model for thousands or millions of disparate IPs from the Internet will obviously be different from resources acting on a network within a closely controlled datacenter. You claim to know all this but seem to gloss over these important points.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: