ORMs vs SQL: The JPA Story

noelwelsh · on Aug 6, 2013

I like the newer Scala frameworks, like Slick (http://slick.typesafe.com/). They work with tuples rather than classes. Fits much better with the relational model.

stormbrew · on Aug 6, 2013

I am far more interested in this kind of approach to things these days than ORMs or object databases. The relational model is a very powerful way of looking at data and almost all ORMs wind up neutering all of that power, leaving you with the absolute worst of all worlds.

What's needed is good (possibly object oriented) interfaces to relational data, which is a very distinct concept from ORM.

jacques_chester · on Aug 6, 2013

In my opinion, what's actually needed is to realise that object orientation is a dead end. We need relational programming languages.

ORMs can't work for a simple reason. They are trying to map from the world of sets into the world of graphs. Sets are more expressive, so there is always a good chance that there will be a lossy transformation.

This remark constitutes 100% of your RDA of cryptic ranting.

dragonwriter · on Aug 6, 2013

> In my opinion, what's actually needed is to realise that object orientation is a dead end. We need relational programming languages.

The second part is true, the first is not. While certain popular OO programming languages and SQL are limited in incompatible ways (creating the so-called "object-relational impedance mismatch", which is more properly the "certain-OO-languages-to-SQL-impedance-mismatch"), there is no inherent mismatch between the OO and relational models, and there is no reason you can't have both in the same language.

This is pretty central to Date and Darwen's Third Manifesto [1] and the languages that implement the features it requires of "D" [2].

[1] http://www.dcs.warwick.ac.uk/~hugh/TTM/TTM-2013-02-07.pdf

[2] http://en.wikipedia.org/wiki/D_(data_language_specification)

mindcrime · on Aug 6, 2013

ORMs can't work for a simple reason.

But they do. They work every freaking day, in millions of applications, all over the world.

I think it would be more correct to say "ORM's can't work perfectly for every conceivable scenario involving an RDBMS". But they certainly can, and do, work for many, many use cases.

They are trying to map from the world of sets into the world of graphs. Sets are more expressive, so there is always a good chance that there will be a lossy transformation.

But if you started with a graph, transform it to the more expressive set world, then transform it back to a graph, why would you lose anything? And in my experience that's usually the way ORM's are used... to persist an object-graph and then query/retrieve parts of that graph.

Now if you're taking an existing schema, developed for different usage patterns, and trying to layer on top of it with an ORM, then, yes, you should expect problems.

We need relational programming languages.

That's an interesting idea. What would such a language look like?

auvrw · on Aug 6, 2013

total tangent, but in defense of graphs: it's worth noting that graphs are actually a lot more expressive than they might appear at first glance. it's possible to interpret linear orders, algebraic groups, or any other theory written in finite language with graph theory (according to Marker's book on model theory, which references Hodges' for details on the proof)

jacques_chester · on Aug 6, 2013

Not tangential at all and I while I hate being wrong, I like learning new things. Graphs are very expressive. But most of the time, when you try to express the domain model as a tree (inheritance) it leads to muddiness because most domain models are not truly tree-like. Then we try to express it as graphs (composition), but that gives up the advantages of hierarchy.

If you instead express things as sets and subsets, you can enjoy the advantages of both inheritance (DRY logic) and composition (numerous). Some languages kinda sorta have this through mixins or modules. But not quite.

moron4hire · on Aug 6, 2013

Actually, that's been the philosophy behind my personal data access system, https://github.com/capnmidnight/SqlSiphon

It is all just SQL, and the classes it maps to for select statements are really just records, they can only map to raw system types and have no additional functionality. The delete and update statements take no class parameters, they only take raw system types.

It's basically meant to have as simple as possible of a way to execute stored procedures, with an added layer of type safety for parameters and a little intellisense help along the way. Essentially, I treat the application-level method name--and its parameter names and types--as repeated information, and attempt to eliminate repetition. The system calls a stored procedure that has the same name as the method, passing parameters that have the same name and type as the method parameters. And that's about it! I've been using a version of this project in production systems for... well... almost 7 years now, and it's been a breeze.

The project is a little bit out of date right now. I'm working on rebuilding a new feature that I prototyped poorly 6 months ago but was an incredible boost in productivity. The feature is the ability to manage the stored procedure create/alter statements from within the data layer, at the site of the method calls. It eliminates the last bit of repetitiveness in a database project: the repetition of the names of things in the database versus the application.

I had done the same for the table definitions, but that turned out to be a mistake (and thus why I need to rewrite the SP generator system, the two were too closely linked so I just ripped them out completely). I found myself going down that same old road of every ORM, trying to solve graphs of data types, graphs that the database is already pretty capable of solving with a simple join. Because that's the real problem with ORMs: even their developers don't trust the database.

lmm · on Aug 6, 2013

I'm just now moving away from Squeryl (slick was far too immature) to go back to Hibernate. I found the relational approach far too cumbersome, and unpleasant to use if my classes didn't exactly correspond to my database tables.

jacques_chester · on Aug 6, 2013

[2009], for the confused.

twic · on Aug 6, 2013

Quite a few of the detail-level problems in that post have indeed been fixed by now. Not all of them. In practice, i find using Hibernate pretty painless, but then i am the kind of person who actually reads documentation.

His final point still stands:

"If you're doing JPA, you still need to know databases and SQL. If you're using a Web application framework, you still need to know the servlets API and how HTTP works at least at a high level."

This is true of all sorts of ORM and noORM frameworks. You still need to understand how databases work.

fusiongyro · on Aug 6, 2013

I've been using Hibernate sans Java EE for about four years. I started using it under JPA (again, without Java EE) about two years ago. Only in the last few months have I been able to deploy a Java EE environment, and there, only for a very small project, and the Java EE aspect was pretty minor, although desirable and cool.

Hibernate does not require Java EE stuff. The "dynamic weaving" is provided by Javassist (formerly provided by cglib, but cglib is apparently deprecated). No complex setup here. Put the appropriate dependency in your pom and it "magically" works.

Indeed, my biggest objection to Hibernate is the degree to which it relies on magic. A lot of that magic is truly magic, in the sense that you are not supposed to worry to hard about how it all works. If it did always just work, it wouldn't be maddening when you try to figure out why it is going wrong only to get slapped in the face by a fistful of hard magic.

For instance, the batch annotation, in pure, non-JPA Hibernate, is the fetch "mode." There are several options like "join" and "subselect." Hibernate defaults to an N+1 queries situation, but supplying "join" or "subselect" instead isn't necessarily good enough to get the behavior you want. We had a situation where no matter what we did in the Hibernate config, the behavior was an N+1 query explosion. The problem turned out to be an innocuous-looking log statement in the object model. Tracing in, it turned out that this log statement was being invoked indirectly from the object's constructor, causing the list to be "forced" before construction was complete. For some reason this bypassed Hibernate's usual configuration. The solution was to delete the log statement and take a lot of care not to touch anything that might be a PersistentList from the constructor.

That kind of lesson, while trite, is very hard to apply in practice. Especially when you have a set of developers working on the database and object-relational mapping layer and another set working on the model. Hibernate brings a lot of "gotchas." Shield's article hits on one of the more onerous database-side ones, that you are informally forced into using artificial keys, but there are enough oddball ramifications and restrictions to go around that plenty of them spill out into the Java code.

I have positive feelings towards myBatis, but I have only used it for a few edge cases in my Hibernate projects. While it is actually pretty easy to trick Hibernate into returning real objects for native queries, the trouble doesn't end there. It's very hard to do a complex SQL query from Hibernate without running into the vague sensation that the Criteria API would be better. A few hours later, you're back to building SQL strings, having read unsettling absurdities in the documentation like "There is no explicit 'group by' necessary in a criteria query."

myBatis's major advantage, in my conjectural opinion, is that it does not pretend to liberate you from worrying about the database. As a database developer, I am free to write the best query I can for a given situation, liberally using the most esoteric features of my database. My peers in the model can write exactly the interfaces they want to use. But they are not free to imagine that they have the entire object graph available to them to manage to the minutia at all times, and the ramifications of this loss on a group working solely in the model must be great, and I haven't seen them yet. I see storm clouds looming on that side of this tradeoff. It's annoying when I go to a peer and say what they can and cannot do in a constructor or a bean property getter/setter, but from then on they can still pretend that they have everything, and the worst thing that happens is really untenable performance. But everything works. myBatis, in contrast, is only too happy to give you back an incomplete object tree. There is no "weaving" or "instrumentation" in what comes back to support on-the-fly querying just because you accessed some property. Again, it is a great strength (much, much less magic) but it's also a great weakness. Your model guys aren't going to be able to ignore the database with impunity.

I'd like to hear from a group that switched from Hibernate or JPA to myBatis and how it worked out.

brianmcc · on Aug 6, 2013

Very interesting; as someone's who been very happily using Ibatis with Java and Spring for many years I've always been wary of Hibernate for just the reasons you mention. The thought of ceding control over SQL to a framework so extensively would give me sleepless nights, and my hunch has always been that there will be a bunch of non-performant parent/child relation anti-patterns basically baked in and waiting to pounce (if that's not too much of a mixed metaphor...!).

fusiongyro · on Aug 7, 2013

You're absolutely right. In fact, our application has a pretty severely hierarchical nature. Initially, we just pulled out the top-level things and let Hibernate handle it, but performance was ghastly. We added a native query to pull out the first four levels at once, and a great deal of manual code in Java to process what came out. This is good, but the fifth level is self-referencing and recursive. I have an SQL query now that can retrieve everything at once, but the amount of Java code I'm going to have to write to drop it in is a significant barrier. Still, once I do, most of our egregious performance problems will be solved.

You can make it work. But I think Hibernate and other ORMs work by selling you on the notion that they're an abstraction, that you don't have to know everything to use them. In my experience, Hibernate winds up creating a third domain of necessary expertise, rather than managing the two so you only have to have one. And this is the source of the anger.

brianmcc · on Aug 7, 2013

One observation - seems you can almost predict people's preference for Hibernate or Ibatis (or something else) based on whether they are "application folk", who see the DB as just the persistence mechanism, or whether they view the DB as the "critical data store" with the application used to get and store that data.

Despite being very much an app developer myself, not a DBA or particularly a DB development guy, I've seen DBs outlast many apps, coding languages etc, and always feel it's the most important thing at the end of the day.

Depends on your domain too I guess, I'm sure there are many important exceptions.

nobullet · on Aug 6, 2013

I understand why people don't like ORMs. But I don't understand people who like iBatis: it is boilerplate like the hell. Imaging you have to separately update mappings for SQL select, insert, update and delete query (and these mappings are very long) for your entities.

For example, prepared statement for update looks like this:

  PreparedStatement ps = ...;
  ps.setLong(1, author.getUserId());
  ps.setLong(2, author.getStreamId());
  ps.setString(3, author.getNetwork());
  ps.setString(4, author.getIdInNetwork());
  ps.setString(5, author.getNote());
  ps.setString(6, author.getSocialId());

....

And manually counting questions is a normal practice when you add a field:

  static final String UPDATE = "UPDATE Table1 SET Domains = ?, TemplateId = ?, InternalJson = ? WHERE StreamID = ?";
  static final String INSERT = "INSERT INTO Config1 (UserID, StreamID, Domains, TemplateId, InternalJson) VALUES (?, ?, ?, ?, ?)";

fusiongyro · on Aug 7, 2013

You have to pick your battles. It's convenient that Hibernate will do these things for you, but I have seen lots of times when the order Hibernate wants to remove something causes an integrity violation. For instance, a not-null foreign key reference in a linking table; sometimes Hibernate seems to try setting the value to null before deleting the referenced entity, leading to an integrity violation. There have been times when I couldn't figure out how to make Hibernate do this the right way (reversing the order of its deletes and skipping the unnecessary set-null step) so I instead just lifted the constraint. I hate when I have to do that, because ideally Hibernate should just live with whatever schema I have given it.

Another example is trying to depend on CASCADE in the database. The settings you need to make to get Hibernate to accept this are quite arcane and force you to manually worry about list indexes in the Java code. Yet another example of a place where Hibernate, which ought not be telling me how to make my database or my Java code, instead winds up forcing me to take certain decisions in both.

brianmcc · on Aug 7, 2013

I can see your point, pretty valid. I guess I prefer control over magic. I can produce a complex SQL SELECT, use functions/joins/views, and populate either full entities with it or just teeny tiny DTOs, e.g. if I just need customer name and account number I can easily populate a CustomerStuffDto with two values from a surgical, precise, fast query. Spring manages all the datasource and transaction stuff and wiring everything together, and it's easily JUnit tested (this is an "integration test" rather than "unit test" I am told, but it works !)

Pretty sure ibatis helps you avoid the counting ? chars by naming params in a #param1# syntax, might be mis-recalling though.

electrotype · on Aug 6, 2013