Thoughts on OR/M

There is an updated post, written in August 2016: https://www.stum.de/2016/08/24/thoughts-on-orms-2016-edition/

It's hard to imagine a world without OR/Ms anymore. The old days of making a query, getting a reader, looping through it, getting every column into an object...

I'm a Linq2Sql user. I like it's relatively easy setup and good flexibility, and it has a really good query generator. However, I'm running into some issues with Unit Testing it, because you can't really emulate all the features that it has (EntityRefs with a composite Key or fields that are DbGenerated are a nightmare). Since it is a one-off technology for SQL Server, it doesn't work with any other database (not even SQL Compact 4, which sucks as it would be the perfect Unit Test drop in).

Some people are saying that OR/Ms are an anti-pattern, to which Ayende had an interesting reply. I don't fully agree with any of them. I think that OR/M is solving two different problems, one that it shouldn't solve.

The first problem is getting data from and to the database - the second problem is the mapping of the result to objects. The second problem is a good one, the first one is not a problem and thus doesn't require a solution.

Somehow, people seem to frown SQL, as if it were some evil or outdated concept. You know what? SQL is the language of your database, so you better learn it. Oh sure, it's nice to chain a gazillion Where/Select/ToDictionary/GroupBy calls together and don't have to worry about coming up with the SQL yourself. It's not so nice that the chance to produce a SELECT N+1 is pretty high. When was the last time you looked at the SQL that your OR/M generates and optimized it?

Speaking of optimizations: When was the last time your DBA improved your OR/M Code? Most DBAs I know speak one language very well, and that's the SQL Dialect of their database. They know these hidden tricks and tweaks, they can tweak the query plan and come up with ways to speed up your queries, thus lowering the load and therefore the cost of the system. I don't know many DBAs who would look at a piece of C# code and suggest moving the select outside of the foreach loop in order to turn a SELECT N+1 into a single select. They know the difference between the different JOIN Types (I use LEFT JOIN 99.99% of the time - who knows if there are better ways?) and are able to suggest putting in a READ UNCOMMITTED into a query that is okay to retreive slightly stale data. When was the last time you tweaked the TABLE LOCK setting on your OR/M query?

Granted, most developers don't seem to have access to DBA people and thus have to wear a DBA hat quite often. But you know what? If you have to know about DB stuff anyway, you can as well learn SQL. Even though you may not get everything highly optimized, in the moment a DBA looks at it and optimizes stuff you just have to change your query and call it a day. Maybe you two decide to create a Stored Procedure instead, in which case your SELECT becomes an EXEC - big deal, not.

Yes, I am aware that some queries can get pretty harsh. I have a query with 14 subselects (mainly because it has to do paging on the database) and I am glad that I can write it in Linq2Sql, creating an IQueryable and passing it through a dozen methods that apply more Where clauses based on some ridiculously huge filter class a user passes in. Not having to generate that through StringBuilders or other stuff is a great thing.

But you know what? That query is the one anomaly in a sea of standard CRUD queries. I can write those CRUD queries in my sleep, blindfolded. And I can look at them and immediately understand what this thing is doing. I can read up the T-SQL manual, run it in SQL Management Studio while the SQL profiler is running, look at the Execution plan, tweak the query, put it on Stack Overflow to get some input - I am talking to the DB in a language it understands and have no limitations in how I tweak it.

Really, all I want is the Mapping. I want to write a (parameterized) SQL query, but I don't want to do the ExecuteReader/foreach loop crap, because that's stupid work. I want to be able to insert a Parent Row and it's child rows in one go, without having to use client-side generated GUIDs for the primary key so that I can add it to the child rows. I love the EntitySet/EntityRef stuff in Linq2Sql, and that's what I want. Create an Object and it's children, put it in the database and not worry about capturing the @@IDENTITY and populating it on the children. I don't want an OR/M, I want an Object Mapper.

Luckily, there seems to be a small (but truly existing) movement into it, Dapper and Massive being two of the better known ones. Massive uses the new dynamic functionality in .net 4, a technology that could revolutionize OR/Ms once they start embracing it (rather than trying to stick to their Java roots or stay compatible with .net 2.0). I don't fully like how I have to create "empty" classes and it seems that it doesn't do "real" SQL but rather a DSL that looks like SQL. That would kinda suck.

Dapper is currently used as the OR/M for Stack Overflow, It's very bare bones (doesn't manage your connection, no UPDATE helpers), has some not-so-nice syntax quirks (Having to pass IsAnsi = true when querying a non-unicode - but that's an edge case I'd argue) and although I have not yet had a chance to use it, if Jeff Atwood is willing to use it on the thing that puts food on his table and rock band songs onto his Xbox, I'd assume it's stable.

Granted, one can argue that you're not paying your developers to write code that can be written faster in an OR/M, a claim I'd like to dispute when looking at the total lifetime of an application including maintenance, but if you want a catchy, quotable sentence: Traditional OR/M is like WebForms. It's rapid at first, does stuff just well, may even have a fancy GUI to drag/drop your tables. But just like WebForms limits you tweaking the genrated markup or core functionality ("I need my Data Grid to render the headers in a <thead> so that the jQuery TableSorter plugin works") and sometimes requires some hacks ("I know, I just put some javascript that wraps the generated table header rows into a <thead>"), an OR/M will put up some walls that you can't climb as well.

I think it's time for an OR/M that is to other OR/Ms what ASP.net MVC is to ASP.net WebForms: A Framework that allows and requires us to get deep into the guts, but does not limit the tweaking we can do, while still handling a lot of the stupid work, and I think it's great to see projects move into that direction. Just as MVC and WebForms both sit on top of an ASP.net core stack, ADO.net serves as a rock solid foundation for multiple philosophies of OR/M systems that can happily co-exist and serve both the people able and willing to write SQL and the people who'd rather spend money on an OR/M, profiling tools, and maintenance costs.

Comments (7)

SamSeptember 28th, 2011 at 07:17

Nice article.

We chose to go with DbString as opposed to an AnsiString class cause it gives you more flexibly and seemed more consistent, if you need the that level of control you probably care about stuff like varchar(12) vs char(78) vs nvarchar(max) ... the single class can allow you to specify all these param types.

One thing I strongly recommend you look at is my SqlBuilder API: http://samsaffron.com/archive/2011/09/05/Digging+ourselves+out+of+the+mess+Linq-2-SQL+created it is my dissertation against the showhorning of relational algebra on to SQL. When I first saw LINQ and ARel I was enamoured by the composability, however once you look a bit deeper you see there is an impedance mismatch between the two languages. I wanted to be able to use SQL server hints without having to resort to nasty string constructions something LINQ will never be able to do properly.

As to:

var id = cnn.Query(@"set nocount on
insert Post (Name) values (@Name)
set nocount off
select ScopeIdentity()", new {Name});

Internally we use a simple helper:

var id - cnn.Insert("Post",new {Name});

The thing is, due to DynamicParameters it is quite trivial to implement. There are even some examples in Dapper.Contrib.

Another point that is really important to mention is the MultiMapper, that allows you to split up a row that comes back from SQL into multiple objects.

Cheers,
Sam

RichBSeptember 28th, 2011 at 07:59

> The first problem is getting data from and to the database... the first one is not a problem and
> thus doesn't require a solution.

1) Database agnostic paging
2) Database agnostic write batching

These are both problems that a lot of ORMs don't solve - and really do require a solution.

mstumSeptember 28th, 2011 at 08:27

The problem with database agnostic approaches is that you usually have to decide between a simpler API (like Linq) or access to the full power of the DB. This is one of those situations where scaling problems only become apparent when a change is very costly (=when the Application is mature enough to have much data).

One problem with database-specific stuff: It's impossible to Unit Test. At that point, it's an integration test because you can't even bring in SQL Compact as a replacement anymore, but instead look at spinning up User Instances of a SQL Server.

I don't think it's possible to solve this for RDBMS, the dialects are too different and the vendors have zero incentive to move. In .net that's not so much a problem, because either you're using SQL Server or you're cursing about the horrible quality of third party ADO.net providers (at least that's what I heard last about Oracle and MySQL in .net).

I've peeked a bit into the SqlProvider in LinqToSql. This thing is insane, but it seems as optimized for SQL Server as an OR/M can get. I have not checked EF 4.1 to see if it still generates inferior SQL (the 3.5 Version did), but trying to get a third-party LinqToSql provider (which should be possible) or good, optimized, agnostic code seems pretty futile.

I guess what would really help: An OR/M that is as Simple as Linq2Sql in CRUD scenarios, but that can send off raw T-SQL queries and map the result set back into objects.

JohnSeptember 28th, 2011 at 13:59

Great post. You guys should check out PetaPoco also. It does all the crud stuff really easily and the mapping of custom sql's directly to objects with great performance. Its currently being used in Production with great success.

http://www.toptensoftware.com/petapoco/

Thomas WagnerSeptember 28th, 2011 at 14:05

Hmm I dont think dapper or massive are good substitutions for a decent ORM like llblgen . But I am not surprised we are seeing this mini backlash after the performance of Linq and the overall quality of EF. I have used neither in production and will use dapper or massive before either - but my tool of choice for this type of work has been llblgen . It simply cant be beat.

Duncan SmartSeptember 28th, 2011 at 23:08

"Massive... it seems that it doesn't do "real" SQL but rather a DSL that looks like SQL" - I think you missed Rob's joke here in the readme, he was teasing: it *is* real SQL.

[…] is a bit of a follow up to my 3 year old post about LINQ 2 SQL and my even older 2011 Thoughts on ORM post. I’ve changed my position several times over the last three years as I learned about issues […]