Some Unofficial ORM Benchmarks

Recently RepoDb was bragging about their performance numbers in the RawDataAccessBencher benchmark. Looking it over, I noticed that my ORM’s numbers had dropped significantly.

Now raw performance isn’t my biggest selling point, our real focus is being able to access more of the database’s capabilities and to provide introspection (i.e. reflection) for things like code generators and data analysis tools. But still, that’s kinda embarrassing.

After digging around for awhile I realized that I wasn’t putting my best foot forward. The benchmark he ran was for .NET Core, which didn’t include the ‘fast’ version of my ORM yet. (My fault, ORM devs are supposed to issue a PR when new capabilities are added.)

image

Much better. Slower than RepoDB, but I’m still beating Dapper on the set fetches.

EF Core is stupid fast

Those who know me know I dislike EF and hate EF Core. But I have to admit, they’ve really stepped up their performance.

Introspection beats assumption

Tortuga Chain can’t cache SQL. It caches a lot of things, but the actual SQL string isn’t one of them. So why is the normal version, which generates SQL, consistently faster than the version that uses a hard-coded SQL string?

And why is EF Core faster than Dapper?

Frans writes,

It can because it uses an assumption that’s also used by Linq to DB: it uses the schema of the resultset to determine which columns will be NULL and which won’t. This can help with the IsDBNull check. Typically you check each value if it’s NULL, and that takes time. If you can ‘reasonably’ assume it’s never NULL, you can go straight to the Get<type>() read on the reader.

There’s a problem with this however. The meta-data this is based on can be outdated and you could get a schema of the resultset which says “this column isn’t null” while there IS a NULL value in the resultset, causing a crash. I had to roll back the optimization in LLBLGen Pro because of this. (In case you’re wondering: create table T with a non nullable column, create a view V on T, just select * from T. Then make the non nullable column in T nullable. If you now select from V the metadata of V will still say the column is non nullable (on sql server at least) which could lead to problems at runtime if you rely on that and a NULL value is present i the nullable column)

So all in all, it’s ‘faster’ (by a few ms on 31000 rows, so it’s not that much of a deal), but the price to pay for that is IMHO too high. We didn’t want this optimization in LLBLGen Pro because it could hurt users (one of our tests failed because of this, so that was a red flag). Make your own decision, but I think dapper, and other ORMs are wise to stay away from this optimization.

Chain uses runtime introspection, meaning it asks the database at runtime whether or not a given column is nullable and add skip the IsNull checks as appropriate. But if you give Chain raw SQL, it can’t ask for metadata and thus checks cannot be skipped.

What about Frans’ warning about SQL Server? The answer to that is to use SCHEMABINDING. This ensure that your view metadata always matches the table’s metadata. Though I admit, if you aren’t using SQL Server Data Tools to manage your versioning then it’s a right pain in the ass.

Single Fetches

When fetching single records, the numbers jump around a bit. RepoDb is still the fastest, followed closely by Tortuga Chain and Dapper.

But what struct me as interesting is the performance for dynamic objects. See, Chain as the ability to leverage the dynamic keyword in C# to create new objects on the fly. (Internally this uses ExpandoObject.) When working with a set of objects, it is as slow as you would expect for late-bound, dynamically typed code. But for a single fetch, it actually beat EF Core for both speed and memory usage. It shouldn’t, especially given how slow it is for sets of objects, but there you go.

image

.NET Framework Multi-row Fetch: Win!

Only 0.61 ms slower than hand-coded and faster than EF Core, Dapper, and RepoDb. Too bad .NET Framework is going away, but I’ll take any victory I can get.

Though I do wonder why RepoDb dropped so far in the rankings. Something about .NET Framework makes it run 25% slower than .NET Core, while others aren’t see such a big drop.

image

.NET Framework Single Fetch: Show

When only fetching a single row, Chain is tied for 3rd which is not too bad.

You can see the 0.02 ms overhead from the SQL generator showing up in this one. I can think of some ways to speed that up, but really I don’t think it’s necessary at this point.

And yes, RepoDb is back to the top of the rankings.

image

What does this all mean?

Nothing really. ORMs are all pretty fast these days so unless you are doing something silly like using dynamic classes for large data sets, read performance for single table queries is no longer a major issue for ORMs.

Of course there are other aspects that need to be considered such as multi-table reads, write performance, feature sets, etc. But it’s good to know that there is one less thing server-side developers have to worry about.

One thought on “Some Unofficial ORM Benchmarks

  1. In the last run of RDAB, RepoDb outcasted HandCoded by few MS. Me and Frans exchanged tweets and both profiled RepoDb why it is running much more faster than HandCoded. We both discovered the trigger why RepoDb is much more faster than HandCoded, and it is seems to be a technique and a trigger to speed-up the data conversion.

    New official result: https://github.com/FransBouma/RawDataAccessBencher/blob/master/Results/20190520_netcore.txt

    In net FX, the first run (before even replaced by Frans), RepoDb is on top and even it shocks me. In anyway, this mystery is now solved as Frans applied the logic in the HandCodedMaterializer.

    Like

Leave a comment