Why Entity Framework isn’t a Good Model, Part I: Lazy Loading

243
4 min readJan 6, 2018

--

This is one I’ve been holding in for too long. To be fair, I’ve only been working with Entity Framework for less than a year, and I’m totally unqualified to say this, but Entity Framework just bothers me. It’s the fly in the object-oriented ointment. As much as I enjoy backend dev and database dev, it always bothers me just a little to use Entity Framework.

Why We Like It

I will be the last person to deny that Entity Framework is elegant. Database rows do tend to be, on a conceptual level, objects. The idea of magically calling them up as iterable data structures composed of nice, strongly-typed, encapsulated objects is downright seductive. There’s just no denying that. Plus, the SQL-but-much-better approach to LINQ query syntax is just flat delightful, and its beauty is somehow undiminished when combined with lambdas.

Why I Don’t, Part the First: Lazy Loading

Lazy loading is a marvel. It allows a programmer to easily take a layered approach. A single query can be used as a base for a number of different but similar queries, creating code that’s reusable in a way that SQL usually isn’t.

A programmer won’t even know how the query is finally compiled down to SQL or when it executes. The data shows up in memory when it needs to.

On the balance, I’ve come to the conclusion that this is a pretty bad thing.

Liskov Substitution

Consider an interface several methods and two implementations. FirstImplementation has intuitive implementations of all the methods, while SecondImplementation has an implementation of one of the methods but throws NotImplementedException for the others. Therefore, if you’re calling the other methods on that interface, you always have to know which implementation you’re using. As you know, this is a classic violation of the Liskov Substitution principle.

Now consider this Stack Overflow question about NotSupportedException. In this case, the issue is a custom .Equals() method. That C# method can’t be transpiled to SQL, so the developer/questioner is getting a runtime exception. If you type NotSupportedException into DuckDuckGo, you’ll probably see a lot of questions just like it. In order to avoid NotSupportedException, you have to do something to force the program to go ahead and evaluate the SQL expression before making the comparison. In the case of the linked example above, the developer calls .ToList() on the IEnumerable, and thus the .Equals() call fires successfully.

This is a hack. The ability or inability to call a particular method should be evaluated at compilation, like in the rest of C#. That’s why we all love strong type systems. It’s scary as all-get-out to not know until runtime whether your LINQ query is actually valid. Even semantically, there’s very little difference between a method that is “not supported” and a method that is “not implemented.”

Abstraction & Efficiency

Advancements in abstraction often result in cleaner code and better software at the expense of speed. I’d rather just write in a high-level language than try to optimize my own Assembly code. In fact, I almost never even look under the hood. A compiler will abstract away things like individual instructions or memory addresses, and I trust the compiler to produce something pretty well optimized most of the time.

Entity Framework is a logical end to that, abstracting away the database work and bringing that logic up the stack and into the sexier, more robust, and ever-improving C#.

The trouble is that our current technology doesn’t support this abstraction, at least, not yet. Users of Entity Framework consistently do still have to examine the runtime-generated SQL queries and try to optimize it — Entity Framework just makes that harder to do.

Even SQL optimization aside, I need to know when my query executes so that I can optimize my application layer. For example, say I’m writing a controller method that returns a view that represents a user profile. To get my model, I call a repository method that returns and IEnumerable of all users. Then, I put an additional where clause on the end to filter down to the requested user. This kind of software design is a huge strength of Entity Framework.

However, proper object oriented abstraction dictates that the caller shouldn’t have to know how the called code is implemented in order to use it, but the point of query execution in this case cannot be ignored. The controller has to count on the repository to be implemented in a way that doesn’t cause the query to be executed before the additional where clause is tacked on or else suffer the consequences of synchronously loading up thousands of records. But even the writer of the repository may not be aware of that unless he is familiar with the nuances of how the framework is implemented.

This is all because Entity Framework is hiding something that should be exposed. I hope to see the day when hardware and networks will be able to bear the burden of lazy loading without the need to expose the underlying runtime code, but for the time being, I’d like to stick to my own, hand crafted, artisanal SQL.

--

--

243

Civic tech fanatic. Senior ASP.NET MVC / C# developer. Web Bureaucrat. Opinions are mine, but the bugs are all from the previous maintainer.