Demystifying The Code

Is Lazy Loading in EF 4 Evil or the Second Coming?

(As you probably know) The Entity Framework provides you with various options for loading related entities.  In Entity Framework 4, you will have the choice to implement eager loading, explicit loading and now… lazy loading.  Lazy loading was not available in version 1.  A quick search on ‘Lazy Loading’ will yield opinions from 2 very different camps.  Some folks see lazy loading as a necessity for an ORM, while others believe it to be evil.  So, who is right?  This post will examine that question.  If you are not familiar with all of the load options are, I will start the post with a brief description of each.

 

The Code Scenario

We will use the following simple scenario to examine the various load options.  In this scenario, we have 2 entities that we are surfacing as objects: Customer and Order.  Each customer has 0 to many orders.  Imagine the following code accessing the customer and it’s orders (in this case in a console application):

using (NorthwindContext nw = new NorthwindContext())
{
    var query = from c in nw.Customers
                where c.CompanyName.StartsWith("A")
                select c;

    foreach (Customer c in query)
    {
        Console.WriteLine("{0}", c.CompanyName);

        foreach (Order order in c.Orders)
        {
            Console.WriteLine("\t{0}", order.OrderDate.ToString());
        }
    }

    Console.ReadLine();
}

For the sake of this example, assume that we have 3 customers, with a total of 24 orders between them.

Eager Loading

With eager loading, you structure the initial query in such a way that all of the required objects are returned in the initial query.  If you are a SQL person, you can think of this as building a join query.  In fact, if you are using the SQLClient Provider, that is exactly what eventually gets executed against SQL Server. 

How Do I Do Eager Loading?

You implement eager loading in LINQ to Entities with the ‘Include’ method.  An include specifies the related objects to include in the query results.  So, given the previous example, we would re-structure the original query to look like this:

var query = from c in nw.Customers.Include("Orders")
            where c.CompanyName.StartsWith("A")
            select c;

The Result

Taking a look at the screenshot of SQL Profiler, you will note that in order to return the 3 customers and the 24 associated orders, only 1 query was executed.  You can also see the join to the Orders table in the SQL statement that was executed.

image

Discussion

So is eager loading good?  Is it bad?  It depends upon the scenario.  On a positive note, only 1 query had to be executed.  This means that we only  needed one connection and incurred the overhead of making one network round trip.  On the other hand, every order was returned, resulting in a potentially large resultset. and the initial query included a join which is more expensive than a query against a single table, 

In a scenario where a high percentage of the related objects are traversed, eager loading would be a great choice.  In a scenario where the object graph can be quite large and you sparingly traverse the related objects, it may be a poor choice.

 

Explicit Loading

With explicit loading, you explicitly request to load the related objects.  In essence, you run an original query to return an object or collection of objects.  When you want to process the related objects for that object (or an object in the collection), you explicitly request to have the related objects returned.

How Do I Do Explicit Loading?

You implement explicit loading by calling the Load method on an EntityCollection or EntityReference.  So, given the previous example, we would re-structure the code to look like this:

using (NorthwindContext nw = new NorthwindContext())
{
    var query = from c in nw.Customers
                where c.CompanyName.StartsWith("A")
                select c;

    foreach (Customer c in query)
    {
        Console.WriteLine("{0}", c.CompanyName);

        c.Orders.Load();
        foreach (Order order in c.Orders)
        {
            Console.WriteLine("\t{0}", order.OrderDate.ToString());
        }
    }

    Console.ReadLine();
}

Notice that I removed the call to Include.  Further note that prior to iterating over the Orders collection, I am explicitly calling Load.

The Result

From the profiler screenshot below, you will notice that 4 queries were run in total.  The first query returned just the customer information.  Then, I explicitly called ‘Load’ prior to enumerating over the orders for each customer.  Each call to load made a separate call to the database with all of the associated overhead: a connection, network round trip, etc.

image

Discussion

As with eager loading, explicit loading can be performant or not depending upon the scenario.  Again, given the scenario where a high percentage of the related objects are traversed, eager loading would likely be a better choice.  However in the second scenario where the object graph can be quite large and you sparingly traverse the related objects, explicit loading may be a better choice.

 

Lazy Loading

With lazy loading, related objects are loaded automatically for you when you access them.  You can think of it like explicit loading, but the call to Load is called for you automatically when you access the object in question.  To be clear, the query is only run if the objects are not already in the ObjectContext. 

How Do I Implement Lazy Loading?

If you are using the Entity Framework designer with the default code generation strategy, you are already set up for lazy loading.  The default value of LazyLoadingEnabled for a context is false,  However, the default code generation template sets the ContextOptions.LazyLoadingEnabled property to true in each of the constructors for the context – see screenshot below.  (You can easily implement lazy loading for POCOs, as well.  I’ll illustrate how to do that in a post next week.)

image

Given that lazy loading is set up, you do not need to do anything else (that is the lazy part).  You simply access the objects in question.  If the objects are not already in the context and you have not eagerly loaded them with an Include and you have not explicitly loaded them with Load, they will be loaded for you automatically.  With lazy loading enabled, the code listed under ‘The Code Scenario’ at the top of this post will work as-is.

The Result

The result is the same as the example of explicit loading.  In our simple example, 4 queries were run in all.

Discussion

The issue of relative performance is the same as explicit loading. 

 

The History of Lazy Loading and the Entity Framework

As mentioned at the beginning of this post, lazy loading was not available in the first version of the Entity Framework.  The mindset was that, with lazy loading, it was possible for someone to unknowingly make network calls.  As these calls are expensive, this was viewed as bad.  The decision was made to force people to be explicit when making a network call. 

This decision was not well received by everyone and not without it’s issues.  Take a look at the following excerpt from the now infamous ADO .NET Entity Framework Vote of No Confidence:

image

The reality was that virtually every ORM on the market supports lazy loading.  The excerpt relayed that without this functionality, the developer is required write unnecessary code to get the expected results.  Take the code laid out in ‘The Code Scenario’ at the top of this document.  Without lazy loading, the results of this code are shown below:

image

That is not an accurate depiction of the data.  To the untrained eye, it appears that each customer has no orders – a bit of a false negative.  In v1, the developer would have been required to add the code to see if the orders were loaded and if not, call Load. 

 

Is Lazy Loading Evil or the Second Coming?

It should be mentioned at the outset that the existence of lazy loading (or explicit loading for that matter) does not preclude you from using eager loading in scenarios where that makes sense.  Another point to make is that after all is said and done, explicit loading and lazy loading offer the exact same performance benefits and challenges.  What this means to me is that if you have implemented explicit loading in the appropriate places, lazy loading is not evil at all.  It will behave exactly as though you had made the correct explicit calls.

If you buy into that logic, the remaining issue is with code where eager loading was not implemented appropriately.  Antagonists for lazy loading would argue that, in this case, the uninformed developer would be making unnecessary calls to the database without any knowledge (assuming that they never profile their code).  Protagonists of lazy loading might mention that a) one should profile one’s code and b) in order to make the code work without lazy loading is to write a bunch of unnecessary code to see if the items were loaded and, if not, load them.  They might follow on that if this same uninformed developer failed to write that ancillary code, their solution would be riddled with false negatives.

So, who is right?  Both? Neither?  It is simply a matter of opinion.  There are folks on both sides that will never buy into the contrary viewpoint.  Given that both camps are going to theoretically be consumers of the Entity Framework, it should be clear that lazy loading should be an option.

Now, should it be on by default?  In beta 2, it is kind of on by default and kind of not.  LazyLoadingEnabled is actually defaults to false.  However, if you use the default code generation in the EF, it is set to true in all of the constructors for the context object.  That is the way it is.  The question is: is that ok?

That is actually a tougher question to answer.  IMHO, that was a questionable choice.  I’m actually ok with having LazyLoadingEnabled defaulting to false.  I would have been equally ok with having it default to true.  I would probably lean further toward having it default to true, if pushed.  However, it has to be one or the other and there are valid reasons for both.  At some point you have to make a decision and I assume that is what they did.  What I don’t really like is that if you use the default code gen template, it reverses this decision.  I think that once you take a stance on the default, you should remain consistent.  I understand that it is nice to illustrate how to set this property in the generated code.  Perhaps a better decision would have been to (re)set it to false and have a comment pointing out to set it to true.

To conclude, lazy loading is neither evil or the second coming.  It is a necessary feature of any ORM today and I’m glad it is in the EF.

Comments

3 Responses to “Is Lazy Loading in EF 4 Evil or the Second Coming?”
  1. Raghuraman says:

    Liked the article and Loved your Conclusion.

    It is better to have a choice. As the cliche goes, when to use it ? It Depends.

    Thanks for the write up.

  2. I like using eager loading in cases I have to display on the same screen data from N tables…
    If I use lazy or Explicit, I end up with X number of access to database as lines in the grid.

    On others section I use Lazy Loading so I only fetch from DataBase the Data I need to show..

    So my rule of thumb is:
    Get all data you need to display as quick and with less access to database.
    Not need data stays in database …

Trackbacks

Check out what others are saying about this post...
  1. [...] this post for a discussion of when to use eager vs. lazy [...]



Speak Your Mind

Tell us what you're thinking...
and oh, if you want a pic to show with your comment, go get a gravatar!

Demystifying The Code