When you eagerly load related data in Entity Framework Core (EF Core) using Include and ThenInclude, the framework can either generate a single large SQL query with multiple JOIN operations or execute several smaller queries (known as split queries).
Selecting the appropriate strategy is essential, as an incorrect choice may lead to excessive row counts, high memory consumption, or additional network round trips. This article describes the concept of split queries, their purpose, safe usage practices, and scenarios where they outperform single-query approaches along with practical patterns for production environments.
The problem: JOINs, duplication, and the "Cartesian explosion"
Entity Framework combines tables with JOINs in a single query to obtain related data when using relational databases. Although SQL's JOIN operations are a basic functionality, their incorrect or ineffective implementation can cause significant performance overhead.This method is effective up until the query requires several collection navigations at the same level, after which the result set may expand significantly.

JOIN actions duplicate the columns of the parent entity for every child row, even with a single collection inclusion. When the primary has a lot of columns, like pictures or long text, this can get expensive. To prevent retrieving needless huge columns, Entity Framework Core suggests utilizing projections.
var query = context.Customer
.Include(o=>o.Orders)
.Include(a=>a.Addresses)
.Where(r=>r.Id==1);Generated SQL
Split queries instruct EF Core to divide a single LINQ query with Include statements into multiple SQL commands, typically one for each included collection. EF Core then assembles the results into the entity graph in memory, preventing the large, duplicated row sets that broad JOIN operations often produce.
Consider above mention example:
Generated SQL
When to Use Split Queries?- Multiple sibling collection includes
- Large principal rows - If the principal entity contains large columns (such as images or BLOBs), JOIN duplication can significantly increase the payload size. In such cases, consider using split queries.
- Complex graphs with deep relationships - EF Core’s caution regarding single-query eager loading of collections still applies; for queries with heavy includes, split queries are generally the safer default.
Enabling split queries globally (at the Context level)
You can also set split queries as the default behavior for your application's DbContext
protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
{
optionsBuilder
.UseSqlServer(
connectionString,
o => o.UseQuerySplittingBehavior(QuerySplittingBehavior.SplitQuery));
}When split queries are set as the default, you can still configure individual queries to execute as single queries. Use caution when applying split queries in the following scenarios. Avoid Skip/Take with split queries unless ordering is unique - If you’re using split queries with Skip and Take in EF Core versions prior to 10, make sure your query has a unique ordering. If it doesn’t, the results might be wrong.
Prefer projections over Include for paged lists - Select only the data you need (e.g., project into DTOs) instead of including entire object graphs. This approach reduces payload size and prevents JOIN duplication
Conclusion
In EF Core, split queries are a useful and effective tool. When eagerly loading complicated object graphs, they assist minimize JOIN-related speed issues and eliminate data duplication, but they also add extra round trips and possible consistency issues. The optimal strategy varies depending on the situation: whenever you mix split queries with pagination, make sure the ordering is consistent, choose projections for paging, and assess both procedures using production-like data.
Use Split Queries When:
- You include two or more collections at the same level.
- Principal entities contain large columns (e.g., binary or text) that cannot be avoided.
- You need predictable memory usage and reduced duplication.
Use Single Queries When:
- Includes are primarily references rather than collections.
- Network latency is high and additional round trips are expensive.
- Strong consistency is required and you prefer a single SQL statement.
Windows Hosting Recommendation



0 comments:
Post a Comment