概要
本文提供一个EF Core的优化案例,主要介绍一些EF Core常用的优化方法,以及在优化过程中,出现性能反复的时候的解决方法,并澄清一些对优化概念的误解,例如AsNoTracking并不包治百病。
本文使用的是Dotnet 6.0和EF Core 7.0。
代码及实现
背景介绍
本文主要使用一个图书和作者的案例,用于介绍优化过程。
- 一个作者Author有多本自己写的的图书Book
- 一本图书Book有一个发行商Publisher
- 一个作者Author是一个系统用户User
- 一个系统用户User有多个角色Role
本实例中Author表和Book数据量较大,记录数量全部过万条,其它数据表记录大概都是几十或几百条。具体实体类定义请见附录。
查询需求
我们需要查找写书最多的前两名作家,该作家需要年龄在20岁以上,国籍是法国。需要他们的FirstName, LastName, Email,UserName以及在1900年以前他们发行的图书信息,包括书名Name和发行日期Published。
基本优化思路
本人做EF Core的复杂查询优化,并不推荐直接查看生成的SQL代码,我习惯按照如下方式进行:
首先,进行EF的LINQ代码检查(初筛),找到明显的错误。
- 查看代码中是否有基本错误,主要针对全表载入的问题。例如EF需要每一步的LINQ扩展方法的返回值都是IQueryable类型,不能有IEnumerable类型;
- 查看是否有不需要的栏位;
- 根据情况决定是否加AsNoTracking,注意这个东西有时候加了也没用;
其次,找到数据量较大的表,进行代码整理和针对大数据表的优化(精细化调整)
- 在操作大数据表时候,先要进行基本的过滤;
- 投影操作Select应该放到排序操作后面;
- 减少返回值数量,推荐进行分页操作;
本人推荐一旦出现性能反复的时候或者代码整体基本完成的时候,再去查看生成的SQL代码。
初始查询代码
public List<AuthorWeb> GetAuthors() {using var dbContext = new AppDbContext();var authors = dbContext.Authors.Include(x => x.User).ThenInclude(x => x.UserRoles).ThenInclude(x => x.Role).Include(x => x.Books).ThenInclude(x => x.Publisher).ToList().Select(x => new AuthorWeb{UserCreated = x.User.Created,UserEmailConfirmed = x.User.EmailConfirmed,UserFirstName = x.User.FirstName,UserLastActivity = x.User.LastActivity,UserLastName = x.User.LastName,UserEmail = x.User.Email,UserName = x.User.UserName,UserId = x.User.Id,RoleId = x.User.UserRoles.FirstOrDefault(y => y.UserId == x.UserId).RoleId,BooksCount = x.BooksCount,AllBooks = x.Books.Select(y => new BookWeb{Id = y.Id,Name = y.Name,Published = y.Published,ISBN = y.ISBN,PublisherName = y.Publisher.Name}).ToList(),AuthorAge = x.Age,AuthorCountry = x.Country,AuthorNickName = x.NickName,Id = x.Id}).ToList().Where(x => x.AuthorCountry == "France" && x.AuthorAge == 20).ToList();var orderedAuthors = authors.OrderByDescending(x => x.BooksCount).ToList().Take(2).ToList();List<AuthorWeb> finalAuthors = new List<AuthorWeb>();foreach (var author in orderedAuthors){List<BookWeb> books = new List<BookWeb>();var allBooks = author.AllBooks;foreach (var book in allBooks){if (book.Published.Year < 1900){book.PublishedYear = book.Published.Year;books.Add(book);}}author.AllBooks = books;finalAuthors.Add(author);}return finalAuthors;}
Benchmark测试后,系统资源使用情况如下:
代码性能非常差,内存消耗很大,一次执行就要消耗190MB,执行时间超过2s。如果是放到WebAPI里面调用,用户会有明显的卡顿感觉;如果面临高并发的情况,很可得会造成服务器资源紧张,返回各种500错误。
优化代码
初筛
按照我们的优化思路,在查看上面的代码后,发现一个严重的问题。
虽然每次LINQ查询返回都是IQueryable类型,但是源码中有多个ToList(),尤其是第一个,它的意思是将Author, Book,User,Role,Publisher等多个数据表的数据全部载入,前面已经说了,Author, Book两张表的数据量很大,必然影响性能。
我们需要删除前面多余的ToList(),只保留最后的即可。请参考附录中的方法GetAuthors_RemoveToList()。
在GetAuthors_RemoveToList()基础上,对照用户的需求,发现查询结果中包含了Role相关的信息和很多Id信息,但是查询结果并不需要这些,因此必须删掉。请参考附录中的方法GetAuthorsOptimized_RemoveColumn()
在GetAuthorsOptimized_RemoveColumn的基础上,我们再加入AsNoTracking方法。请参考附录中的方法GetAuthorsOptimized_AsNoTracking()
我们在Benchmark中,测试上面提到的三个方法,直接结果如下:
从Benchmark的测试结果上看,删除多余ToList方法和删除多余的栏位,确实带来了性能的大幅提升。
但是增加AsNoTracking,性能提反而下降了一点。这也说明了AsNoTracking并不是适用所有场景。Select投影操作生成的AuthorWeb对像,并不是EF管理的,与DbContext无关,它只是作为前端API的返回值。相当于EF做了没有用的事,所以性能略有下降。
代码进一步调整
初筛阶段完成后,下面对代码进一步整理
下面Take和Order操作可以并入基本的查询中,Take可以帮助我们减少返回值的数量。请见 GetAuthorsOptimized_ChangeOrder()方法。
var orderedAuthors = authors.OrderByDescending(x => x.BooksCount).ToList().Take(2).ToList();
在GetAuthorsOptimized_ChangeOrder基础上,对于dbContext.Authors,Author是一张数据量很大的表,我们需要在其进行联表操作前,先过滤掉不需要的内容,所以我们可以把Where前提,还有就是将排序操作放到投影的Select前面完成。请见 GetAuthorsOptimized_ChangeOrder方法。
上面的两个优化方法的执行结果如下:
可以看到性略能有提升。
下面我们为了进一步提升性能,可以查看一下生成的SQL代码,看看是否还有优化的空间。
GetAuthorsOptimized_ChangeOrder方法生成的SQL如下:
SELECT [u].[FirstName], [u].[LastName], [u].[Email], [u].[UserName], [t].[
BooksCount], [t].[Id], [u].[Id], [b].[Name], [b].[Published], [b].[Id], [t].[Age
], [t].[Country]FROM (SELECT TOP(@__p_0) [a].[Id], [a].[Age], [a].[BooksCount], [a].[Country
], [a].[UserId]FROM [Authors] AS [a]WHERE [a].[Country] = N'France' AND [a].[Age] >= 20ORDER BY [a].[BooksCount] DESC) AS [t]INNER JOIN [Users] AS [u] ON [t].[UserId] = [u].[Id]LEFT JOIN [Books] AS [b] ON [t].[Id] = [b].[AuthorId]ORDER BY [t].[BooksCount] DESC, [t].[Id], [u].[Id]
从生成SQL来看,Author表在使用之前过滤掉了相关的内容,但是直接Left Join了[Books]这个大表。我们可以按照前面提到的1900年以前的查询要求,在左联之前先过滤一下,请参考 GetAuthorsOptimized_SelectFilter方法。
该方法执行后,生成的SQL如下:
SELECT [u].[FirstName], [u].[LastName], [u].[Email], [u].[UserName], [t].[
BooksCount], [t].[Id], [u].[Id], [t0].[Name], [t0].[Published], [t0].[Id], [t].[
Age], [t].[Country]FROM (SELECT TOP(@__p_1) [a].[Id], [a].[Age], [a].[BooksCount], [a].[Country
], [a].[UserId]FROM [Authors] AS [a]WHERE [a].[Country] = N'France' AND [a].[Age] >= 20ORDER BY [a].[BooksCount] DESC) AS [t]INNER JOIN [Users] AS [u] ON [t].[UserId] = [u].[Id]LEFT JOIN (SELECT [b].[Name], [b].[Published], [b].[Id], [b].[AuthorId]FROM [Books] AS [b]WHERE [b].[Published] < @__date_0) AS [t0] ON [t].[Id] = [t0].[AuthorId]ORDER BY [t].[BooksCount] DESC, [t].[Id], [u].[Id]
在左联之前,确实进行了过滤,该方法的性能测试如下:
在避免Book表直接进行左联后,性能有所提升。
最后一个优化点,是在EF Core 5.0里面提供了带Filter功能的Include方法,也可以用于本案例的优化,但是该特性但是存在一些局限性,具体请参考EF Core中带过滤器参数的Include方法
但是此方法又涉及了将IQueryable转换成IEnumerable的操作,最后要将生成的Author对象全部转换成AuthorWeb对象。代码过于繁琐,而且带来的性能提升也不明显。因此放弃这个点。
dbContext.Authors.AsNoTracking().Include(x => x.Books.Where(b => b.Published < date))......
结论
从这个优化过程来看,其实对性能提升最大的贡献就是删除多余的ToList(),避免全表载入和删除不需要的栏位两项。其它所谓更精细的优化,性能提升有限。
附录
实体类定义
public class Author{public int Id { get; set; }public int Age { get; set; }public string Country { get; set; }public int BooksCount { get; set; }public string NickName { get; set; }[ForeignKey("UserId")]public User User { get; set; }public int UserId { get; set; }public virtual List<Book> Books { get; set; } = new List<Book>();}public class Book{public int Id { get; set; }public string Name { get; set; }[ForeignKey("AuthorId")]public Author Author { get; set; }public int AuthorId { get; set; }public DateTime Published { get; set; }public string ISBN { get; set; }[ForeignKey("PublisherId")]public Publisher Publisher { get; set; }public int PublisherId { get; set; }}
public class Publisher{public int Id { get; set; }public string Name { get; set; }public DateTime Established { get; set; }}public class User{public int Id { get; set; }public string FirstName { get; set; }public string LastName { get; set; }public string UserName { get; set; }public string Email { get; set; }public virtual List<UserRole> UserRoles { get; set; } = new List<UserRole>();public DateTime Created { get; set; }public bool EmailConfirmed { get; set; }public DateTime LastActivity { get; set; }}public class Role{public int Id { get; set; }public virtual List<UserRole> UserRoles { get; set; } = new List<UserRole>();public string Name { get; set; }}public class AuthorWeb{public DateTime UserCreated { get; set; }public bool UserEmailConfirmed { get; set; }public string UserFirstName { get; set; }public DateTime UserLastActivity { get; set; }public string UserLastName { get; set; }public string UserEmail { get; set; }public string UserName { get; set; }public int UserId { get; set; }public int AuthorId { get; set; }public int Id { get; set; }public int RoleId { get; set; }public int BooksCount { get; set; }public List<BookWeb> AllBooks { get; set; }public int AuthorAge { get; set; }public string AuthorCountry { get; set; }public string AuthorNickName { get; set; }}public class BookWeb{public int Id { get; set; }public string Name { get; set; }public DateTime Published { get; set; }public int PublishedYear { get; set; }public string PublisherName { get; set; }public string ISBN { get; set; }}
优化方法
[Benchmark]
public List<AuthorWeb> GetAuthors() {using var dbContext = new AppDbContext();var authors = dbContext.Authors.Include(x => x.User).ThenInclude(x => x.UserRoles).ThenInclude(x => x.Role).Include(x => x.Books).ThenInclude(x => x.Publisher).ToList().Select(x => new AuthorWeb{UserCreated = x.User.Created,UserEmailConfirmed = x.User.EmailConfirmed,UserFirstName = x.User.FirstName,UserLastActivity = x.User.LastActivity,UserLastName = x.User.LastName,UserEmail = x.User.Email,UserName = x.User.UserName,UserId = x.User.Id,RoleId = x.User.UserRoles.FirstOrDefault(y => y.UserId == x.UserId).RoleId,BooksCount = x.BooksCount,AllBooks = x.Books.Select(y => new BookWeb{Id = y.Id,Name = y.Name,Published = y.Published,ISBN = y.ISBN,PublisherName = y.Publisher.Name}).ToList(),AuthorAge = x.Age,AuthorCountry = x.Country,AuthorNickName = x.NickName,Id = x.Id}).ToList().Where(x => x.AuthorCountry == "France" && x.AuthorAge >= 20).ToList();var orderedAuthors = authors.OrderByDescending(x => x.BooksCount).ToList().Take(2).ToList();List<AuthorWeb> finalAuthors = new List<AuthorWeb>();foreach (var author in orderedAuthors){List<BookWeb> books = new List<BookWeb>();var allBooks = author.AllBooks;foreach (var book in allBooks){if (book.Published.Year < 1900){book.PublishedYear = book.Published.Year;books.Add(book);}}author.AllBooks = books;finalAuthors.Add(author);}return finalAuthors;}[Benchmark]public List<AuthorWeb> GetAuthors_RemoveToList(){using var dbContext = new AppDbContext();var authors = dbContext.Authors.Include(x => x.User).ThenInclude(x => x.UserRoles).ThenInclude(x => x.Role).Include(x => x.Books).ThenInclude(x => x.Publisher)// .ToList().Select(x => new AuthorWeb{UserCreated = x.User.Created,UserEmailConfirmed = x.User.EmailConfirmed,UserFirstName = x.User.FirstName,UserLastActivity = x.User.LastActivity,UserLastName = x.User.LastName,UserEmail = x.User.Email,UserName = x.User.UserName,UserId = x.User.Id,RoleId = x.User.UserRoles.FirstOrDefault(y => y.UserId == x.UserId).RoleId,BooksCount = x.BooksCount,AllBooks = x.Books.Select(y => new BookWeb{Id = y.Id,Name = y.Name,Published = y.Published,ISBN = y.ISBN,PublisherName = y.Publisher.Name}).ToList(),AuthorAge = x.Age,AuthorCountry = x.Country,AuthorNickName = x.NickName,Id = x.Id})// .ToList().Where(x => x.AuthorCountry == "France" && x.AuthorAge >=20).ToList();var orderedAuthors = authors.OrderByDescending(x => x.BooksCount).ToList().Take(2).ToList();List<AuthorWeb> finalAuthors = new List<AuthorWeb>();foreach (var author in orderedAuthors){List<BookWeb> books = new List<BookWeb>();var allBooks = author.AllBooks;foreach (var book in allBooks){if (book.Published.Year < 1900){book.PublishedYear = book.Published.Year;books.Add(book);}}author.AllBooks = books;finalAuthors.Add(author);}return finalAuthors;}[Benchmark]public List<AuthorWeb> GetAuthorsOptimized_RemoveColumn(){using var dbContext = new AppDbContext();var authors = dbContext.Authors// .Include(x => x.User)//.ThenInclude(x => x.UserRoles)// .ThenInclude(x => x.Role)// .Include(x => x.Books)// .ThenInclude(x => x.Publisher).Select(x => new AuthorWeb{// UserCreated = x.User.Created,// UserEmailConfirmed = x.User.EmailConfirmed,UserFirstName = x.User.FirstName,// UserLastActivity = x.User.LastActivity,UserLastName = x.User.LastName,UserEmail = x.User.Email,UserName = x.User.UserName,// UserId = x.User.Id,// RoleId = x.User.UserRoles.FirstOrDefault(y => y.UserId == x.UserId).RoleId,BooksCount = x.BooksCount,AllBooks = x.Books.Select(y => new BookWeb{// Id = y.Id,Name = y.Name,Published = y.Published,// ISBN = y.ISBN,// PublisherName = y.Publisher.Name}).ToList(),AuthorAge = x.Age,AuthorCountry = x.Country,AuthorNickName = x.NickName,// Id = x.Id}).Where(x => x.AuthorCountry == "France" && x.AuthorAge >=20).ToList();var orderedAuthors = authors.OrderByDescending(x => x.BooksCount).ToList().Take(2).ToList();List<AuthorWeb> finalAuthors = new List<AuthorWeb>();foreach (var author in orderedAuthors){List<BookWeb> books = new List<BookWeb>();var allBooks = author.AllBooks;foreach (var book in allBooks){if (book.Published.Year < 1900){book.PublishedYear = book.Published.Year;books.Add(book);}}author.AllBooks = books;finalAuthors.Add(author);}return finalAuthors;}[Benchmark]public List<AuthorWeb> GetAuthorsOptimized_AsNoTracking(){using var dbContext = new AppDbContext();var authors = dbContext.Authors.AsNoTracking()// .Include(x => x.User)// .ThenInclude(x => x.UserRoles)// .ThenInclude(x => x.Role)// .Include(x => x.Books)// .ThenInclude(x => x.Publisher).Select(x => new AuthorWeb{//UserCreated = x.User.Created,// UserEmailConfirmed = x.User.EmailConfirmed,UserFirstName = x.User.FirstName,// UserLastActivity = x.User.LastActivity,UserLastName = x.User.LastName,UserEmail = x.User.Email,UserName = x.User.UserName,// UserId = x.User.Id,//RoleId = x.User.UserRoles.FirstOrDefault(y => y.UserId == x.UserId).RoleId,BooksCount = x.BooksCount,AllBooks = x.Books.Select(y => new BookWeb{// Id = y.Id,Name = y.Name,Published = y.Published,//ISBN = y.ISBN,//PublisherName = y.Publisher.Name}).ToList(),AuthorAge = x.Age,AuthorCountry = x.Country,//AuthorNickName = x.NickName,Id = x.Id}).Where(x => x.AuthorCountry == "France" && x.AuthorAge >=20).ToList();var orderedAuthors = authors.OrderByDescending(x => x.BooksCount).ToList().ToList();List<AuthorWeb> finalAuthors = new List<AuthorWeb>();foreach (var author in authors){List<BookWeb> books = new List<BookWeb>();var allBooks = author.AllBooks;foreach (var book in allBooks){if (book.Published.Year < 1900){book.PublishedYear = book.Published.Year;books.Add(book);}}author.AllBooks = books;finalAuthors.Add(author);}return finalAuthors;}[Benchmark]public List<AuthorWeb> GetAuthorsOptimized_Take_Order(){using var dbContext = new AppDbContext();var authors = dbContext.Authors.AsNoTracking().Select(x => new AuthorWeb{UserFirstName = x.User.FirstName,UserLastName = x.User.LastName,UserEmail = x.User.Email,UserName = x.User.UserName,BooksCount = x.BooksCount,AllBooks = x.Books.Select(y => new BookWeb{Name = y.Name,Published = y.Published,}).ToList(),AuthorAge = x.Age,AuthorCountry = x.Country,AuthorNickName = x.NickName,}).Where(x => x.AuthorCountry == "France" && x.AuthorAge >=20).OrderByDescending(x => x.BooksCount).Take(2).ToList();// var orderedAuthors = authors.OrderByDescending(x => x.BooksCount).ToList().ToList();List<AuthorWeb> finalAuthors = new List<AuthorWeb>();foreach (var author in authors){List<BookWeb> books = new List<BookWeb>();var allBooks = author.AllBooks;foreach (var book in allBooks){if (book.Published.Year < 1900){book.PublishedYear = book.Published.Year;books.Add(book);}}author.AllBooks = books;finalAuthors.Add(author);}return finalAuthors;}[Benchmark]public List<AuthorWeb> GetAuthorsOptimized_ChangeOrder(){using var dbContext = new AppDbContext();var authors = dbContext.Authors.AsNoTracking().Where(x => x.Country == "France" && x.Age >=20).OrderByDescending(x => x.BooksCount)// .Include(x => x.User)// .ThenInclude(x => x.UserRoles)// .ThenInclude(x => x.Role)// .Include(x => x.Books)// .ThenInclude(x => x.Publisher).Select(x => new AuthorWeb{//UserCreated = x.User.Created,// UserEmailConfirmed = x.User.EmailConfirmed,UserFirstName = x.User.FirstName,// UserLastActivity = x.User.LastActivity,UserLastName = x.User.LastName,UserEmail = x.User.Email,UserName = x.User.UserName,// UserId = x.User.Id,//RoleId = x.User.UserRoles.FirstOrDefault(y => y.UserId == x.UserId).RoleId,BooksCount = x.BooksCount,AllBooks = x.Books.Select(y => new BookWeb{// Id = y.Id,Name = y.Name,Published = y.Published,//ISBN = y.ISBN,//PublisherName = y.Publisher.Name}).ToList(),AuthorAge = x.Age,AuthorCountry = x.Country,//AuthorNickName = x.NickName,Id = x.Id}) .Take(2).ToList();// var orderedAuthors = authors.OrderByDescending(x => x.BooksCount).ToList().ToList();List<AuthorWeb> finalAuthors = new List<AuthorWeb>();foreach (var author in authors){List<BookWeb> books = new List<BookWeb>();var allBooks = author.AllBooks;foreach (var book in allBooks){if (book.Published.Year < 1900){book.PublishedYear = book.Published.Year;books.Add(book);}}author.AllBooks = books;finalAuthors.Add(author);}return finalAuthors;}// [Benchmark]public List<AuthorWeb> GetAuthorsOptimized_IncludeFilter(){using var dbContext = new AppDbContext();var date = new DateTime(1900, 1, 1);var authors = dbContext.Authors.AsNoTracking().Include(x => x.Books.Where(b => b.Published < date)).Include(x => x.User)// .IncludeFilter(x =>x.Books.Where(b =>b.Published.Year < 1900)).Where(x => x.Country == "France" && x.Age >=20).OrderByDescending(x => x.BooksCount)// .ThenInclude(x => x.UserRoles)// .ThenInclude(x => x.Role)// .Include(x => x.Books)// .ThenInclude(x => x.Publisher).Take(2).ToList();// var orderedAuthors = authors.OrderByDescending(x => x.BooksCount).ToList().ToList();var authorList = authors.AsEnumerable().Select(x => new AuthorWeb{//UserCreated = x.User.Created,// UserEmailConfirmed = x.User.EmailConfirmed,UserFirstName = x.User.FirstName,// UserLastActivity = x.User.LastActivity,UserLastName = x.User.LastName,UserEmail = x.User.Email,UserName = x.User.UserName,// UserId = x.User.Id,//RoleId = x.User.UserRoles.FirstOrDefault(y => y.UserId == x.UserId).RoleId,BooksCount = x.BooksCount,AllBooks = x.Books// .Where(b => b.Published < date).Select(y => new BookWeb{// Id = y.Id,Name = y.Name,Published = y.Published,//ISBN = y.ISBN,//PublisherName = y.Publisher.Name}).ToList(),AuthorAge = x.Age,AuthorCountry = x.Country,//AuthorNickName = x.NickName,// Id = x.Id}).ToList();return authorList;}[Benchmark]public List<AuthorWeb> GetAuthorsOptimized_SelectFilter(){using var dbContext = new AppDbContext();var date = new DateTime(1900, 1, 1);var authors = dbContext.Authors.AsNoTracking().Include(x => x.Books.Where(b => b.Published < date)).Where(x => x.Country == "France" && x.Age >=20).OrderByDescending(x => x.BooksCount).Select(x => new AuthorWeb{//UserCreated = x.User.Created,// UserEmailConfirmed = x.User.EmailConfirmed,UserFirstName = x.User.FirstName,// UserLastActivity = x.User.LastActivity,UserLastName = x.User.LastName,UserEmail = x.User.Email,UserName = x.User.UserName,// UserId = x.User.Id,//RoleId = x.User.UserRoles.FirstOrDefault(y => y.UserId == x.UserId).RoleId,BooksCount = x.BooksCount,AllBooks = x.Books.Where(b => b.Published < date).Select(y => new BookWeb{// Id = y.Id,Name = y.Name,Published = y.Published,//ISBN = y.ISBN,//PublisherName = y.Publisher.Name}).ToList(),AuthorAge = x.Age,AuthorCountry = x.Country,//AuthorNickName = x.NickName,// Id = x.Id}).Take(2).ToList();return authors;}