The N+1 problem is not a new challenge, yet it’s often overlooked until your system faces a deluge of data. Or you might have heard of it, but never grasped its full impact—until you’re forced to confront it head-on. It’s wise to understand and avoid this issue before it ruins your weekends, isn’t it?
All problems have early warning signs, you can easily find out thousands of definitions for this problem; However, have you ever wondered how and when you might notice this issue early?
Picture this: one beautiful day, your database server indicates it’s running out of CPU capacity while the server memory remains fine. If you find yourself in this situation, I’m sad to inform you that it’s a late sign of the N+1 problem.
Early important signs
As what I have experienced, it often comes from the way Hibernate retrieves the associated data along with the main entity.
Relationship | Fetch Type |
OneToOne | FetchType.EAGER |
ManyToOne | FetchType.EAGER |
OneToMany | FetchType.LAZY |
ManyToMany | FetchType.LAZY |
Since ManyToOne/ OneToOne relationship has the default fetch type in Hibernate that is EAGER, it will always run N+1 queries to eagerly fetch data unless you explicitly specify a different fetch type for a relationship.
Below is a practical example illustrating the difference between LAZY and EAGER fetch types.
When I call the findAll() function to get the list of jobs, one query is generated to find all the jobs, and N additional queries are used to fetch occupation data. Of course, I haven’t returned any information related to the location or occupation for users.
However, even with LAZY fetch, the N+1 problem still exists. LAZY fetch runs N queries to get data at the moment it is accessed, rather than when the initial query is executed. For instance, after the last query to calculate the total items, it runs N queries to get the location information if I want to return it to users.
Fixing N-1 queries
To address this issue and enhance performance, we need to eliminate those unnecessary N queries when retrieving data. Honestly, I’ve tried several common solutions found on the internet, but not all of them worked as expected. Below are some solutions that worked for my case, and I hope you find them useful too.
NamedEntityGraph
In Spring, a NamedEntityGraph is a feature used to define and manage the fetching strategy for entity associations more efficiently. It allows developers to specify the fetch type (EAGER or LAZY) for each attribute or association, ensuring that only the necessary data is retrieved in a single query at runtime. This optimizes query performance by reducing the number of database accesses. By controlling which related entities are loaded with the main entity, NamedEntityGraph provides fine-grained control over query performance and prevents the N+1 problem.
Additionally, NamedEntityGraph supports nested associations. This means developers can specify not only first-level associations but also second-level, third-level, and deeper associations to be fetched with the main entity.
Example:
@Entity @NamedEntityGraph(name = "graph.Job.details",attributeNodes = { @NamedAttributeNode("occupation"), @NamedAttributeNode("location") }) public class Job { @Id @GeneratedValue(strategy = GenerationType.IDENTITY) private Long id; private String title; @ManyToOne(fetch = FetchType.LAZY) @JoinColumn(name = "occupation_id") private Occupation occupation; @ManyToOne(fetch = FetchType.LAZY) @JoinColumn(name = "location_id") private Location location; // getters and setters }
Usage:
public interface JobRepository extends JpaRepository<Job, Long> { @EntityGraph(value = "graph.Job.details", type = EntityGraph.EntityGraphType.LOAD) List<Job> findAll(); }
Result:
Hibernate: select j1_0.id,j1_0.job_description,l1_0.id,l1_0.company_id,o1_0.id,o1_0.occupation_name,j1_0.title,j1_0.total_pv from jobs j1_0 left join company_locations l1_0 on l1_0.id=j1_0.location_id left join occupations o1_0 on o1_0.id=j1_0.occupation_id where (j1_0.is_deleted=0) limit ?,? Hibernate: select count(j1_0.id) from jobs j1_0 where (j1_0.is_deleted=0)
Join Fetch
Another approach you might have heard of is Join Fetch, which is used in JPQL (Java Persistence Query Language) or by simply adding the join operator in a native query. However, be cautious, as the behavior varies depending on the implementation you choose.
- When using the fetch function supported by CriteriaBuilder, if you have more than two associated EAGER entities, such as occupation and location, and you only specify one fetched join in your query, you might still encounter the issue of N queries being generated to fetch the remaining associated fields. This can happen if you don’t specify the LAZY fetch type for them, leading to unexpected behavior.
public List<Job> findLAllJobByCriteriaFetch() { CriteriaBuilder cb = entityManager.getCriteriaBuilder(); CriteriaQuery<Job> cq = cb.createQuery(Job.class); Root<Job> job = cq.from(Job.class); job.fetch("location", JoinType.LEFT); TypedQuery<Job> query = entityManager.createQuery(cq);return query.getResultList(); }
Result:
Hibernate: select j1_0.id,j1_0.job_description,l1_0.id,l1_0.company_id,o1_0.id,o1_0.occupation_name,j1_0.title,j1_0.total_pv from jobs j1_0 left join company_locations l1_0 on l1_0.id=j1_0.location_id left join occupations o1_0 on o1_0.id=j1_0.occupation_id where (j1_0.is_deleted=0) Hibernate: select o1_0.id,o1_0.occupation_name from occupations o1_0 where o1_0.id=? Hibernate: select o1_0.id,o1_0.occupation_name from occupations o1_0 where o1_0.id=?
- When using JPQL, whether in the repository layer or with CriteriaBuilder, the associated entities are fetched again at runtime. Regardless of how many associated entities were mentioned in the query, this can result in additional queries being executed.
public List<Job> findListAllByJPQL() { String jpql = "select job from jobs job left join fetch company_locations loc on job.location.id = loc.id"; TypedQuery<Job> query = entityManager.createQuery(jpql, Job.class); return query.getResultList(); }
Result:
Hibernate: select j1_0.id,j1_0.job_description,l1_0.id,l1_0.company_id,o1_0.id,o1_0.occupation_name,j1_0.title,j1_0.total_pv from jobs j1_0 left join company_locations cl1_0 on j1_0.location_id=cl1_0.id where (j1_0.is_deleted=0) Hibernate: select cl1_0.id,cl1_0.city,cl1_0.company_id from company_locations cl1_0 where cl1_0.id=? Hibernate: select o1_0.id,o1_0.occupation_name from occupations o1_0 where o1_0.id=? Hibernate: select cl1_0.id,cl1_0.city,cl1_0.company_id from company_locations cl1_0 where cl1_0.id=? Hibernate: select o1_0.id,o1_0.occupation_name from occupations o1_0 where o1_0.id=? Hibernate: select o1_0.id,o1_0.occupation_name from occupations o1_0 where o1_0.id=?
Customized DTOs
To simplify working with Hibernate entities and avoid N+1 queries, I recommend selecting only the necessary fields and returning them in a customized DTO (Data Transfer Object). This approach is especially useful for list APIs when you don’t want to fetch all the information.
@Query(value = """ select job.id as id,job.title as title,job.specialty as specialty,job.pr_text as prText,job.job_description as jobDescription, loc.location_name as locationName,job.posting_start_date as postingStartDate,job.posting_end_date as postingEndDate,job.total_pv as totalPv from jobs job left join company_locations loc on job.location_id = loc.idleft join occupations o on job.occupation_id = o.id""", nativeQuery = true) Page<IJobItem> findAllJobListByPagination(Pageable pageable);
Result:
Summary
In summary, there are various solutions to improve your application’s performance, ranging from basic to complex. Depending on your specific requirements, some solutions may be more suitable than others. However, it’s crucial to have a solid understanding of the impact of each implementation and to carefully manage every request to your database server. This will help prevent overloading your server before it accidentally ruins your weekend. Good luck, and I hope you find these suggestions helpful.
この情報は役に立ちましたか?
カテゴリー: