Skip to main content

Spring Data JPA: Custom Queries with @Query

Who this guide is for

What is Spring Data JPA?โ€‹

Spring Data JPA is a Spring module that eliminates boilerplate data-access code. Instead of writing EntityManager calls manually, you declare a repository interface and Spring generates the implementation at runtime.

Your code Spring Data JPA Database
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
UserRepository โ”€โ”€โ†’ Generated proxy (runtime) โ”€โ”€โ†’ SQL
.findById(1) EntityManager.find() SELECT * FROM users WHERE id = 1

The building blocksโ€‹

LayerResponsibilityExample
EntityMaps a Java class to a DB table@Entity class User
RepositoryInterface declaring data-access operationsUserRepository extends JpaRepository
JpaRepositorySpring-provided base with save, findById, delete, pagination, etc.No code needed
@QueryOverride the generated SQL with your own JPQL or native SQLWhen derived methods aren't enough

What Spring generates without any custom codeโ€‹

// You write this interface โ€” nothing else
public interface UserRepository extends JpaRepository<User, Long> { }

// Spring generates at startup โ€” all of these work for free:
userRepository.findById(1L); // SELECT * FROM users WHERE id = 1
userRepository.save(user); // INSERT or UPDATE
userRepository.deleteById(1L); // DELETE FROM users WHERE id = 1
userRepository.findAll(pageable); // SELECT * FROM users LIMIT ? OFFSET ?
userRepository.count(); // SELECT COUNT(*) FROM users

Why @Query?โ€‹

Derived query methods โ€” the convenient starting pointโ€‹

Spring Data JPA can generate queries by parsing method names. No SQL or JPQL required:

// Spring reads the method name and generates the SQL automatically
List<User> findByStatusAndDepartmentId(String status, Long departmentId);
// โ†’ SELECT * FROM users WHERE status = ? AND department_id = ?

List<User> findByFirstNameContainingIgnoreCase(String keyword);
// โ†’ SELECT * FROM users WHERE LOWER(first_name) LIKE LOWER('%?%')

List<User> findTop5ByStatusOrderByCreatedAtDesc(String status);
// โ†’ SELECT * FROM users WHERE status = ? ORDER BY created_at DESC LIMIT 5

When derived methods break downโ€‹

As queries grow complex, method names become unreadable and unmaintainable:

// โŒ Derived method โ€” technically works, practically unreadable
List<User> findByStatusAndDepartmentIdAndCreatedAtAfterAndRoleIn(
String status, Long departmentId, LocalDate date, List<String> roles);

// โœ… @Query โ€” clear, explicit, maintainable
@Query("""
SELECT u FROM User u
WHERE u.status = :status
AND u.department.id = :deptId
AND u.createdAt > :since
AND u.role IN :roles
""")
List<User> findFilteredUsers(
@Param("status") String status,
@Param("deptId") Long deptId,
@Param("since") LocalDate since,
@Param("roles") List<String> roles);

When to use @Queryโ€‹

SituationUse @Query?Reason
Simple equality / range filter on one or two fieldsโŒDerived method is cleaner
Three or more filter conditionsโœ…Derived method name becomes unreadable
JOIN across related entitiesโœ…Derived methods can't express JOINs well
Aggregation (SUM, AVG, COUNT, GROUP BY)โœ…Not expressible as derived method
Database-specific function (PostgreSQL JSONB, window functions)โœ…Requires native SQL
Bulk UPDATE or DELETEโœ…Requires @Modifying + @Query
Return a DTO/projection instead of a full entityโœ…Constructor expression or interface projection

JPQL vs Native SQLโ€‹

Before writing queries, understand the difference between the two query modes:

JPQL (Java Persistence Query Language)โ€‹

JPQL operates on entity classes and their fields, not on database tables and columns. Hibernate translates JPQL into SQL at runtime.

// JPQL โ€” references the Java entity name "User" and field "firstName"
@Query("SELECT u FROM User u WHERE u.firstName = :name")

// Hibernate translates to SQL โ€” references the actual table "users" and column "first_name"
// โ†’ SELECT u.id, u.first_name, u.last_name, ... FROM users u WHERE u.first_name = ?

Advantages of JPQL:

  • Database-agnostic โ€” works on PostgreSQL, MySQL, Oracle, H2 without changes.
  • Hibernate can apply dialect-specific optimisations during AST translation.
  • Works with Hibernate's L1/L2 caches and lazy loading.
  • Refactoring entity field names (with an IDE) updates all JPQL references.

Native SQLโ€‹

Native SQL is sent directly to the database without JPQL parsing. Use it when you need database-specific features unavailable in JPQL.

// Native SQL โ€” references the actual table and column names
@Query(
value = "SELECT * FROM users WHERE properties->>'timezone' = :tz",
nativeQuery = true
)
List<User> findByTimezone(@Param("tz") String timezone);

When to use native SQL:

  • Database-specific functions: PostgreSQL JSONB, ARRAY, LATERAL, WITH (CTEs).
  • Window functions (ROW_NUMBER(), RANK(), LAG()).
  • Full-text search with tsvector / MATCH AGAINST.
  • Complex UNION, INTERSECT, or recursive CTEs.
Architectural trade-off with native SQL

Native queries bypass Hibernate's dialect translation. Migrating from MySQL to PostgreSQL requires rewriting every nativeQuery = true query. Always add a comment documenting the database dependency, and centralise native queries in one repository to make a future migration manageable.

Side-by-side comparisonโ€‹

JPQLNative SQL
Operates onEntity classes & fieldsDB tables & columns
Database portabilityโœ… PortableโŒ DB-specific
DB-specific functionsโŒ Not supportedโœ… Fully supported
Hibernate cache integrationโœ… Fullโš ๏ธ Limited
Performance optimisationHibernate ASTRaw driver
Refactoring safetyโœ… IDE-assistedโŒ Manual
When to useDefault choiceDB-specific features only

Parameter Bindingโ€‹

Injecting variables into queries safely is fundamental. Spring Data JPA prevents SQL injection by using prepared statement placeholders โ€” parameters are never concatenated into the query string.

Positional parametersโ€‹

Parameters are referenced by their position in the method signature โ€” ?1 for the first argument, ?2 for the second, and so on.

@Query("SELECT u FROM User u WHERE u.firstName = ?1 AND u.lastName = ?2")
Optional<User> findByFullName(String firstName, String lastName);
Positional parameters are fragile

If you reorder method parameters during a refactor, positional references silently break โ€” ?1 now maps to the wrong argument. Prefer named parameters.

Use @Param("name") to bind parameters explicitly by name. The query reference is :name. The order of method arguments no longer matters.

@Query("""
SELECT u FROM User u
WHERE u.firstName = :firstName
AND u.lastName = :lastName
AND u.status = :status
""")
Optional<User> findByFullNameAndStatus(
@Param("firstName") String firstName,
@Param("lastName") String lastName,
@Param("status") String status
);

Passing collectionsโ€‹

JPQL's IN clause accepts a Collection parameter directly โ€” no manual string-building needed:

@Query("SELECT u FROM User u WHERE u.role IN :roles AND u.status = :status")
List<User> findByRolesAndStatus(
@Param("roles") List<String> roles,
@Param("status") String status
);

// Usage
repository.findByRolesAndStatus(List.of("ADMIN", "MANAGER"), "ACTIVE");
// โ†’ SELECT ... FROM users WHERE role IN ('ADMIN', 'MANAGER') AND status = 'ACTIVE'

LIKE patternsโ€‹

JPQL does not auto-wrap :param in wildcards. You must either provide them from the caller or use CONCAT:

// Option 1: wrap at call site
@Query("SELECT u FROM User u WHERE u.email LIKE :pattern")
List<User> findByEmailPattern(@Param("pattern") String pattern);
// Caller: repository.findByEmailPattern("%@gmail.com")

// Option 2: wrap inside the query with CONCAT
@Query("SELECT u FROM User u WHERE u.email LIKE CONCAT('%', :domain, '%')")
List<User> findByEmailDomain(@Param("domain") String domain);
// Caller: repository.findByEmailDomain("gmail.com")

Sorting and Paginationโ€‹

Dynamic sortingโ€‹

Pass a Sort object as a parameter โ€” Spring Data appends an ORDER BY clause to your JPQL at runtime without you writing it:

@Query("SELECT u FROM User u WHERE u.status = :status")
List<User> findByStatus(@Param("status") String status, Sort sort);

// Usage โ€” build Sort programmatically
Sort sort = Sort.by(Sort.Direction.DESC, "createdAt")
.and(Sort.by(Sort.Direction.ASC, "lastName"));

repository.findByStatus("ACTIVE", sort);
// โ†’ SELECT ... FROM users WHERE status = 'ACTIVE'
// ORDER BY created_at DESC, last_name ASC

Pagination with Pageableโ€‹

Pageable adds LIMIT + OFFSET and returns a Page<T> that also contains total count metadata (used for building pagination UI):

@Query("SELECT u FROM User u WHERE u.role = :role")
Page<User> findByRolePaginated(@Param("role") String role, Pageable pageable);

// Usage
Pageable pageable = PageRequest.of(
0, // page number (0-indexed)
20, // page size
Sort.by("createdAt").descending() // optional sort
);

Page<User> page = repository.findByRolePaginated("ADMIN", pageable);
page.getContent(); // List<User> โ€” the 20 records
page.getTotalElements(); // total matching records (e.g. 847)
page.getTotalPages(); // 847 / 20 = 43 pages
page.hasNext(); // true
Pagination requires a count query

Page<T> automatically runs a SELECT COUNT(*) alongside your data query. If your main query is expensive (many JOINs), the count query may also be slow. You can provide a separate, optimised count query:

@Query(
value = "SELECT u FROM User u JOIN FETCH u.roles WHERE u.status = :status",
countQuery = "SELECT COUNT(u) FROM User u WHERE u.status = :status"
)
Page<User> findActiveUsersWithRoles(@Param("status") String status, Pageable pageable);

Data Modification โ€” @Modifyingโ€‹

@Query alone only reads data. To execute UPDATE or DELETE statements, pair it with @Modifying:

// โœ… UPDATE โ€” returns number of affected rows
@Modifying
@Transactional
@Query("UPDATE User u SET u.status = 'INACTIVE' WHERE u.lastLoginDate < :threshold")
int deactivateDormantAccounts(@Param("threshold") LocalDate threshold);

// โœ… DELETE โ€” returns number of deleted rows
@Modifying
@Transactional
@Query("DELETE FROM User u WHERE u.status = 'DELETED' AND u.updatedAt < :before")
int purgeDeletedUsers(@Param("before") LocalDate before);
Always add @Transactional to @Modifying methods

@Modifying without @Transactional throws InvalidDataAccessApiUsageException at runtime. Either annotate the repository method or ensure the calling service method is transactional.

Bulk operation rulesโ€‹

RuleWhy
Always use @Modifying for UPDATE/DELETEWithout it, Spring assumes read-only โ€” exception at runtime
Add @TransactionalDML requires an active transaction
Use clearAutomatically = trueClears the L1 cache after the query โ€” prevents stale entity reads (see senior section)
Use flushAutomatically = trueFlushes pending entity changes before executing โ€” prevents lost updates

SpEL Expressionsโ€‹

Spring Expression Language (SpEL) can be embedded in @Query strings inside #{ ... }. The most useful built-in SpEL variable is #entityName โ€” it resolves to the entity name declared on the @Entity annotation, enabling reusable base queries in generic repository code.

// Without SpEL โ€” hardcoded entity name breaks if the entity is renamed
@Query("SELECT u FROM User u WHERE u.email LIKE %:domain%")

// With SpEL โ€” resolves the entity name dynamically
@Query("SELECT e FROM #{#entityName} e WHERE e.email LIKE %:domain%")
List<T> findByEmailDomain(@Param("domain") String domain);

Most common SpEL use cases:

// In a generic base repository shared by many entities
@NoRepositoryBean
public interface BaseRepository<T, ID> extends JpaRepository<T, ID> {

// #{#entityName} โ†’ resolves to "User", "Product", "Order", etc. at runtime
@Query("SELECT e FROM #{#entityName} e WHERE e.createdBy = :userId")
List<T> findAllCreatedBy(@Param("userId") Long userId);
}

// Works for all entities that extend BaseRepository
public interface UserRepository extends BaseRepository<User, Long> { }
public interface ProductRepository extends BaseRepository<Product, Long> { }

Projections โ€” Return Only What You Needโ€‹

By default, Spring Data returns fully hydrated entity objects. This is wasteful when you only need 2โ€“3 fields from a 30-column table โ€” Hibernate must read, allocate, and dirty-track the entire entity graph.

Interface projections (Spring Data feature)โ€‹

Declare a Java interface with getter methods matching the fields you want. Spring generates a proxy that reads only those columns.

// Projection interface โ€” only id, firstName, email
public interface UserSummary {
Long getId();
String getFirstName();
String getEmail();
}

// Return the projection type directly โ€” no @Query needed for simple cases
public interface UserRepository extends JpaRepository<User, Long> {
List<UserSummary> findByStatus(String status);
// Spring generates: SELECT u.id, u.first_name, u.email FROM users WHERE status = ?
}

Class-based DTO projections with @Queryโ€‹

For complex queries (aggregations, computed fields, multiple entities), use JPQL constructor expressions to project results directly into a DTO class:

// UserSummaryDto.java โ€” a plain class, not an entity
@Getter
@AllArgsConstructor
public class UserSummaryDto {
private final Long id;
private final String firstName;
private final String email;
private final Long orderCount; // computed field โ€” not on the entity
}

// Repository
@Query("""
SELECT new dev.example.dto.UserSummaryDto(
u.id, u.firstName, u.email, COUNT(o.id)
)
FROM User u
LEFT JOIN u.orders o
WHERE u.status = 'ACTIVE'
GROUP BY u.id, u.firstName, u.email
""")
List<UserSummaryDto> getActiveUserSummaries();
Why DTO projections matter for performance
Full entityInterface projectionDTO projection
Columns fetchedAllOnly declared gettersOnly constructor args
Hibernate trackingโœ… Dirty-checkedโŒ No trackingโŒ No tracking
Proxy overheadHeavyLight proxyNone
Memory usageHighLowLowest
Use forWrite operationsSimple readsComplex reads / aggregations

Complete Use-Case Examplesโ€‹

@Query("""
SELECT u FROM User u
WHERE (:status IS NULL OR u.status = :status)
AND (:role IS NULL OR u.role = :role)
AND (:search IS NULL OR LOWER(u.firstName) LIKE LOWER(CONCAT('%', :search, '%'))
OR LOWER(u.email) LIKE LOWER(CONCAT('%', :search, '%')))
""")
Page<User> searchUsers(
@Param("status") String status,
@Param("role") String role,
@Param("search") String search,
Pageable pageable
);
// Caller passes null for fields that should not filter
// repository.searchUsers("ACTIVE", null, "alice", PageRequest.of(0, 20))

Persistence Context Internalsโ€‹

Senior deep-dive starts here

The sections below focus on JVM memory mechanics, Hibernate internals, and production performance patterns. Understanding these separates developers who write working queries from engineers who write efficient ones.

The L1 cache (First-Level Cache)โ€‹

Every JPA transaction has a persistence context โ€” Hibernate's L1 cache. It is a Map<EntityKey, Object> that lives for the duration of the transaction. Every entity you load is placed in this map.

Transaction starts
โ”‚
โ”œโ”€ em.find(User.class, 1L) โ†’ MISS โ†’ SELECT FROM db โ†’ stores in L1 cache
โ”œโ”€ em.find(User.class, 1L) โ†’ HIT โ†’ returns from cache, no SQL
โ”œโ”€ em.find(User.class, 2L) โ†’ MISS โ†’ SELECT FROM db โ†’ stores in L1 cache
โ”‚
โ”œโ”€ user.setName("Alice Updated") โ†’ entity is "dirty" in the cache
โ”‚
Transaction commits โ†’ Hibernate detects dirty entity โ†’ flushes UPDATE to DB
โ”‚
Cache discarded

Key implications:

  • Two calls to findById(1L) in the same transaction hit the DB only once โ€” safe to call freely.
  • Hibernate tracks every managed entity for changes (dirty checking) at flush time. Loading 10,000 entities means Hibernate compares 10,000 objects against their snapshots at commit โ€” expensive.
  • The L1 cache is not shared between transactions or threads.

The stale-cache problem with @Modifyingโ€‹

A bulk UPDATE query executes as a single SQL statement directly against the database, completely bypassing the L1 cache. If you had already loaded an entity in the same transaction, your Java object is now out of sync with the database:

@Transactional
public void example() {
// Step 1: loads User id=1 into L1 cache โ†’ status = "ACTIVE"
User user = userRepository.findById(1L).orElseThrow();
System.out.println(user.getStatus()); // "ACTIVE"

// Step 2: bulk UPDATE bypasses the cache entirely
userRepository.deactivateDormantAccounts(LocalDate.now().minusMonths(6));
// DB row for id=1 is now status = "INACTIVE"

// Step 3: reads from L1 cache โ€” returns stale data!
System.out.println(user.getStatus()); // โŒ still "ACTIVE" โ€” stale!

// Step 4: even re-fetching by ID returns from cache
User reloaded = userRepository.findById(1L).orElseThrow();
System.out.println(reloaded.getStatus()); // โŒ still "ACTIVE" โ€” cache hit!
}

The fix โ€” clearAutomatically = true:

@Modifying(
clearAutomatically = true, // evicts all entities from L1 cache after execution
flushAutomatically = true // flushes pending dirty entities to DB before execution
)
@Query("UPDATE User u SET u.status = 'INACTIVE' WHERE u.lastLoginDate < :threshold")
int deactivateDormantAccounts(@Param("threshold") LocalDate threshold);
With clearAutomatically = true:

Step 1: findById(1) โ†’ loads into L1 cache
Step 2: @Modifying UPDATE โ†’ executes SQL โ†’ clears L1 cache
Step 3: findById(1) โ†’ MISS (cache cleared) โ†’ fresh SELECT from DB โœ…
clearAutomatically clears the entire L1 cache

All currently tracked entities are evicted โ€” not just the ones affected by your query. Any entity loaded before this call that you planned to use afterward must be re-fetched. Design your transaction flow to run bulk modifications before loading entities you need afterward.


The N+1 Problemโ€‹

N+1 is the most common JPA performance mistake. It occurs when loading a list of N entities, then accessing a lazy-loaded relationship on each โ€” triggering N additional queries:

@Entity
public class User {
@OneToMany(fetch = FetchType.LAZY) // lazy โ€” not loaded with the user
private List<Order> orders;
}

// The N+1 scenario
@Transactional
public void printUserOrders() {
List<User> users = userRepository.findAll(); // 1 SELECT: loads 100 users

for (User u : users) {
// Accessing the lazy collection triggers a query per user:
System.out.println(u.getOrders().size());
// 100 SELECTs: SELECT * FROM orders WHERE user_id = ?
}
// Total: 1 + 100 = 101 queries โŒ
}

In a high-throughput environment, 101 queries for one endpoint depletes HikariCP connection pool threads, causing downstream timeouts for all other requests.

Solving N+1 with JOIN FETCHโ€‹

JOIN FETCH tells Hibernate to load the association in the same query as the parent:

// Single query fetches users AND their orders together
@Query("SELECT u FROM User u JOIN FETCH u.orders WHERE u.status = :status")
List<User> findActiveUsersWithOrders(@Param("status") String status);
// โ†’ SELECT u.*, o.* FROM users u INNER JOIN orders o ON o.user_id = u.id WHERE u.status = ?
// Total: 1 query โœ…

JOIN FETCH vs LEFT JOIN FETCHโ€‹

// INNER JOIN FETCH โ€” only returns users who HAVE at least one order
@Query("SELECT u FROM User u JOIN FETCH u.orders WHERE u.status = :status")

// LEFT JOIN FETCH โ€” returns all users, with empty list if they have no orders
@Query("SELECT u FROM User u LEFT JOIN FETCH u.orders WHERE u.status = :status")
JOIN FETCH with pagination โ€” the silent memory problem

You cannot safely combine JOIN FETCH with Pageable:

// โŒ This produces a Hibernate warning and loads ALL rows into memory first:
@Query("SELECT u FROM User u JOIN FETCH u.orders")
Page<User> findAllWithOrders(Pageable pageable);
// WARNING: HHH90003004: firstResult/maxResults specified with collection fetch;
// applying in memory!
// โ†’ All rows loaded into JVM, then sliced โ€” defeats the purpose of pagination

Solutions:

Option 1 โ€” Use @EntityGraph instead (works with pagination):

@EntityGraph(attributePaths = {"orders"})
Page<User> findByStatus(String status, Pageable pageable);

Option 2 โ€” Separate queries (batch fetch pattern):

// Step 1: paginate user IDs only
@Query("SELECT u.id FROM User u WHERE u.status = :status")
Page<Long> findUserIdsByStatus(@Param("status") String status, Pageable pageable);

// Step 2: fetch users + orders for just those IDs
@Query("SELECT u FROM User u JOIN FETCH u.orders WHERE u.id IN :ids")
List<User> findUsersWithOrdersByIds(@Param("ids") List<Long> ids);

@EntityGraph vs JOIN FETCHโ€‹

Both solve N+1, but in different ways:

// Defined on the entity:
@Entity
@NamedEntityGraph(
name = "User.withRolesAndDepartment",
attributeNodes = {
@NamedAttributeNode("roles"),
@NamedAttributeNode("department")
}
)
public class User { ... }

// Used in the repository:
@EntityGraph("User.withRolesAndDepartment")
Optional<User> findByUsername(String username);

// Or inline without @NamedEntityGraph:
@EntityGraph(attributePaths = {"roles", "department"})
List<User> findByStatus(String status);

Pros: works with derived methods and pagination. Spring generates a LEFT OUTER JOIN.

Cons: always LEFT JOIN โ€” cannot specify INNER JOIN. Less control over the generated SQL.

@EntityGraphJOIN FETCH
Works with Pageableโœ… YesโŒ No (in-memory warning)
Works with derived methodsโœ… YesโŒ No
JOIN type controlโŒ Always LEFT OUTERโœ… INNER or LEFT
Condition on joined tableโŒ Noโœ… Yes
Multiple collectionsโš ๏ธ Cartesian product riskโš ๏ธ Cartesian product risk
Multiple collections โ€” the Cartesian product trap

Fetching two @OneToMany collections in one query produces a Cartesian product โ€” if a user has 5 orders and 3 roles, the result set has 15 rows for that user. For JOIN FETCH, Hibernate deduplicates with DISTINCT. For large collections this is still wasteful:

// โŒ Cartesian product โ€” avoid for large collections
@Query("SELECT DISTINCT u FROM User u JOIN FETCH u.orders JOIN FETCH u.roles")

// โœ… Better: use @BatchSize on the entity collections (separate queries, but batched)
@OneToMany
@BatchSize(size = 50) // loads roles for 50 users in one IN query
private List<Role> roles;

DTO Projections for Memory Optimizationโ€‹

Fetching full entities forces Hibernate to:

  1. Read every column from the result set.
  2. Allocate a Java object for each entity.
  3. Store a deep snapshot copy for dirty checking at flush time.
  4. Maintain all of this in the L1 cache for the transaction duration.

For read-heavy endpoints, this is wasteful. DTO projections skip all of it:

// โŒ Fetches all 30 columns, allocates entity proxies, enables dirty tracking
List<User> users = userRepository.findByStatus("ACTIVE");

// โœ… Fetches 4 columns, no proxy, no dirty checking, no L1 cache tracking
@Query("""
SELECT new dev.example.dto.UserSummaryDto(
u.id, u.firstName, u.email, COUNT(o.id)
)
FROM User u
LEFT JOIN u.orders o
WHERE u.status = 'ACTIVE'
GROUP BY u.id, u.firstName, u.email
""")
List<UserSummaryDto> getActiveUserSummaries();

Memory impact at scale:

Full entity fetch โ€” 10,000 users:
Per entity: ~2KB (entity object + Hibernate snapshot + proxy metadata)
Total heap: ~20 MB held for the duration of the transaction

DTO projection โ€” 10,000 users (4 fields):
Per DTO: ~80 bytes (plain object, no tracking)
Total heap: ~800 KB โ€” 25ร— less memory pressure

Performance Tuningโ€‹

๐Ÿ”ฌ Senior deep-dive: query hints

@QueryHints passes hints to the underlying JPA provider to control caching, lock modes, and query timeouts:

@QueryHints({
@QueryHint(name = "org.hibernate.readOnly", value = "true"), // no dirty checking
@QueryHint(name = "org.hibernate.fetchSize", value = "50"), // JDBC fetch size
@QueryHint(name = "jakarta.persistence.query.timeout", value = "5000") // 5s timeout
})
@Query("SELECT u FROM User u WHERE u.status = 'ACTIVE'")
List<User> findActiveUsersReadOnly();
HintEffect
org.hibernate.readOnly = trueDisables dirty checking โ€” reduces memory, speeds up flush
org.hibernate.fetchSize = NJDBC streaming batch size โ€” reduces memory for large result sets
jakarta.persistence.query.timeoutCancels slow queries after N milliseconds
org.hibernate.cacheable = trueEnables the L2 (query result) cache for this query
๐Ÿ”ฌ Senior deep-dive: HikariCP and connection pool starvation

For detailed guidelines on pool sizing, parameter details, starvation patterns, and Micrometer instrumentation, see the centralized Database Connection Pooling guide.

๐Ÿ”ฌ Senior deep-dive: streaming large result sets

Loading 100,000 rows into a List blows up heap memory. Stream them instead:

// Returns a lazy Java Stream โ€” rows are fetched from JDBC in batches, not all at once
@QueryHints(@QueryHint(name = "org.hibernate.fetchSize", value = "100"))
@Query("SELECT u FROM User u WHERE u.status = 'ACTIVE'")
Stream<User> streamActiveUsers();

// Usage โ€” must be inside a transaction; close the stream when done
@Transactional(readOnly = true)
public void exportUsers(OutputStream out) {
try (Stream<User> stream = userRepository.streamActiveUsers()) {
stream.map(userMapper::toDto)
.forEach(dto -> csvWriter.write(out, dto));
}
// Stream closed โ†’ cursor closed โ†’ connection returned to pool
}
Always close the Stream

A Stream<T> from Spring Data holds an open JDBC cursor and therefore an active database connection. Not closing it leaks the connection. Always use try-with-resources or call .close() explicitly.

๐Ÿ”ฌ Senior deep-dive: @Query vs Specification vs QueryDSL

For dynamic queries where filters change per request, @Query with optional parameter tricks (IS NULL OR field = :param) is limited. Compare the alternatives:

@QuerySpecification (JPA Criteria)QueryDSL
Static queriesโœ… BestOverkillOverkill
Dynamic filtersโš ๏ธ Hacky (IS NULL tricks)โœ… Designed for thisโœ… Designed for this
Type safetyโŒ String-basedโœ… Type-safe at compile timeโœ… Type-safe
Readabilityโœ… SQL-like, familiarโŒ Verbose Criteria APIโœ… Fluent, readable
Pagination supportโœ… Nativeโœ… Nativeโœ… Native
Setup costNoneNone (built-in)Requires APT plugin
// Dynamic search with Specification โ€” composable predicates
public class UserSpecifications {
public static Specification<User> hasStatus(String status) {
return (root, query, cb) ->
status == null ? null : cb.equal(root.get("status"), status);
}
public static Specification<User> nameContains(String keyword) {
return (root, query, cb) ->
keyword == null ? null : cb.like(
cb.lower(root.get("firstName")),
"%" + keyword.toLowerCase() + "%"
);
}
}

// Compose at call site โ€” only active conditions are applied
Specification<User> spec = where(hasStatus("ACTIVE")).and(nameContains("alice"));
Page<User> results = userRepository.findAll(spec, PageRequest.of(0, 20));

Decision guide:

  • Simple, fixed query โ†’ @Query.
  • Dynamic filters that change per request โ†’ Specification.
  • Complex dynamic queries with joins across many tables โ†’ QueryDSL.

Testing @Query Methodsโ€‹

Test repository queries with @DataJpaTest โ€” it loads only the JPA slice (no controllers, no services) and uses an in-memory H2 database by default:

@DataJpaTest
class UserRepositoryTest {

@Autowired
private UserRepository userRepository;

@Autowired
private TestEntityManager em;

@BeforeEach
void setUp() {
em.persist(new User("alice", "[email protected]", "ACTIVE", "ADMIN"));
em.persist(new User("bob", "[email protected]", "ACTIVE", "USER"));
em.persist(new User("carol", "[email protected]","INACTIVE","USER"));
em.flush();
}

@Test
void findByStatus_returnsOnlyActiveUsers() {
List<User> result = userRepository.findByStatus("ACTIVE", Sort.unsorted());
assertThat(result).hasSize(2)
.extracting(User::getUsername)
.containsExactlyInAnyOrder("alice", "bob");
}

@Test
void getActiveUserSummaries_includesOrderCount() {
// persist an order for alice
em.persist(new Order("alice-id", BigDecimal.TEN));
em.flush();

List<UserSummaryDto> summaries = userRepository.getActiveUserSummaries();
assertThat(summaries).anyMatch(s ->
s.getFirstName().equals("alice") && s.getOrderCount() == 1L);
}

@Test
void deactivateDormantAccounts_updatesStatusAndReturnsCount() {
int affected = userRepository.deactivateDormantAccounts(
LocalDate.now().minusMonths(6));

assertThat(affected).isGreaterThanOrEqualTo(0);
em.clear(); // clear L1 cache before asserting DB state
User alice = em.find(User.class, /* alice's id */);
assertThat(alice.getStatus()).isEqualTo("INACTIVE");
}
}
Use a real database for native queries

H2 doesn't support PostgreSQL-specific syntax (JSONB, LATERAL, ON CONFLICT). Test native queries against a real database using Testcontainers:

@DataJpaTest
@AutoConfigureTestDatabase(replace = AutoConfigureTestDatabase.Replace.NONE)
@Testcontainers
class NativeQueryTest {

@Container
static PostgreSQLContainer<?> postgres =
new PostgreSQLContainer<>("postgres:16-alpine");

@DynamicPropertySource
static void configureProperties(DynamicPropertyRegistry registry) {
registry.add("spring.datasource.url", postgres::getJdbcUrl);
registry.add("spring.datasource.username", postgres::getUsername);
registry.add("spring.datasource.password", postgres::getPassword);
}
// ... test methods
}

Common Mistakesโ€‹

MistakeWhy it's a problemFix
Using @Modifying without @TransactionalThrows InvalidDataAccessApiUsageException at runtimeAdd @Transactional to the repository method
@Modifying without clearAutomatically = trueStale entities in L1 cache cause incorrect reads after bulk updateSet clearAutomatically = true
JOIN FETCH with PageableHibernate loads all rows into memory, then slices in JVM โ€” OutOfMemoryError riskUse @EntityGraph or two-query pattern
Fetching full entities for read-only endpointsUnnecessary heap pressure from dirty-checking snapshotsUse DTO projections or @QueryHint readOnly = true
N+1 on every request100 users ร— 1 lazy query = 101 DB round trips, pool starvationJOIN FETCH or @EntityGraph
SELECT * patterns in native queriesCouples result mapping to table schema โ€” breaks on column additionsSelect only the columns you need
Not closing Stream<T> resultOpen JDBC cursor leaks the database connection โ€” pool exhaustion over timeAlways use try-with-resources
Hardcoding entity name in SpEL-eligible base queriesBreaks when entity is renamedUse #{#entityName} in generic repositories

๐ŸŽฏ Interview Questionsโ€‹

Q1. What is the difference between JPQL and native SQL in Spring Data JPA? When do you choose each?

JPQL operates on JPA entity classes and their fields โ€” Hibernate translates it to SQL using the configured dialect. It is database-portable, integrates with the L1/L2 caches, and is refactoring-safe. Native SQL is sent directly to the database unchanged โ€” use it only for database-specific features unavailable in JPQL: JSONB operations, window functions, recursive CTEs, full-text search. Default to JPQL; switch to native only when you hit its limits.

Q2. What is the N+1 problem and how does JOIN FETCH solve it?

N+1 occurs when loading N parent entities and then accessing a lazy-loaded collection on each โ€” triggering one query per parent. 100 users with lazy orders = 1 + 100 = 101 queries. JOIN FETCH loads the parent and the association in a single SQL JOIN query, reducing 101 queries to 1. The trade-off: JOIN FETCH cannot be combined with Pageable without loading all rows into memory first.

Q3. Why must @Modifying queries use clearAutomatically = true?

Bulk UPDATE and DELETE queries execute directly against the database, bypassing the JPA persistence context (L1 cache). Any entities already loaded in the same transaction hold stale data. Without clearAutomatically = true, subsequent reads in the same transaction return cached (stale) values even after a findById. With it, the L1 cache is cleared after the bulk operation, forcing fresh DB reads.

Q4. When would you use a DTO projection instead of returning a full entity?

For read-only, high-frequency endpoints where you need only a subset of fields. Full entities incur: reading all columns, allocating the entity object, creating a dirty-checking snapshot, and storing everything in the L1 cache. DTO projections skip all of this โ€” Hibernate reads only the projected columns, allocates a plain object, and does no tracking. For a list of 10,000 records, this can reduce heap usage by 10โ€“25ร—.

Q5. (Senior) How does HikariCP connection pool starvation relate to N+1 queries?

Each query borrows a connection from HikariCP for the duration of the call. N+1 queries inside a single transaction hold a connection open while issuing serial queries, quickly depleting the pool and causing other requests to time out. For details, sizing, and solutions, see Database Connection Pooling.

Q6. (Senior) How do @EntityGraph and JOIN FETCH differ, and when do you choose each?

Both solve N+1 by eagerly loading associations in a JOIN. Differences: @EntityGraph always uses LEFT OUTER JOIN, works with Pageable and derived methods, but offers no control over JOIN type or conditions on the joined table. JOIN FETCH in @Query allows INNER or LEFT JOIN, supports WHERE conditions on the joined table, and is explicit โ€” but cannot be safely combined with Pageable. Rule of thumb: use @EntityGraph for paginated lists or when derived methods are sufficient; use JOIN FETCH in @Query when you need precise SQL control and are not paginating.

Q7. (Senior) What happens when you call @Modifying without flushAutomatically = true, and there are pending entity changes in the same transaction?

By default (flushAutomatically = false), the persistence context has not yet flushed dirty entity changes to the database when the bulk query runs. If you loaded a user, set user.setStatus("ACTIVE"), and then ran a bulk UPDATE affecting that same row โ€” the bulk UPDATE may overwrite the in-memory change with the old value from DB, or the subsequent flush may overwrite the bulk result. With flushAutomatically = true, Hibernate first flushes all pending changes to the DB, then executes the bulk query โ€” ensuring a consistent ordering. The safe pattern: always set both flushAutomatically = true and clearAutomatically = true on @Modifying queries.


See Alsoโ€‹