Deepen PracticeOrdered learning track

Spring Data JPA Repository Layer

Learn Java Persistence, Database Integration, and JPA - Part 024

Spring Data JPA repository layer design for production systems: repository abstractions, derived queries, JPQL/native queries, specifications, projections, custom repositories, transaction boundaries, and anti-patterns.

17 min read3317 words
PrevNext
Lesson 2435 lesson track2029 Deepen Practice
#java#persistence#jpa#jakarta-persistence+8 more

Part 024 — Spring Data JPA Repository Layer

Part 023 covered cache strategy: when and how persisted state can be reused without corrupting correctness.

This part moves into the most common persistence abstraction used in Spring applications: Spring Data JPA repositories.

Spring Data JPA is powerful because it removes repetitive repository boilerplate. It is dangerous because it can make persistence look simpler than it is.

A repository method can hide:

  • transaction assumptions
  • fetch plan decisions
  • locking decisions
  • projection shape
  • pagination limits
  • bulk update side effects
  • persistence-context synchronization
  • provider-specific hints
  • data access policy
  • domain boundary leakage

The repository layer is not just a collection of database methods. In a serious system, it is a persistence boundary.


1. Kaufman Framing: Use Abstraction Without Losing Feedback

Kaufman's method says: learn enough to self-correct.

With Spring Data JPA, self-correction means you can look at a repository method and predict:

  • what SQL shape it likely emits
  • whether it returns managed entities or DTOs
  • whether it participates in an existing transaction
  • whether it can produce N+1
  • whether it can paginate safely
  • whether it can update stale data
  • whether it leaks domain internals to callers
  • whether it should be a repository method at all

The core skill:

Use Spring Data JPA to reduce mechanical code, not to hide persistence semantics from yourself.


2. Repository Is Not DAO Rebranded

A low-level DAO often means:

“Object with SQL methods.”

A domain-oriented repository means:

“Boundary for retrieving and persisting aggregate state according to domain use cases.”

Spring Data JPA can support either style. The team chooses the discipline.

Weak Repository Design

public interface CaseRepository extends JpaRepository<Case, UUID> {

    List<Case> findByStatus(CaseStatus status);

    List<Case> findByPriority(Priority priority);

    List<Case> findByAssigneeId(UUID assigneeId);

    List<Case> findByStatusAndPriorityAndAssigneeIdAndCreatedAtBetween(
        CaseStatus status,
        Priority priority,
        UUID assigneeId,
        Instant from,
        Instant to
    );
}

This repository becomes a query dumping ground.

Stronger Repository Design

public interface CaseRepository extends JpaRepository<Case, UUID>, CaseRepositoryCustom {

    @Lock(LockModeType.OPTIMISTIC)
    @Query("""
        select c
        from Case c
        where c.id = :id
    """)
    Optional<Case> findForDecision(@Param("id") UUID id);

    @EntityGraph(attributePaths = {"assignee", "workflowState"})
    @Query("""
        select c
        from Case c
        where c.id = :id
    """)
    Optional<Case> findDetailForRead(@Param("id") UUID id);
}

The methods describe use-case intent:

  • findForDecision
  • findDetailForRead
  • searchForBackoffice
  • findOverdueForEscalation

Good repository names encode persistence semantics.


3. Spring Data Repository Types

Common repository interfaces:

InterfacePurpose
Repository<T, ID>Marker/root abstraction
CrudRepository<T, ID>Basic CRUD operations
PagingAndSortingRepository<T, ID>Pagination and sorting support
JpaRepository<T, ID>JPA-specific repository with flush/batch helpers
Custom repository interfaceComplex query/write path or provider-specific behavior

Most Spring JPA applications extend JpaRepository by default:

public interface OrderRepository extends JpaRepository<Order, UUID> {
}

But exposing JpaRepository everywhere means callers can access methods like:

  • findAll()
  • deleteAll()
  • flush()
  • saveAndFlush()
  • getReferenceById()

These methods are not always safe at domain boundary.

Restricting the Base Interface

For stricter systems, define a narrow base repository:

@NoRepositoryBean
public interface AggregateRepository<T, ID> extends Repository<T, ID> {

    Optional<T> findById(ID id);

    T save(T aggregate);
}

Then:

public interface CaseRepository extends AggregateRepository<Case, UUID> {

    Optional<Case> findForDecision(UUID id);
}

This prevents accidental findAll() or mass deletes from becoming available by default.

Practical Rule

Extend JpaRepository when convenience is worth it. Use a narrower base repository when domain safety matters more than convenience.


4. The save() Method Is Often Misunderstood

save() does not always mean SQL INSERT or UPDATE immediately.

Spring Data JPA delegates to JPA semantics:

  • new entity usually goes through persist
  • detached/existing entity may go through merge
  • SQL may be emitted on flush/commit, not at method call

Bad mental model:

repository.save(order); // database updated now

Better mental model:

repository.save(order); // entity scheduled/managed according to JPA lifecycle; SQL later

Inside a transaction, this is often unnecessary:

@Transactional
public void renameOrder(UUID id, String label) {
    Order order = repository.findById(id).orElseThrow();
    order.rename(label);

    // repository.save(order) usually unnecessary for managed entity
}

Dirty checking flushes managed changes.

When save() Is Useful

  • persisting a new aggregate
  • reattaching/merging detached state deliberately
  • repository API consistency
  • explicit lifecycle transition in application service

When save() Is a Smell

@Transactional
public void approve(UUID id) {
    Case c = repository.findById(id).orElseThrow();
    c.approve();
    repository.save(c); // often noise
}

This line may reveal the engineer does not trust or understand managed state.


5. Derived Query Methods

Derived query methods are convenient:

List<Order> findByCustomerIdAndStatus(UUID customerId, OrderStatus status);

They are best for simple constraints.

Good candidates:

Optional<User> findByEmail(String email);

boolean existsByTenantIdAndCode(UUID tenantId, String code);

List<Currency> findByEnabledTrueOrderByCodeAsc();

Bad candidates:

List<Case> findByTenantIdAndStatusInAndPriorityInAndAssigneeDepartmentIdAndCreatedAtBetweenAndDeletedFalseOrderBySlaDeadlineAsc(...);

Long derived method names are a design smell.

Derived Query Strengths

  • low boilerplate
  • readable for simple lookup
  • easy to refactor with property names
  • supports common predicates

Derived Query Weaknesses

  • query shape is implicit
  • complex names become unreadable
  • fetch plan is not obvious
  • joins may surprise you
  • difficult to add hints/entity graphs/locks cleanly
  • not ideal for complex business queries

Rule

Use derived queries for simple lookup. Use explicit @Query, Specification, Querydsl, or custom repository when query intent matters.


6. @Query: Make Intent Explicit

JPQL query:

@Query("""
    select c
    from Case c
    where c.tenantId = :tenantId
      and c.status = :status
      and c.deleted = false
    order by c.slaDeadline asc
""")
List<Case> findOpenCasesForTenant(
    @Param("tenantId") UUID tenantId,
    @Param("status") CaseStatus status
);

Advantages:

  • explicit query shape
  • easier review
  • easier to add join fetch or projection
  • clearer parameter binding
  • can express domain-specific query names

Native query:

@Query(
    value = """
        select c.id, c.reference_no, c.status, c.sla_deadline
        from cases c
        where c.tenant_id = :tenantId
          and c.status = 'OPEN'
        order by c.sla_deadline asc
        limit :limit
    """,
    nativeQuery = true
)
List<CaseQueueRow> findQueueRowsNative(
    @Param("tenantId") UUID tenantId,
    @Param("limit") int limit
);

Native query is appropriate when:

  • database-specific feature is required
  • JPQL cannot express query efficiently
  • query is read-model/report oriented
  • SQL plan must be precise
  • window functions, CTEs, JSON operators, or vendor-specific syntax are needed

Rule

Native SQL in a repository is not a failure. Unreviewed native SQL hidden behind vague method names is the failure.


7. Return Type Is a Contract

Repository return type communicates semantics.

Return TypeMeaning
Optional<Entity>Zero or one entity expected
EntityMust exist; absence exceptional or framework-thrown
List<Entity>Bounded result expected
Page<T>Need content plus total count
Slice<T>Need page-like traversal without total count
Stream<T>Large read, must close transaction/resource
DTO/recordRead model, not managed state
boolean exists...Existence check, must consider race if used for decision
long count...Count, can be expensive and stale immediately

Avoid Ambiguous Lists

List<Case> findByStatus(CaseStatus status);

Can this return 5 rows or 5 million?

Better:

Page<CaseSummary> findBackofficeQueue(..., Pageable pageable);

List<Case> findTop100ByStatusOrderByCreatedAtAsc(CaseStatus status);

Stream<Case> streamByStatus(CaseStatus status);

Make cardinality explicit.


8. Entity Return vs Projection Return

Returning entities means returning managed state when inside a persistence context.

Optional<Case> findById(UUID id);

Use entity return for:

  • command/update path
  • aggregate mutation
  • invariant enforcement
  • lifecycle operations

Use projection return for:

  • list screens
  • reports
  • API read endpoints
  • dashboards
  • search results

Example record projection:

public record CaseQueueItem(
    UUID id,
    String referenceNo,
    CaseStatus status,
    Instant slaDeadline
) {}
@Query("""
    select new com.example.caseapp.CaseQueueItem(
        c.id,
        c.referenceNo,
        c.status,
        c.slaDeadline
    )
    from Case c
    where c.assignee.id = :assigneeId
      and c.status = com.example.caseapp.CaseStatus.OPEN
    order by c.slaDeadline asc
""")
Page<CaseQueueItem> findQueueItems(
    @Param("assigneeId") UUID assigneeId,
    Pageable pageable
);

Rule

Commands load aggregates. Queries return projections unless the caller truly needs entity behavior.

This is one of the simplest ways to prevent accidental graph loading and API/entity coupling.


9. Pagination in Repositories

Spring Data makes pagination easy:

Page<CaseQueueItem> findByAssigneeId(UUID assigneeId, Pageable pageable);

But Page requires a count query.

For large datasets, the count may be expensive.

Use Slice when you only need “has next”:

Slice<CaseQueueItem> findByAssigneeId(UUID assigneeId, Pageable pageable);

Use keyset/windowed access when offset becomes too expensive. Spring Data supports scroll-style access in modern versions, but the underlying rule remains:

Stable ordering is mandatory for reliable pagination.

Bad:

PageRequest.of(0, 50)

No explicit sort means unstable order.

Better:

PageRequest.of(
    0,
    50,
    Sort.by(
        Sort.Order.asc("slaDeadline"),
        Sort.Order.asc("id")
    )
);

Always include a deterministic tie-breaker such as id.


10. Fetch Plans in Repository Methods

Repository methods must own fetch plans for their use case.

@EntityGraph

@EntityGraph(attributePaths = {"assignee", "workflowState"})
Optional<Case> findDetailById(UUID id);

Good for:

  • simple graph loading
  • avoiding N+1
  • keeping query concise

Be careful:

  • graph can become too wide
  • collection loading can multiply rows
  • it can hide performance decisions if overused

Join Fetch in @Query

@Query("""
    select c
    from Case c
    join fetch c.assignee
    left join fetch c.workflowState
    where c.id = :id
""")
Optional<Case> findDetail(@Param("id") UUID id);

Good when fetch shape is important and should be visible in code review.

DTO Projection

@Query("""
    select new com.example.CaseHeader(c.id, c.referenceNo, a.displayName)
    from Case c
    join c.assignee a
    where c.id = :id
""")
Optional<CaseHeader> findHeader(@Param("id") UUID id);

Often better for read endpoints.

Rule

A repository method that returns entities should make its fetch plan obvious or deliberately minimal.


11. Locking in Repository Methods

Spring Data supports JPA locks through @Lock.

@Lock(LockModeType.OPTIMISTIC)
@Query("""
    select c
    from Case c
    where c.id = :id
""")
Optional<Case> findForOptimisticDecision(@Param("id") UUID id);

Pessimistic example:

@Lock(LockModeType.PESSIMISTIC_WRITE)
@QueryHints({
    @QueryHint(name = "jakarta.persistence.lock.timeout", value = "1000")
})
@Query("""
    select r
    from Reservation r
    where r.id = :id
""")
Optional<Reservation> findForUpdate(@Param("id") UUID id);

Use lock-specific method names:

findForUpdate
findForDecision
findLockedById

Avoid hiding locks behind normal lookup methods.

Bad:

Optional<Reservation> findById(UUID id); // secretly pessimistic via redeclaration

Locks are operationally meaningful. Make them visible.


12. @Modifying Queries

Bulk update/delete methods require @Modifying.

@Modifying(clearAutomatically = true, flushAutomatically = true)
@Query("""
    update Case c
    set c.status = :newStatus
    where c.status = :oldStatus
""")
int transitionAll(
    @Param("oldStatus") CaseStatus oldStatus,
    @Param("newStatus") CaseStatus newStatus
);

Risks:

  • bypasses entity lifecycle callbacks
  • bypasses dirty checking
  • may bypass version checks depending query
  • persistence context may become stale
  • L2 cache/query cache may need eviction
  • domain invariants can be skipped

Use bulk updates for administrative/maintenance paths or carefully designed write paths, not as a shortcut around domain logic.

Safer Naming

int bulkExpireOverdueCases(Instant cutoff);

Include bulk in the method name to signal special semantics.


13. Transaction Boundaries: Repository vs Service

Typical recommendation:

  • service/application layer owns transaction boundary
  • repository performs persistence operations
  • repository methods may have read-only hints, but should not define business transaction orchestration

Example:

@Service
public class CaseApprovalService {

    private final CaseRepository caseRepository;

    @Transactional
    public void approve(UUID caseId, UUID reviewerId) {
        Case c = caseRepository.findForDecision(caseId)
            .orElseThrow(CaseNotFoundException::new);

        c.approve(reviewerId);
    }
}

Repository:

public interface CaseRepository extends JpaRepository<Case, UUID> {

    @Lock(LockModeType.OPTIMISTIC)
    @Query("select c from Case c where c.id = :id")
    Optional<Case> findForDecision(@Param("id") UUID id);
}

Why Service Owns Transaction

Because a use case may involve:

  • loading multiple aggregates
  • validating permissions
  • writing domain events
  • calling outbox writer
  • updating audit metadata
  • coordinating idempotency

A repository does not know the whole use case.

Repository-Level @Transactional

Spring Data repositories have transactional behavior by default for many methods. But relying on repository methods as the transaction boundary can fragment a use case.

Bad:

public void approve(UUID caseId) {
    Case c = repository.findById(caseId).orElseThrow(); // transaction A maybe
    c.approve();
    repository.save(c); // transaction B maybe
}

Better:

@Transactional
public void approve(UUID caseId) {
    Case c = repository.findById(caseId).orElseThrow();
    c.approve();
}

14. Read-Only Transactions

Use read-only transactions for query paths:

@Transactional(readOnly = true)
public Page<CaseQueueItem> loadQueue(UUID assigneeId, Pageable pageable) {
    return repository.findQueueItems(assigneeId, pageable);
}

Benefits may include:

  • communicates intent
  • avoids accidental writes by convention
  • may allow provider/framework optimizations
  • helps code review

But read-only is not a security boundary. Do not assume it makes writes impossible in every environment/provider combination.

Rule

readOnly = true is a semantic signal and possible optimization, not a replacement for good design.


15. Specifications in Repository Layer

From Part 015, Specification is useful for dynamic filtering.

public interface CaseRepository
    extends JpaRepository<Case, UUID>, JpaSpecificationExecutor<Case> {
}

Specification example:

public final class CaseSpecifications {

    public static Specification<Case> tenant(UUID tenantId) {
        return (root, query, cb) -> cb.equal(root.get("tenantId"), tenantId);
    }

    public static Specification<Case> statusIn(Collection<CaseStatus> statuses) {
        return (root, query, cb) -> root.get("status").in(statuses);
    }

    public static Specification<Case> slaBefore(Instant cutoff) {
        return (root, query, cb) -> cb.lessThan(root.get("slaDeadline"), cutoff);
    }
}

Usage:

Specification<Case> spec = Specification
    .where(CaseSpecifications.tenant(tenantId))
    .and(CaseSpecifications.statusIn(statuses))
    .and(CaseSpecifications.slaBefore(cutoff));

Page<Case> result = repository.findAll(spec, pageable);

Risks:

  • returning entities for dynamic search can trigger N+1
  • count query may become expensive
  • fetch joins inside specifications can break pagination/count
  • predicate composition can hide query complexity

Recommendation

Use Specification for moderate dynamic filters. For complex read models, implement custom repository with explicit query/projection.


16. Query By Example

Query By Example can be useful for simple exploratory matching.

Case probe = new Case();
probe.setStatus(CaseStatus.OPEN);

ExampleMatcher matcher = ExampleMatcher.matching()
    .withIgnoreNullValues();

List<Case> cases = repository.findAll(Example.of(probe, matcher));

Use it for:

  • admin screens with simple equality filters
  • prototypes
  • simple user directory matching

Avoid it for:

  • complex joins
  • range queries
  • advanced boolean logic
  • production-critical query plans
  • performance-sensitive screens

Query By Example is not a replacement for deliberate query design.


17. Custom Repository Implementations

When repository methods become complex, use a custom repository implementation.

Interface:

public interface CaseSearchRepository {

    Page<CaseQueueItem> searchQueue(CaseQueueFilter filter, Pageable pageable);
}

Main repository:

public interface CaseRepository
    extends JpaRepository<Case, UUID>, CaseSearchRepository {
}

Implementation:

@Repository
public class CaseSearchRepositoryImpl implements CaseSearchRepository {

    private final EntityManager entityManager;

    public CaseSearchRepositoryImpl(EntityManager entityManager) {
        this.entityManager = entityManager;
    }

    @Override
    public Page<CaseQueueItem> searchQueue(CaseQueueFilter filter, Pageable pageable) {
        // Criteria API, JPQL, Querydsl, or native SQL here.
        throw new UnsupportedOperationException("example");
    }
}

Custom implementation is appropriate when:

  • query has many optional filters
  • projection is non-trivial
  • query needs vendor-specific SQL
  • keyset pagination is needed
  • fetch plan must be explicit
  • query must be tested independently

Rule

When repository interface methods become unreadable, move complexity into a named custom query component.


18. Repository Method Naming Taxonomy

A naming taxonomy makes code review easier.

PrefixIntended Semantics
findBy...Simple lookup, usually no special fetch/lock
findDetail...Entity/detail graph for read use case
findForDecision...Command path requiring fresh enough state/invariant check
findForUpdate...Pessimistic lock or write-intent load
search...Dynamic filter or user-driven query
load...ViewDTO/read model
exists...Existence check; race-prone if used alone for command
bulk...Bulk update/delete bypassing entity lifecycle
stream...Resource-bound large result traversal
count...Aggregation/count query; may be expensive

Bad:

Optional<Case> getCase(UUID id);

Better:

Optional<Case> findForDecision(UUID id);
Optional<CaseDetailView> loadDetailView(UUID id);
Optional<Case> findForUpdate(UUID id);

Names should reduce ambiguity.


19. Avoid Entity Leakage to API Layer

Bad architecture:

@RestController
class CaseController {

    @GetMapping("/cases/{id}")
    public Case get(@PathVariable UUID id) {
        return repository.findById(id).orElseThrow();
    }
}

Problems:

  • exposes persistence model
  • lazy loading during serialization
  • infinite recursion risk
  • overfetching/underfetching
  • accidental mutation path
  • API contract tied to entity mapping

Better:

@GetMapping("/cases/{id}")
public CaseDetailResponse get(@PathVariable UUID id) {
    return queryService.getCaseDetail(id);
}

Query service:

@Transactional(readOnly = true)
public CaseDetailResponse getCaseDetail(UUID id) {
    CaseDetailView view = repository.loadDetailView(id)
        .orElseThrow(CaseNotFoundException::new);

    return mapper.toResponse(view);
}

Entity is not the API contract.


20. Repository and Aggregate Boundaries

A repository should usually be per aggregate root.

Example:

CaseRepository        -> Case aggregate root
CustomerRepository    -> Customer aggregate root
InvoiceRepository     -> Invoice aggregate root

Avoid repositories for internal child entities unless they have independent lifecycle.

Potential smell:

TaskRepository extends JpaRepository<CaseTask, UUID>

If CaseTask is owned by Case, direct task repository updates may bypass Case invariants.

Better:

@Transactional
public void completeTask(UUID caseId, UUID taskId, UUID actorId) {
    Case c = caseRepository.findForDecision(caseId).orElseThrow();
    c.completeTask(taskId, actorId);
}

The aggregate root enforces consistency.

Rule

If a child cannot be validly changed without its parent invariant, do not expose a general repository for the child.


21. getReferenceById() and Lazy References

JpaRepository#getReferenceById() returns a reference/proxy without necessarily hitting the database immediately.

Useful for setting associations when you know the target exists:

Customer customerRef = customerRepository.getReferenceById(customerId);
Order order = Order.create(customerRef, lines);
orderRepository.save(order);

Risks:

  • entity may not exist; failure delayed
  • accessing proxy outside transaction may fail
  • proxy equality/class issues
  • hides existence validation

Use when:

  • foreign key existence is guaranteed by prior validation or database constraint
  • you only need a reference
  • delayed failure is acceptable or handled

Avoid when:

  • you need to validate target state
  • you need authorization based on target fields
  • target may not exist and error must be clear

22. Existence Checks and Race Conditions

if (!repository.existsByTenantIdAndCode(tenantId, code)) {
    repository.save(new Policy(tenantId, code));
}

This is race-prone.

Two transactions can both observe absence, then both insert.

Correct protection:

  • unique constraint on (tenant_id, code)
  • handle duplicate key exception
  • or use serializable/pessimistic strategy if invariant cannot be expressed as constraint

Repository method still useful:

boolean existsByTenantIdAndCode(UUID tenantId, String code);

But only as user feedback/precheck, not final correctness authority.

Rule

exists... can improve UX. It does not replace database constraints for uniqueness invariants.


23. Delete Methods

Spring Data gives many delete options:

deleteById(id);
delete(entity);
deleteAll();
deleteAllInBatch();

Deletion is not trivial.

Before exposing delete methods, decide:

  • hard delete or soft delete?
  • cascade delete allowed?
  • orphan removal expected?
  • audit record required?
  • foreign key constraints?
  • regulatory retention?
  • domain event/outbox needed?

Dangerous:

repository.deleteById(caseId);

Better:

@Transactional
public void closeCase(UUID caseId, UUID actorId) {
    Case c = repository.findForDecision(caseId).orElseThrow();
    c.close(actorId, clock.instant());
}

For regulatory systems, deletion is usually a domain transition, not a technical operation.


24. Repository Exceptions and Error Translation

Spring translates many persistence exceptions into DataAccessException hierarchy.

However, domain code should not leak raw database errors to API consumers.

Example:

try {
    repository.save(policy);
} catch (DataIntegrityViolationException ex) {
    throw new DuplicatePolicyCodeException(policy.code(), ex);
}

Use exception translation to map infrastructure failure to domain/application error.

Common cases:

Persistence FailureApplication Meaning
unique constraint violationduplicate business key
FK violationreferenced object missing or invalid transition
optimistic lock exceptionstale command/conflict
lock timeoutcontention/retryable failure
deadlockretryable transaction failure
query timeoutdegraded dependency/performance incident

Do not bury all persistence exceptions as RuntimeException("database error").


25. Repository Testing Strategy

Repository tests should prove actual persistence behavior.

Use:

  • real database via Testcontainers when possible
  • migration scripts, not auto-generated schema
  • SQL count/assertions for fetch behavior
  • transaction boundaries that match production use
  • tests for constraints and locking

@DataJpaTest

@DataJpaTest is useful for focused repository tests, but beware:

  • default rollback can hide commit-time failures
  • in-memory DB can differ from production DB
  • lazy loading may work accidentally inside test transaction
  • schema auto-generation may hide migration mismatch

Better for serious persistence:

@DataJpaTest
@Testcontainers
class CaseRepositoryTest {

    @Container
    static PostgreSQLContainer<?> postgres = new PostgreSQLContainer<>("postgres:16");

    @Test
    void findQueueItemsDoesNotLoadEntities() {
        // assert projection query shape or SQL count
    }
}

Test What Matters

Repository FeatureTest
projectioncolumns/shape and mapping
paginationdeterministic order and count behavior
entity graphquery count / no N+1
lockingconcurrent transaction test
bulk updatepersistence context clear and affected rows
unique invariantduplicate insert fails
soft deletedeleted rows excluded consistently
tenant filtercross-tenant data not returned

26. Multi-Tenant Repository Methods

Every tenant-owned query must include tenant scope.

Bad:

Optional<Case> findById(UUID id);

If case IDs are globally unique, this may be technically safe, but still often weak as a policy boundary.

Better:

Optional<Case> findByTenantIdAndId(UUID tenantId, UUID id);

For projections:

@Query("""
    select new com.example.CaseSummary(c.id, c.referenceNo, c.status)
    from Case c
    where c.tenantId = :tenantId
      and c.id = :id
""")
Optional<CaseSummary> loadSummary(
    @Param("tenantId") UUID tenantId,
    @Param("id") UUID id
);

Rule

Repository methods should make tenant scope visible unless tenant isolation is enforced below the application layer and tested thoroughly.

Part 029 will go deeper into multitenancy.


27. Security-Aware Repository Design

Repository should not usually decide user permissions. But it must support safe filtering.

Example:

@Query("""
    select c
    from Case c
    join c.assignments a
    where c.tenantId = :tenantId
      and a.userId = :viewerId
      and c.id = :caseId
""")
Optional<Case> findVisibleCase(
    @Param("tenantId") UUID tenantId,
    @Param("viewerId") UUID viewerId,
    @Param("caseId") UUID caseId
);

This is useful for query-side visibility.

But command authorization should usually be explicit in service/domain layer:

@Transactional
public void approve(UUID tenantId, UUID caseId, UUID actorId) {
    Case c = repository.findByTenantIdAndId(tenantId, caseId).orElseThrow();
    authorization.assertCanApprove(actorId, c);
    c.approve(actorId);
}

Do not hide complex authorization inside vague repository queries unless the naming makes it clear.


28. Repository Layer Diagram

The repository sits below application use cases but above raw JPA provider mechanics.

It should not become:

  • controller helper
  • business rule engine
  • generic SQL bag
  • transaction orchestrator
  • API response factory

29. Repository Anti-Patterns

Anti-Pattern 1: Entity Repository Used Directly by Controller

@GetMapping("/orders")
public List<Order> list() {
    return orderRepository.findAll();
}

Problems:

  • unbounded read
  • entity leakage
  • serialization/lazy loading risk
  • no use-case boundary

Better:

@GetMapping("/orders")
public Page<OrderSummaryResponse> list(Pageable pageable) {
    return orderQueryService.listOrders(pageable);
}

Anti-Pattern 2: findAll() in Production Path

List<Case> cases = repository.findAll();

This is almost always wrong for growing tables.

Better:

  • pagination
  • streaming/chunking
  • bounded query
  • batch processing with cursor/keyset

Anti-Pattern 3: Repository as Business Logic Container

@Query("""
    select c
    from Case c
    where c.status = 'OPEN'
      and c.slaDeadline < current_timestamp
      and c.escalationCount < 3
      and c.region.riskScore > 80
      and c.assignee.active = true
""")
List<Case> findCasesThatShouldBeEscalated();

Some filtering belongs in query. But policy logic should be named and tested as domain/application logic.

Better:

List<Case> findEscalationCandidates(Instant cutoff);

Then:

for (Case c : candidates) {
    if (escalationPolicy.shouldEscalate(c, now)) {
        c.escalate(systemActor, now);
    }
}

Balance query efficiency with policy clarity.


Anti-Pattern 4: Giant Derived Queries

findByAAndBOrCAndDAndEInAndFBetweenAndGIsNullAndHNot(...)

Hard to review, hard to modify, easy to misread.

Better:

  • explicit @Query
  • Specification
  • custom repository
  • named query object

Anti-Pattern 5: Blind save() After Every Change

entity.changeSomething();
repository.save(entity);

If entity is managed, this is unnecessary. If entity is detached, it may trigger merge semantics you did not intend.

Better:

  • understand managed state
  • use save for new aggregate or deliberate merge
  • avoid detached mutation in web/API layer

Anti-Pattern 6: Bulk Delete Without Domain/Audit Awareness

repository.deleteAllByStatus(CLOSED);

This may violate retention, audit, reporting, or FK constraints.

Better:

  • model archival/retention explicitly
  • use migration/maintenance job with audit
  • document bulk semantics

30. Production Repository Review Checklist

For each repository method, ask:

Semantics

  • Is this command-side or query-side?
  • Does the method name reveal intent?
  • Does it return entity or projection deliberately?
  • Is cardinality bounded?
  • Does it need tenant/security scope?

Query Shape

  • Is the generated SQL predictable?
  • Does it join/fetch intentionally?
  • Can it cause N+1?
  • Does pagination have stable sorting?
  • Is count query acceptable?

Transaction/Concurrency

  • Who owns transaction boundary?
  • Does this need optimistic/pessimistic locking?
  • Does it rely on exists/count for invariant?
  • Are stale reads acceptable?

Write Path

  • Does this method mutate through entity lifecycle or bulk operation?
  • Are callbacks/domain events expected?
  • Is persistence context cleared after bulk update?
  • Is L2/application cache invalidated?

Operations

  • Can this query be observed in logs/traces?
  • Is there a performance test for high-cardinality data?
  • Are DB indexes aligned with predicates/order?
  • Does it behave correctly on production database dialect?

A strong team style guide might say:

  1. Controllers must not return JPA entities.
  2. Repositories are per aggregate root by default.
  3. Command methods load aggregates by explicit intent: findForDecision, findForUpdate.
  4. Query/list screens return DTO projections, not entities.
  5. Every pageable query must have deterministic sort.
  6. findAll() is forbidden in production paths unless table is bounded/reference data.
  7. Derived queries are limited to simple lookup.
  8. Complex queries use @Query, Specification, Querydsl, or custom repository.
  9. Bulk methods must be prefixed with bulk and document cache/persistence-context behavior.
  10. Tenant scope must be explicit or enforced by tested infrastructure.
  11. Repository tests must run against the production database family.
  12. save() after modifying a managed entity should be challenged in code review.

32. Example: Production-Grade Repository Slice

Entity:

@Entity
@Table(
    name = "cases",
    indexes = {
        @Index(name = "idx_case_queue", columnList = "tenant_id, assignee_id, status, sla_deadline, id")
    }
)
public class Case {

    @Id
    private UUID id;

    @Column(name = "tenant_id", nullable = false)
    private UUID tenantId;

    @Version
    private long version;

    @Enumerated(EnumType.STRING)
    private CaseStatus status;

    private Instant slaDeadline;

    protected Case() {
    }

    public void approve(UUID reviewerId) {
        if (status != CaseStatus.IN_REVIEW) {
            throw new InvalidCaseTransitionException(id, status, CaseStatus.APPROVED);
        }
        this.status = CaseStatus.APPROVED;
    }
}

Projection:

public record CaseQueueItem(
    UUID id,
    String referenceNo,
    CaseStatus status,
    Instant slaDeadline
) {}

Repository:

public interface CaseRepository extends JpaRepository<Case, UUID>, CaseSearchRepository {

    @Lock(LockModeType.OPTIMISTIC)
    @Query("""
        select c
        from Case c
        where c.tenantId = :tenantId
          and c.id = :id
    """)
    Optional<Case> findForDecision(
        @Param("tenantId") UUID tenantId,
        @Param("id") UUID id
    );

    @Query("""
        select new com.example.caseapp.CaseQueueItem(
            c.id,
            c.referenceNo,
            c.status,
            c.slaDeadline
        )
        from Case c
        where c.tenantId = :tenantId
          and c.assignee.id = :assigneeId
          and c.status = com.example.caseapp.CaseStatus.OPEN
        order by c.slaDeadline asc, c.id asc
    """)
    Page<CaseQueueItem> findQueueItems(
        @Param("tenantId") UUID tenantId,
        @Param("assigneeId") UUID assigneeId,
        Pageable pageable
    );

    @Modifying(clearAutomatically = true, flushAutomatically = true)
    @Query("""
        update Case c
        set c.status = com.example.caseapp.CaseStatus.EXPIRED
        where c.tenantId = :tenantId
          and c.status = com.example.caseapp.CaseStatus.OPEN
          and c.slaDeadline < :cutoff
    """)
    int bulkExpireOpenCases(
        @Param("tenantId") UUID tenantId,
        @Param("cutoff") Instant cutoff
    );
}

Service:

@Service
public class CaseCommandService {

    private final CaseRepository repository;

    public CaseCommandService(CaseRepository repository) {
        this.repository = repository;
    }

    @Transactional
    public void approve(UUID tenantId, UUID caseId, UUID reviewerId) {
        Case c = repository.findForDecision(tenantId, caseId)
            .orElseThrow(CaseNotFoundException::new);

        c.approve(reviewerId);
    }
}

This design makes the following explicit:

  • tenant scope
  • command vs query path
  • projection for list screen
  • optimistic locking for decision
  • stable pagination order
  • bulk operation naming
  • service-owned transaction boundary

33. Key Takeaways

  • Spring Data JPA removes boilerplate, not persistence semantics.
  • Repository methods should express use-case intent, not just property filters.
  • save() is not immediate SQL and is often unnecessary for managed entities.
  • Derived queries are excellent for simple lookup and poor for complex business queries.
  • Return type is a contract: entity for command path, projection for read path.
  • Pagination must be deterministic and count cost must be considered.
  • Locks, entity graphs, query hints, and bulk operations should be visible in method naming or annotations.
  • Service/application layer should usually own transaction boundaries.
  • Repositories should respect aggregate, tenant, and security boundaries.
  • Repository tests must prove real database behavior, not just mock interactions.

34. Where This Leads Next

Part 025 moves from repository mechanics into transactional service boundaries:

  • where @Transactional belongs
  • command handler patterns
  • read-only transaction design
  • Open Session in View
  • domain event timing
  • side effects and after-commit behavior
  • preventing transaction leakage across layers
Lesson Recap

You just completed lesson 24 in deepen practice. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.

Continue The Track

Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.