Start HereOrdered learning track

Provider Mental Model

Learn Java Hibernate ORM and EclipseLink - Part 002

A provider-level mental model of Jakarta Persistence, Hibernate ORM, EclipseLink, persistence context, unit of work, metadata, query translation, flush, caching, JDBC, and the relational database boundary.

17 min read3333 words
PrevNext
Lesson 0234 lesson track0106 Start Here
#java#hibernate#eclipselink#orm+5 more

Part 002 — ORM Provider Mental Model: Spec, Provider, Runtime, Database

Jika Part 001 adalah peta belajar, Part 002 adalah peta mesin. Kita akan melihat ORM bukan sebagai annotation framework, tetapi sebagai runtime engine yang menyimpan state, menerjemahkan query, melacak perubahan, mengatur flush, memanfaatkan cache, dan akhirnya tunduk pada database.

Hibernate ORM dan EclipseLink sama-sama implementasi Jakarta Persistence. Tetapi “implementasi spec” bukan berarti behavior identik di semua area. Spec mendefinisikan kontrak utama. Provider mengisi detail runtime: metadata model, lazy loading mechanics, dirty checking, flush ordering, cache architecture, query optimization, weaving/enhancement, dan extension APIs.

Mental model yang benar:

Jakarta Persistence gives the contract.
Hibernate/EclipseLink execute the contract.
The persistence context holds runtime identity and state.
The database enforces final truth through constraints, locks, transactions, and query plans.

1. Four-Layer Model

Untuk menganalisis masalah ORM, gunakan empat lapisan berikut.

Setiap bug ORM bisa dipetakan ke salah satu lapisan ini. Jangan langsung menyalahkan provider. Sering kali masalah berasal dari boundary yang kabur, mapping yang tidak merepresentasikan invariant, query shape yang salah, atau database constraint yang berbeda dari asumsi object model.


2. Jakarta Persistence sebagai Contract

Jakarta Persistence mendefinisikan standar untuk object/relational mapping dan persistence management di Java. Standar ini penting karena memberikan vocabulary bersama:

  • entity;
  • persistence unit;
  • entity manager;
  • persistence context;
  • lifecycle state;
  • mapping annotation;
  • JPQL;
  • Criteria API;
  • transaction interaction;
  • lock modes;
  • cache API;
  • lifecycle callback.

Namun spec sengaja tidak mengatur semua detail internal provider. Misalnya, spec tidak memaksa provider memakai algoritma dirty checking yang sama, cache implementation yang sama, lazy loading mechanism yang sama, atau SQL generation strategy yang identik.

Akibatnya, engineer perlu membedakan dua kategori pengetahuan:

KategoriContohDampak
Spec-level knowledgeentity lifecycle, EntityManager, JPQL, lock mode, mapping standarRelatif portable
Provider-level knowledgeHibernate Session, ActionQueue, bytecode enhancement, EclipseLink weaving, descriptor customizer, query hintsPowerful tetapi perlu isolation

Rule:

Saat menjelaskan behavior, selalu tandai apakah itu berasal dari spec, provider, framework integration, atau database.


3. Provider sebagai Execution Engine

Provider ORM memiliki beberapa tanggung jawab utama.

3.1 Metadata Interpretation

Provider membaca entity class dan mapping metadata, lalu membangun model internal:

  • class mana entity;
  • field/property mana persistent;
  • table dan column mapping;
  • identifier strategy;
  • association ownership;
  • inheritance strategy;
  • converter/type mapping;
  • cascade rule;
  • fetch rule;
  • lifecycle callback;
  • cache policy;
  • provider-specific annotation/hint.

Metadata bukan hanya konfigurasi pasif. Metadata menentukan bagaimana provider membuat SQL, mengelola lifecycle, melakukan dirty checking, dan menyusun flush order.

3.2 Identity Management

Dalam persistence context, provider menjaga hubungan antara database identity dan object instance.

Contoh invariant:

Within one persistence context:
Database row case(id = 100) -> one managed RegulatoryCase instance

Jika aplikasi mencoba memasukkan object lain dengan ID yang sama, provider harus menangani konflik. Di Hibernate, konflik ini sering terlihat sebagai exception seputar duplicate representation atau non-unique object tergantung operasi. Di EclipseLink, UnitOfWork dan identity map memiliki model clone/registration yang perlu dipahami.

3.3 Change Tracking

Provider menentukan apakah managed entity berubah. Strateginya dapat berupa:

  • snapshot comparison;
  • field interception;
  • bytecode enhancement;
  • weaving;
  • attribute-level tracking;
  • collection wrapper tracking.

Dirty checking adalah jembatan antara mutation object dan SQL UPDATE.

3.4 Query Translation

JPQL/HQL/Criteria bukan SQL. Provider menerjemahkannya ke SQL berdasarkan:

  • dialect/platform database;
  • mapping metadata;
  • join path;
  • fetch plan;
  • parameter binding;
  • pagination;
  • lock mode;
  • query hints;
  • inheritance strategy;
  • discriminator;
  • filters/additional criteria;
  • cache setting.

SQL yang dihasilkan bukan hanya fungsi dari query string. SQL juga fungsi dari metadata dan provider behavior.

3.5 Flush and Commit Coordination

Provider mengumpulkan perubahan di memory. Saat flush, provider mengubah perubahan menjadi SQL:

  • insert entity baru;
  • update entity dirty;
  • delete entity removed;
  • update FK;
  • insert/delete join table row;
  • maintain collection table;
  • execute version update;
  • enforce ordering agar constraints tidak gagal sejauh mungkin.

Commit adalah tanggung jawab transaksi database. Flush dapat terjadi sebelum commit.

3.6 Cache Coordination

Provider dapat memiliki beberapa lapisan cache:

  • first-level cache/persistence context;
  • second-level cache/shared cache;
  • query cache/result cache;
  • natural ID cache atau provider-specific cache;
  • coordination/invalidation mechanism.

Cache harus dipikirkan sebagai consistency mechanism, bukan sekadar optimization.


4. Hibernate ORM Mental Model

Hibernate historically memiliki native API sendiri (Session, SessionFactory) dan juga implementasi Jakarta Persistence (EntityManager, EntityManagerFactory). Dalam praktik modern, banyak aplikasi memakai JPA API di atas Hibernate provider, tetapi behavior penting tetap berasal dari Hibernate internals.

High-level architecture:

4.1 Important Hibernate Concepts

ConceptMental model
SessionFactoryImmutable-ish heavyweight runtime factory built from metadata and services
SessionUnit of work / persistence context boundary, usually transaction-scoped
PersistenceContextFirst-level cache and managed entity registry
ActionQueueOrdered queue of pending inserts, updates, deletes, collection actions
Type systemMapping between Java values and JDBC/database representation
DialectDatabase-specific SQL capability abstraction
Event systemHooks for load, persist, flush, delete, dirty checking, etc.
Bytecode enhancementOptional/required feature for advanced lazy loading and dirty tracking scenarios
Second-level cacheShared cache outside a single session, configured by region/strategy

4.2 Hibernate Biases

Hibernate is powerful and extension-rich. It often gives more knobs than the spec:

  • custom types;
  • filters;
  • formulas;
  • batch fetching;
  • subselect fetching;
  • fetch profiles;
  • interceptors and event listeners;
  • natural ID support;
  • stateless session;
  • second-level cache regions;
  • bytecode enhancement;
  • Hibernate-specific query features.

Engineering implication:

Hibernate can solve many advanced problems elegantly, but uncontrolled Hibernate-specific usage can create migration friction and hidden coupling.

4.3 Hibernate Debugging Questions

When debugging Hibernate behavior, ask:

1. Which Session owns this entity?
2. Is the entity managed, detached, removed, or proxy?
3. Is there an ActionQueue entry pending?
4. Did a query trigger auto-flush?
5. Is dirty checking snapshot-based or enhanced?
6. Did lazy loading happen through proxy or enhanced field interception?
7. Did second-level cache participate?
8. Which dialect generated this SQL?
9. Did batching actually occur?
10. Is behavior from JPA API or Hibernate extension?

EclipseLink is also a Jakarta Persistence provider, but its architecture vocabulary differs. It has deep concepts around sessions, projects, descriptors, weaving, indirection, identity maps, and UnitOfWork.

High-level architecture:

ConceptMental model
ServerSessionShared session representing database login, descriptors, platform, cache
ClientSessionSession view often associated with client/unit of work usage
UnitOfWorkTracks clones/changes before commit
DescriptorRuntime metadata for persistent class
MappingAttribute/relationship mapping inside descriptor
WeavingBytecode modification for lazy loading, change tracking, fetch groups, etc.
IndirectionLazy reference/collection mechanism
Identity MapShared cache identity structure
DatabasePlatformDatabase-specific SQL and capability abstraction
DescriptorCustomizer / SessionCustomizerExtension points for runtime customization

EclipseLink has strong provider-level features around:

  • weaving;
  • indirection;
  • shared cache and identity maps;
  • descriptor/session customization;
  • batch reading;
  • fetch groups;
  • multitenancy support;
  • additional criteria;
  • converters and transformation mappings;
  • database platform customization.

Engineering implication:

EclipseLink rewards understanding descriptors, weaving, and UnitOfWork. If you only bring a Hibernate mental model, you may misread EclipseLink behavior.

1. Is weaving active?
2. Is this object a clone in UnitOfWork or shared cached object?
3. Which descriptor mapping controls this attribute?
4. Is indirection/lazy loading active for this relationship?
5. Is shared cache returning stale data?
6. Is batch reading configured?
7. Did additional criteria affect the query?
8. Which DatabasePlatform generated SQL?
9. Did a descriptor/session customizer alter default behavior?
10. Is behavior from Jakarta Persistence or EclipseLink extension?

6. Persistence Context vs Unit of Work

JPA uses the term persistence context. Hibernate commonly maps this to Session internals. EclipseLink often exposes the UnitOfWork concept in its architecture.

The shared idea:

A runtime scope tracks objects loaded from or scheduled for persistence to the database.

But the internal implementation can differ.

ConcernHibernate leaningEclipseLink leaning
Runtime scopeSession / persistence contextEntityManager backed by UnitOfWork/session
Identity trackingPersistenceContext identity mapUnitOfWork clones + identity maps
Change trackingsnapshots/enhancement/collection wrappersdeferred change detection/weaving/change tracking policies
Lazy mechanismproxies/enhancement/collectionsindirection/weaving/fetch groups
Extension pointevent listeners, interceptors, services, typesdescriptors, customizers, session events
Cache vocabularyfirst-level/second-level/query cacheidentity map/shared cache/query results depending config

Do not force one provider’s vocabulary onto the other. Translate concepts carefully.


7. Lifecycle of a Simple Write

Consider:

RegulatoryCase c = new RegulatoryCase("CASE-2026-0001");
entityManager.persist(c);
c.changeStatus(CaseStatus.OPEN);
entityManager.flush();

Provider-level flow:

Important observations:

  • persist does not necessarily execute SQL immediately.
  • ID strategy can force earlier SQL in some cases.
  • Field mutation after persist but before flush can be included in insert.
  • Constraint errors may appear at flush time, not at persist call.
  • Provider determines SQL ordering.

8. Lifecycle of a Simple Read

Consider:

RegulatoryCase c = entityManager.find(RegulatoryCase.class, id);
String name = c.getPrimaryParty().getDisplayName();

Possible runtime flow:

Read operation can involve:

  • first-level cache;
  • shared/second-level cache;
  • SQL select;
  • lazy load later;
  • proxy initialization;
  • weaving/indirection;
  • hydration;
  • entity registration;
  • implicit flush before query depending context.

A getter may therefore become a database operation.


9. Where Assumptions Commonly Fail

9.1 “I called save, so SQL happened”

Wrong mental model. ORM often delays SQL until flush. Some ID strategies or provider operations may force earlier SQL, but you should not assume every save call hits the database immediately.

9.2 “I queried, so I read the database truth”

Maybe. Query may return managed instances already present in persistence context. Even if SQL runs, provider can reconcile rows with existing managed objects. First-level cache can preserve stale in-memory state within a transaction.

9.3 “The entity is detached, but changing it should update later”

Detached object mutation is not tracked. A later merge copies state into a managed instance, which has different semantics and risks.

9.4 “Lazy means no query unless I explicitly query”

Accessing a lazy property/association can trigger SQL implicitly.

9.5 “If it works in Hibernate, it is portable to EclipseLink”

Not necessarily. Standard mappings are portable in intent, but provider details can differ in lazy loading, enhancement/weaving, query hints, cache, extension annotations, and edge behavior.


10. Spec vs Provider vs Database: Diagnostic Classification

Saat incident terjadi, klasifikasikan masalah.

SymptomLikely layerExample diagnosis
LazyInitializationException / lazy access failureBoundary/providerEntity escaped persistence context
Duplicate entity identity errorPersistence contextTwo detached/new instances with same ID
Constraint violation during commitFlush/databaseSQL order or invalid object graph
N+1 queryFetch plan/provider runtimeLazy association accessed in loop
DeadlockTransaction/databaseInconsistent update order or lock strategy
Stale data after bulk updatePersistence context/cacheBulk SQL bypassed managed state/cache
Query slow despite few SQL statementsDatabase/query shapeMissing index, bad join, overhydration
Memory spikeProvider runtimeLarge persistence context or huge hydration
Different behavior across providersProvider-specificExtension, weaving, cache, query hint difference

Decision tree:


11. Provider Portability Model

Portability is not binary. Use a spectrum.

Fully portable          Mostly portable            Provider-bound
JPA annotations  ->     Standard + hints      ->    Native provider APIs/extensions
JPQL standard           Minor behavior diff         Custom types/events/descriptors

11.1 Portable Zone

Examples:

  • standard entity annotation;
  • standard associations;
  • standard lifecycle callbacks;
  • JPQL subset;
  • standard lock modes;
  • standard EntityGraph API;
  • standard AttributeConverter.

Even here, SQL shape may vary.

11.2 Mostly Portable Zone

Examples:

  • query hints with provider-specific interpretation;
  • fetch graph/load graph behavior in edge cases;
  • schema generation settings;
  • cache settings;
  • lazy to-one behavior depending enhancement/weaving;
  • Criteria API features whose SQL rendering differs.

11.3 Provider-Bound Zone

Examples:

  • Hibernate custom types, filters, formulas, event listeners, stateless session;
  • EclipseLink descriptors, customizers, additional criteria, transformation mapping;
  • provider-specific batch/fetch annotations;
  • provider-specific cache coordination;
  • provider-specific multi-tenancy configuration.

Rule:

Provider-bound features are acceptable when they buy measurable correctness, performance, or maintainability. They must be isolated and tested.


12. Database Is the Final Authority

ORM does not eliminate database reality.

The database still owns:

  • unique constraints;
  • foreign keys;
  • check constraints;
  • transaction isolation;
  • locks;
  • indexes;
  • execution plans;
  • row visibility;
  • deadlock detection;
  • storage layout;
  • network round-trip cost;
  • write-ahead logging cost;
  • replication lag;
  • trigger behavior;
  • generated columns;
  • sequence allocation.

If object model says a relationship is optional but database says NOT NULL, database wins. If ORM generates a query that cannot use an index, optimizer behavior wins. If two transactions lock rows in opposite order, database deadlock detection wins.

Architecture principle:

The ORM model should express domain intent, but the database schema must enforce non-negotiable invariants.


13. The Runtime Cost Model

Every ORM operation has costs.

Cost typeExampleHow to observe
SQL round tripN+1 lazy loadsSQL count/log/statistics
Rows scannedmissing indexexecution plan
Rows returnedoverbroad join fetchresult size/log/plan
Object hydrationloading full aggregate for list pageallocation profiler/statistics
Dirty checkinghuge persistence contextflush time/profile
Collection difflarge @OneToMany mutationSQL log/flush metrics
Cache lookupsecond-level cache overheadcache hit/miss metrics
Lock waitpessimistic lock/deadlockDB lock view/logs
Batch failureidentity generation or flush orderingJDBC batch logs/statistics

Performance engineering starts by identifying which cost dominates.

Bad diagnosis examples:

  • Fixing N+1 by adding cache when the real issue is fetch planning.
  • Adding indexes when the real issue is overhydration.
  • Increasing connection pool when the real issue is lock wait.
  • Using native SQL when the real issue is wrong transaction boundary.
  • Enabling second-level cache for mutable workflow data without invalidation.

14. Boundary Model: Where Entities Must Not Leak

Entity objects are stateful persistence objects. They should not leak carelessly across boundaries.

Dangerous boundaries:

  • REST response serialization;
  • GraphQL resolver without fetch planning;
  • async job queue payload;
  • Kafka/event message;
  • cache outside ORM;
  • UI session;
  • audit snapshot without explicit copy;
  • equals/hashCode in collection after ID mutation;
  • logging that accesses lazy associations;
  • validation that traverses lazy graph unexpectedly.

Safer boundary pattern:

The key is explicit translation. Entity stays inside transaction/persistence boundary. DTO/read model crosses external boundary.


15. A Practical Classification of ORM Use Cases

Not every persistence use case should be solved with entity loading.

Use casePreferred patternReason
Mutate aggregate with invariantsManaged entity aggregateNeed lifecycle, dirty checking, optimistic lock
Simple lookup by IDEntity or DTO depending boundaryEntity if mutation follows; DTO if read-only external response
List/search pageDTO projection/read modelAvoid overhydration and N+1
ReportingNative SQL/view/materialized read modelSQL shape matters more than entity lifecycle
High-volume importBatch ORM with flush/clear or native bulkControl memory and batching
Mass updateJPQL/native bulk + context cleanupAvoid hydrating every row
Reference dataEntity + cache if immutable/rarely changedCache may be justified
Audit historyAppend-only table/entity or Envers-like toolImmutable/event-like semantics
Cross-service integrationDTO/event schema, not entityAvoid persistence leakage

Rule:

ORM entity loading is best when object lifecycle and invariants matter. Projection/native/read model is often better when data shape and query performance dominate.


16. Hibernate vs EclipseLink: First Decision Matrix

This is not a winner-takes-all comparison. Both are mature providers. The right question is: which provider aligns with the system constraints?

DimensionHibernate ORMEclipseLink
Ecosystem usageVery common in Spring-centric stacksStrong Jakarta EE / Eclipse ecosystem heritage
Extension richnessVery broad Hibernate-specific feature setStrong descriptor/session/weaving/customizer model
Mental model vocabularySession, ActionQueue, Type system, eventsSession, Descriptor, UnitOfWork, weaving, identity map
Lazy mechanism emphasisproxies + enhancementindirection + weaving
Change trackingsnapshots + enhancement optionsdeferred/change tracking policies + weaving
Cache modelsecond-level cache regions and strategiesshared cache/identity maps and coordination features
Query extensionHQL and Hibernate query featuresEclipseLink query hints/features
Migration riskHibernate-specific annotations/APIs can lock inEclipseLink-specific descriptors/customizers can lock in
Best learning approachUnderstand Session/flush/type/query engineUnderstand UnitOfWork/descriptors/weaving/cache

Do not choose provider only by popularity. Choose based on stack integration, operational familiarity, required extension points, migration constraints, team expertise, and performance/correctness requirements.


17. How to Read ORM Documentation Efficiently

A common mistake is reading docs linearly. Better approach:

  1. Start with lifecycle and persistence context.
  2. Learn mapping only when tied to SQL effect.
  3. Learn query language with generated SQL inspection.
  4. Learn fetch strategies after reproducing N+1.
  5. Learn cache only after understanding transaction/write paths.
  6. Learn provider extensions after knowing portable baseline.
  7. Learn internals only where they explain observable behavior.

Reading checklist:

For every documented feature:
1. What problem does it solve?
2. Is it spec-level or provider-specific?
3. What SQL/runtime behavior changes?
4. What failure mode does it prevent?
5. What new failure mode can it introduce?
6. How do I test it?
7. How would migration to another provider be affected?

18. Practice Drill: Classify the Source of Behavior

For each statement, classify as spec, provider, database, or framework integration.

  1. EntityManager.find() returns a managed entity if found.
  2. Hibernate can use an ActionQueue to order pending SQL actions.
  3. EclipseLink can use weaving for lazy loading and change tracking.
  4. PostgreSQL enforces unique constraints.
  5. Spring @Transactional defines application transaction boundary.
  6. JPQL bulk update bypasses already-managed object state unless synchronized manually.
  7. A lazy association access may fail after the persistence context is closed.
  8. A provider-specific query hint changes SQL generation.
  9. A database deadlock occurs because two transactions lock rows in different order.
  10. A DTO projection avoids entity hydration.

Expected classification:

#Classification
1Spec-level concept
2Hibernate provider-level
3EclipseLink provider-level
4Database-level
5Framework integration
6Spec/provider/runtime interaction
7Provider/runtime boundary
8Provider-level
9Database/transaction-level
10Application/query-shape design

The point is not memorizing labels. The point is debugging with the correct mental layer.


19. Practice Drill: Predict the Runtime Path

Scenario:

@Transactional
public CaseSummary openCase(UUID caseId) {
    RegulatoryCase c = em.find(RegulatoryCase.class, caseId);
    c.markViewedBy(currentOfficerId);
    return new CaseSummary(
        c.getCaseNumber(),
        c.getStatus(),
        c.getPrimaryParty().getDisplayName(),
        c.getTasks().stream().filter(CaseTask::isOpen).count()
    );
}

Before running, predict:

  1. Is c managed?
  2. Does markViewedBy cause dirty state?
  3. When will SQL update happen?
  4. Does getPrimaryParty() trigger SQL?
  5. Does getTasks() trigger SQL?
  6. Is this endpoint at risk of N+1 if used in a list?
  7. Should this be entity-based or projection-based?
  8. What happens if transaction is read-only?
  9. Would Hibernate and EclipseLink produce identical SQL?
  10. What test would prove the query count?

Likely answer:

  • c is managed if found.
  • markViewedBy probably causes dirty state unless field is transient/read-only/no-op.
  • SQL update may happen at flush/commit or before a later query depending flush mode.
  • primaryParty may lazy load.
  • tasks may lazy load and hydrate entire collection just to count open tasks.
  • In a list, this is a strong N+1 risk.
  • For read-only summary, DTO projection or read model is likely better.
  • Read-only transaction behavior depends on framework/provider hints and should not be treated as a universal guarantee without verification.
  • SQL may differ by provider.
  • Query count regression test should assert expected statements.

20. Production Mental Model Summary

When looking at ORM code, always run this internal pipeline:

1. What is the use case? Read, write, batch, report, audit, integration?
2. What is the transaction boundary?
3. Which objects are managed?
4. Which associations can lazy load?
5. What changes are dirty?
6. When will flush happen?
7. What SQL should appear?
8. How many rows will be scanned/returned/hydrated?
9. Which cache layers may participate?
10. What database constraints/locks can fail?
11. Which behavior is provider-specific?
12. How do we verify with logs, metrics, and tests?

This is the operational lens used throughout the rest of the series.


21. Key Takeaways

  • Jakarta Persistence gives a standard contract, not identical provider internals.
  • Hibernate and EclipseLink must be understood through their own vocabulary.
  • Persistence context/UnitOfWork is the runtime center of ORM behavior.
  • Flush timing, dirty checking, fetch planning, and cache are stateful concerns.
  • Database constraints, locks, and execution plans remain final authority.
  • Provider-specific features are useful but must be isolated and justified.
  • Debugging ORM requires classifying symptoms into application, spec, provider, framework, and database layers.

Part 003 will go deeper into bootstrapping: persistence units, metadata construction, service/session factory concepts, EclipseLink project/descriptors, configuration surfaces, and bytecode enhancement/weaving.


References

Lesson Recap

You just completed lesson 02 in start here. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.

Continue The Track

Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.