Performance and Resource Efficiency
Learn Java Jakarta RESTful Web Services / JAX-RS - Part 026
Performance engineering for Jakarta REST services: request cost model, serialization, allocation, threading, async, virtual threads, connection pools, benchmarking, and tail latency.
Part 026 — Performance Engineering: Reflection, Serialization, Allocation, Threading, Virtual Threads, and Benchmarking
1. Learning Objective
Bagian ini membahas performance Jakarta REST secara sistemik. Tujuannya bukan menghafal trik micro-optimization, tetapi membangun model biaya agar kita bisa menjawab pertanyaan seperti:
- Request lambat karena routing, serialization, database, network, atau downstream call?
- Apakah async membantu atau hanya memindahkan bottleneck?
- Apakah virtual threads akan memperbaiki throughput?
- Apakah JSON serialization menjadi hotspot?
- Apakah tail latency berasal dari connection pool, GC, lock contention, atau retry storm?
- Bagaimana menguji endpoint REST dengan benar?
Target akhir:
Mampu melihat endpoint Jakarta REST sebagai pipeline resource consumption: CPU, memory allocation, thread, IO, connection, queue, lock, downstream dependency, dan serialization boundary.
2. Kaufman Deconstruction
Kita pecah performance REST menjadi sub-skill berikut.
| Sub-skill | Pertanyaan inti | Output praktis |
|---|---|---|
| Cost model | Apa saja biaya satu request? | Bisa menemukan bottleneck tanpa menebak |
| Measurement | Metric apa yang valid? | Bisa membedakan latency, throughput, saturation |
| Serialization | Berapa biaya JSON/body mapping? | DTO dan provider tidak boros |
| Allocation | Apakah endpoint membuat terlalu banyak object? | GC pressure terkontrol |
| Threading | Apakah request blocking atau non-blocking? | Thread pool tidak habis |
| Connection pools | Pool mana yang jadi bottleneck? | DB/HTTP client tidak starvation |
| Virtual threads | Kapan membantu, kapan tidak? | Tidak overclaim fitur runtime |
| Benchmarking | Bagaimana load test yang benar? | Hasil bisa dipercaya |
| Tail latency | Kenapa p99 buruk meski average bagus? | Sistem stabil di produksi |
3. Performance Mental Model
Satu request Jakarta REST melewati beberapa tahap.
Potential cost at each stage:
| Stage | Cost type |
|---|---|
| HTTP parse | CPU, allocation |
| Resource matching | CPU, route table lookup |
| Filters | CPU, IO if badly designed |
| Parameter conversion | CPU, validation, error handling |
| Body deserialization | CPU, allocation, reflection/codegen |
| Resource/service | Business logic, locks, DB, external calls |
| DTO mapping | CPU, allocation |
| Serialization | CPU, allocation, buffering |
| Response write | network IO, client speed |
Most REST performance problems are not caused by @GET or @Path. They are caused by expensive work hidden behind a clean endpoint.
4. Measure Before Optimizing
Do not start with tuning.
Start with four questions:
- What is the endpoint SLO?
- What is the traffic shape?
- Where is time spent?
- Which resource saturates first?
Example SLO:
GET /cases/{caseId}
p50 < 50 ms
p95 < 200 ms
p99 < 500 ms
error rate < 0.1%
Traffic shape:
Peak: 300 RPS
Payload: median 4 KB, p99 80 KB
Dependencies: DB + document metadata service
Concurrency: 500 active requests
Without SLO and traffic shape, “fast” has no meaning.
5. Latency vs Throughput vs Saturation
Definitions:
| Term | Meaning |
|---|---|
| Latency | Time for one request to complete |
| Throughput | Requests per second completed |
| Concurrency | Requests in flight |
| Saturation | Resource utilization near capacity |
| Tail latency | High percentile latency, e.g. p95/p99 |
| Queueing | Waiting before actual work starts |
Little’s Law is useful:
concurrency ≈ throughput × latency
If endpoint handles 200 RPS with 500 ms average latency:
concurrency ≈ 200 × 0.5 = 100 in-flight requests
If latency jumps to 2 seconds under downstream slowness:
concurrency ≈ 200 × 2 = 400 in-flight requests
That can exhaust request threads, DB pool, or HTTP client pool.
6. Average Latency Is Misleading
Average hides pain.
Example:
| Percentile | Latency |
|---|---|
| p50 | 40 ms |
| p90 | 120 ms |
| p95 | 300 ms |
| p99 | 2,500 ms |
Average may look acceptable, but p99 indicates some users hit severe delay. In distributed systems, tail latency compounds. If one user action calls 5 backend APIs, each with p99 risk, the user-facing p99 can become very bad.
Always capture:
- p50,
- p90,
- p95,
- p99,
- max only for debugging, not SLO,
- error rate,
- saturation indicators.
7. Jakarta REST Runtime Overhead
Jakarta REST runtime overhead usually includes:
- route matching,
- annotation metadata lookup,
- provider selection,
- parameter injection,
- filters/interceptors,
- exception mapper resolution,
- entity provider invocation.
For most business APIs, this overhead is smaller than DB/external IO and serialization. But it can matter for:
- extremely high RPS small payload endpoints,
- gateway-like services,
- health/metrics endpoints under heavy scraping,
- event ingestion APIs,
- native-image/build-time optimized runtimes,
- environments with strict cold start.
Do not optimize route matching if 95% of latency is database query time.
8. Serialization Cost
JSON serialization/deserialization is often a major CPU/allocation cost.
Cost drivers:
- payload size,
- object graph depth,
- reflection/introspection,
- date/time formatting,
- enum conversion,
- polymorphism,
- null handling,
- unknown field handling,
- custom serializers/adapters,
- records vs mutable classes,
- buffering strategy.
Example problematic DTO:
public class CaseDetailResponse {
public String caseId;
public List<EvidenceResponse> evidence;
public List<AuditEventResponse> fullAuditTrail;
public Map<String, Object> dynamicAttributes;
public Object rawWorkflowContext;
}
Problems:
- large nested collections,
- dynamic map defeats type discipline,
- raw workflow context may be huge,
- audit trail may not belong in case summary,
- serialization cost unpredictable.
Better:
public record CaseDetailResponse(
String caseId,
String status,
String assignedTeam,
OffsetDateTime updatedAt,
List<LinkResponse> links
) {}
Then expose heavy subresources separately:
GET /cases/{caseId}/evidence
GET /cases/{caseId}/audit-events
GET /cases/{caseId}/decisions
Performance and contract design are connected.
9. Avoid Entity Exposure
Returning persistence entities directly is bad for API contract and performance.
Bad:
@GET
@Path("/{id}")
public CaseEntity getCase(@PathParam("id") String id) {
return repository.find(id);
}
Performance risks:
- lazy-loading during serialization,
- N+1 queries hidden in JSON writer,
- circular references,
- huge object graph,
- accidental fields serialized,
- transaction/session boundary leak.
Better:
@GET
@Path("/{id}")
public CaseResponse getCase(@PathParam("id") String id) {
CaseView view = caseQueryService.getCaseView(id);
return CaseResponse.from(view);
}
DTO is not just design purity. It is performance control.
10. Allocation and GC Pressure
Every request allocates objects:
- request context,
- parameter values,
- DTOs,
- JSON parser/writer objects,
- collections,
- log strings,
- exceptions,
- optional wrappers,
- stream buffers.
Allocation is not always bad in modern JVMs, but excessive allocation increases GC pressure and tail latency.
Symptoms:
- p99 spikes during GC,
- high allocation rate per request,
- CPU high even when DB is idle,
- memory pressure under burst traffic,
- large temporary byte arrays/strings.
Common causes:
- converting body to
Stringunnecessarily, - reading entire upload into memory,
- logging full request/response body,
- building large intermediate maps,
stream().map(...).collect(...)chains over huge result sets without bounds,- serializing large nested object graphs,
- exception-driven control flow.
11. Streaming vs Buffering
Buffering is simpler but can be expensive.
Examples:
byte[] file = service.loadFile(id);
return Response.ok(file).build();
This loads whole file into memory.
Better for large payload:
@GET
@Path("/{id}/content")
public Response download(@PathParam("id") String id) {
StreamingOutput stream = output -> {
documentService.copyContentTo(id, output);
};
return Response.ok(stream)
.type("application/pdf")
.header("Content-Disposition", "attachment; filename=\"document.pdf\"")
.build();
}
But streaming has its own constraints:
- error after partial response is hard to report as JSON,
- client speed affects write duration,
- output stream errors must be logged,
- resource cleanup must be reliable,
- timeout behavior must be tested.
Use streaming for large payloads; use small DTOs for normal JSON.
12. Threading Model
Classic Jakarta REST request handling is often thread-per-request from a container-managed pool.
Simplified:
1 request = 1 container request thread until response completes
If resource method blocks on DB or external HTTP call, thread is occupied.
This model is simple and works well with bounded pools, but can suffer when:
- downstream latency increases,
- too many concurrent blocking calls,
- request timeout too high,
- retries multiply load,
- DB pool is small and request threads pile up waiting.
12.1 Thread Pool Exhaustion Scenario
Request threads: 200
DB pool: 30
External API becomes slow: 5 seconds
Incoming traffic: 100 RPS
Requests pile up, occupy threads, wait for DB/external calls, and eventually even cheap endpoints may fail because no request thread is free.
Mitigations:
- strict timeouts,
- bulkheads,
- circuit breakers,
- async/job resource pattern,
- queue limits,
- separate executor for long-running work,
- admission control,
- virtual threads where supported and appropriate.
13. Async Jakarta REST Does Not Magically Make Work Faster
Async resource pattern releases request thread while work continues elsewhere.
@GET
@Path("/{id}/expensive")
public void expensive(
@PathParam("id") String id,
@Suspended AsyncResponse response) {
executor.submit(() -> {
try {
CaseResponse result = service.compute(id);
response.resume(result);
} catch (Exception e) {
response.resume(e);
}
});
}
This can improve request thread utilization, but total system capacity still depends on:
- executor size,
- downstream pool size,
- CPU,
- memory,
- queue length,
- timeout,
- cancellation handling.
If you move blocking work from request thread pool to unbounded executor, you may create a worse failure mode.
Correct principle:
Async is useful when it controls resource ownership and prevents request thread starvation. It is not a substitute for capacity planning.
14. CompletionStage Resource Methods
Jakarta REST supports returning asynchronous types such as CompletionStage in modern versions.
Example:
@GET
@Path("/{id}")
public CompletionStage<CaseResponse> getCase(@PathParam("id") String id) {
return caseQueryService.getCaseAsync(id)
.thenApply(CaseResponse::from);
}
This is clean when the underlying work is genuinely async/non-blocking or managed by an appropriate executor.
Bad:
return CompletableFuture.supplyAsync(() -> blockingRepository.find(id));
without a managed bounded executor.
Problems:
- uses common ForkJoinPool by default,
- blocking work can starve unrelated tasks,
- context propagation unclear,
- timeout/cancellation often missing.
Use container-managed executor facilities where possible.
15. Virtual Threads
Virtual threads can improve scalability for blocking IO-heavy workloads by making blocking cheaper at the thread abstraction level.
But virtual threads are not magic.
They help when:
- code is mostly blocking IO,
- thread-per-request model is simple,
- bottleneck is platform thread scarcity,
- dependencies can support higher concurrency,
- container/runtime supports virtual thread configuration safely.
They do not help when:
- CPU is saturated,
- database pool is bottleneck,
- external API rate limit is bottleneck,
- locks serialize work,
- synchronized/blocking pinning issue appears,
- payload serialization dominates CPU,
- you allow unlimited concurrency without backpressure.
15.1 Virtual Thread Invariant
Virtual threads reduce cost of waiting threads; they do not increase capacity of downstream systems.
If DB pool has 30 connections, 10,000 virtual threads waiting for DB do not create 10,000 DB connections. They create 9,970 queued waiters unless you add bulkheads/admission control.
15.2 Jakarta EE Context
Jakarta EE 11 introduces JDK-runtime-aware support for virtual threads in the platform direction, and Jakarta Concurrency 3.1 provides support for virtual threads in managed concurrency resources. In practice, implementation support and configuration vary by runtime. Treat virtual threads as a runtime feature to verify through load testing, not a theoretical switch.
16. Connection Pools Are Often the Real Bottleneck
REST services usually depend on pools:
- database pool,
- outbound HTTP client pool,
- thread pool,
- executor queue,
- cache connection pool,
- message broker connection/channel pool.
Example bottleneck:
Request threads: 200
DB pool: 20
Endpoint requires DB query taking 100 ms
Theoretical DB-limited throughput ≈ 20 / 0.1 = 200 RPS
If query latency becomes 500 ms:
Throughput ≈ 20 / 0.5 = 40 RPS
Adding more request threads will not fix it. It may make tail latency worse.
17. Timeout Budgeting
Each request should have a time budget.
Example:
Total SLO p95: 300 ms
- REST runtime + filters: 10 ms
- DB query: 100 ms
- outbound risk service: 120 ms
- serialization: 20 ms
- buffer: 50 ms
Timeouts should respect budget:
Risk service timeout: 150 ms
DB query timeout: 120 ms
Overall request timeout: 300-400 ms
Bad:
Overall SLO: 300 ms
Outbound HTTP timeout: 30 seconds
That creates thread pile-up and failure amplification.
18. Retries and Performance Collapse
Retries can multiply load.
If traffic is 100 RPS and every failing request retries 3 times:
Effective traffic = 400 attempts/sec
During downstream degradation, retry storm can destroy both caller and callee.
Retry only when:
- operation is idempotent or protected by idempotency key,
- failure is likely transient,
- timeout is short,
- backoff/jitter exists,
- retry budget is bounded,
- circuit breaker prevents storm.
Do not retry large POST mutation blindly.
19. Caching
Caching can improve performance, but only when correctness is preserved.
Options:
- HTTP caching with
ETag,Last-Modified,Cache-Control, - application cache,
- query result cache,
- CDN/proxy cache for public/static resources,
- client-side cache.
For case-management systems, many resources are sensitive and user-specific. Use private/no-store/no-cache carefully.
ETag for read resource:
@GET
@Path("/{id}")
public Response getCase(@PathParam("id") String id, @Context Request request) {
CaseView view = service.getCaseView(id);
EntityTag etag = new EntityTag(view.versionHash());
Response.ResponseBuilder precondition = request.evaluatePreconditions(etag);
if (precondition != null) {
return precondition.build();
}
return Response.ok(CaseResponse.from(view))
.tag(etag)
.build();
}
This can avoid serializing and transferring unchanged representation.
20. Pagination and Response Size
Large responses hurt:
- DB time,
- memory,
- serialization CPU,
- network transfer,
- client rendering,
- p99 latency.
Never ship unbounded collection endpoints.
Bad:
GET /cases
with unlimited results.
Better:
GET /cases?limit=50&cursor=eyJvZmZzZXQiOjEwMDB9
Set:
- default limit,
- max limit,
- stable sort,
- cursor/keyset pagination for large datasets,
- response metadata,
- clear filtering grammar.
Performance begins at contract design.
21. Logging Cost
Logging can become a hidden bottleneck.
Expensive patterns:
log.info("request body={}", hugeBody);
log.info("response={}", objectMapper.writeValueAsString(response));
Risks:
- CPU cost,
- allocation,
- blocking appender,
- disk pressure,
- sensitive data leakage,
- p99 latency spikes.
Better:
log.info("case request completed caseId={} status={} durationMs={} correlationId={}",
caseId, status, durationMs, correlationId);
Log structured metadata, not full payload by default.
22. Filters and Interceptors Performance
Filters run for many or all requests. A slow global filter damages every endpoint.
Avoid in global filters:
- blocking DB calls,
- external HTTP calls,
- full body buffering,
- expensive JSON parsing,
- synchronous audit writes,
- high-cardinality metric labels,
- complex authorization if endpoint-specific logic is needed elsewhere.
Good global filters:
- correlation id,
- cheap auth context extraction,
- security headers,
- timing metrics,
- access log metadata,
- request size guard.
If a filter must do expensive work, bind it narrowly with name binding or resource-specific registration.
23. Exception Cost
Exceptions are expensive if used as normal control flow.
Bad:
try {
UUID id = UUID.fromString(input);
} catch (IllegalArgumentException e) {
// expected for many invalid requests
}
This is acceptable occasionally, but not as a hot-path parser for high-volume invalid traffic. Better validate cheap format first when needed.
Also avoid logging full stack traces for expected client errors:
400validation error,404not found,409conflict,412precondition failed.
Stack traces are useful for server bugs, not for every bad user input.
24. CPU-Bound vs IO-Bound Endpoints
Performance strategy depends on workload.
24.1 IO-Bound Endpoint
Example:
GET /cases/{id}
- DB query 80 ms
- JSON serialization 5 ms
Focus:
- DB query/index,
- pool sizing,
- timeout,
- caching,
- concurrency control.
Virtual threads may help if request threads are bottleneck, but DB pool remains limit.
24.2 CPU-Bound Endpoint
Example:
POST /documents/{id}/analysis
- CPU classification 800 ms
- no external IO
Focus:
- algorithm optimization,
- separate worker pool,
- async job pattern,
- limit concurrency,
- avoid blocking request thread,
- possibly offload to specialized service.
Virtual threads do not create more CPU.
24.3 Serialization-Bound Endpoint
Example:
GET /cases/{id}/audit-events?limit=5000
- DB 50 ms
- JSON serialization 900 ms
Focus:
- reduce response size,
- pagination,
- streaming JSON if appropriate,
- simpler DTO,
- faster JSON provider/config,
- compression trade-off.
25. Benchmarking REST Endpoints
Benchmarking must resemble production.
Include:
- realistic payload size,
- realistic auth headers/session,
- realistic database volume,
- realistic downstream latency,
- keep-alive behavior,
- warmup period,
- ramp-up period,
- fixed test duration,
- error-rate check,
- p95/p99 latency,
- server resource metrics.
Tools can include wrk, k6, Gatling, JMeter, Vegeta, or custom harness. The tool matters less than test design.
25.1 Bad Benchmark
Single endpoint
Single user
No auth
Tiny payload
In-memory fake DB
10 second test
Only average latency reported
This result is not production evidence.
25.2 Better Benchmark
30 minute test
5 minute warmup
real DB dataset
mixed endpoint workload
realistic payload distribution
p50/p95/p99 reported
server CPU/memory/GC/thread/pool metrics captured
failure rate included
26. Load Test Workload Model
Example workload:
| Endpoint | Weight |
|---|---|
GET /cases/{id} | 50% |
GET /cases?status=... | 20% |
POST /cases/{id}/notes | 10% |
POST /cases/{id}/transitions | 5% |
GET /cases/{id}/audit-events | 10% |
GET /cases/{id}/events SSE | 5% connection mix |
Mixed workload catches resource interactions that single-endpoint tests miss.
For SSE, model active connections separately from request/response RPS.
27. Profiling
Use profilers when measurement shows CPU or allocation bottleneck.
Look for:
- JSON serialization hotspots,
- DTO mapping overhead,
- regex validation cost,
- logging formatting,
- lock contention,
- excessive allocation,
- date/time formatter creation,
- reflection/config introspection repeated per request.
Do not guess based on code aesthetics. Measure.
28. Warmup and Cold Start
JVM and Jakarta REST runtimes may have warmup cost:
- class loading,
- annotation scanning,
- provider discovery,
- JIT compilation,
- JSON mapper initialization,
- database pool initialization,
- connection TLS warmup,
- cache warmup.
Benchmark and readiness probes should account for this.
Production implication:
- readiness should not turn green before critical providers/pools are ready,
- first user request should not pay all initialization cost,
- rolling deploy should avoid sending full traffic to cold instance immediately.
29. Native Image Considerations
Some Jakarta REST implementations support native-image-oriented deployments through frameworks/runtimes. Native images can improve startup and memory footprint, but may affect:
- reflection configuration,
- dynamic provider discovery,
- JSON serialization behavior,
- resource scanning,
- runtime proxies,
- monitoring/profiling assumptions,
- peak throughput after warmup compared with JVM JIT.
Do not assume native image is always faster. It optimizes some dimensions, especially startup and footprint, but workload-specific testing remains required.
30. Performance-Aware API Design
Design decisions that improve performance:
- Bounded collection endpoints.
- Explicit field selection only if governance exists.
- Separate heavy subresources.
- Cursor/keyset pagination for large sets.
- Conditional GET with ETag where safe.
- Async job resource for long-running commands.
- Streaming for large downloads.
- Small DTOs for hot-path endpoints.
- Avoid dynamic
Map<String, Object>for stable contracts. - Avoid embedding audit trails into primary resource by default.
API contract is your first performance control surface.
31. Performance Failure Patterns
31.1 Retry Storm
Downstream slows. Caller retries. Traffic multiplies. Everything collapses.
Fix:
- bounded retries,
- jitter,
- circuit breaker,
- timeout budget,
- idempotency key,
- load shedding.
31.2 Pool Starvation
Threads wait for DB pool. Request queue grows. Latency explodes.
Fix:
- right-size pool,
- limit concurrency,
- optimize queries,
- add timeout,
- separate pools if needed.
31.3 Large Response Explosion
One endpoint returns huge nested graph. Serialization dominates.
Fix:
- pagination,
- subresources,
- projection DTO,
- response size limit.
31.4 Slow Client Write
Client receives slowly. Server keeps response resource occupied.
Fix:
- streaming timeout,
- write timeout if runtime supports,
- bounded queues,
- CDN/object storage for large file delivery.
31.5 Global Filter Bottleneck
Every request performs expensive work in filter.
Fix:
- move logic to endpoint-specific layer,
- cache safe metadata,
- name-bind filter,
- remove body buffering.
32. Production Metrics Checklist
Collect at least:
HTTP Metrics
- request count by route/method/status,
- latency by route/method,
- response size,
- request size,
- error rate,
- active requests.
JVM Metrics
- CPU,
- heap usage,
- allocation rate,
- GC pause,
- thread count,
- blocked/waiting threads.
Pool Metrics
- DB active/idle/pending,
- HTTP client pool active/pending,
- executor queue depth,
- circuit breaker state,
- retry count,
- timeout count.
Domain Metrics
- case transition rate,
- validation failure rate,
- conflict/precondition failure rate,
- long-running job queue depth,
- SSE active connections.
Avoid high-cardinality labels such as raw user id, case id, or request id in metrics.
33. Tuning Order
Use this order:
- Define SLO and workload.
- Measure current behavior.
- Identify bottleneck.
- Fix contract/design issue first.
- Fix query/downstream issue.
- Fix serialization/payload issue.
- Fix pool/thread/timeout issue.
- Tune runtime/JVM only after application bottleneck is understood.
- Validate with load test.
- Add regression guard.
Do not start by tweaking JVM flags.
34. Example Performance Review
Endpoint:
GET /cases/{caseId}/timeline
Observed:
p50: 120 ms
p95: 1.8 s
p99: 6.0 s
response p99 size: 9 MB
DB queries/request: 301
Likely issues:
- unbounded timeline,
- N+1 query,
- huge JSON serialization,
- no pagination,
- client probably does not need all data.
Fix plan:
- Add
limitand cursor. - Replace entity graph serialization with projection query.
- Add separate detail endpoint for individual timeline items.
- Add ETag for stable pages if safe.
- Add load test for 50, 100, 500 item pages.
This is better than increasing heap or request threads.
35. Case-Management Performance Blueprint
For regulated case-management APIs:
| API type | Performance design |
|---|---|
| Case summary | Projection DTO, small payload, cache/ETag if allowed |
| Case search | Indexed filters, pagination, no arbitrary unbounded query |
| Evidence download | Streaming/object storage, authorization before stream |
| Audit trail | Append-only query, cursor pagination, no embedded full audit in case response |
| State transition | Small command DTO, idempotency/precondition, async for long work |
| Notification stream | SSE as hint, canonical state through GET |
| Reporting | Async export job, not synchronous huge REST response |
This aligns performance with correctness and defensibility.
36. Checklist
Before optimizing:
- Is there a clear SLO?
- Are p95/p99 measured?
- Is workload realistic?
- Is payload size measured?
- Are DB queries counted?
- Are downstream calls timed?
- Are pool metrics visible?
- Is allocation/GC measured?
- Are retries bounded?
- Are timeouts aligned with budget?
- Are collection endpoints bounded?
- Are large downloads streamed?
- Are global filters cheap?
- Is serialization cost known?
- Are slow clients considered?
- Is load test result repeatable?
37. Practice Tasks
- Add timing metrics to all Jakarta REST resources using a response filter.
- Measure p95/p99 for
GET /cases/{id}with realistic payloads. - Create a deliberately unbounded collection endpoint, load test it, then fix with pagination.
- Compare returning entity graph vs projection DTO.
- Implement
ETagon a read endpoint and measure unchanged response behavior. - Simulate downstream slowness and observe request thread/pool saturation.
- Add timeout and circuit breaker, then retest.
- Test virtual-thread-enabled executor/runtime if available and compare under IO-bound load.
- Profile serialization-heavy endpoint.
- Add SSE connections to load test and observe active connection/resource behavior.
38. Key Takeaways
- Jakarta REST performance is mostly about pipeline cost, not annotation syntax.
- Measure before optimizing; p95/p99 matter more than average.
- Serialization, payload size, DB access, downstream calls, and pool saturation dominate many REST workloads.
- Async and virtual threads can improve resource utilization, but they do not remove CPU, DB, or external system limits.
- Contract design is performance design: bounded collections, projection DTOs, subresources, ETags, and async job resources matter.
- Global filters/interceptors must stay cheap.
- Load tests must reflect realistic traffic, payload, auth, dependencies, and long-lived streams.
References
- Jakarta RESTful Web Services 4.0 Specification: https://jakarta.ee/specifications/restful-ws/4.0/jakarta-restful-ws-spec-4.0
- Jakarta EE Platform 11 Specification: https://jakarta.ee/specifications/platform/11/
- Jakarta EE 11 Release: https://jakarta.ee/release/11/
- Jakarta Concurrency 3.1: https://jakarta.ee/specifications/concurrency/3.1/
- Jakarta REST Client API package: https://jakarta.ee/specifications/restful-ws/4.0/apidocs/jakarta.ws.rs/jakarta/ws/rs/client/package-summary
You just completed lesson 26 in deepen practice. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.
Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.