Series/Learn Java Microservices Communication

Start HereOrdered learning track

Compression, Payload Size, and Wire Efficiency

Learn Java Microservices Communication - Part 017

Production-grade guide to compression, payload size, representation design, and wire efficiency for Java microservice HTTP communication.

[2026-07-05]15 min read2811 words

In This Lesson

1. The Wire Efficiency Mental Model 2. The Three Optimization Layers 3. Payload Size Is a Contract Property

PrevNext

Lesson 1796 lesson track01–17 Start Here

#java#microservices#http#compression+5 more

Part 017 — Compression, Payload Size, and Wire Efficiency

Payload efficiency is not a cosmetic performance topic.

In microservices, payload size affects:

latency;
CPU;
memory allocation;
garbage collection;
network saturation;
proxy buffering;
timeout probability;
retry amplification;
tail latency;
observability cost;
incident blast radius.

A badly shaped payload can turn a healthy dependency into a slow dependency.

A slow dependency then triggers timeout.

Timeout triggers retry.

Retry multiplies traffic.

Multiplied traffic saturates connection pools, worker threads, broker consumers, or gateway queues.

So payload efficiency is not about saving a few kilobytes.

It is about controlling the communication envelope.

The core rule:

Optimize representation before compression.
Compress only when the saved network cost is greater than the added CPU, latency, memory, and operational complexity.

This part teaches how to reason about that rule in production Java microservices.

1. The Wire Efficiency Mental Model

Every HTTP call pays several costs.

The payload is not only the bytes on the wire.

A production request has at least five size-related dimensions:

Dimension	Meaning	Failure mode
Logical size	Number of domain items/fields	Over-fetching, unstable API contract
Serialized size	JSON/Protobuf/XML/Avro bytes	Bandwidth waste, parse latency
Compressed size	Bytes after content coding	CPU/latency trade-off
In-memory size	Object graph after parsing	GC pressure, heap spikes
Observed size	Logs/traces/metrics cost	Expensive telemetry, cardinality explosion

A common mistake is measuring only serialized size.

For Java services, the in-memory size can be much larger than the wire size because parsed JSON becomes object graphs, strings, collections, buffers, and intermediate parser structures.

That means compression may reduce network cost while leaving the most expensive part untouched: parsing and allocation.

2. The Three Optimization Layers

Payload efficiency has three layers.

Layer 1: Do not send unnecessary data.
Layer 2: Represent necessary data efficiently.
Layer 3: Compress when it actually wins.

The order matters.

If a service returns 200 fields when the caller needs 8, compression only hides the smell.

Better:

Bad: large generic response + gzip
Good: intentionally shaped response + maybe gzip

Production teams often jump to layer 3 because it is easy to toggle.

Top-tier engineers start with layer 1.

3. Payload Size Is a Contract Property

Payload size should not be treated as an incidental runtime detail.

It is part of the communication contract.

A caller depends on the callee not only for schema shape but also for approximate operational envelope.

For example:

GET /accounts/{accountId}/case-summary

The caller should be able to assume:

Typical response: < 30 KB
P95 response: < 80 KB
Hard limit: 256 KB
No unbounded arrays
No embedded audit history
No base64 documents
No full child entity expansion by default

Without such assumptions, timeout and capacity planning become guesswork.

A schema can be compatible while the payload becomes operationally incompatible.

Example:

{
  "caseId": "CASE-123",
  "status": "UNDER_REVIEW",
  "subjects": [
    {
      "subjectId": "SUB-1",
      "name": "...",
      "addresses": [
        {
          "line1": "...",
          "country": "ID"
        }
      ]
    }
  ]
}

Adding a new field may be schema-compatible.

But if the new field is:

"fullAuditTimeline": [ ... thousands of events ... ]

then it is not operationally harmless.

Compatibility has two forms:

Compatibility type	Question
Schema compatibility	Can the client parse it?
Operational compatibility	Can the client survive it under production load?

Both matter.

4. Why Payload Size Impacts Reliability

Payload size affects reliability through time and resource consumption.

This is why payload efficiency belongs in a communication series, not only in a performance tuning series.

A request that returns a huge payload can consume:

server CPU to serialize;
server memory to build response;
network bandwidth to transmit;
proxy memory to buffer;
client CPU to parse;
client heap to materialize;
tracing/logging budget to observe;
retry budget if it times out.

The bigger the payload, the more expensive every failed attempt becomes.

5. First Principle: Avoid Over-Fetching

The cheapest byte is the byte you do not send.

Over-fetching usually comes from API design that exposes domain aggregates instead of use-case-specific representations.

Bad internal API:

GET /cases/{caseId}

Returns:

case core
subjects
addresses
documents
attachments
audit timeline
notes
tasks
permissions
risk indicators
workflow history

The caller only needs:

caseId
status
assignee
nextActionDueAt
riskLevel

Better API:

GET /cases/{caseId}/worklist-card

or:

GET /cases/{caseId}?view=worklist-card

The point is not whether you prefer sub-resource or view parameter.

The point is that payload shape must match caller intent.

Design rule

Do not expose one giant representation and expect every caller to ignore unused fields.

Unused fields are still paid for by the server, network, proxy, client, and runtime.

6. Projection APIs

A projection API returns a representation designed for a specific read use case.

Example:

GET /enforcement-cases/{caseId}/summary
GET /enforcement-cases/{caseId}/timeline
GET /enforcement-cases/{caseId}/decision-context
GET /enforcement-cases/{caseId}/worklist-card

This is usually better than:

GET /enforcement-cases/{caseId}?include=a,b,c,d,e,f,g,h

Why?

Because named projections are stable and observable.

Approach	Pros	Risks
Named projection	Clear contract, stable metrics, easier caching	More endpoints
Arbitrary include fields	Flexible	Hard to test, hard to cache, combinatorial behavior
GraphQL-like selection	Very flexible	Requires strong governance, cost control, query limits

For internal service-to-service APIs, named projections often give the best operational control.

Example:

GET /cases/CASE-123/worklist-card
Accept: application/json

Response:

{
  "caseId": "CASE-123",
  "status": "WAITING_FOR_REVIEW",
  "assignee": "investigator-41",
  "riskLevel": "HIGH",
  "nextActionDueAt": "2026-07-05T10:00:00Z"
}

This is small, stable, and purpose-built.

7. Expansion Must Be Explicit

Implicit expansion creates hidden payload growth.

Bad:

GET /customers/C-123

Sometimes returns only customer core.

Sometimes also returns accounts, documents, historical addresses, regulatory flags, and contact attempts depending on server-side configuration.

Better:

GET /customers/C-123
GET /customers/C-123?expand=accounts
GET /customers/C-123?expand=accounts,regulatory-flags

But expansion must be constrained.

A production expansion policy should define:

Allowed expansions
Maximum nesting depth
Maximum expanded collection size
Timeout behavior
Partial result behavior
Cacheability
Authorization impact
Telemetry dimensions

Never allow unbounded expansion in internal service calls.

Bad:

GET /cases/CASE-123?expand=*

That API looks convenient until one caller accidentally turns an OLTP request into a graph export.

8. Pagination Is a Payload Control Mechanism

Pagination is not only UI convenience.

It is a communication control mechanism.

Any endpoint returning a list must answer:

What is the default limit?
What is the maximum limit?
Is ordering stable?
What cursor represents the next page?
What happens when data changes during pagination?
Can a page become too large because each item is huge?

Bad:

GET /case-events?caseId=CASE-123

Returns all events.

Better:

GET /case-events?caseId=CASE-123&limit=100&cursor=eyJvZmZzZXQiOjEwMH0

Response:

{
  "items": [
    {
      "eventId": "EVT-1",
      "occurredAt": "2026-07-05T08:15:30Z",
      "type": "CASE_ASSIGNED"
    }
  ],
  "nextCursor": "eyJvZmZzZXQiOjIwMH0",
  "hasMore": true
}

Limit count alone is not enough.

You also need maximum serialized response size.

Max items: 100
Max response body: 512 KB

Why?

Because 100 tiny records and 100 giant records are not equivalent.

9. Beware of Base64 Payloads

Base64 increases payload size by roughly one third before compression characteristics are considered.

It also encourages embedding binary data into JSON APIs.

Bad:

{
  "documentId": "DOC-123",
  "filename": "evidence.pdf",
  "contentBase64": "JVBERi0xLjQKJ..."
}

Better:

{
  "documentId": "DOC-123",
  "filename": "evidence.pdf",
  "downloadUrl": "/documents/DOC-123/content",
  "contentSha256": "...",
  "sizeBytes": 849231
}

Then stream binary content separately:

GET /documents/DOC-123/content
Accept: application/pdf

Do not mix metadata and large binary content by default.

Separate them because they have different:

caching behavior;
authorization checks;
timeout budgets;
streaming needs;
audit requirements;
retry semantics;
logging restrictions.

10. JSON Payload Efficiency

JSON is human-readable and widely supported.

It is also verbose.

That does not automatically make it wrong.

For many internal HTTP APIs, JSON is a good default because it is debuggable, language-neutral, and compatible with OpenAPI tooling.

But JSON must be shaped carefully.

10.1 Avoid repeated static metadata

Bad:

{
  "items": [
    {
      "caseType": "ENFORCEMENT_CASE",
      "jurisdiction": "ID",
      "schemaVersion": "2026-07-01",
      "caseId": "CASE-1"
    },
    {
      "caseType": "ENFORCEMENT_CASE",
      "jurisdiction": "ID",
      "schemaVersion": "2026-07-01",
      "caseId": "CASE-2"
    }
  ]
}

Better:

{
  "caseType": "ENFORCEMENT_CASE",
  "jurisdiction": "ID",
  "schemaVersion": "2026-07-01",
  "items": [
    {
      "caseId": "CASE-1"
    },
    {
      "caseId": "CASE-2"
    }
  ]
}

10.2 Do not use deeply nested objects unless the relationship matters

Deep nesting increases parser work and often reflects unclear ownership.

Bad:

{
  "case": {
    "workflow": {
      "stage": {
        "assignment": {
          "user": {
            "id": "u-1"
          }
        }
      }
    }
  }
}

Better for a read model:

{
  "caseId": "CASE-123",
  "workflowStage": "LEGAL_REVIEW",
  "assigneeUserId": "u-1"
}

Flattening is not always correct for domain persistence.

But communication payloads are not persistence models.

A read response is allowed to be a projection.

10.3 Prefer stable identifiers over embedded entities

Bad:

{
  "caseId": "CASE-123",
  "assignee": {
    "userId": "u-1",
    "name": "Ayu",
    "email": "ayu@example.internal",
    "department": {
      "id": "dept-9",
      "name": "Investigation"
    }
  }
}

If caller only needs the assignee identity:

{
  "caseId": "CASE-123",
  "assigneeUserId": "u-1"
}

If caller needs display data:

{
  "caseId": "CASE-123",
  "assignee": {
    "userId": "u-1",
    "displayName": "Ayu"
  }
}

Be intentional.

11. Nulls, Empty Values, and Field Presence

Payload efficiency is not only about removing bytes.

It is also about preserving meaning.

Consider:

{
  "reviewedAt": null
}

This might mean:

Known to be not reviewed.

Or:

Server does not know.

Or:

Field is unavailable to caller.

Or:

Field omitted due to projection.

Do not remove nulls blindly if presence carries semantics.

For production APIs, define field presence rules.

Example:

Omitted field: not part of this projection.
null field: part of projection, but value is explicitly absent.
empty array: part of projection, known to have no elements.

This prevents clients from over-interpreting missing data.

Compression can make null-heavy payloads smaller, but it cannot fix ambiguous semantics.

12. Compression Basics in HTTP

HTTP compression is usually represented through content coding.

The client advertises what it accepts:

Accept-Encoding: gzip, br, zstd

The server replies with what it applied:

Content-Encoding: gzip

The representation is still the same media type:

Content-Type: application/json
Content-Encoding: gzip

Meaning:

The payload is JSON after gzip decoding.

Do not confuse:

Header	Meaning
`Content-Type`	What the representation means after decoding
`Content-Encoding`	How the representation is encoded/compressed
`Transfer-Encoding`	Message transfer framing, mainly HTTP/1.1 concern
`Accept-Encoding`	What response content codings the client can accept

For microservices, Content-Encoding is the main concept.

13. Common Content Codings

Common HTTP content codings include:

Coding	Typical use	Notes
`gzip`	Default interoperable compression	Widely supported, safe default
`br`	Excellent compression ratio, common on web	Often stronger for static/text content
`zstd`	Strong ratio and speed trade-off	Increasingly relevant; check platform support
`deflate`	Legacy	Avoid unless required
`identity`	No compression	Always conceptually available unless refused

For internal Java microservices, gzip is often the baseline because it is easy to operate across proxies, clients, and frameworks.

But the best algorithm is not universal.

You must test against your actual payloads.

Example result patterns:

Small JSON < 1 KB: compression may lose.
Medium repetitive JSON: gzip often wins.
Large text-heavy JSON: br/zstd may win more.
Already compressed binary: compression wastes CPU.
Random/encrypted data: compression does not help.

14. Compression Is Not Free

Compression costs CPU and memory.

Decompression also costs CPU and memory.

In Java, it can also add allocation pressure through buffers and byte arrays.

Compression should be treated as a policy decision, not a universal toggle.

Ask:

What payload sizes are common?
What is the P95/P99 body size?
Is network or CPU the bottleneck?
Does the gateway decompress/recompress?
Are we compressing already compressed content?
Does compression delay first byte?
Does the client stream or buffer compressed responses?
Can decompressed size exceed limits?

If you cannot answer these, compression may hide rather than solve the problem.

15. Compression Thresholds

Most production services should not compress every response.

Compressing tiny payloads can increase latency and CPU without meaningful network savings.

A practical policy:

Do not compress below a minimum response size.
Compress only selected content types.
Never compress already compressed binary.
Measure before and after.

Example policy:

compression:
  enabled: true
  min-response-size: 2048
  mime-types:
    - application/json
    - application/problem+json
    - text/plain

The threshold is workload-specific.

A 2 KB threshold is a reasonable starting point, not a law.

For high-throughput internal APIs, tune with actual measurements.

16. Compression and Tail Latency

Average latency can improve while tail latency gets worse.

Why?

Compression adds CPU work.

Under normal load, CPU is available and network savings dominate.

Under high load, CPU becomes scarce and compression work competes with request handling.

That can increase P99 latency.

Normal load:
  network saved > CPU cost
  latency improves

High load:
  CPU cost > network saved
  latency worsens

This is why compression testing must include:

normal load;
peak load;
CPU pressure;
GC pressure;
large payload routes;
mixed small and large payloads;
retries under failure.

Compression that passes a happy-path benchmark may fail during an incident.

17. Compression and Retries

Compression changes retry economics.

If a request times out after the server spent CPU compressing a large response, a retry may force the server to compress again.

This can create CPU amplification.

Mitigations:

Use realistic timeout budgets.
Avoid expensive large response endpoints in synchronous paths.
Cache stable large representations when appropriate.
Use async export for very large results.
Use pagination or streaming.
Use idempotency and request deduplication for commands.
Expose size metrics.

For large generated responses, caching compressed variants can help, but only if authorization, personalization, and freshness rules are correct.

18. Do Not Compress Secrets with Attacker-Controlled Input

Compression can create security risks when secrets and attacker-controlled input are compressed together and response size is observable.

This class of issue is known from attacks against compressed encrypted web traffic.

For internal service-to-service communication, the risk profile is different from browser traffic, but the principle still matters.

Avoid compressing payloads that combine:

secret data + attacker-controlled reflected data + observable compressed length

Practical internal rule:

Do not put credentials, tokens, session secrets, or highly sensitive values into compressible reflected responses.

Also avoid logging compressed or decompressed sensitive payloads.

Compression is not encryption.

TLS protects transport confidentiality.

Application logs, traces, metrics, and dumps can still leak data.

19. Decompression Bombs and Response Limits

Compressed payloads can expand to much larger decompressed payloads.

This matters for clients and gateways.

A client that enforces only compressed size may be vulnerable to memory pressure after decompression.

Production clients should enforce:

maximum compressed size
maximum decompressed size
maximum parse depth
maximum array length
maximum object/string length if parser supports it
request timeout
read timeout

For service-to-service clients, response size limits are part of dependency safety.

Example policy:

GET /risk-score/{subjectId}
Max compressed response: 64 KB
Max decompressed response: 256 KB
Max timeout: 300 ms

If a dependency returns more than that, the caller should fail predictably.

Unbounded reads are a production smell.

20. Gateway and Proxy Behavior

Compression may be applied at:

service;
gateway;
sidecar proxy;
ingress proxy;
CDN;
client library.

Internal microservices often have more than one hop.

Questions:

Who compresses?
Who decompresses?
Can the proxy buffer the full response before forwarding?
Are Content-Length and Transfer-Encoding changed?
Are ETags generated for compressed or uncompressed representation?
Are metrics recorded before or after compression?
Are body size limits applied to compressed or decompressed bytes?

Do not assume enabling compression in the application is the only place compression happens.

A gateway may already compress responses.

A sidecar may decompress for inspection.

A proxy may disable compression for certain content types.

An observability agent may measure different size semantics than the application.

21. Compression and Caching

Caching compressed responses introduces representation variants.

A cache must respect Accept-Encoding negotiation.

The usual mechanism is:

Vary: Accept-Encoding

Without correct variant handling, a cache could serve compressed content to a client that cannot decode it, or serve the wrong representation.

For internal microservices, caching is less universal than on the public web, but the same rule applies.

If content negotiation affects response representation, caches must know.

Do not casually add caching and compression together without variant policy.

22. Content-Length and Streaming

Compression can change whether Content-Length is known in advance.

If the server compresses on the fly, it may not know the final compressed length without buffering the entire response.

That can produce chunked transfer in HTTP/1.1 or streamed data frames in HTTP/2.

Operational implication:

Compression can shift memory cost from network to buffering if implementation computes length before sending.

For large responses, avoid full buffering just to compute Content-Length.

Prefer streaming when possible.

But streaming compressed responses also requires careful cancellation handling.

If the client disconnects mid-stream, the server must stop producing and compressing.

Otherwise, it wastes work.

23. Payload Efficiency and Observability

You need metrics for payload size.

At minimum:

request body size
response body size
compressed response size if available
decompressed response size if available
serialization time
compression time
deserialization time
route-level distribution
status code distribution by size bucket

But avoid high-cardinality labels.

Bad metric tags:

caseId
customerId
documentId
errorMessage
fullPathWithId

Good metric tags:

route_template=/cases/{caseId}/summary
method=GET
status_code=200
content_encoding=gzip
size_bucket=64kb_256kb

Size histograms are more useful than averages.

Average payload size hides the rare but dangerous giant responses.

Track:

P50
P90
P95
P99
max
count above limit

24. Java Serialization Cost

In Java, serialization cost is not only CPU cycles.

It also includes allocation and object traversal.

For JSON APIs using Jackson-like libraries, the service may allocate:

DTO objects;
strings;
arrays;
maps;
byte buffers;
intermediate serializer state;
compressed output buffers.

Large payloads can create allocation bursts.

Allocation bursts can create GC pauses.

GC pauses can create timeout bursts.

Timeout bursts can create retry bursts.

So a seemingly harmless DTO expansion can become a reliability problem.

Design implication:

Treat large DTOs as production load-bearing structures.

Do not let DTOs grow indefinitely because "only internal services use them".

Internal consumers are often the most latency-sensitive consumers.

25. Java Example: Response Size Guard

A server-side response size guard is difficult if responses are streamed, but the concept is important.

At the API policy level:

public final class PayloadPolicy {
    public static final long MAX_CASE_SUMMARY_BYTES = 256 * 1024;
    public static final int MAX_CASE_EVENTS_PAGE_SIZE = 100;

    private PayloadPolicy() {
    }
}

For list endpoints:

public record PageRequest(int limit, String cursor) {
    public PageRequest {
        if (limit < 1) {
            throw new IllegalArgumentException("limit must be positive");
        }
        if (limit > PayloadPolicy.MAX_CASE_EVENTS_PAGE_SIZE) {
            throw new IllegalArgumentException("limit exceeds maximum page size");
        }
    }
}

Do not rely only on documentation.

Enforce payload control in code.

26. Spring Boot Compression Configuration

A typical Spring Boot service can enable response compression through server configuration.

Example:

server:
  compression:
    enabled: true
    min-response-size: 2KB
    mime-types:
      - application/json
      - application/problem+json
      - text/plain

This is not enough by itself.

You still need route-level policy:

Which endpoints are allowed to return large payloads?
Which endpoints must never compress?
Which clients support which encodings?
Does gateway override compression?
Are metrics collected before or after compression?

Configuration toggles are not architecture.

27. Client-Side Accept-Encoding Policy

A service client should be explicit about what it can decode.

Example intent:

Accept-Encoding: gzip

Do not advertise br or zstd unless your client, runtime, gateway, and observability tooling handle them correctly.

A mature client abstraction should define:

accepted encodings
maximum compressed size
maximum decompressed size
timeout budget
streaming vs buffering behavior
error handling for unsupported encoding
metrics for actual content encoding

The server's Content-Encoding must match something the client can decode.

If it does not, the client should fail clearly.

Bad failure:

java.io.IOException: malformed response

Better failure:

UnsupportedContentEncodingException: service=risk-service route=/risk-score/{subjectId} contentEncoding=br accepted=[gzip]

Error messages are part of operability.

28. Payload Compression and OpenAPI

OpenAPI can describe media types and response schemas, but compression is usually represented through HTTP headers and operational policy rather than only schema.

Document:

responses:
  '200':
    description: Case summary
    headers:
      Content-Encoding:
        schema:
          type: string
        description: Present when response body is compressed.
    content:
      application/json:
        schema:
          $ref: '#/components/schemas/CaseSummaryResponse'

But do not stop there.

Also document:

maximum response size
pagination limits
known large fields
projection rules
compression threshold
supported content codings

Schema tells the client what fields exist.

Payload policy tells the client what operational envelope to expect.

29. Wire Efficiency Decision Model

Use this decision model.

Compression is late in the decision tree.

Not because compression is bad.

Because representation mistakes are more fundamental.

30. When to Consider Protobuf or Binary Encoding

This part focuses on HTTP payload efficiency, not gRPC design.

But binary encoding belongs in the decision model.

Consider Protobuf or another compact format when:

payloads are frequent and high volume;
JSON parsing cost is significant;
schema is stable and strongly governed;
both sides are internal and generated clients are acceptable;
humans do not need to inspect traffic directly;
contract evolution is disciplined;
observability can decode or summarize safely.

Do not switch to binary encoding merely because JSON looks verbose.

The trade-off is not only size.

It includes:

tooling;
debugging;
compatibility;
schema governance;
generated code lifecycle;
gateway support;
tracing/logging visibility;
partial rollout.

For many service-to-service APIs, well-shaped JSON plus compression is enough.

For high-throughput internal RPC, gRPC/Protobuf may be better.

That is why this series has a dedicated gRPC phase later.

31. Efficient Payload Checklist

Use this checklist before optimizing with compression.

[ ] Is this endpoint returning only the fields the caller needs?
[ ] Are large child collections paginated?
[ ] Is response ordering stable?
[ ] Are expansion rules explicit and bounded?
[ ] Are binary payloads separated from metadata?
[ ] Is maximum response size documented and enforced?
[ ] Are null/omitted/empty semantics defined?
[ ] Are large fields opt-in rather than default?
[ ] Is serialization/deserialization time measured?
[ ] Are payload size histograms available per route?
[ ] Are compressed and decompressed limits enforced?
[ ] Is compression threshold configured?
[ ] Are supported content codings known to clients?
[ ] Does gateway/proxy compression behavior match application policy?
[ ] Are sensitive payloads protected from unsafe compression scenarios?

If the first half fails, compression is premature.

32. Production Payload Policy Template

A production API should include a payload policy like this.

## Payload Policy

Endpoint: GET /cases/{caseId}/worklist-card

Representation:
- Media type: application/json
- Default compression: gzip when response >= 2 KB and client accepts gzip
- Binary content: not embedded

Size limits:
- Typical response: < 10 KB
- P95 expected: < 25 KB
- Hard limit: 64 KB decompressed

Collections:
- No unbounded collections
- Maximum alerts embedded: 5
- Additional alerts require separate endpoint

Field presence:
- Omitted field: not part of projection
- null field: explicitly absent value
- empty array: known empty collection

Operational behavior:
- Timeout budget: 300 ms total caller-side
- Retry: safe for GET, max 1 retry, only on transient transport failure
- Metrics: response size histogram by route/status/content-encoding

This kind of policy prevents accidental payload drift.

33. Anti-Patterns

33.1 Compressing everything

All responses are compressed regardless of size or type.

This wastes CPU on tiny responses and already compressed payloads.

33.2 Large internal DTO shared by everyone

One DTO serves UI, batch, search, workflow, audit, and reporting.

This creates uncontrolled payload coupling.

33.3 Infinite include trees

GET /cases/CASE-123?include=subjects.addresses.documents.events.tasks.notes

Without depth and size limits, this is a denial-of-service interface.

33.4 Logging full payloads to debug size problems

Large payload incidents often tempt teams to log the entire body.

That creates privacy, cost, and performance problems.

Log size, schema version, route, correlation ID, and safe summaries instead.

33.5 Treating gzip as a substitute for pagination

Compression reduces bytes.

It does not fix unbounded response semantics.

34. Practical Engineering Rule

Use this rule in design reviews:

Every synchronous endpoint must have an expected size envelope.
Every collection must be bounded.
Every large representation must be intentional.
Every compression decision must be measured.

If an endpoint cannot state its expected size envelope, it is not production-ready.

35. How This Connects to the Next Part

Payload efficiency controls successful responses.

The next part controls failed responses.

Error responses are also payloads.

They must be:

machine-readable;
stable;
safe to log;
useful for retry decisions;
consistent across services;
aligned with HTTP status semantics.

That is the role of error response modeling.

References

RFC 9110 — HTTP Semantics: https://www.rfc-editor.org/rfc/rfc9110.html
IANA HTTP Parameters — HTTP Content Coding Registry: https://www.iana.org/assignments/http-parameters/http-parameters.xhtml
RFC 9457 — Problem Details for HTTP APIs: https://www.rfc-editor.org/rfc/rfc9457.html
OpenTelemetry Semantic Conventions — HTTP: https://opentelemetry.io/docs/specs/semconv/http/
Spring Boot Reference — Response Compression: https://docs.spring.io/spring-boot/reference/how-to/webserver.html

Lesson Recap

You just completed lesson 17 in start here. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.

Back To Series Next Lesson

Continue The Track

Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.

Previous Lesson

Lesson 16

HTTP/3 and QUIC Considerations for Microservices

Next Lesson

Lesson 18

Error Response Modeling: Problem Details, Retriability, Diagnostics