Final StretchOrdered learning track

Designing Production-Grade Network Servers and Gateways

Learn Java Networking - Part 031

Designing production-grade Java network servers and gateways with admission control, overload behavior, graceful shutdown, connection draining, protocol negotiation, streaming proxying, and defensive degradation.

21 min read4144 words
PrevNext
Lesson 3132 lesson track2832 Final Stretch
#java#networking#server-design#gateways+3 more

Part 031 — Designing Production-Grade Network Servers and Gateways

Goal: mampu mendesain server/gateway Java yang tidak hanya “bisa menerima koneksi”, tetapi juga punya admission control, overload semantics, graceful shutdown, draining, deadline propagation, protocol boundary yang defensif, dan degradation strategy yang jelas.

Part sebelumnya membahas production-grade network client. Part ini membalik perspektif: ketika aplikasi Java menjadi network server atau gateway, ia bukan hanya menjalankan handler. Ia menjadi traffic boundary yang harus menjaga kapasitas, fairness, isolation, security, dan correctness di bawah kondisi normal maupun buruk.

Server production-grade yang baik tidak diukur dari throughput maksimum di benchmark kosong. Server yang baik adalah yang tetap bisa menjawab pertanyaan ini ketika sistem sedang kacau:

  • koneksi baru harus diterima atau ditolak?
  • request yang sedang berjalan boleh selesai atau harus diputus?
  • kapan server dianggap ready?
  • kapan server dianggap overloaded?
  • bagaimana mencegah slow client menghabiskan memory?
  • bagaimana gateway meneruskan response streaming tanpa buffering seluruh payload?
  • bagaimana shutdown dilakukan tanpa merusak transaksi yang masih legal?
  • bagaimana memberi sinyal kegagalan yang bisa dipahami client?

1. Kaufman Skill Deconstruction

Untuk mencapai keluwesan top-tier, skill server/gateway networking harus dipecah menjadi beberapa sub-skill yang bisa dilatih terpisah.

Sub-skillPertanyaan intiOutput engineer yang matang
Listen lifecyclekapan bind, accept, stop accept, close?startup/shutdown bisa diprediksi
Admission controlsiapa yang boleh masuk saat kapasitas menipis?overload tidak berubah menjadi collapse
Connection stateapa status setiap koneksi?no zombie connection, no hidden leak
Request lifecyclekapan request dianggap diterima, diproses, selesai, gagal?handler punya state machine jelas
Deadline propagationberapa sisa waktu request?tidak ada infinite work
Backpressurebagaimana slow client/server dikendalikan?memory bounded, fairness terjaga
Graceful shutdownapa yang dihentikan duluan?rolling deploy aman
Gateway semanticsapa yang diteruskan, diubah, ditolak?proxy behavior eksplisit
Failure mappingerror internal menjadi sinyal network apa?client bisa self-correct
Observabilitybukti apa yang dikumpulkan?insiden bisa didiagnosis cepat

Mental model Kaufman-nya sederhana: jangan berlatih “membuat server”. Berlatihlah membuat server yang tahu kapan harus berkata tidak.


2. Server Is a Boundary, Not a Handler Container

Kesalahan umum: menganggap server adalah loop yang menerima request lalu menjalankan fungsi bisnis.

Model yang lebih benar:

Server adalah boundary yang mengubah koneksi mentah menjadi pekerjaan aplikasi. Boundary ini harus melakukan:

  1. acceptance — menerima atau menolak koneksi;
  2. classification — memahami jenis traffic;
  3. admission — menentukan apakah kapasitas tersedia;
  4. parsing — membentuk request dari byte stream/frame;
  5. execution — menjalankan handler dengan deadline dan isolation;
  6. response — menulis hasil tanpa meledakkan buffer;
  7. termination — menutup koneksi secara benar;
  8. observation — menghasilkan evidence untuk debugging.

Jika salah satu boundary hilang, bug production biasanya muncul sebagai latency spike, connection leak, memory pressure, thread exhaustion, atau retry storm dari client.


3. Production Server Invariants

Sebelum memilih API Java, tetapkan invariant.

InvariantMaknaAnti-pattern
Bounded concurrencyjumlah pekerjaan aktif terbatasunbounded thread per request
Bounded memorybuffer/request/queue punya limitread body penuh ke memory
Bounded waitsemua stage punya timeout/deadlineinfinite accept/read/write/handler
Explicit ownershiptiap socket/request punya owner lifecyclesocket ditutup dari banyak tempat tanpa aturan
Fail before collapseoverload ditolak sebelum sistem runtuhmenerima semua traffic sampai OOM
Drain before stopshutdown memisahkan stop-accept dan finish-activelangsung System.exit/kill
Observable statesstate penting bisa dilihathanya log error saat sudah terlambat
Protocol correctnessparser defensif terhadap input jahat/rusakpercaya client selalu valid
Isolationslow/bad client tidak mengganggu semua clientshared lock/global queue besar
Deterministic degradationmode gagal sudah dirancangrandom timeout dan partial response

Invariant ini lebih penting daripada framework. Framework bisa membantu, tetapi tidak menggantikan keputusan boundary.


4. Java Server Implementation Choices

Java menyediakan beberapa level API untuk membuat server/gateway.

ApproachAPICocok untukTrade-off
Blocking socketServerSocket, Socketprotocol kecil, lab, internal tool, virtual-thread servermudah dipahami, harus disiplin deadline/resource
Non-blocking NIOServerSocketChannel, SocketChannel, Selectorserver high-connection, custom protocol, gateway low-levelkompleks, state machine eksplisit
Async channelAsynchronousServerSocketChannelcompletion model, Windows IOCP-style abstractiontidak selalu lebih sederhana dari NIO
Embedded HTTP server JDKcom.sun.net.httpserver.HttpServertest server, admin endpoint, lightweight embedded serverbukan full production web framework
FrameworkNetty/Undertow/Jetty/Tomcat/etc.production HTTP/gateway systemsperlu paham model internal agar tidak salah konfigurasi

Dalam seri ini kita tidak mendalami framework eksternal. Fokusnya mental model yang tetap berlaku saat memakai framework mana pun.


5. Server Lifecycle State Machine

Server yang baik punya state machine eksplisit.

Key distinction:

  • not started: belum bind;
  • live but not ready: process hidup tetapi belum boleh menerima traffic;
  • ready: boleh menerima traffic;
  • degraded: masih menerima sebagian traffic dengan pembatasan;
  • draining: tidak menerima traffic baru, menyelesaikan yang aktif;
  • force closing: menutup paksa karena shutdown budget habis;
  • stopped: semua resource ditutup.

Banyak outage saat rolling deploy terjadi karena server hanya punya dua state: “up” dan “down”. Itu terlalu kasar.


6. Accept Loop Design

Accept loop adalah pintu pertama.

Pada blocking model:

public final class BlockingTcpServer implements AutoCloseable {
    private final ServerSocket serverSocket;
    private final ExecutorService workers;
    private final AtomicBoolean accepting = new AtomicBoolean(true);
    private final Semaphore admission;

    public BlockingTcpServer(int port, int maxActiveConnections) throws IOException {
        this.serverSocket = new ServerSocket();
        this.serverSocket.setReuseAddress(true);
        this.serverSocket.bind(new InetSocketAddress(port), 512);
        this.workers = Executors.newVirtualThreadPerTaskExecutor();
        this.admission = new Semaphore(maxActiveConnections);
    }

    public void serve() throws IOException {
        while (accepting.get()) {
            Socket socket = serverSocket.accept();

            if (!admission.tryAcquire()) {
                reject(socket, "server overloaded\n");
                continue;
            }

            workers.submit(() -> {
                try (socket) {
                    configureAcceptedSocket(socket);
                    handleConnection(socket);
                } catch (IOException e) {
                    // classify and log at connection boundary
                } finally {
                    admission.release();
                }
            });
        }
    }

    private static void configureAcceptedSocket(Socket socket) throws SocketException {
        socket.setTcpNoDelay(true);
        socket.setSoTimeout(30_000);
        socket.setKeepAlive(true);
    }

    private static void reject(Socket socket, String message) {
        try (socket) {
            socket.setSoTimeout(2_000);
            socket.getOutputStream().write(message.getBytes(StandardCharsets.UTF_8));
        } catch (IOException ignored) {
            // reject path must be best-effort
        }
    }

    private void handleConnection(Socket socket) throws IOException {
        // protocol-specific state machine
    }

    @Override
    public void close() throws IOException {
        accepting.set(false);
        serverSocket.close();
        workers.shutdown();
    }
}

Important: server tidak langsung melempar semua accepted socket ke worker tanpa admission. Kalau worker queue unbounded, accept loop bisa terus menerima koneksi sampai memory penuh.


7. Backlog Is Not Admission Control

backlog pada listen socket sering disalahpahami sebagai “max connections”. Itu bukan admission policy aplikasi.

LayerQueueYang dikendalikan
Kernel listen backlogpending connection before acceptkoneksi yang menunggu diterima proses
Application accept loopaccepted connectionkoneksi yang sudah masuk proses
Worker queuesubmitted taskpekerjaan yang menunggu dieksekusi
Protocol queueparsed request/messagepekerjaan logical aplikasi
Write queuepending response bytesoutput yang belum terkirim

Backlog hanya satu antrian di kernel. Admission control production harus ada di aplikasi juga.

Jika hanya mengandalkan backlog:

  • server bisa menerima lebih banyak koneksi daripada kapasitas handler;
  • worker queue bisa tumbuh tanpa batas;
  • latency naik diam-diam;
  • client mulai retry;
  • retry membuat lebih banyak koneksi;
  • server collapse.

8. Admission Control Patterns

Admission control harus didefinisikan per resource yang langka.

ResourceContoh limitFailure response
active connection10k TCP connectionclose early / 503 equivalent
active request1k concurrent request503 with Retry-After if HTTP
body memory256 MiB total buffered413 / close protocol
write backlog64 MiB pending writesslow client disconnect
downstream calls200 active outbound callslocal fail-fast
CPU workworker capacity429/503 depending semantics

Admission harus terjadi sebelum resource mahal dialokasikan.

Anti-pattern:

byte[] body = socket.getInputStream().readAllBytes(); // unbounded body first
if (!admission.tryAcquire()) {
    reject();
}

Pattern yang benar:

if (!admission.tryAcquire()) {
    rejectEarly(socket);
    return;
}
try {
    readBoundedRequest(socket, maxBytes, deadline);
} finally {
    admission.release();
}

9. Request State Machine

Untuk server/gateway, request bukan method call biasa. Request adalah stateful lifecycle.

Kenapa state machine penting?

Karena setiap state punya timeout, memory limit, cancellation behavior, dan metric berbeda.

StateTimeoutLimitMetric
Reading headersshortmax header bytesheader read latency
Reading bodymediummax body bytesbody bytes received
Dispatchingshortexecutor queuequeue wait
Downstream calldeadline remainderoutbound concurrencydownstream latency
Writing responseboundedwrite queue bytesbytes sent / client abort

Tanpa state machine, semua kegagalan akan terlihat seperti “request timeout”. Itu tidak cukup untuk debugging.


10. Thread-per-Connection with Virtual Threads

Virtual thread membuat blocking I/O kembali realistis untuk banyak server custom Java. Tetapi ini bukan izin untuk menghilangkan limit.

Model yang baik:

ExecutorService connections = Executors.newVirtualThreadPerTaskExecutor();
Semaphore maxConnections = new Semaphore(20_000);

while (running) {
    Socket socket = serverSocket.accept();

    if (!maxConnections.tryAcquire()) {
        closeQuietly(socket);
        continue;
    }

    connections.submit(() -> {
        try (socket) {
            socket.setSoTimeout(15_000);
            serveProtocol(socket);
        } finally {
            maxConnections.release();
        }
    });
}

Virtual thread membantu ketika banyak thread menunggu I/O. Namun resource lain tetap nyata:

  • file descriptor;
  • socket buffer;
  • heap object per connection;
  • native memory;
  • downstream capacity;
  • database pool;
  • CPU;
  • log volume;
  • metric cardinality.

Rule:

Virtual threads reduce the cost of waiting. They do not remove the cost of accepting unbounded work.


11. NIO Gateway Design

Gateway sering butuh NIO karena harus menghubungkan dua koneksi: inbound client dan outbound upstream.

Gateway state harus merepresentasikan dua arah data:

DirectionRisk
client -> gatewayslow upload, oversized body, invalid protocol
gateway -> upstreamconnect delay, DNS failure, TLS failure
upstream -> gatewayslow response, partial body, reset
gateway -> clientslow download, client abort

NIO gateway yang benar harus punya:

  • inbound read budget;
  • outbound connect timeout;
  • per-connection deadline;
  • bounded buffer per direction;
  • explicit close propagation;
  • half-close policy;
  • cancellation on either side failure;
  • correlation ID per proxied exchange.

12. Streaming Proxy Without Full Buffering

Anti-pattern gateway:

byte[] requestBody = clientInput.readAllBytes();
byte[] upstreamBody = callUpstream(requestBody);
clientOutput.write(upstreamBody);

Ini gagal untuk large payload, slow client, dan memory pressure.

Streaming gateway harus mengalirkan data dalam bounded chunks.

static void copyBounded(InputStream in, OutputStream out, long maxBytes) throws IOException {
    byte[] buffer = new byte[64 * 1024];
    long total = 0;

    while (true) {
        int n = in.read(buffer);
        if (n == -1) {
            return;
        }
        total += n;
        if (total > maxBytes) {
            throw new IOException("payload too large");
        }
        out.write(buffer, 0, n);
        out.flush(); // choose carefully; excessive flush hurts throughput
    }
}

Untuk HTTP gateway, jangan buffering seluruh response sebelum menulis ke client kecuali memang ada requirement seperti content transformation kecil.

Decision rule:

RequirementStrategy
pass-through large objectstreaming
inspect small JSONbounded buffering
transform compressed payloaddecompress with strict limits
retry after partial uploadusually impossible safely
retry before body sentpossible for idempotent request

13. Gateway Is Not Just Forwarding

Gateway membuat keputusan semantic.

ResponsibilityExample
Normalizeremove hop-by-hop headers
Authenticatevalidate caller identity
Authorizeenforce route-level policy
Routechoose upstream based on path/header/tenant
Limitcap concurrency/rate/body size
Transformrewrite path/header/body if allowed
Observetrace one inbound request across outbound calls
Protectblock SSRF/internal destinations
Degradefallback/reject/shed load

Sebagai engineer, tentukan mana behavior gateway yang eksplisit dan mana yang dilarang.

Contoh policy table:

InputGateway behavior
unknown hostreject
private IP resolved from public hostreject
redirect to unapproved domainreject
large body without content lengthstream with cap
upstream 503preserve or map to gateway 502/503 by policy
client disconnectcancel upstream
upstream slow bodyabort on deadline

14. Protocol Negotiation and Versioning

Server/gateway sering berada di boundary antar versi protocol.

Protocol aspectRisk
HTTP/1.1 vs HTTP/2different connection and multiplexing semantics
TLS ALPNclient/server disagree on protocol
SNIcertificate selection fails
compressiondecompression bomb
chunked transferbody length unknown up front
WebSocket upgradestate moves from HTTP request to bidirectional channel
custom binary versionparser mismatch

Design principle:

Negotiate explicitly, fail loudly, and record the negotiated result.

Log/metric attributes worth capturing:

  • local address;
  • remote address;
  • protocol version;
  • TLS version;
  • cipher suite;
  • SNI hostname;
  • ALPN result;
  • request route;
  • upstream target;
  • response status;
  • request/response bytes;
  • close reason.

15. Graceful Shutdown

Graceful shutdown bukan hanya server.close().

Urutan yang benar:

Server perlu membedakan:

ActionMeaning
mark unreadyload balancer should stop routing new traffic
stop acceptprocess stops accepting new connections
reject new requestexisting keep-alive connection gets no new work
drain activelet accepted work finish
close idlefree unused connections
force cancelshutdown budget exhausted

Pseudo-code:

public final class ServerLifecycle {
    private final AtomicBoolean ready = new AtomicBoolean(false);
    private final AtomicBoolean accepting = new AtomicBoolean(false);
    private final LongAdder activeRequests = new LongAdder();

    public boolean isReady() {
        return ready.get();
    }

    public boolean canAcceptNewWork() {
        return ready.get() && accepting.get();
    }

    public void beginShutdown(Duration drainBudget) {
        ready.set(false);       // remove from LB
        accepting.set(false);   // stop new work locally

        Instant deadline = Instant.now().plus(drainBudget);
        while (activeRequests.sum() > 0 && Instant.now().isBefore(deadline)) {
            LockSupport.parkNanos(Duration.ofMillis(50).toNanos());
        }

        // Force-close remaining connections by connection registry.
    }
}

16. Readiness and Liveness

Health checks harus punya arti operasional.

CheckQuestionShould fail when
Livenessprocess should be restarted?deadlock, unrecoverable state
Readinessshould receive new traffic?dependency unavailable, draining, overload
Startuphas boot completed?config/load/migration not ready

Anti-pattern:

boolean health() {
    return true;
}

Better readiness:

boolean ready() {
    return lifecycle.isReady()
        && admission.availablePermits() > minConnectionHeadroom
        && executorQueueDepth() < maxQueueDepth
        && dependencyGate.allCriticalHealthy()
        && memoryPressureBelowThreshold();
}

Readiness must fail during drain. Otherwise load balancer keeps sending traffic while the process is trying to stop.


17. Overload Behavior

Overload policy harus deterministic.

Common overload strategies:

StrategyUse caseRisk
fail-fastpreserve systemcaller must retry sanely
queue smallabsorb tiny burstsqueue can hide overload
shed low priorityprotect critical trafficpriority model must be trusted
degrade responseserve cached/partial responsesemantic correctness risk
close idlereclaim connection resourceclient reconnect storm
reduce keep-alivereduce FD pressuremore handshakes later

Good overload response for HTTP often uses:

  • 503 Service Unavailable for temporary capacity issue;
  • 429 Too Many Requests for caller-specific rate limit;
  • Retry-After only when retry is actually encouraged;
  • small response body;
  • connection close if needed.

For custom TCP protocol, define equivalent error frames.


18. Slow Client Protection

Slow clients can kill servers.

Failure modes:

Slow behaviorImpact
slow headerconnection slot occupied
slow body uploadworker/thread held
slow response downloadwrite buffer grows
read but never finishrequest never dispatches
half-open connectionresource leak

Defenses:

  • header read timeout;
  • body read timeout or minimum data rate;
  • max header size;
  • max body size;
  • max active connection per source/tenant;
  • bounded write queue;
  • disconnect slow readers;
  • never buffer unlimited response per connection.

For NIO server, write queue limit is mandatory:

final class ConnectionState {
    private final Deque<ByteBuffer> pendingWrites = new ArrayDeque<>();
    private long pendingBytes;
    private final long maxPendingBytes = 1_000_000;

    boolean enqueue(ByteBuffer buffer) {
        int bytes = buffer.remaining();
        if (pendingBytes + bytes > maxPendingBytes) {
            return false;
        }
        pendingWrites.add(buffer);
        pendingBytes += bytes;
        return true;
    }

    void onWritten(int bytes) {
        pendingBytes -= bytes;
    }
}

If pending writes exceed limit, close gracefully if possible. Keeping slow clients forever is not fairness; it is resource capture.


19. Failure Mapping for Servers and Gateways

A gateway must map internal failures into external signals carefully.

Internal failureHTTP-ish mappingNotes
invalid client request400caller can fix
unauthorized401auth missing/invalid
forbidden403auth ok, policy denies
body too large413include limit if safe
caller rate exceeded429caller-specific
local overload503system capacity issue
upstream DNS failure502/503depends whether route is misconfigured or temporary
upstream connect timeout504gateway waited for upstream
upstream TLS failure502bad gateway to upstream
upstream response timeout504response not timely
upstream invalid protocol502upstream broken/mismatch
gateway deadline exceeded504total budget exhausted
client disconnectedoften no responserecord as client abort

Do not map everything to 500. That destroys client behavior and debugging signal.


20. Connection Draining

Connection draining matters for HTTP keep-alive, HTTP/2, WebSocket, and long polling.

During drain:

Connection typeDrain behavior
idle HTTP/1.1 keep-aliveclose or send connection close signal
active HTTP/1.1 requestlet finish until drain deadline
HTTP/2 connectionstop new streams, drain active streams
WebSocketsend close frame with reason if possible
raw TCP protocolsend protocol close/error frame if defined
streaming upload/downloadapply policy: finish if short, abort if long

Draining without stopping new work is not draining.


21. Designing a Custom TCP Server Protocol Boundary

A custom protocol should define:

  • magic/version;
  • frame type;
  • length;
  • correlation/request ID;
  • flags;
  • payload;
  • checksum/integrity if needed;
  • error frame;
  • close frame;
  • max frame size;
  • timeout per frame;
  • compatibility rule.

Example header:

0                   1                   2                   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Magic(16)     | Version(8)    | Type(8)       | Length(32)   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RequestId(64)                                                   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Payload...                                                     |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Rule: length is not trusted until validated against maximum allowed size.

int length = in.readInt();
if (length < 0 || length > maxFrameBytes) {
    throw new ProtocolException("invalid frame length: " + length);
}

22. Server Observability Contract

A production server should emit evidence for each boundary.

BoundaryMetric/log/trace
acceptaccepted/rejected connections, remote address class
admissionactive permits, rejection reason
parserinvalid frame/header count, parse latency
request executionqueue wait, handler latency, deadline remaining
downstreamtarget, connect latency, response latency
response writebytes, write latency, client abort
shutdownactive count, drain duration, forced close count
overloadshed count, priority, capacity dimension

Avoid high-cardinality labels like raw IP or full URL path in metric labels. Put them in structured logs or traces with sampling/redaction.


23. Security at Server Boundary

Networking server security is not only TLS.

RiskDefense
oversized header/bodystrict limits
slowloris-style behaviorread timeout/min data rate
protocol smugglingstrict parser, no ambiguous framing
request queue exhaustionadmission control
log injectionsanitize untrusted values
path/host confusioncanonical route matching
unexpected proxy headerstrust only from known proxy boundary
IP spoof assumptionsdo not trust source IP behind proxy unless verified
decompression bombcompressed/uncompressed ratio limits
WebSocket abusemessage size and idle timeout

Gateway-specific: never trust X-Forwarded-* headers unless they come from a trusted upstream proxy boundary. Public clients can set those headers too.


24. Designing Gateway Routing Policy

Gateway route should be data-driven and explicit.

routes:
  - id: account-read
    match:
      method: GET
      pathPrefix: /accounts/
    upstream:
      scheme: https
      host: account-service.internal
      port: 8443
    policy:
      maxBodyBytes: 0
      deadlineMs: 800
      retry:
        enabled: true
        maxAttempts: 2
        onlyBeforeResponseStarted: true
      safeEgress:
        allowPrivate: true
        allowedHosts:
          - account-service.internal

Do not let arbitrary user input become upstream URL.

Bad:

URI upstream = URI.create(request.queryParam("url"));

Better:

Route route = routeTable.match(request.method(), request.path())
    .orElseThrow(NotFoundException::new);
URI upstream = route.buildUpstreamUri(request.pathVariables());

25. Gateway Retry Rules

Gateway retry is dangerous because gateway may sit between a caller and a side-effecting upstream.

Retry may be reasonable when:

  • method/operation is idempotent;
  • request body has not been partially sent, or body is replayable;
  • upstream did not produce response headers/body;
  • failure is likely transient;
  • retry budget remains;
  • caller deadline remains;
  • retry will not amplify overload.

Retry should usually not happen when:

  • request performed side effect without idempotency key;
  • body streaming already started and cannot replay;
  • upstream returns deterministic 4xx;
  • local server is overloaded;
  • downstream explicitly says do not retry;
  • global retry budget exhausted.

Gateway retry pseudo-policy:

boolean canRetry(Exchange x, Failure f) {
    return x.operation().isIdempotent()
        && x.requestBody().isReplayable()
        && !x.responseStarted()
        && f.isTransientNetworkFailure()
        && x.attempts() < x.route().maxAttempts()
        && x.deadline().hasTimeForAnotherAttempt()
        && retryBudget.tryAcquire();
}

26. Server Capacity Model

Capacity model harus menjawab “bottleneck mana yang habis duluan?”

DimensionExample metricSymptom when exhausted
file descriptorsopen socketsaccept/connect fails
CPUrun queue, CPU utilizationlatency rises globally
heapallocation rate, old genGC pause/OOM
direct memorydirect buffer usagenative OOM, allocation failure
workeractive tasks/queuequeue wait increases
downstream poolpending callsrequest timeout
log I/Olog backlogCPU/disk pressure
accept rateconnection churnSYN/backlog issues

Throughput target without capacity model is theatre.


27. Practical Server/Gateway Architecture Blueprint

Minimum modules:

  1. LifecycleManager
  2. ConnectionRegistry
  3. AdmissionController
  4. DeadlineManager
  5. RouteTable
  6. ProtocolParser
  7. RequestDispatcher
  8. ResponseWriter
  9. DownstreamClient
  10. OverloadPolicy
  11. DrainManager
  12. ServerTelemetry

28. Implementation Checklist

Before calling a Java server/gateway production-ready, verify:

  • listen address explicit, not accidentally 0.0.0.0 when it should be localhost;
  • backlog configured based on traffic profile;
  • accepted socket options documented;
  • max active connections enforced;
  • max active requests enforced;
  • header/body size limits enforced;
  • read timeout and request deadline enforced;
  • write queue bounded;
  • slow client behavior tested;
  • graceful shutdown has drain budget;
  • readiness fails during drain;
  • liveness does not flap during dependency outage;
  • overload returns deterministic signal;
  • gateway route policy is explicit;
  • retry policy is idempotency-aware;
  • upstream timeout smaller than caller deadline;
  • client abort cancels downstream work;
  • structured logs include close/failure reason;
  • metrics do not contain unbounded label cardinality;
  • packet-level debugging plan exists;
  • load/chaos tests cover DNS/TCP/TLS/slow body/client abort.

29. Common Production Incidents and Root Causes

IncidentLikely root causeFix direction
rolling deploy causes 5xx spikereadiness not changed before shutdowndrain sequence
latency climbs before crashunbounded worker/write queueadmission + bounded queues
memory grows with slow clientsresponse buffering per connectionwrite queue cap + disconnect
gateway retries cause outageretry stormretry budget + idempotency policy
clients see random resetforce close during active requestgraceful drain + close semantics
only IPv6 clients failbind/address selection issuedual-stack test matrix
TLS errors after cert rotationmissing trust/SNI/chain updateTLS diagnostic playbook
high CPU with low throughputtiny writes/syscall amplificationbatching + buffer tuning
server “healthy” but unusablehealth check too shallowreadiness based on capacity
thread count fine but server stuckdownstream pool exhaustedbulkhead + deadline propagation

30. Deliberate Practice Drills

Drill 1 — Graceful Shutdown

Build a blocking socket or simple HTTP server that:

  • marks not ready on shutdown;
  • stops accepting new work;
  • drains active requests for 30 seconds;
  • force closes remaining work;
  • logs drain summary.

Success criteria:

  • new requests rejected after drain begins;
  • existing short requests complete;
  • long requests cancelled after budget;
  • process exits cleanly.

Drill 2 — Slow Client Defense

Create a test client that sends one byte per second.

Server must:

  • reject slow header;
  • cap body size;
  • not grow memory;
  • not occupy all worker capacity.

Drill 3 — Gateway Streaming

Build a gateway that streams a 1 GiB file from upstream to client using bounded memory.

Success criteria:

  • heap stable;
  • client abort cancels upstream;
  • deadline abort works;
  • no full-body buffering.

Drill 4 — Overload Experiment

Run load test above capacity.

Success criteria:

  • server rejects early;
  • p99 does not grow without bound;
  • memory remains stable;
  • error ratio is explicit and explained;
  • recovery after load stops is fast.

31. What Top 1% Engineers Notice

A strong engineer does not ask only “berapa RPS server ini?” They ask:

  • what happens when accepted connection exceeds worker capacity?
  • what is the maximum memory per connection?
  • can a single slow reader pin response buffers?
  • does shutdown stop new work before closing old work?
  • what error does a client see during overload?
  • can retry amplify a dependency outage?
  • can request body be replayed safely?
  • can the gateway stream without full buffering?
  • are route policies safe from user-controlled upstream URLs?
  • can we tell DNS failure from TCP timeout from TLS failure?
  • how do we prove the server recovered after chaos?

This is the difference between server code and server engineering.


32. Summary

Production-grade Java network server/gateway design is about controlling boundaries:

  • listen boundary controls who can connect;
  • admission boundary controls who can consume capacity;
  • protocol boundary controls byte-to-message correctness;
  • execution boundary controls concurrency and deadline;
  • write boundary controls backpressure;
  • gateway boundary controls routing, retry, and egress;
  • shutdown boundary controls deploy safety;
  • observability boundary controls diagnosability.

The server is not complete when it can answer requests. It is complete when it can survive invalid traffic, slow clients, dependency failure, overload, deployment, and partial network collapse without becoming the next failure amplifier.


References

  • Java SE java.nio.channels package: selectable channels, selectors, socket channels.
  • Java SE java.net package: sockets, addresses, proxy, network interfaces.
  • Java SE java.net.http module: HTTP Client and WebSocket APIs.
  • Java SE jdk.httpserver module: lightweight embedded HTTP server API.
  • RFC 9110: HTTP Semantics.
  • RFC 9112: HTTP/1.1.
  • RFC 9113: HTTP/2.
  • RFC 6455: WebSocket Protocol.
Lesson Recap

You just completed lesson 31 in final stretch. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.

Continue The Track

Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.