Keystores, KMS, HSM, and Secrets
Learn Java Security, Cryptography, Integrity and Platform Hardening - Part 018
Java keystores, truststores, KMS, HSM, PKCS#11, secret zero, envelope encryption, key rotation, and production-grade secrets lifecycle design.
Part 018 — Keystores, KMS, HSM, and Secrets
1. Why This Part Matters
Crypto failures are rarely caused by AES being mathematically broken. They are usually caused by bad key and secret lifecycle design:
- Keys are stored next to encrypted data.
- Long-lived credentials are copied into source code, CI logs, container images, or environment dumps.
- One key protects too many domains.
- Rotation is theoretically supported but operationally untested.
- Applications cannot distinguish current, previous, disabled, compromised, and retired keys.
- Secrets are readable by too many humans and workloads.
- Disaster recovery depends on undocumented key material.
- HSM/KMS is added as a product checkbox without a threat model.
This part builds the mental model for Java key material and secrets handling.
Cryptography protects data only if key material has a stronger lifecycle than the data it protects.
2. Kaufman Skill Decomposition
Break the skill into practiceable subskills:
| Subskill | You can self-correct when you can answer |
|---|---|
| Secret taxonomy | Is this value a password, token, private key, symmetric key, API key, certificate, or trust anchor? |
Java KeyStore model | Which KeyStore.Entry type holds this material? |
| Keystore vs truststore | Is this file proving local identity or deciding remote trust? |
| KMS/HSM boundary | Does the key leave the boundary, or is crypto performed inside it? |
| Envelope encryption | Which key encrypts data, and which key wraps the data key? |
| Secret zero | How does the app authenticate to retrieve its first secret? |
| Rotation | Can the system encrypt with the new key while decrypting old data? |
| Compromise response | What is the blast radius if this secret leaks? |
| Auditability | Can we prove who accessed, rotated, or used key material? |
| Recovery | Can we restore service without weakening key custody? |
3. Secret and Key Taxonomy
Do not treat all secrets as the same thing.
| Type | Example | Primary risk | Typical control |
|---|---|---|---|
| Password | Database password | Replay by anyone who knows it | Secrets manager, rotation, least privilege |
| API key | Third-party integration key | Unauthorized API use | Scoped key, rate limit, rotation |
| Bearer token | OAuth access token | Immediate impersonation | Short TTL, audience binding, refresh discipline |
| Refresh token | Long-lived session continuation | Persistent account takeover | Rotation, revocation, device binding |
| Private key | TLS/server signing key | Identity impersonation | Keystore, HSM/KMS, file permission, rotation |
| Symmetric data key | AES-GCM key | Data decryption | Envelope encryption, key hierarchy |
| HMAC key | Request signing key | Forged messages | Key ID, domain separation, rotation |
| Trust anchor | Root/intermediate CA cert | Unauthorized trust expansion | Change control, minimal truststore |
| Certificate | Public identity assertion | Expiry/misissuance | PKI lifecycle, linting, monitoring |
| Seed/pepper | Password hashing pepper | Large-scale credential compromise if leaked | KMS/HSM, split access, rotation plan |
A value becomes a secret if possession changes the security state.
4. Java KeyStore Mental Model
java.security.KeyStore represents a storage facility for cryptographic keys and certificates. It manages entries such as:
PrivateKeyEntry: private key plus certificate chain.SecretKeyEntry: symmetric secret key.TrustedCertificateEntry: trusted public certificate.
Every Java implementation must support PKCS12 as a standard keystore type. KeyStore.getDefaultType() returns the configured default keystore type, or pkcs12 if no security property overrides it.
5. Keystore vs Truststore
The difference is usage, not necessarily file format.
| Store | Contains | Used for | Example |
|---|---|---|---|
| Keystore | Local private key + certificate chain, or secret keys | Proving local identity / holding key material | Server TLS keypair, client mTLS keypair, signing key |
| Truststore | Trusted certificates / CA roots | Deciding whether remote identity is acceptable | Client trust of server CA, server trust of client CA |
The same KeyStore API may load both. The security meaning comes from how the material is used.
Bad naming pattern:
security.p12 contains server key, client key, public CA roots, partner CA roots, and unrelated test certificates.
Better pattern:
server-identity.p12
client-payment-gateway-identity.p12
trust-payment-gateway-server-ca.p12
trust-internal-workload-client-ca.p12
Naming should expose the trust boundary.
6. Key Lifecycle State Machine
Treat keys as stateful operational entities.
Key states should be represented in metadata, not tribal knowledge.
Minimum key metadata:
| Field | Purpose |
|---|---|
keyId | Identifies the key used for encryption/signing/verifying. |
algorithm | Prevents ambiguous interpretation. |
purpose | Encryption, HMAC, signing, TLS, wrapping, password pepper, etc. |
scope | Tenant, service, environment, data class, or boundary. |
state | Active, decrypt-only, retired, suspended, revoked. |
createdAt | Lifecycle audit. |
activatedAt | Rotation audit. |
expiresAt | Planned rotation/retirement. |
owner | Operational accountability. |
protection | File, KMS, HSM, Vault transit, PKCS#11 token. |
7. Envelope Encryption
Envelope encryption separates data encryption from key encryption.
Why this pattern is useful:
- Large data does not need to pass through KMS/HSM.
- Each object can use a unique DEK.
- The KEK can stay inside KMS/HSM.
- Rotation can rewrap DEKs without re-encrypting all data.
- Metadata can support algorithm agility.
Envelope example:
{
"version": 1,
"algorithm": "AES-256-GCM",
"keyEncryptionKeyId": "kms://prod/case-data/kek-2026-06",
"encryptedDataKey": "base64...",
"nonce": "base64...",
"aad": "case-data:v1:tenant-42",
"ciphertext": "base64..."
}
Invariant:
The envelope must carry enough metadata to decrypt safely without guessing algorithm, key, nonce, or domain context.
8. KMS, HSM, Vault, and PKCS#11
These tools solve related but different problems.
| Option | Main idea | Key exposure model | Common Java integration |
|---|---|---|---|
| File keystore | Key material stored in local encrypted file | Key loaded into JVM memory | KeyStore, KeyManagerFactory, TrustManagerFactory |
| Cloud KMS | Managed key service with policy/audit | KEK usually does not leave service | Cloud SDK, envelope encryption client |
| HSM | Dedicated hardware boundary for key operations | Private key often non-exportable | PKCS#11 provider, vendor SDK |
| Vault transit | Crypto-as-a-service; Vault holds keys | App sends data/DEK to Vault API | HTTP client, Vault SDK, sidecar |
| PKCS#11 token | Standard interface to native crypto token | Token-dependent, often non-exportable | SunPKCS11 provider via JCA/JCE |
Java's SunPKCS11 provider lets applications use standard JCA/JCE APIs to access native PKCS#11 libraries when configured. The provider itself is a conduit to the underlying native PKCS#11 implementation; availability of algorithms depends on that implementation.
8.1 KMS Is Not Magic
A KMS improves key custody, auditability, access control, and rotation primitives. It does not automatically solve:
- Bad access policy.
- Overbroad IAM role assignment.
- Missing encryption context/AAD.
- Application-level authorization bugs.
- Data classification errors.
- Logging decrypted plaintext.
- Caching plaintext keys forever.
8.2 HSM Is Not Always Better
HSMs are valuable when you need strong non-exportability, compliance boundaries, signing key protection, root CA protection, or high-assurance key custody.
But HSMs add:
- Operational complexity.
- Latency.
- Throughput limits.
- Vendor-specific behavior.
- Disaster recovery complexity.
- Harder local testing.
Use HSM where the threat model justifies it.
9. The Secret Zero Problem
Secret zero is the first credential an application uses to obtain all other secrets.
Examples:
- A Kubernetes service account token used to authenticate to a secrets manager.
- A cloud workload identity used to call KMS.
- A VM instance identity document.
- A sidecar-issued token.
- A bootstrap certificate.
Bad secret zero:
APP_SECRET_MANAGER_PASSWORD=super-secret in container image
Better secret zero:
Workload identity issued by platform -> short-lived token -> secrets manager policy -> scoped secret access
Secret zero invariant:
Bootstrap identity must be short-lived, non-human, environment-scoped, auditable, and harder to exfiltrate than the secrets it unlocks.
10. Loading Keystores in Java
A safe keystore loader should make format and boundary explicit.
import java.io.InputStream;
import java.nio.file.Files;
import java.nio.file.Path;
import java.security.KeyStore;
import java.util.Arrays;
public final class KeyStores {
private KeyStores() {}
public static KeyStore loadPkcs12(Path path, char[] password) throws Exception {
KeyStore keyStore = KeyStore.getInstance("PKCS12");
try (InputStream input = Files.newInputStream(path)) {
keyStore.load(input, password);
return keyStore;
} finally {
// Reduces lifetime of password copy. It does not erase copies made internally.
if (password != null) {
Arrays.fill(password, '\0');
}
}
}
}
Important caveats:
char[]is preferable toStringfor passwords because it can be overwritten, but this is not perfect memory safety.- The JVM, libraries, heap dumps, logs, and crash reports can still expose secrets.
- Never log keystore passwords, aliases that reveal sensitive topology, or raw key material.
- Validate file permissions before loading local key material.
- Avoid storing production keystores in application source repositories.
11. Extracting a SecretKey
import javax.crypto.SecretKey;
import java.security.KeyStore;
public final class SecretKeys {
private SecretKeys() {}
public static SecretKey getSecretKey(
KeyStore keyStore,
String alias,
char[] entryPassword
) throws Exception {
KeyStore.ProtectionParameter protection =
new KeyStore.PasswordProtection(entryPassword);
KeyStore.Entry entry = keyStore.getEntry(alias, protection);
if (!(entry instanceof KeyStore.SecretKeyEntry secretEntry)) {
throw new IllegalStateException("Alias does not contain a SecretKeyEntry: " + alias);
}
return secretEntry.getSecretKey();
}
}
Review questions:
- Who can read the keystore file?
- Who can read the keystore password?
- Is this key exportable from the JVM?
- Is this alias environment-specific?
- Is the key purpose encoded in metadata?
- Does the ciphertext store the
keyId/alias used?
12. Rotation Pattern: Encrypt New, Decrypt Old
For data encryption, rotation should avoid breaking old data.
Java-style interface:
public interface KeyRegistry {
CryptoKey activeEncryptionKey(String purpose, String scope);
CryptoKey decryptionKey(String keyId);
}
public record CryptoKey(
String keyId,
String algorithm,
String purpose,
String scope,
KeyState state,
Object handle
) {}
public enum KeyState {
ACTIVE_ENCRYPT_DECRYPT,
ACTIVE_DECRYPT_ONLY,
RETIRED,
SUSPENDED,
REVOKED
}
Encryption invariant:
New ciphertext MUST use exactly one active encrypt key for its purpose/scope.
Decryption invariant:
Old ciphertext MAY use decrypt-only keys until retention/migration completes.
13. Versioned Envelope Design
A robust ciphertext record should include key metadata.
public record EncryptedValue(
int version,
String algorithm,
String keyId,
byte[] nonce,
byte[] encryptedDataKey,
byte[] ciphertext,
byte[] tag,
String aad
) {}
Do not design ciphertext as just byte[] with hidden assumptions. Hidden assumptions become migration blockers.
Minimum envelope fields:
- Version.
- Algorithm.
- Key ID.
- Nonce/IV.
- Ciphertext.
- Authentication tag, if not appended.
- AAD/domain context.
- Optional encrypted DEK.
- Creation timestamp.
- Compression flag, if compression is used before encryption.
14. Secrets in Containers and CI/CD
Secrets often leak before the application starts.
Danger zones:
| Zone | Failure mode |
|---|---|
| Source repository | Hardcoded credentials. |
| Build logs | Secret printed by test/config output. |
| Dockerfile | ENV SECRET=... baked into image layer. |
| Container image | Keystore copied into immutable artifact. |
| Kubernetes Secret | Base64 mistaken for encryption. |
| Environment variables | Visible in process metadata, dumps, debug tooling. |
| CI artifacts | Test reports include config. |
| Crash dumps | Heap contains plaintext credentials. |
| Metrics/tags | Secret accidentally used as label. |
| Logs | Tokens included in request/response dump. |
Better controls:
- Use workload identity instead of static cloud credentials.
- Inject secrets at runtime, not build time.
- Keep secret access scoped per service and environment.
- Use short-lived credentials where possible.
- Apply secret scanning in source and CI.
- Redact known secret formats in logs.
- Disable or protect heap dumps for sensitive services.
- Rotate immediately after suspected exposure.
OWASP's Secrets Management guidance emphasizes lifecycle concerns such as creation, storage, access, rotation, detection, and incident response. Rotation should be a safe multi-step transition, not a single destructive overwrite.
15. Access Control for Secrets
Secret policy should be least privilege and purpose-specific.
Bad policy:
service-* can read secret/* in prod
Better policy:
case-service-prod can read:
/prod/case-service/db/readonly
/prod/case-service/jwt-verification-public-set
/prod/case-service/payment-client-cert
case-service-prod cannot read:
/prod/payment-service/db/admin
/prod/root-ca/private-key
/dev/*
Policy dimensions:
- Service/workload identity.
- Environment.
- Tenant or business domain.
- Secret purpose.
- Read vs write vs rotate vs destroy.
- Human vs workload access.
- Break-glass conditions.
- Audit and alerting.
16. Break-Glass Design
Break-glass access is emergency access outside normal paths.
It must be designed before incidents.
Minimum design:
- Requires strong human identity and MFA.
- Requires ticket/incident reference.
- Is time-limited.
- Is heavily audited.
- Alerts security/on-call.
- Exposes only needed secrets/actions.
- Triggers post-incident review.
- Has replayable evidence for regulators/auditors.
Break-glass should not mean "someone has a copy of prod root keys in a password manager forever."
17. HSM / PKCS#11 in Java
When using PKCS#11, Java code may still use JCA/JCE APIs, but operations are routed to the token/provider.
Conceptual flow:
Configuration is provider/token-specific, but the review questions are stable:
- Are private keys non-exportable?
- Which operations are allowed by the key policy?
- What is the authentication mechanism to the token?
- How are PINs or token credentials protected?
- What is the throughput/latency limit?
- Is high availability configured?
- How are backups and disaster recovery handled?
- Is audit logging tamper-resistant?
- How is key destruction verified?
- How are test/staging HSMs separated from production?
18. KMS Envelope Encryption Facade
Keep KMS-specific logic behind a small boundary.
public interface KeyWrappingService {
WrappedDataKey generateDataKey(String keyEncryptionKeyId, byte[] encryptionContext);
byte[] decryptDataKey(WrappedDataKey wrappedDataKey, byte[] encryptionContext);
}
public record WrappedDataKey(
String keyEncryptionKeyId,
String wrappingAlgorithm,
byte[] plaintextDataKey,
byte[] encryptedDataKey
) {}
But be careful with plaintextDataKey lifetime:
- Keep it in memory only as long as needed.
- Avoid logging or serializing it.
- Prefer byte arrays that can be overwritten when feasible.
- Be aware that Java memory management does not guarantee perfect zeroization.
- Consider whether high-sensitivity keys should never enter JVM memory.
19. Secret Caching
Secret caching improves reliability and latency, but it changes the security model.
| Cache choice | Benefit | Risk |
|---|---|---|
| No cache | Fresh policy/secret every time | Latency and dependency on secrets service. |
| Short TTL cache | Lower latency, tolerates brief outage | Revocation delay. |
| Long TTL cache | High availability | Large compromise window. |
| Persistent local cache | Survives restart | Secret-at-rest problem on app host. |
Cache invariant:
Cache TTL must be shorter than the organization's acceptable revocation delay for that secret class.
For high-risk secrets, prefer short TTL, explicit refresh, and audit logs.
20. Compromise Response
When a secret leaks, the response depends on what the secret can do.
Response questions:
- Was it a read credential, write credential, signing key, or decrypt key?
- Can it be revoked immediately without outage?
- What data or actions could be affected?
- Are there logs proving use/non-use?
- Do downstream systems need notification?
- Are derived tokens or sessions still valid?
- Is re-encryption required?
- Is forensic retention required?
21. Testing Strategy
21.1 Unit Tests
Test metadata and state logic:
- Cannot encrypt with retired key.
- Cannot decrypt with revoked key.
- New ciphertext includes key ID and algorithm.
- Wrong AAD fails decryption.
- Unknown key ID fails closed.
- Missing envelope field fails closed.
21.2 Integration Tests
Test actual keystore/KMS behavior:
- Load PKCS12 keystore with expected alias.
- Fail on wrong password.
- Fail on missing alias.
- Validate certificate chain from keystore.
- Generate/decrypt data key with KMS test key.
- Verify IAM/workload policy denies unrelated secret.
- Rotate key and verify old ciphertext still decrypts.
21.3 Chaos / Drill Tests
- KMS unavailable.
- Secrets manager latency spike.
- Revoked key used for decryption.
- Expired client certificate.
- Missing truststore.
- Corrupted encrypted DEK.
- Wrong encryption context/AAD.
- Emergency break-glass access.
22. Production Readiness Checklist
Key Material
- Every key has owner, purpose, scope, state, and expiry.
- Production keys are separate from non-production keys.
- Key IDs are stored with ciphertext/signatures.
- Rotation path is tested.
- Compromise response is documented.
- Key access is audited.
Keystores and Truststores
- Keystore/truststore files are not in source control.
- File permissions are minimal.
- Stores are boundary-specific.
- Passwords are runtime-injected.
- Alias naming is consistent and non-ambiguous.
- Expiry is monitored for certificate entries.
Secrets
- No secrets are baked into images.
- No secrets appear in logs, metrics, traces, or crash dumps.
- Secret scanning runs in source and CI.
- Workload identity solves secret zero.
- Secret access policy is least privilege.
- Rotation is multi-step and reversible.
KMS/HSM
- KMS/HSM key policy is scoped.
- Encryption context/AAD is used where supported.
- Rate limits and latency are understood.
- DR/backup process is tested.
- Break-glass access is controlled and audited.
- Non-exportability requirements are explicit.
23. Anti-Patterns
| Anti-pattern | Consequence |
|---|---|
| Store encryption key beside encrypted data | Attacker gets both lock and key. |
| Use one global key for all tenants/data | Huge blast radius. |
| Rotate by overwriting old key | Old data becomes undecryptable. |
| No key ID in ciphertext | Decryption requires guessing. |
| Use environment variables for all high-value secrets | Easy exposure through process/debug tooling. |
| Copy production keystore into container image | Secret becomes part of artifact supply chain. |
| Treat KMS as authorization | Decryption allowed does not mean business action allowed. |
| Use HSM without HA/DR testing | Security control becomes availability risk. |
| Keep break-glass credentials permanently shared | Emergency access becomes shadow admin access. |
| Log decrypted payload for debugging | Crypto boundary is bypassed by observability. |
24. Practice Lab
Lab 1 — PKCS12 Keystore Exploration
- Create a PKCS12 keystore.
- Add a private key and certificate chain.
- Add a trusted certificate entry.
- Load it from Java.
- List aliases and entry types.
- Reject unexpected alias/type.
Lab 2 — Envelope Encryption
- Generate random DEK.
- Encrypt payload with AES-GCM.
- Simulate wrapping DEK with a fake KMS KEK.
- Store envelope metadata.
- Decrypt using key ID lookup.
- Change AAD and verify decryption fails.
Lab 3 — Rotation Drill
- Create key
K1and encrypt records. - Create key
K2and mark as active. - Encrypt new records with
K2. - Decrypt old records with
K1. - Lazy re-encrypt old records with
K2. - Mark
K1decrypt-only, then retired. - Verify encrypt with
K1fails.
Lab 4 — Secret Leak Incident Tabletop
- Pick a leaked secret type.
- Determine blast radius.
- Revoke/rotate safely.
- Identify logs needed for misuse analysis.
- Document permanent fixes.
25. Key Takeaways
- Key and secret lifecycle design is usually more important than crypto algorithm selection after safe algorithms are chosen.
- Java
KeyStorecan hold private keys, secret keys, and trusted certificates; usage determines whether we call it a keystore or truststore. PKCS12is the portable standard keystore type every Java implementation must support.- KMS/HSM improves custody and auditability, but it does not replace application authorization or data classification.
- Envelope encryption is the default mental model for scalable data encryption.
- Rotation requires key states, metadata, and decrypt-old/encrypt-new behavior.
- Secret zero must be solved with platform identity, not static credentials baked into artifacts.
- Production readiness requires tests, drills, auditability, and compromise response.
References
- Oracle, Java SE 25, KeyStore API: https://docs.oracle.com/en/java/javase/25/docs/api/java.base/java/security/KeyStore.html
- Oracle, Java SE 25, JDK Providers Documentation: https://docs.oracle.com/en/java/javase/25/security/oracle-providers.html
- Oracle, Java SE 25, Java Cryptography Architecture Reference Guide: https://docs.oracle.com/en/java/javase/25/security/java-cryptography-architecture-jca-reference-guide.html
- OWASP Secrets Management Cheat Sheet: https://cheatsheetseries.owasp.org/cheatsheets/Secrets_Management_Cheat_Sheet.html
- AWS Encryption SDK for Java examples: https://docs.aws.amazon.com/encryption-sdk/latest/developer-guide/java-example-code.html
- HashiCorp Vault Transit Secrets Engine: https://developer.hashicorp.com/vault/docs/secrets/transit
You just completed lesson 18 in build core. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.
Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.