Build CoreOrdered learning track

Learn Java Dsa Part 016 Binary Search Trees And Ordered Indexes

[]11 min read2196 words

In This Lesson

1. Kaufman Skill Target 2. The Mental Model: Ordered Partitioning 3. BST Invariant with Comparator

Lesson 1635 lesson track07–19 Build Core

title: Learn Java Data Structures and Algorithms - Part 016 description: Binary search tree invariants, ordered index operations, deletion, floor and ceiling, range scans, comparator contracts, TreeMap and TreeSet usage, validation, and Java production design. series: learn-java-dsa seriesTitle: Learn Java Data Structures and Algorithms order: 16 partTitle: Binary Search Trees and Ordered Indexes tags:

java
data-structures
algorithms
dsa
binary-search-tree
bst
treemap
treeset
ordered-index
comparator
series date: 2026-06-27

Part 016 — Binary Search Trees and Ordered Indexes

Part 015 covered binary tree traversal invariants.

Now we add one more invariant:

keys in the left subtree are smaller
keys in the right subtree are larger

That single invariant turns a shape into an index.

A binary search tree is not primarily a tree trick.

It is an ordered dictionary whose cost depends on height.

1. Kaufman Skill Target

After this part, you should be able to:

State the BST invariant precisely using a comparator.
Implement search, insert, delete, minimum, maximum, predecessor, successor, floor, ceiling, and range scan.
Explain why operation cost is O(h) and why h can degrade to n.
Validate a BST using range constraints instead of local child checks.
Design duplicate-key policies explicitly.
Use Java TreeMap, TreeSet, NavigableMap, and NavigableSet as production ordered indexes.
Avoid comparator and mutable-key bugs.
Decide when a hash index, sorted array, heap, or ordered tree is the better choice.

The practice objective:

Treat a BST as an ordered index with invariants, not as a recursive coding exercise.

2. The Mental Model: Ordered Partitioning

At every node, the key partitions the search space.

left subtree  < node.key < right subtree

The invariant is global, not local.

It is not enough that:

node.left.key < node.key
node.right.key > node.key

Every key in the entire left subtree must be smaller.

Every key in the entire right subtree must be larger.

3. BST Invariant with Comparator

For Java, ordering should be expressed with Comparator<? super K> or natural ordering.

Strict unique-key BST invariant:

For every node x:
all keys in x.left  compare < x.key
all keys in x.right compare > x.key

With comparator cmp:

cmp.compare(a, b) < 0
cmp.compare(a, b) == 0
cmp.compare(a, b) > 0

The tree's notion of key equality is usually:

cmp.compare(a, b) == 0

Not necessarily:

a.equals(b)

This distinction is critical for TreeMap and TreeSet.

4. Ordered Index Operations

An ordered index supports operations that a hash table does not naturally support:

Operation	Meaning
`get(k)`	Find exact key
`put(k, v)`	Insert or replace by key order
`remove(k)`	Delete exact key
`min()`	Smallest key
`max()`	Largest key
`floor(k)`	Greatest key `<= k`
`ceiling(k)`	Smallest key `>= k`
`lower(k)`	Greatest key `< k`
`higher(k)`	Smallest key `> k`
`range(lo, hi)`	Iterate keys in sorted interval
`predecessor(k)`	Previous key in sorted order
`successor(k)`	Next key in sorted order

If you need only exact lookup, a hash table is usually better.

If you need nearest-neighbor or sorted range queries, an ordered index becomes valuable.

5. Minimal BST Map Implementation

This implementation is intentionally unbalanced.

Part 017 covers balanced trees.

The purpose here is to understand the invariant and operations.

import java.util.Comparator;
import java.util.Objects;
import java.util.Optional;

public final class SimpleBstMap<K, V> {
    private final Comparator<? super K> comparator;
    private Node<K, V> root;
    private int size;

    private static final class Node<K, V> {
        K key;
        V value;
        Node<K, V> left;
        Node<K, V> right;

        Node(K key, V value) {
            this.key = key;
            this.value = value;
        }
    }

    public SimpleBstMap(Comparator<? super K> comparator) {
        this.comparator = Objects.requireNonNull(comparator);
    }

    public int size() {
        return size;
    }

    private int compare(K a, K b) {
        return comparator.compare(a, b);
    }
}

Why not use K extends Comparable<K> only?

Because production indexes often need domain-specific ordering:

sort by deadline, then priority, then id
sort by timestamp, then sequence
sort by status rank, then creation time

A comparator makes the ordering explicit.

6. Search

Search follows the partition decision at each node.

public Optional<V> get(K key) {
    Objects.requireNonNull(key);

    Node<K, V> current = root;
    while (current != null) {
        int c = compare(key, current.key);
        if (c == 0) {
            return Optional.ofNullable(current.value);
        }
        current = c < 0 ? current.left : current.right;
    }

    return Optional.empty();
}

Loop invariant:

If key exists, it is inside the subtree rooted at current.

Each comparison discards one side of the current node.

Cost:

O(h)

where h is tree height.

7. Insertion

Insertion searches for the missing null slot where the key belongs.

If the key already exists under comparator equality, replace the value.

public Optional<V> put(K key, V value) {
    Objects.requireNonNull(key);

    if (root == null) {
        root = new Node<>(key, value);
        size++;
        return Optional.empty();
    }

    Node<K, V> current = root;
    while (true) {
        int c = compare(key, current.key);

        if (c == 0) {
            V old = current.value;
            current.key = key;
            current.value = value;
            return Optional.ofNullable(old);
        }

        if (c < 0) {
            if (current.left == null) {
                current.left = new Node<>(key, value);
                size++;
                return Optional.empty();
            }
            current = current.left;
        } else {
            if (current.right == null) {
                current.right = new Node<>(key, value);
                size++;
                return Optional.empty();
            }
            current = current.right;
        }
    }
}

Important design choice:

current.key = key;

When comparator equality says two keys are equal, replacing the stored key can be reasonable if the new key carries metadata not included in ordering.

But this can also surprise callers.

Production maps usually define clear semantics:

Comparator equality defines key identity.
Stored key may be the latest equal key or the first equal key depending on implementation.
Mutable key fields used by comparator are forbidden.

8. Minimum and Maximum

Minimum is found by walking left until null.

public Optional<K> minKey() {
    Node<K, V> node = minNode(root);
    return node == null ? Optional.empty() : Optional.of(node.key);
}

private static <K, V> Node<K, V> minNode(Node<K, V> node) {
    if (node == null) {
        return null;
    }
    while (node.left != null) {
        node = node.left;
    }
    return node;
}

Maximum is symmetric:

public Optional<K> maxKey() {
    Node<K, V> node = root;
    if (node == null) {
        return Optional.empty();
    }
    while (node.right != null) {
        node = node.right;
    }
    return Optional.of(node.key);
}

Invariant:

At each left step, all keys to the right are larger, so minimum remains in the left subtree if it exists.

9. Floor and Ceiling

9.1 Floor

floor(k) is the greatest key <= k.

Search rule:

If current key is equal, current is the floor.
If current key is greater than target, floor must be in the left subtree.
If current key is less than target, current is a candidate; try to find a better one on the right.

public Optional<K> floorKey(K key) {
    Objects.requireNonNull(key);

    Node<K, V> current = root;
    K candidate = null;

    while (current != null) {
        int c = compare(key, current.key);
        if (c == 0) {
            return Optional.of(current.key);
        }
        if (c < 0) {
            current = current.left;
        } else {
            candidate = current.key;
            current = current.right;
        }
    }

    return Optional.ofNullable(candidate);
}

Loop invariant:

candidate is the greatest key seen so far that is <= target.

9.2 Ceiling

ceiling(k) is the smallest key >= k.

public Optional<K> ceilingKey(K key) {
    Objects.requireNonNull(key);

    Node<K, V> current = root;
    K candidate = null;

    while (current != null) {
        int c = compare(key, current.key);
        if (c == 0) {
            return Optional.of(current.key);
        }
        if (c < 0) {
            candidate = current.key;
            current = current.left;
        } else {
            current = current.right;
        }
    }

    return Optional.ofNullable(candidate);
}

Loop invariant:

candidate is the smallest key seen so far that is >= target.

10. Inorder Traversal Produces Sorted Keys

A BST adds semantic meaning to inorder traversal.

public java.util.List<K> keysInOrder() {
    java.util.ArrayList<K> out = new java.util.ArrayList<>(size);
    inorder(root, out);
    return out;
}

private void inorder(Node<K, V> node, java.util.List<K> out) {
    if (node == null) {
        return;
    }
    inorder(node.left, out);
    out.add(node.key);
    inorder(node.right, out);
}

Correctness by induction:

Left subtree keys are sorted and all < node.key.
Right subtree keys are sorted and all > node.key.
Therefore left + node + right is sorted.

11. Range Scan

Range scan is the ordered-index operation that often justifies trees.

For half-open range [lo, hi):

public java.util.List<K> keysInRange(K loInclusive, K hiExclusive) {
    Objects.requireNonNull(loInclusive);
    Objects.requireNonNull(hiExclusive);

    java.util.ArrayList<K> out = new java.util.ArrayList<>();
    collectRange(root, loInclusive, hiExclusive, out);
    return out;
}

private void collectRange(Node<K, V> node, K lo, K hi, java.util.List<K> out) {
    if (node == null) {
        return;
    }

    int cmpLo = compare(node.key, lo);
    int cmpHi = compare(node.key, hi);

    if (cmpLo >= 0) {
        collectRange(node.left, lo, hi, out);
    }

    if (cmpLo >= 0 && cmpHi < 0) {
        out.add(node.key);
    }

    if (cmpHi < 0) {
        collectRange(node.right, lo, hi, out);
    }
}

Pruning invariants:

If node.key < lo, the entire left subtree is also < lo and can be skipped.
If node.key >= hi, the entire right subtree is also >= hi and can be skipped.

Cost:

O(h + m)

where m is the number of returned keys.

12. Deletion

Deletion is the most error-prone basic BST operation.

Cases:

Node has no children: remove it.
Node has one child: replace node with child.
Node has two children: replace node's key/value with successor, then delete successor.

Implementation:

public Optional<V> remove(K key) {
    Objects.requireNonNull(key);
    Removal<K, V> removal = remove(root, key);
    root = removal.node;
    if (removal.removed) {
        size--;
    }
    return removal.oldValue;
}

private record Removal<K, V>(Node<K, V> node, boolean removed, Optional<V> oldValue) {}

private Removal<K, V> remove(Node<K, V> node, K key) {
    if (node == null) {
        return new Removal<>(null, false, Optional.empty());
    }

    int c = compare(key, node.key);

    if (c < 0) {
        Removal<K, V> r = remove(node.left, key);
        node.left = r.node;
        return new Removal<>(node, r.removed, r.oldValue);
    }

    if (c > 0) {
        Removal<K, V> r = remove(node.right, key);
        node.right = r.node;
        return new Removal<>(node, r.removed, r.oldValue);
    }

    Optional<V> oldValue = Optional.ofNullable(node.value);

    if (node.left == null) {
        return new Removal<>(node.right, true, oldValue);
    }

    if (node.right == null) {
        return new Removal<>(node.left, true, oldValue);
    }

    Node<K, V> successor = minNode(node.right);
    node.key = successor.key;
    node.value = successor.value;

    Removal<K, V> r = remove(node.right, successor.key);
    node.right = r.node;

    return new Removal<>(node, true, oldValue);
}

Invariant:

remove(node, key) returns the new root of this subtree after deletion.

That return value is essential.

Without it, deleting the root of a subtree does not update the parent link correctly.

13. Validation: Do Not Only Check Children

Wrong validation:

boolean wrong(Node<Integer, ?> node) {
    if (node == null) {
        return true;
    }
    if (node.left != null && node.left.key >= node.key) {
        return false;
    }
    if (node.right != null && node.right.key <= node.key) {
        return false;
    }
    return wrong(node.left) && wrong(node.right);
}

Counterexample:

Local checks pass at node 15 because 6 < 15.

But 6 is in the right subtree of 10, so it must be > 10.

Correct validation carries bounds:

public boolean isValidBst() {
    return isValid(root, null, null);
}

private boolean isValid(Node<K, V> node, K lowerExclusive, K upperExclusive) {
    if (node == null) {
        return true;
    }

    if (lowerExclusive != null && compare(node.key, lowerExclusive) <= 0) {
        return false;
    }

    if (upperExclusive != null && compare(node.key, upperExclusive) >= 0) {
        return false;
    }

    return isValid(node.left, lowerExclusive, node.key)
        && isValid(node.right, node.key, upperExclusive);
}

Invariant:

Every node in this recursive call must be within (lowerExclusive, upperExclusive).

14. Duplicate Key Policies

BSTs need a duplicate policy.

Common options:

Policy	Rule	Use Case
Replace	`compare == 0` replaces value	Map semantics
Count	Store frequency at node	Multiset
Bucket	Store list of values at equal key	Secondary records
Tie-break	Include unique id in comparator	Stable total ordering
Side insertion	Equal keys always go left/right	Usually avoid

Avoid side insertion as the default.

It makes search and deletion semantics ambiguous and can create long chains for equal keys.

For production ordered indexes, prefer a total ordering:

Comparator<Event> byTimeThenId = Comparator
        .comparing(Event::time)
        .thenComparing(Event::id);

15. Comparator Contracts

A comparator used by an ordered structure must be stable and logically consistent.

Required properties:

Property	Meaning
Antisymmetry	sign of `compare(a,b)` is opposite of `compare(b,a)`
Transitivity	if `a < b` and `b < c`, then `a < c`
Totality for inserted keys	any two inserted keys can be compared
Stability over time	comparison result does not change while key is in tree

Bad comparator:

Comparator<Integer> bad = (a, b) -> a - b;

Problem: integer overflow.

Use:

Comparator<Integer> good = Integer::compare;

Bad mutable-key example:

final class Task {
    String id;
    int priority;
}

Comparator<Task> byPriority = Comparator.comparingInt(t -> t.priority);

If a Task is inserted into a TreeSet and then priority changes, the tree is corrupted logically because the node remains in the old structural position.

The tree will not automatically re-index it.

16. Java `TreeMap`, `TreeSet`, and `NavigableMap`

For production Java, you rarely implement a raw BST.

Use library structures unless you need custom augmentation.

TreeMap is a NavigableMap implementation ordered by natural ordering or a comparator.

TreeSet is a NavigableSet implementation backed by a TreeMap.

Basic operations:

java.util.NavigableMap<Long, String> byTimestamp = new java.util.TreeMap<>();

byTimestamp.put(1700000000000L, "created");
byTimestamp.put(1700000005000L, "assigned");
byTimestamp.put(1700000010000L, "escalated");

Long floor = byTimestamp.floorKey(1700000007000L);
Long ceiling = byTimestamp.ceilingKey(1700000007000L);

java.util.NavigableMap<Long, String> window =
        byTimestamp.subMap(1700000000000L, true, 1700000010000L, false);

NavigableMap provides:

lowerEntry / lowerKey
floorEntry / floorKey
ceilingEntry / ceilingKey
higherEntry / higherKey
firstEntry / lastEntry
subMap / headMap / tailMap
descendingMap

These methods are exactly the ordered-index operations discussed above.

17. `TreeSet` Equality Trap

A TreeSet uses ordering to determine uniqueness.

If comparator says two elements compare as 0, the set treats them as duplicates.

Example:

record User(String id, String email) {}

Comparator<User> byEmailDomain = Comparator.comparing(
        user -> user.email().substring(user.email().indexOf('@') + 1)
);

java.util.Set<User> users = new java.util.TreeSet<>(byEmailDomain);
users.add(new User("u1", "a@example.com"));
users.add(new User("u2", "b@example.com"));

System.out.println(users.size()); // 1

Both users compare as equal because both have domain example.com.

This may be correct for a domain index.

It is wrong for a user set.

Fix with a tie-breaker:

Comparator<User> byDomainThenId = Comparator
        .comparing((User user) -> user.email().substring(user.email().indexOf('@') + 1))
        .thenComparing(User::id);

Rule:

If the structure represents unique entities, the comparator must include a unique identity tie-breaker or be consistent with entity equality.

18. Ordered Index Design Patterns

18.1 Time Window Index

public final class TimeWindowIndex<V> {
    private final java.util.NavigableMap<Long, java.util.List<V>> byTime = new java.util.TreeMap<>();

    public void add(long timestampMillis, V value) {
        byTime.computeIfAbsent(timestampMillis, ignored -> new java.util.ArrayList<>()).add(value);
    }

    public java.util.List<V> between(long fromInclusive, long toExclusive) {
        java.util.ArrayList<V> result = new java.util.ArrayList<>();
        for (java.util.List<V> bucket : byTime.subMap(fromInclusive, true, toExclusive, false).values()) {
            result.addAll(bucket);
        }
        return result;
    }
}

This handles duplicate timestamps with buckets.

18.2 Composite Ordered Key

record CaseOrderKey(int priorityRank, long dueAtMillis, String caseId)
        implements Comparable<CaseOrderKey> {

    @Override
    public int compareTo(CaseOrderKey other) {
        int c = Integer.compare(priorityRank, other.priorityRank);
        if (c != 0) return c;
        c = Long.compare(dueAtMillis, other.dueAtMillis);
        if (c != 0) return c;
        return caseId.compareTo(other.caseId);
    }
}

This produces a stable total order.

18.3 Secondary Index with Remove/Reinsert

If indexed fields change, remove then reinsert.

public final class PriorityIndex<T> {
    private final java.util.TreeSet<T> index;

    public PriorityIndex(java.util.Comparator<? super T> comparator) {
        this.index = new java.util.TreeSet<>(comparator);
    }

    public void add(T item) {
        index.add(item);
    }

    public void updateIndexedFields(T oldSnapshot, T newSnapshot) {
        index.remove(oldSnapshot);
        index.add(newSnapshot);
    }
}

For mutable objects, a safer design stores immutable key snapshots in the index.

19. Choosing Between HashMap, TreeMap, Sorted Array, and Heap

Need	Better Choice
Exact lookup only	`HashMap`
Exact membership only	`HashSet`
Sorted iteration	`TreeMap` / `TreeSet`
Floor/ceiling	`NavigableMap` / `NavigableSet`
Range scan	`NavigableMap.subMap`
Frequent min/max only	Heap / `PriorityQueue`
Static sorted data + binary search	Sorted array/list
Small data	Simpler structure usually wins
Mutable priority with arbitrary delete	Tree + identity map or custom heap handles

A PriorityQueue gives quick min/max access but not efficient ordered range scan or arbitrary lookup.

A HashMap gives exact lookup but not sorted neighbors.

A sorted array gives excellent locality but expensive insertion.

A TreeMap gives dynamic ordered operations with predictable logarithmic behavior.

20. Complexity Budget

For an unbalanced BST:

Operation	Time	Space
Search	`O(h)`	`O(1)` iterative
Insert	`O(h)`	`O(1)` iterative
Delete	`O(h)`	`O(h)` recursive or `O(1)` iterative with parent tracking
Min/max	`O(h)`	`O(1)`
Floor/ceiling	`O(h)`	`O(1)`
Inorder traversal	`O(n)`	`O(h)`
Range scan	`O(h + m)`	`O(h)` recursion or iterator state
Validation	`O(n)`	`O(h)`

For balanced trees:

h = O(log n)

For skewed trees:

h = O(n)

That difference is why production ordered maps use balanced trees.

21. Failure Modes

Failure Mode	Symptom	Root Cause
Local-only validation	Invalid tree accepted	Missing ancestor bounds
Mutable key	Lookup/remove fails after mutation	Ordering changed while node stayed in place
Comparator overflow	Wrong order for large ints	Using subtraction instead of compare method
Comparator not transitive	Sort/tree weirdness	Invalid ordering relation
Duplicate policy absent	Lost records or ambiguous search	`compare == 0` not defined semantically
Delete not returning subtree root	Parent still points to deleted node	Link update lost
Range scan visits all nodes	No pruning	Ignored ordered invariant
Using TreeSet as entity set with partial comparator	Missing elements	Comparator equality collapses distinct entities
Assuming `TreeMap` is thread-safe	Race/corruption/visibility bugs	External synchronization missing

22. Production Review Checklist

Before using or implementing an ordered index, ask:

What is the exact key?
Is the ordering total for all inserted keys?
Is comparator equality the same as domain identity?
Are duplicates possible?
What is the duplicate policy?
Can indexed fields mutate?
If fields mutate, how is re-indexing enforced?
Are range boundaries inclusive or exclusive?
Is the workload exact lookup, nearest lookup, sorted scan, or min/max?
Is the expected data size large enough for constant factors to matter?
Is concurrency required?
Should a library structure be used instead of custom tree code?

23. Practice Loop

Exercises:

Implement containsKey, put, remove, floorKey, ceilingKey, and keysInRange for SimpleBstMap.
Write a validator using lower/upper bounds.
Generate all permutations of small key sets and verify inorder traversal is sorted after insertion.
Build a skewed BST by inserting sorted keys and measure height.
Build a balanced-ish BST by inserting median first recursively and compare height.
Implement a multiset BST that stores counts.
Implement a TreeMap-based time-window index.
Create a deliberately broken comparator and write a test showing unexpected TreeSet behavior.
Write a remove/reinsert workflow for mutable indexed fields.
Compare HashMap, TreeMap, sorted array, and heap for a workload with exact lookup, range query, and min extraction.

24. Self-Correction Checklist

A BST solution is probably wrong if:

It checks only immediate children during validation.
It never defines duplicate behavior.
It uses a - b in comparator logic.
It assumes equals controls TreeSet uniqueness.
It mutates indexed fields after insertion.
It uses a BST when the workload only needs exact lookup.
It uses a heap when the workload needs range scan.
It ignores skewed input.
It recursively processes deep untrusted input without considering stack depth.
It does not specify inclusive/exclusive range semantics.

25. Summary

A binary search tree is a binary tree plus a global ordering invariant.

That invariant enables search-space pruning:

search      O(h)
insert      O(h)
delete      O(h)
floor       O(h)
ceiling     O(h)
range scan  O(h + m)

The height h is the cost driver.

Unbalanced trees teach the invariant.

Balanced trees make the invariant reliable under real workloads.

In production Java, TreeMap, TreeSet, NavigableMap, and NavigableSet are the default tools for dynamic ordered indexes.

Custom BSTs are justified when you need specialized augmentation, custom memory layout, or educational control.

The next part explains how balanced trees maintain height guarantees through rotations and structural invariants.

Lesson Recap

You just completed lesson 16 in build core. Use the series map if you want to review the broader track, or continue directly into the next lesson while the context is still warm.

Back To Series Next Lesson

Continue The Track

Keep the momentum while the lesson is still fresh. Move backward for review or continue forward into the next concept.

Previous Lesson

Lesson 15

Learn Java Dsa Part 015 Binary Trees And Tree Traversal Invariants

Next Lesson

Lesson 17

Learn Java Dsa Part 017 Balanced Trees Red Black Avl And Btrees