Skip to main content

Java Collections Framework: Deep Dive

A comprehensive guide to Java's Collections Framework — data structures, common implementations, internal mechanics, and best practices.


1. Collections Hierarchy Overview

Iterable
└── Collection
├── List (ordered, allows duplicates)
│ ├── ArrayList
│ ├── LinkedList
│ ├── Vector (legacy, synchronized)
│ └── CopyOnWriteArrayList (concurrent)
├── Set (no duplicates)
│ ├── HashSet
│ ├── LinkedHashSet
│ └── TreeSet (sorted)
└── Queue / Deque
├── PriorityQueue
├── ArrayDeque
├── LinkedList
└── BlockingQueue (concurrent)
├── ArrayBlockingQueue
├── LinkedBlockingQueue
└── DelayQueue

Map (key-value pairs, not part of Collection)
├── HashMap
├── LinkedHashMap
├── TreeMap (sorted)
├── Hashtable (legacy, synchronized)
└── ConcurrentHashMap (concurrent)

2. List Implementations

ArrayList

The most commonly used List implementation, backed by a resizable array.

Key characteristics:

  • O(1) random access by index (implements RandomAccess)
  • O(1) amortized append (add() at end)
  • O(n) insert/remove in the middle (requires shifting elements)
  • Not thread-safe

Expansion mechanism: When the internal array is full, ArrayList creates a new array 1.5× the current size and copies elements over:

// Simplified expansion logic
int newCapacity = oldCapacity + (oldCapacity >> 1); // 1.5x
elementData = Arrays.copyOf(elementData, newCapacity);

Best practices:

  • Specify initial capacity if you know the size: new ArrayList<>(1000)
  • Use ensureCapacity() before bulk inserts to avoid repeated resizing

LinkedList

Backed by a doubly-linked list. Also implements Deque.

Key characteristics:

  • O(1) insert/remove at head or tail
  • O(n) random access (must traverse the list)
  • O(n) insert at arbitrary index (traversal + O(1) link update)
  • Higher memory overhead per element (each node stores two pointers)

ArrayList vs LinkedList

OperationArrayListLinkedList
Random access get(i)O(1)O(n)
Append add(e)O(1) amortizedO(1)
Insert at index add(i, e)O(n)O(n)*
Remove by indexO(n)O(n)*
Memory per elementLowerHigher (node overhead)
Cache localityExcellent (contiguous)Poor (scattered)

* LinkedList's insert/remove is O(1) for the link operation itself, but O(n) to find the position first.

In practice: ArrayList is almost always the better choice due to CPU cache friendliness and lower overhead.


3. Map Implementations

HashMap Internals

HashMap uses an array of buckets where each bucket is a linked list (or red-black tree for long chains).

Core mechanics:

  1. Hash function: Spreads keys across buckets.

    static final int hash(Object key) {
    int h;
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }

    The XOR with the upper 16 bits reduces collisions when the array size is a power of 2.

  2. Bucket index: index = hash & (capacity - 1) (bitwise AND, equivalent to modulo for power-of-2 sizes).

  3. Collision resolution:

    • Entries with the same bucket index form a linked list.
    • When a linked list exceeds 8 nodes (and capacity ≥ 64), it converts to a red-black tree for O(log n) lookup.
    • When a tree shrinks below 6 nodes, it converts back to a linked list.
  4. Load factor & expansion:

    • Default load factor: 0.75
    • Threshold = capacity × load factor
    • When size > threshold, the array doubles in size and all entries are rehashed.

Important: HashMap is not thread-safe. Use ConcurrentHashMap for concurrent access.

LinkedHashMap

Extends HashMap with a doubly-linked list connecting all entries, preserving either:

  • Insertion order (default)
  • Access order (most-recently-accessed last) — set via constructor parameter

LRU Cache: LinkedHashMap can function as an LRU cache by overriding removeEldestEntry():

public class LRUCache<K, V> extends LinkedHashMap<K, V> {
private final int maxSize;

public LRUCache(int maxSize) {
super(maxSize, 0.75f, true); // accessOrder = true
this.maxSize = maxSize;
}

@Override
protected boolean removeEldestEntry(Map.Entry<K, V> eldest) {
return size() > maxSize;
}
}

TreeMap

Backed by a red-black tree. Entries are sorted by key (natural ordering or a provided Comparator).

  • All operations are O(log n)
  • Supports NavigableMap operations: firstKey(), lastKey(), subMap(), headMap(), tailMap()

HashMap vs Hashtable vs ConcurrentHashMap

FeatureHashMapHashtableConcurrentHashMap
Thread-safe✅ (synchronized)✅ (CAS + synchronized)
Null keys✅ (one null key)
PerformanceBest (single-thread)Poor (coarse lock)Best (concurrent)
RecommendedSingle-threaded useNever (legacy)Multi-threaded use

4. Set Implementations

HashSet

Backed by a HashMap internally — elements are stored as keys, with a dummy constant as the value.

// Internally
private transient HashMap<E, Object> map;
private static final Object PRESENT = new Object();

public boolean add(E e) {
return map.put(e, PRESENT) == null;
}

LinkedHashSet

Extends HashSet using LinkedHashMap — maintains insertion order.

TreeSet

Backed by TreeMap — maintains elements in sorted order.


5. Queue Implementations

PriorityQueue

A min-heap implementation that always dequeues the smallest element (by natural order or Comparator).

PriorityQueue<Integer> minHeap = new PriorityQueue<>();
minHeap.offer(3);
minHeap.offer(1);
minHeap.offer(2);
minHeap.poll(); // returns 1 (smallest)

// Max-heap
PriorityQueue<Integer> maxHeap = new PriorityQueue<>(Comparator.reverseOrder());
  • offer() / poll(): O(log n)
  • peek(): O(1)
  • Not thread-safe — use PriorityBlockingQueue for concurrent access

ArrayDeque

A resizable circular array implementing Deque. Faster than LinkedList as a stack or queue.

Deque<String> stack = new ArrayDeque<>();
stack.push("first");
stack.push("second");
stack.pop(); // "second"

Deque<String> queue = new ArrayDeque<>();
queue.offer("first");
queue.offer("second");
queue.poll(); // "first"

ArrayBlockingQueue

A bounded blocking queue backed by a fixed-size array. Uses ReentrantLock with two Condition objects (notEmpty, notFull) for producer-consumer synchronization.

BlockingQueue<Task> queue = new ArrayBlockingQueue<>(100);

// Producer thread
queue.put(task); // blocks if queue is full

// Consumer thread
Task task = queue.take(); // blocks if queue is empty

DelayQueue

An unbounded blocking queue where elements must implement Delayed. Elements can only be dequeued after their delay has expired.

public class DelayedTask implements Delayed {
private final long executeAt;

@Override
public long getDelay(TimeUnit unit) {
return unit.convert(executeAt - System.currentTimeMillis(), TimeUnit.MILLISECONDS);
}

@Override
public int compareTo(Delayed other) {
return Long.compare(this.getDelay(TimeUnit.MILLISECONDS),
other.getDelay(TimeUnit.MILLISECONDS));
}
}

Use cases: Scheduled task execution, order timeout cancellation, cache expiration.


6. ConcurrentHashMap Deep Dive

JDK 7: Segment-Based Locking

ConcurrentHashMap in JDK 7 used Segment arrays, each containing its own HashEntry[] array. Locking was per-segment, allowing concurrent access to different segments.

ConcurrentHashMap
├── Segment[0] (lock) → HashEntry[] → linked list
├── Segment[1] (lock) → HashEntry[] → linked list
├── ...
└── Segment[15] (lock) → HashEntry[] → linked list

Default concurrency level: 16 segments → up to 16 concurrent writers.

JDK 8: CAS + Synchronized (Node-Based)

JDK 8 replaced segments with a flat Node[] array, using CAS for empty buckets and synchronized on the first node of each bucket:

// Simplified put logic
if (bucket is empty) {
CAS to insert new Node // lock-free for empty buckets
} else {
synchronized (first node of bucket) {
// insert into linked list or red-black tree
}
}

Size counting: Uses baseCount + CounterCell[] to reduce contention when multiple threads update the size concurrently (similar to LongAdder).


7. Collection Usage Best Practices

Common Pitfalls

  1. Arrays.asList() returns a fixed-size list:

    List<String> list = Arrays.asList("a", "b", "c");
    list.add("d"); // throws UnsupportedOperationException!

    // Use this instead:
    List<String> mutableList = new ArrayList<>(Arrays.asList("a", "b", "c"));
  2. subList() creates a view, not a copy:

    List<Integer> list = new ArrayList<>(List.of(1, 2, 3, 4, 5));
    List<Integer> sub = list.subList(1, 3); // [2, 3]
    list.add(6); // modifying original...
    sub.get(0); // ConcurrentModificationException!
  3. Use isEmpty() instead of size() == 0: Some collections (e.g., ConcurrentLinkedQueue) compute size() in O(n).

  4. Collectors.toMap() throws on null values:

    // This throws NullPointerException if any value is null
    map.entrySet().stream().collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));

    // Use HashMap::merge or handle nulls explicitly
  5. Don't modify a collection while iterating:

    // WRONG — ConcurrentModificationException
    for (String s : list) {
    if (s.equals("remove")) list.remove(s);
    }

    // CORRECT — use Iterator.remove()
    Iterator<String> it = list.iterator();
    while (it.hasNext()) {
    if (it.next().equals("remove")) it.remove();
    }

    // Or use removeIf (Java 8+)
    list.removeIf(s -> s.equals("remove"));

Choosing the Right Collection

NeedUse
Ordered list with fast random accessArrayList
Fast insert/remove at both endsArrayDeque
No duplicates, unorderedHashSet
No duplicates, sortedTreeSet
Key-value lookupHashMap
Key-value, sorted keysTreeMap
Key-value, insertion orderLinkedHashMap
Thread-safe mapConcurrentHashMap
Thread-safe list (read-heavy)CopyOnWriteArrayList
Producer-consumer queueArrayBlockingQueue / LinkedBlockingQueue
Priority processingPriorityQueue

8. Sorting & Comparison

Comparable vs Comparator

AspectComparable<T>Comparator<T>
Packagejava.langjava.util
MethodcompareTo(T o)compare(T o1, T o2)
Modifies classYes (implements interface)No (external)
Natural orderingDefines itProvides alternative orderings
// Comparable: natural ordering
public class Employee implements Comparable<Employee> {
private String name;
private int salary;

@Override
public int compareTo(Employee other) {
return Integer.compare(this.salary, other.salary);
}
}

// Comparator: custom ordering
Comparator<Employee> byName = Comparator.comparing(Employee::getName);
Comparator<Employee> bySalaryDesc = Comparator.comparingInt(Employee::getSalary).reversed();

Collections.sort() vs Stream.sorted()

AspectCollections.sort()Stream.sorted()
MutabilityMutates the original listReturns a new sorted stream
StyleImperativeFunctional/declarative
Null handlingThrows NullPointerException on null elementsSame behavior
AlgorithmTimSort (stable, O(n log n))TimSort internally

Sorting Null-safe

To sort lists that may contain nulls, use Comparator.nullsFirst() or nullsLast():

List<String> names = Arrays.asList("Alice", null, "Bob", null, "Charlie");
names.sort(Comparator.nullsLast(Comparator.naturalOrder()));
// [Alice, Bob, Charlie, null, null]

9. Common Pitfalls in Interviews

Mutable Keys in HashMap

If a key object's state changes after insertion, the entry becomes unreachable:

List<String> key = new ArrayList<>(List.of("a"));
Map<List<String>, String> map = new HashMap<>();
map.put(key, "value");

key.add("b"); // Mutates the key → hashCode changes
map.get(key); // null! Entry is lost

Rule: Always use immutable objects as map keys.

equals() without hashCode()

Overriding only equals() breaks the hash-based collection contract:

// BAD: equals overridden but not hashCode
Set<User> users = new HashSet<>();
users.add(new User(1, "Alice"));
users.contains(new User(1, "Alice")); // May return false!

ConcurrentModificationException

// ❌ Throws ConcurrentModificationException
for (String item : list) {
if (item.isEmpty()) list.remove(item);
}

// ✅ Safe alternatives
list.removeIf(String::isEmpty);
// or use Iterator.remove()
// or use CopyOnWriteArrayList for concurrent access