Week 8: Advanced Graph Concepts
1. Overviewβ
Welcome to Week 8! This week concludes Phase 2 of our DSA roadmap. Having built a strong foundation in grid traversals and undirected graphs last week, we are now adding Direction and Dependencies.
You will focus heavily on Directed Acyclic Graphs (DAGs). These structures are the backbone of task schedulers, build systems (like Maven or npm), CI/CD pipelines, and spreadsheet calculation engines. You will master how to resolve dependencies in the correct order and how to detect if a system is deadlocked due to a circular dependency.
Goals for this week:
- Understand In-degree and Out-degree in Directed Graphs.
- Master cycle detection in Directed Graphs using a 3-state DFS.
- Master Topological Sorting using Kahn's Algorithm (BFS).
- Understand how to schedule tasks that can run in parallel vs. sequentially.
Knowledge You Need Before Startingβ
- Week 7 graph representation and traversal fluency.
- Strong BFS/DFS implementation and visited-state discipline.
- Basic dependency modeling (prerequisite -> dependent direction).
- Familiarity with queues and counting arrays (
inDegree[]).
2. Theory & Fundamentalsβ
2.1 Mental Model: Why Directed Graphs?β
Last week's undirected graphs modeled symmetric relationships: if city A connects to city B, then B also connects to A. But many real-world relationships are one-way:
- "Course A requires Course B" does not mean "Course B requires Course A"
- "Service A depends on Service B" does not mean B depends on A
- "Task A must finish before Task B starts" is directional
An undirected edge is like a two-way street. A directed edge (arc) is a one-way street with an arrow.
Undirected (symmetric): Directed (one-way):
A βββ B βββ C A βββΆ B βββΆ C
| | β² |
βββ D βββ D ββββββββββ
"A and B are connected" "A must come before B"
Real-world DAG examples you've worked with:
| System | Nodes | Directed Edge |
|---|---|---|
| Maven/Gradle build | JAR modules | "A depends on B" (B must compile first) |
| npm/yarn | packages | "A requires B" (B must install first) |
| GitHub Actions CI | workflow steps | "step A triggers step B" |
| Spreadsheet engine | cells | "cell A's formula uses cell B's value" |
| University curriculum | courses | "must take Math 101 before Math 201" |
2.2 Graph Terminology: In-degree & Out-degreeβ
Graph: A βββΆ B βββΆ D
β β
βΌ βΌ
C βββΆ E
| Node | In-degree (arrows coming IN) | Out-degree (arrows going OUT) |
|---|---|---|
| A | 0 | 2 (βB, βC) |
| B | 1 (Aβ) | 2 (βD, βE) |
| C | 1 (Aβ) | 1 (βE) |
| D | 1 (Bβ) | 0 |
| E | 2 (Bβ, Cβ) | 0 |
Key insight for Topological Sort:
- In-degree 0 = no prerequisites = can be processed immediately (these are your starting points)
- In-degree > 0 = has prerequisites = must wait
2.3 Directed Acyclic Graphs (DAG) β The Golden Propertyβ
A DAG is a directed graph with no cycles. This seemingly simple constraint has massive implications:
Valid DAG: NOT a DAG (has cycle):
A βββΆ B βββΆ D A βββΆ B
β β² β
βΌ β βΌ
C βββΆ E D βββ C
(Can never return to A) (AβBβCβDβA is a cycle!)
Why does "no cycle" matter so much?
Think about a build system where A depends on B, and B depends on A. Neither can ever be built β this is a deadlock. The cycle makes it logically impossible to find a valid build order. If and only if the graph is a DAG can you always find a valid topological ordering.
Proving a directed graph is a DAG = proving topological sort is possible.
These two are equivalent:
- "This graph has a valid topological order" β "This graph is a DAG (no cycles)"
2.4 Topological Sorting: What It Meansβ
A topological sort is a linear ordering of all nodes such that for every directed edge U β V, node U appears before node V in the ordering.
DAG: A βββΆ B βββΆ D
β β
βΌ βΌ
C βββΆ E
Valid topological orderings:
A, B, C, D, E β
(A before B, A before C, B before D, B before E, C before E)
A, C, B, E, D β
(also valid)
A, B, C, E, D β
(also valid)
B, A, C, D, E β (B before A violates A β B)
Key insight: Topological sort is NOT unique. A DAG can have many valid orderings. Problems that ask "find A valid order" or "can you find an order" are topological sort problems.
2.5 Kahn's Algorithm (BFS) β Step by Stepβ
Kahn's Algorithm is the most intuitive approach. Think of it like this:
The Metaphor: You are building a house. You can only start work that has all its prerequisites complete. You start with everything that has no prerequisites (foundation), complete it, then see what's newly unblocked, and repeat.
Graph: A βββΆ C βββΆ E
β β²
βΌ β
B βββΆ D βββββ
In-degrees: A=0, B=1(Aβ), C=1(Aβ), D=1(Bβ), E=2(Cβ, Dβ)
Step-by-step execution:
Initial: queue = [A] (only A has in-degree 0)
order = []
Step 1: poll A β order = [A]
Process A's neighbors: B (1β0 β
add to queue), C (1β0 β
add to queue)
queue = [B, C]
Step 2: poll B β order = [A, B]
Process B's neighbors: D (1β0 β
add to queue)
queue = [C, D]
Step 3: poll C β order = [A, B, C]
Process C's neighbors: E (2β1, not yet 0)
queue = [D]
Step 4: poll D β order = [A, B, C, D]
Process D's neighbors: E (1β0 β
add to queue)
queue = [E]
Step 5: poll E β order = [A, B, C, D, E]
queue = []
Result: [A, B, C, D, E] β every edge UβV has U before V β
Cycle check: processed 5 nodes = 5 total nodes β no cycle β
Cycle detection is built in: If a cycle exists, some nodes will never reach in-degree 0 and will never enter the queue. After BFS finishes, if the processed count is less than total nodes, a cycle blocked the sort.
2.6 DFS Topological Sort β The Post-Order Approachβ
An alternative to Kahn's: use DFS and push each node to a Stack after ALL its descendants have been fully explored. Pop the stack to get topological order.
DFS traversal:
Visit A β go to B β go to D (leaf, push D) β backtrack to B β go to E (leaf, push E) β push B
Back to A β go to C β E already visited β push C β push A
Stack (bottom to top): D, E, B, C, A
Pop order (topological): A, C, B, E, D β
When to use which approach:
| Kahn's Algorithm (BFS) | DFS + Stack | |
|---|---|---|
| Cycle detection | β Natural (count check) | Needs 3-state |
| Parallel scheduling | β Direct (BFS levels) | β Extra work |
| Finding all orderings | β Hard to modify | β Easier with backtracking |
| Intuition | β More intuitive | β Trickier |
| Interview preference | β Recommended | Mention as alternative |
2.7 The 3-State DFS: Why 2 States Aren't Enoughβ
In an undirected graph, you only need 2 states (visited/unvisited). If you reach a visited node that isn't your parent, it's a cycle.
In a directed graph, 2 states fail silently:
Graph: A βββΆ B
β β
βΌ βΌ
C βββΆ D
β²
β
E βββββ
Is this a cycle?
DFS from A: visit A β visit B β visit D β mark D visited
backtrack to B β backtrack to A β visit C β visit D (already visited!)
β 2-state says "CYCLE!" but this is actually a valid DAG (D is reachable from two paths)
The issue: D is finished (fully explored) β it's a cross-edge, not a back-edge. A true cycle only exists if you hit a node that's currently being processed (still in the call stack).
The 3-state solution:
| State | Value | Meaning |
|---|---|---|
| UNVISITED | 0 | Never seen |
| VISITING | 1 | In the current DFS path (on the call stack) |
| VISITED | 2 | Fully explored, all descendants processed |
If you encounter state 1 (VISITING) β CYCLE (back-edge to ancestor in current path)
If you encounter state 2 (VISITED) β SAFE (cross-edge, already fully processed)
Visual proof:
Actual cycle: A βββΆ B βββΆ C βββΆ A (cycle!)
DFS from A:
A β state[A]=1 (VISITING)
A β B β state[B]=1
B β C β state[C]=1
C β A β state[A]==1 !! β CYCLE DETECTED β
Cross-edge (no cycle): A βββΆ B, A βββΆ C, B βββΆ D, C βββΆ D
DFS from A:
A β state[A]=1
A β B β state[B]=1
B β D β state[D]=1
D has no neighbors β state[D]=2 (VISITED)
backtrack to B β state[B]=2
A β C β state[C]=1
C β D β state[D]==2 (already VISITED, safe!) β no cycle β
state[C]=2 β state[A]=2
2.8 Parallel Scheduling: BFS Levels = Parallel Batchesβ
When Kahn's runs, nodes at the same BFS level have all their prerequisites satisfied simultaneously. With infinite resources, all nodes at the same level can execute in parallel.
Graph: A B (both have in-degree 0)
β β
βΌ βΌ
C D (available after A and B finish)
\ /
βΌ
E (available after C and D finish)
BFS levels:
Level 0: {A, B} β run in parallel β 1 hour
Level 1: {C, D} β run in parallel β 1 hour
Level 2: {E} β 1 hour
Total time = 3 hours (number of BFS levels = critical path length)
Critical path: The longest chain of dependencies determines the minimum total time, regardless of how many servers you have. This is a fundamental concept in project management (CPM).
3. Code Templates (Java)β
Template 1: Build Adjacency List (Directed Graph)β
// Standard setup for directed graph problems
int n = numNodes;
List<List<Integer>> adj = new ArrayList<>();
int[] inDegree = new int[n];
for (int i = 0; i < n; i++) adj.add(new ArrayList<>());
// Edge: u β v (u must come before v)
for (int[] edge : edges) {
int u = edge[0], v = edge[1];
adj.get(u).add(v);
inDegree[v]++; // v has one more prerequisite
}
Template 2: Kahn's Algorithm (Topological Sort + Cycle Detection)β
public int[] topologicalSort(int numNodes, int[][] prerequisites) {
List<List<Integer>> adj = new ArrayList<>();
int[] inDegree = new int[numNodes];
for (int i = 0; i < numNodes; i++) adj.add(new ArrayList<>());
// prerequisites[i] = [course, preReq] means preReq β course
for (int[] pre : prerequisites) {
adj.get(pre[1]).add(pre[0]);
inDegree[pre[0]]++;
}
// Start with all nodes that have no prerequisites
Queue<Integer> queue = new ArrayDeque<>();
for (int i = 0; i < numNodes; i++) {
if (inDegree[i] == 0) queue.offer(i);
}
int[] order = new int[numNodes];
int index = 0;
while (!queue.isEmpty()) {
int curr = queue.poll();
order[index++] = curr;
for (int neighbor : adj.get(curr)) {
inDegree[neighbor]--;
if (inDegree[neighbor] == 0) {
queue.offer(neighbor);
}
}
}
// If index != numNodes, a cycle prevented some nodes from being processed
return index == numNodes ? order : new int[0];
}
Template 3: Kahn's Algorithm with Parallel Level Trackingβ
// Returns the minimum time (number of BFS levels) to complete all tasks
public int minTimeToComplete(int numNodes, int[][] prerequisites) {
List<List<Integer>> adj = new ArrayList<>();
int[] inDegree = new int[numNodes];
for (int i = 0; i < numNodes; i++) adj.add(new ArrayList<>());
for (int[] pre : prerequisites) {
adj.get(pre[1]).add(pre[0]);
inDegree[pre[0]]++;
}
Queue<Integer> queue = new ArrayDeque<>();
for (int i = 0; i < numNodes; i++) {
if (inDegree[i] == 0) queue.offer(i);
}
int totalTime = 0;
int processed = 0;
while (!queue.isEmpty()) {
int levelSize = queue.size(); // all tasks runnable in parallel this round
totalTime++; // one time unit per level
for (int i = 0; i < levelSize; i++) {
int curr = queue.poll();
processed++;
for (int neighbor : adj.get(curr)) {
inDegree[neighbor]--;
if (inDegree[neighbor] == 0) {
queue.offer(neighbor);
}
}
}
}
return processed == numNodes ? totalTime : -1; // -1 means cycle detected
}
Template 4: 3-State DFS for Cycle Detectionβ
// 0 = unvisited, 1 = visiting (in call stack), 2 = visited (safe)
public boolean hasCycle(int numNodes, List<List<Integer>> adj) {
int[] state = new int[numNodes];
for (int i = 0; i < numNodes; i++) {
if (state[i] == 0) {
if (dfsHasCycle(i, adj, state)) return true;
}
}
return false;
}
private boolean dfsHasCycle(int curr, List<List<Integer>> adj, int[] state) {
if (state[curr] == 1) return true; // back-edge β CYCLE
if (state[curr] == 2) return false; // cross-edge β already safe
state[curr] = 1; // mark as currently visiting
for (int neighbor : adj.get(curr)) {
if (dfsHasCycle(neighbor, adj, state)) return true;
}
state[curr] = 2; // all descendants safe, mark as done
return false;
}
Template 5: DFS Topological Sort (Post-order Stack)β
public int[] topoSortDFS(int n, List<List<Integer>> adj) {
int[] state = new int[n];
Deque<Integer> stack = new ArrayDeque<>();
for (int i = 0; i < n; i++) {
if (state[i] == 0) {
if (dfsTopoSort(i, adj, state, stack)) {
return new int[0]; // cycle detected
}
}
}
int[] result = new int[n];
int idx = 0;
while (!stack.isEmpty()) result[idx++] = stack.pop();
return result;
}
private boolean dfsTopoSort(int curr, List<List<Integer>> adj, int[] state, Deque<Integer> stack) {
if (state[curr] == 1) return true; // cycle
if (state[curr] == 2) return false; // already processed
state[curr] = 1;
for (int neighbor : adj.get(curr)) {
if (dfsTopoSort(neighbor, adj, state, stack)) return true;
}
state[curr] = 2;
stack.push(curr); // push AFTER all descendants are done
return false;
}
Template 6: String-Keyed Graph (when nodes are strings, not integers)β
// When node IDs are strings β use HashMap instead of arrays
public List<String> topologicalSortStrings(String[] nodes, String[][] edges) {
Map<String, List<String>> adj = new HashMap<>();
Map<String, Integer> inDegree = new HashMap<>();
for (String node : nodes) {
adj.put(node, new ArrayList<>());
inDegree.put(node, 0);
}
for (String[] edge : edges) {
String from = edge[0], to = edge[1]; // from β to
adj.get(from).add(to);
inDegree.merge(to, 1, Integer::sum); // inDegree[to]++
}
Queue<String> queue = new ArrayDeque<>();
for (String node : nodes) {
if (inDegree.get(node) == 0) queue.offer(node);
}
List<String> result = new ArrayList<>();
while (!queue.isEmpty()) {
String curr = queue.poll();
result.add(curr);
for (String neighbor : adj.get(curr)) {
int newDegree = inDegree.get(neighbor) - 1;
inDegree.put(neighbor, newDegree);
if (newDegree == 0) queue.offer(neighbor);
}
}
if (result.size() != nodes.length)
throw new IllegalArgumentException("Circular dependency detected!");
return result;
}
4. Problem-Solving Frameworkβ
Step 1 β Recognize the Graph (2 minutes)β
Ask yourself:
- Is the relationship directional? ("A depends on B" vs. "A connects to B") β Directed graph
- Are there cycles possible? (or must I detect/prevent them?) β Cycle detection
- Do I need an ordering? ("What order to process?") β Topological sort
- Can tasks run in parallel? ("Minimum time?") β BFS level tracking
Step 2 β Clarify Edge Direction (Critical!)β
The most common bug in graph problems is building edges in the wrong direction.
Rule of thumb: If "A depends on B", then B must come before A. So the edge in your adjacency list goes B β A (B points to A), and inDegree[A]++.
"Course A requires Course B":
- Edge: B β A (B must be taken before A)
- adj.get(B).add(A)
- inDegree[A]++
"Task A must finish before Task B":
- Edge: A β B (A points to B)
- adj.get(A).add(B)
- inDegree[B]++
Draw the graph and verify: For a small example, draw the directed edges and check: can you reach a valid ordering by following the arrows?
Step 3 β Choose Kahn's or 3-State DFSβ
| Need to... | Use |
|---|---|
| Find valid ordering AND detect cycles | Kahn's (BFS) |
| Find minimum parallel execution time | Kahn's with levelSize |
| Just detect if cycle exists | 3-State DFS or Kahn's count check |
| Check if all nodes are "safe" (no cycle reachable) | 3-State DFS |
| Find ordering with backtracking (all valid orderings) | DFS |
Step 4 β Always Verify the Cycle Checkβ
Kahn's: if (processed != totalNodes) β cycle exists
3-State DFS: if (state[curr] == 1) β cycle exists
Never skip this. Problems like "can you finish all courses?" ARE asking you to detect cycles β returning the order alone is not enough.
Step 5 β Handle the String-to-Integer Mappingβ
When node names are strings (service names, course names), either:
- Map strings to integers first, then use array-based adjacency list (faster)
- Use
HashMap<String, List<String>>directly (cleaner, acceptable in interviews)
5. Pattern Recognition Guideβ
Signal β Pattern Mappingβ
| Problem signal | Pattern | Key technique |
|---|---|---|
| "Prerequisites", "Dependencies", "Must do A before B" | Topological sort | Kahn's BFS or DFS post-order |
| "Can all tasks be completed?" | Cycle detection | Kahn's count check or 3-state DFS |
| "Find a valid execution order" | Topological sort | Kahn's (returns the order directly) |
| "Minimum time with parallel execution" | BFS level tracking | Kahn's + levelSize loop |
| "Find safe nodes" (no cycle reachable from node) | 3-state DFS | Mark safe after all successors are safe |
| "Detect deadlock" | Cycle detection | 3-state DFS (state 1 = deadlock) |
| "Alien dictionary / unknown character order" | Topological sort | Build graph from adjacent word pairs |
| "Longest path in a directed graph" | Topo sort + DP | Sort first, then DP on sorted order |
| "Reconstruct sequence" | Topological sort uniqueness | Check if exactly one node in queue at each step |
| "All paths from source to target in DAG" | DFS with backtracking | No cycle = no infinite loop |
Detailed Recognition Walkthroughsβ
Recognizing Parallel Scheduling (LC 1136 β Parallel Courses)β
When you see: "each task takes 1 unit, tasks can run simultaneously if dependencies are met, find minimum time"
- "Minimum time = number of BFS levels in topological sort."
- "All nodes entering the queue simultaneously = tasks that can run in parallel."
- "Add
levelSizeloop inside Kahn's to count levels." - "If processed < total at end, return -1 (cycle/impossible)."
Recognizing Alien Dictionary (LC 269 β Hard)β
When you see: "given a sorted list of words in alien language, determine the character order"
- "This is topological sort on characters."
- "Compare adjacent words character by character to find the first difference β that gives a directed edge."
- "Example:
['wrt', 'wrf']βtcomes beforefβ edget β f." - "Build graph from these edges, then topological sort."
- "If impossible (cycle found), return empty string."
Recognizing Safe Nodes (LC 802)β
When you see: "find all nodes from which every path leads to a terminal"
- "A node is 'unsafe' if it's part of a cycle or can reach a cycle."
- "Use 3-state DFS: state 1 = currently visiting (potentially unsafe), state 2 = verified safe."
- "A node is safe if all its neighbors are safe."
- "This is exactly 3-state DFS β state 2 means 'verified safe, all descendants are terminal or safe'."
Recognizing Longest Path in DAG (LC 329 β Matrix)β
When you see: "longest increasing path in a matrix"
- "This is a DAG problem: each cell points to its larger neighbors."
- "There are no cycles because each edge goes strictly to a larger value."
- "Use DFS + memoization: longest path from cell (i,j) = 1 + max(longest path of valid neighbors)."
- "This is equivalent to: topologically sort by value, then DP on the sorted order."
6. Common Mistakes & How to Avoid Themβ
Mistake 1: Building Edges in the Wrong Directionβ
// Problem: "Course A requires Course B" means take B first, then A
// Edge should be: B β A (B enables A)
// β WRONG β reversed edge
for (int[] pre : prerequisites) {
adj.get(pre[0]).add(pre[1]); // pre[0]=A, pre[1]=B β adds AβB β
inDegree[pre[1]]++; // increments B's in-degree β
}
// β
CORRECT β BβA, inDegree[A]++
for (int[] pre : prerequisites) {
adj.get(pre[1]).add(pre[0]); // BβA
inDegree[pre[0]]++; // A gets one more prerequisite
}
Debug trick: Take a tiny example, draw the edges on paper, and verify that the topological order follows the arrows correctly.
Mistake 2: Forgetting the Cycle Check at the End of Kahn'sβ
// β WRONG β missing cycle detection, always returns the partial order
while (!queue.isEmpty()) { ... }
return order; // might only have 3 elements for a 5-node graph!
// β
CORRECT β verify all nodes were processed
return index == numNodes ? order : new int[0]; // or throw exception
Mistake 3: Using 2-State Instead of 3-State DFS for Directed Graphsβ
// β WRONG for directed graphs β false positives on cross-edges
boolean[] visited = new boolean[n];
if (visited[neighbor]) return true; // falsely reports cycle
// β
CORRECT β 3 states distinguish back-edge from cross-edge
int[] state = new int[n]; // 0=unvisited, 1=visiting, 2=visited
if (state[neighbor] == 1) return true; // true back-edge = true cycle
if (state[neighbor] == 2) return false; // cross-edge = safe
Mistake 4: Not Initializing In-degrees for All Nodesβ
// β WRONG β nodes with no incoming edges won't be in inDegree map
for (int[] edge : edges) {
inDegree.put(edge[1], inDegree.getOrDefault(edge[1], 0) + 1);
}
// Problem: source nodes (in-degree 0) are never added to the map!
// queue.offer() check will skip them.
// β
CORRECT β initialize ALL nodes first
for (String node : allNodes) {
inDegree.put(node, 0); // every node starts at 0
}
for (int[] edge : edges) {
inDegree.put(edge[1], inDegree.get(edge[1]) + 1);
}
Mistake 5: Using LinkedList as Queueβ
// β Slower β LinkedList has higher constant factor
Queue<Integer> queue = new LinkedList<>();
// β
Preferred β ArrayDeque is faster and more memory-efficient
Queue<Integer> queue = new ArrayDeque<>();
Mistake 6: Forgetting to Start DFS from All Unvisited Nodesβ
// β WRONG β only starts DFS from node 0, misses disconnected components
if (dfsCycle(0, adj, state)) return true;
// β
CORRECT β start from every unvisited node to cover the full graph
for (int i = 0; i < numNodes; i++) {
if (state[i] == 0) {
if (dfsCycle(i, adj, state)) return true;
}
}
7. Worked Examplesβ
Example 1: LeetCode 207 β Course Scheduleβ
Problem: Given numCourses and prerequisites[i] = [a, b] (b must be taken before a), return true if you can finish all courses.
Thinking process:
- "Can you finish all courses?" = "Is there a valid topological order?" = "Is there no cycle?"
- Use Kahn's Algorithm: if all nodes are processed, no cycle exists.
- Edge direction:
prerequisites[i] = [a, b]β b must come before a β edgeb β a,inDegree[a]++.
class Solution {
public boolean canFinish(int numCourses, int[][] prerequisites) {
List<List<Integer>> adj = new ArrayList<>();
int[] inDegree = new int[numCourses];
for (int i = 0; i < numCourses; i++) adj.add(new ArrayList<>());
for (int[] pre : prerequisites) {
adj.get(pre[1]).add(pre[0]); // pre[1] β pre[0]
inDegree[pre[0]]++;
}
Queue<Integer> queue = new ArrayDeque<>();
for (int i = 0; i < numCourses; i++) {
if (inDegree[i] == 0) queue.offer(i);
}
int completed = 0;
while (!queue.isEmpty()) {
int curr = queue.poll();
completed++;
for (int next : adj.get(curr)) {
if (--inDegree[next] == 0) queue.offer(next);
}
}
return completed == numCourses;
}
}
Dry run for numCourses=4, prerequisites=[[1,0],[2,0],[3,1],[3,2]]:
Edges: 0β1, 0β2, 1β3, 2β3
inDegree: [0, 1, 1, 2]
Initial queue: [0]
Poll 0 β completed=1 β decrement 1 (0), 2 (0) β queue=[1,2]
Poll 1 β completed=2 β decrement 3 (1) β queue=[2]
Poll 2 β completed=3 β decrement 3 (0) β queue=[3]
Poll 3 β completed=4 β queue=[]
completed(4) == numCourses(4) β true β
Time: O(V + E). Space: O(V + E).
Example 2: LeetCode 210 β Course Schedule IIβ
Problem: Same as 207 but return the actual ordering. Return empty array if impossible.
Thinking process:
- Same as 207 but instead of just counting, store the order.
- Kahn's algorithm naturally builds the order as it processes nodes.
class Solution {
public int[] findOrder(int numCourses, int[][] prerequisites) {
List<List<Integer>> adj = new ArrayList<>();
int[] inDegree = new int[numCourses];
for (int i = 0; i < numCourses; i++) adj.add(new ArrayList<>());
for (int[] pre : prerequisites) {
adj.get(pre[1]).add(pre[0]);
inDegree[pre[0]]++;
}
Queue<Integer> queue = new ArrayDeque<>();
for (int i = 0; i < numCourses; i++) {
if (inDegree[i] == 0) queue.offer(i);
}
int[] order = new int[numCourses];
int index = 0;
while (!queue.isEmpty()) {
int curr = queue.poll();
order[index++] = curr;
for (int next : adj.get(curr)) {
if (--inDegree[next] == 0) queue.offer(next);
}
}
return index == numCourses ? order : new int[0];
}
}
Time: O(V + E). Space: O(V + E).
Example 3: LeetCode 802 β Find Eventual Safe Statesβ
Problem: Return all nodes from which every path leads to a terminal node (no cycles reachable).
Thinking process:
- "Safe" = no cycle is reachable from this node.
- A node is unsafe if it's in a cycle OR leads to a cycle.
- 3-state DFS: after returning
true(all descendants safe), mark current node as state 2 (safe). - State 2 = verified safe β reachable without cycles.
class Solution {
public List<Integer> eventualSafeNodes(int[][] graph) {
int n = graph.length;
int[] state = new int[n]; // 0=unvisited, 1=visiting, 2=safe
List<Integer> result = new ArrayList<>();
for (int i = 0; i < n; i++) {
if (isSafe(i, graph, state)) result.add(i);
}
return result;
}
private boolean isSafe(int curr, int[][] graph, int[] state) {
if (state[curr] == 1) return false; // currently visiting = cycle = unsafe
if (state[curr] == 2) return true; // already verified safe
state[curr] = 1; // mark as visiting
for (int next : graph[curr]) {
if (!isSafe(next, graph, state)) return false; // unsafe neighbor
}
state[curr] = 2; // all paths lead to terminal, mark safe
return true;
}
}
Dry run for graph = [[1,2],[2,3],[5],[0],[5],[],[]]:
Nodes 0,1,2,3 form a cycle (0β1β2β3β0? No, 3β0, 0β1, 1β2... let's trace:
isSafe(0): state[0]=1 β check 1,2
isSafe(1): state[1]=1 β check 2,3
isSafe(2): state[2]=1 β check 5
isSafe(5): no neighbors β state[5]=2 β return true
state[2]=2, return true
isSafe(3): state[3]=1 β check 0
isSafe(0): state[0]==1 β return false β CYCLE via 3β0
return false
return false
return false
Node 0,1,3 are NOT safe
isSafe(4): state[4]=1 β check 5
isSafe(5): state[5]==2 β return true (already safe)
state[4]=2 β return true β Node 4 is safe β
Node 5 already marked safe β
Node 6: no neighbors β state[6]=2 β safe β
Safe nodes: [4, 5, 6] β
Time: O(V + E). Space: O(V).
Example 4: LeetCode 329 β Longest Increasing Path in a Matrixβ
Problem: Find the length of the longest increasing path in a matrix (move up/down/left/right).
Thinking process:
- "This is a DAG!" β each cell only points to strictly larger neighbors, so no cycles are possible.
- DFS from every cell, memoize the longest path from each cell.
- No need for explicit visited tracking (DAG = no cycles = no infinite recursion).
class Solution {
private static final int[][] DIRS = {{0,1},{0,-1},{1,0},{-1,0}};
public int longestIncreasingPath(int[][] matrix) {
int m = matrix.length, n = matrix[0].length;
int[][] memo = new int[m][n]; // memo[i][j] = longest path starting at (i,j)
int maxLen = 0;
for (int i = 0; i < m; i++) {
for (int j = 0; j < n; j++) {
maxLen = Math.max(maxLen, dfs(matrix, i, j, m, n, memo));
}
}
return maxLen;
}
private int dfs(int[][] matrix, int i, int j, int m, int n, int[][] memo) {
if (memo[i][j] != 0) return memo[i][j]; // already computed
int maxPath = 1; // at minimum, path length is 1 (just this cell)
for (int[] dir : DIRS) {
int ni = i + dir[0], nj = j + dir[1];
if (ni >= 0 && ni < m && nj >= 0 && nj < n
&& matrix[ni][nj] > matrix[i][j]) { // strictly larger
maxPath = Math.max(maxPath, 1 + dfs(matrix, ni, nj, m, n, memo));
}
}
memo[i][j] = maxPath;
return maxPath;
}
}
Time: O(M \times N) β each cell computed once and memoized. Space: O(M \times N).
8. Decision Tree: Which Graph Algorithm?β
Is the graph DIRECTED?
βββ NO β Use undirected graph algorithms (BFS/DFS for components, Union-Find)
βββ YES β Does it have a cycle?
βββ "Is there a cycle?" β 3-State DFS OR Kahn's count check
βββ "Find ordering (topo sort)"?
βββ YES β Is it Kahn's (BFS) or DFS post-order?
β βββ Need parallel levels? β Kahn's + levelSize
β βββ Need the actual order? β Kahn's (simpler)
β βββ Need all valid orderings? β DFS + backtracking
βββ NO β Is it path/reachability?
βββ Shortest path (unweighted) β BFS
βββ Shortest path (weighted) β Dijkstra or Bellman-Ford
βββ Longest path in DAG β Topo sort + DP
9. Complexity Cheat Sheetβ
| Algorithm | Time | Space | Use case |
|---|---|---|---|
| Kahn's Topological Sort | O(V + E) | O(V + E) | Ordering + cycle detection |
| 3-State DFS Cycle Detection | O(V + E) | O(V) call stack | Cycle detection only |
| DFS Topo Sort (post-order) | O(V + E) | O(V) | Ordering via DFS |
| Parallel scheduling (BFS levels) | O(V + E) | O(V) | Min time with infinite servers |
| Longest path in DAG (topo + DP) | O(V + E) | O(V + E) | Critical path |
| All paths in DAG (DFS) | O(2^V \cdot V) | O(V) | Exponential β only small graphs |
10. 7-Day Practice Plan (21 Problems)β
For each problem, follow this ritual:
- Identify: directed or undirected? weighted or unweighted?
- Draw the graph for a small example, verify edge directions.
- State which algorithm (Kahn's / 3-State DFS / BFS) and why.
- Code and trace manually, verify the cycle check.
Day 1: Kahn's Algorithm & Topological Sort Basics
- Course Schedule (LC 207) β Cycle detection via Kahn's count
- Course Schedule II (LC 210) β Return actual topological order
- Find Eventual Safe States (LC 802) β 3-state DFS for safety
Day 2: Advanced Dependency Resolution 4. Alien Dictionary (LC 269) β β Build graph from word pairs, then topo sort 5. Parallel Courses (LC 1136) β Kahn's + level counting 6. Minimum Height Trees (LC 310) β Topo sort from leaves inward
Day 3: Pathfinding in Directed Graphs 7. All Paths From Source to Target (LC 797) β DFS all paths in DAG 8. Reorder Routes to Make All Paths Lead to Zero (LC 1466) β Edge direction analysis 9. Evaluate Division (LC 399) β Graph with weighted edges
Day 4: Network Connectivity & Degree Analysis 10. Find the Town Judge (LC 997) β In-degree/out-degree counts 11. Maximal Network Rank (LC 1615) β Degree counting 12. Shortest Path with Alternating Colors (LC 1129) β BFS with state
Day 5: Advanced Traversal Mechanics 13. Minimum Number of Vertices to Reach All Nodes (LC 1557) β Find nodes with in-degree 0 14. Time Needed to Inform All Employees (LC 1376) β Tree DFS with weights 15. Loud and Rich (LC 851) β DFS with memoization
Day 6: Matrix Dependencies 16. Longest Increasing Path in a Matrix (LC 329) β β DFS + memoization on implicit DAG 17. Maximum Number of Fish in a Grid (LC 2658) β BFS/DFS component max 18. As Far from Land as Possible (LC 1162) β Multi-source BFS
Day 7: Complex Graph Synthesis 19. Sequence Reconstruction (LC 444) β Unique topo sort check 20. Check if There is a Valid Path in a Grid (LC 1391) β Direction-constrained BFS 21. Cheapest Flights Within K Stops (LC 787) β β Preview of Bellman-Ford / Dijkstra
11. Mock Interview Moduleβ
Problem: The Distributed Build Schedulerβ
Context: You are building the backend for a CI platform. Customers upload a list of microservices to compile along with dependencies. Building service A may require B and C to be built first. A single server can build one service at a time, each taking exactly 1 hour.
Step 1: Clarifying Questions & Expected Answersβ
| Candidate asks | Interviewer answers | Why it matters |
|---|---|---|
| "Can a service depend on itself?" | No | No self-loops |
| "Can there be circular dependencies?" | Yes, must detect them | Throw exception / return error |
| "Are service names unique?" | Yes | Safe to use as map keys |
| "Can services have zero dependencies?" | Yes | These are starting points (in-degree 0) |
| "What if multiple valid orders exist?" | Any valid order is acceptable | Kahn's naturally picks one |
Part 1: Sequential Build (Single Server)β
// Time: O(V + E), Space: O(V + E)
public List<String> getBuildSequence(String[] services, String[][] dependencies) {
Map<String, List<String>> adj = new HashMap<>();
Map<String, Integer> inDegree = new HashMap<>();
for (String s : services) {
adj.put(s, new ArrayList<>());
inDegree.put(s, 0);
}
// "A depends on B" β edge BβA, inDegree[A]++
for (String[] dep : dependencies) {
String target = dep[0], prereq = dep[1];
adj.get(prereq).add(target);
inDegree.merge(target, 1, Integer::sum);
}
Queue<String> queue = new ArrayDeque<>();
for (String s : services) {
if (inDegree.get(s) == 0) queue.offer(s);
}
List<String> buildOrder = new ArrayList<>();
while (!queue.isEmpty()) {
String curr = queue.poll();
buildOrder.add(curr);
for (String dependent : adj.get(curr)) {
int newDegree = inDegree.get(dependent) - 1;
inDegree.put(dependent, newDegree);
if (newDegree == 0) queue.offer(dependent);
}
}
if (buildOrder.size() != services.length)
throw new IllegalArgumentException("Circular dependency detected!");
return buildOrder;
}
Part 2: Parallel Build (Infinite Servers β Enterprise Tier)β
Interviewer: "What if the Enterprise tier provisions infinite servers? Services can compile in parallel when their dependencies are met. How many hours does the full build take?"
Key insight: "Minimum time = number of BFS levels = length of the critical path."
// Time: O(V + E), Space: O(V + E)
public int getParallelBuildTime(String[] services, String[][] dependencies) {
Map<String, List<String>> adj = new HashMap<>();
Map<String, Integer> inDegree = new HashMap<>();
for (String s : services) {
adj.put(s, new ArrayList<>());
inDegree.put(s, 0);
}
for (String[] dep : dependencies) {
adj.get(dep[1]).add(dep[0]);
inDegree.merge(dep[0], 1, Integer::sum);
}
Queue<String> queue = new ArrayDeque<>();
for (String s : services) {
if (inDegree.get(s) == 0) queue.offer(s);
}
int totalHours = 0;
int processed = 0;
while (!queue.isEmpty()) {
int levelSize = queue.size(); // all services that can run NOW
totalHours++; // one hour per level (parallel batch)
for (int i = 0; i < levelSize; i++) {
String curr = queue.poll();
processed++;
for (String dependent : adj.get(curr)) {
int newDegree = inDegree.get(dependent) - 1;
inDegree.put(dependent, newDegree);
if (newDegree == 0) queue.offer(dependent);
}
}
}
if (processed != services.length)
throw new IllegalArgumentException("Circular dependency detected!");
return totalHours;
}
Dry run for services = [A,B,C,D,E], dependencies: AβC, AβD, BβD, CβE, DβE`:
Edges (prereqβtarget): AβC, AβD, BβD, CβE, DβE
inDegree: A=0, B=0, C=1, D=2, E=2
Hour 1: queue=[A,B], levelSize=2 β process A,B β C(0βqueue), D(1)
Hour 2: queue=[C], levelSize=1 β process C β D(0βqueue), E(1)? No, CβE only
Wait β D was not yet 0 from A. AβD decrements D to 1. BβD decrements to 0.
Let me redo: poll A β C(0β
queue), D(1). Poll B β D(0β
queue).
After hour 1 processing: queue=[C, D]
Hour 2: queue=[C,D], levelSize=2 β process C (Eβ1), D (Eβ0β
queue)
After: queue=[E]
Hour 3: queue=[E], process E
Total: 3 hours β
(critical path: AβCβE or AβDβE)
Step 3: Follow-up Questionsβ
Q1: "What if each service has a different build time (not all 1 hour)? How do you find the minimum total time?"
Expected answer: "This becomes the critical path problem. I'd use topo sort + DP:
dp[node]= earliest time this service can FINISH.dp[node] = buildTime[node] + max(dp[prerequisite])for all prerequisites.- Process in topological order, take max of all
dpvalues as the answer."
// After topological sort, process in order:
int[] earliestFinish = new int[n];
for (int node : topoOrder) {
earliestFinish[node] = buildTime[node]; // start with own time
for (int prereq : prerequisites_of[node]) {
earliestFinish[node] = Math.max(earliestFinish[node],
buildTime[node] + earliestFinish[prereq]); // can't start until prereq finishes
}
}
return Arrays.stream(earliestFinish).max().getAsInt();
Q2: "What if the dependency graph is enormous (millions of services) and doesn't fit in memory?"
Expected answer: "Partition the graph by domain or team ownership, process subgraphs independently. Use a distributed message queue (like Kafka) where each service publishes a 'build complete' event that decrements the in-degree counter of dependents β this is exactly how real CI systems like Bazel and Pants implement incremental builds. The in-degree array becomes a distributed counter."
Q3: "How would you handle dynamic dependencies β a service that discovers new dependencies at build time?"
Expected answer: "This makes true topological sort impossible upfront. Real build systems handle this with: (1) conservative over-approximation of dependencies (declare all possible deps), or (2) restart the build plan when new dependencies are discovered mid-run, or (3) use file-level dependency tracking with a cache β if the output of B hasn't changed, skip rebuilding A even if B was 'rebuilt'. This is how Make and Bazel work."
12. Quick Reference: Directed Graph Idioms in Javaβ
// ββ GRAPH SETUP βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
// Integer nodes (0 to n-1) β preferred for performance
int n = numNodes;
List<List<Integer>> adj = new ArrayList<>();
int[] inDegree = new int[n];
for (int i = 0; i < n; i++) adj.add(new ArrayList<>());
// String nodes β use HashMap
Map<String, List<String>> adj = new HashMap<>();
Map<String, Integer> inDegree = new HashMap<>();
for (String node : nodes) { adj.put(node, new ArrayList<>()); inDegree.put(node, 0); }
// ββ ADD DIRECTED EDGE u β v βββββββββββββββββββββββββββββββββββββββββββββββ
adj.get(u).add(v);
inDegree[v]++; // v gets one more prerequisite
// ββ KAHN'S SETUP βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Queue<Integer> queue = new ArrayDeque<>();
for (int i = 0; i < n; i++) {
if (inDegree[i] == 0) queue.offer(i);
}
// ββ KAHN'S PROCESS ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
int processed = 0;
while (!queue.isEmpty()) {
int curr = queue.poll();
processed++;
for (int next : adj.get(curr)) {
if (--inDegree[next] == 0) queue.offer(next);
}
}
boolean hasCycle = (processed != n);
// ββ KAHN'S PARALLEL LEVELS ββββββββββββββββββββββββββββββββββββββββββββββββ
int levels = 0;
while (!queue.isEmpty()) {
int levelSize = queue.size();
levels++;
for (int i = 0; i < levelSize; i++) {
int curr = queue.poll();
for (int next : adj.get(curr)) {
if (--inDegree[next] == 0) queue.offer(next);
}
}
}
// ββ 3-STATE DFS βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
int[] state = new int[n]; // 0=unvisited, 1=visiting, 2=done
// Always loop over ALL nodes (disconnected components!)
for (int i = 0; i < n; i++) {
if (state[i] == 0) dfsCycle(i, adj, state);
}
// ββ IN-DEGREE SHORTCUT (Nodes with no incoming edges = sources) βββββββββββ
List<Integer> sources = new ArrayList<>();
for (int i = 0; i < n; i++) {
if (inDegree[i] == 0) sources.add(i);
}
// ββ COMMON ONE-LINER: decrement + conditional offer βββββββββββββββββββββββ
if (--inDegree[neighbor] == 0) queue.offer(neighbor);
// equivalent to:
// inDegree[neighbor]--;
// if (inDegree[neighbor] == 0) queue.offer(neighbor);