Union Find (Disjoint Set)
Union‑find, or Disjoint Set Union (DSU), is a data structure for maintaining a collection of disjoint sets under two operations:
find(x)– identify which set (component)xbelongs to.union(x, y)– merge the sets containingxandy.
It is extremely useful for dynamic connectivity problems in graphs and grids.
Internal Representation
Union‑find represents each set as a tree:
- Each element
xhas aparent[x]. - A root node satisfies
parent[root] == rootand is the representative of its component.
find(x) walks up parent pointers until it reaches a root.
Optimizations
Two classic optimizations make union‑find very fast in practice:
- Path compression in
find:- After finding the root for
x, setparent[x]directly to that root. - Often done recursively:
parent[x] = find(parent[x]).
- After finding the root for
- Union by rank/size in
union:- Always attach the smaller tree under the larger tree’s root.
- Keeps trees shallow, improving future
findperformance.
With both optimizations, the amortized time per operation is effectively constant for realistic input sizes.
Typical API
Javaclass UnionFind { private int[] parent; private int[] rank; // or size UnionFind(int n) { parent = new int[n]; rank = new int[n]; for (int i = 0; i < n; i++) { parent[i] = i; rank[i] = 0; } } int find(int x) { if (parent[x] != x) { parent[x] = find(parent[x]); // path compression } return parent[x]; } void union(int x, int y) { int rx = find(x), ry = find(y); if (rx == ry) return; if (rank[rx] < rank[ry]) { parent[rx] = ry; } else if (rank[rx] > rank[ry]) { parent[ry] = rx; } else { parent[ry] = rx; rank[rx]++; } } }
Use Cases
Union‑find is ideal for:
- Counting connected components in an undirected graph.
- Detecting cycles when incrementally adding edges.
- Grid connectivity problems (e.g. number of islands with online operations).
- Building Minimum Spanning Trees with Kruskal’s algorithm.
Whenever you need to merge groups and ask “are these two elements in the same group?”, union‑find should be top of mind.