Union Find (Disjoint Set)

Union‑find, or Disjoint Set Union (DSU), is a data structure for maintaining a collection of disjoint sets under two operations:

  • find(x) – identify which set (component) x belongs to.
  • union(x, y) – merge the sets containing x and y.

It is extremely useful for dynamic connectivity problems in graphs and grids.

Internal Representation

Union‑find represents each set as a tree:

  • Each element x has a parent[x].
  • A root node satisfies parent[root] == root and is the representative of its component.

find(x) walks up parent pointers until it reaches a root.

Optimizations

Two classic optimizations make union‑find very fast in practice:

  • Path compression in find:
    • After finding the root for x, set parent[x] directly to that root.
    • Often done recursively: parent[x] = find(parent[x]).
  • Union by rank/size in union:
    • Always attach the smaller tree under the larger tree’s root.
    • Keeps trees shallow, improving future find performance.

With both optimizations, the amortized time per operation is effectively constant for realistic input sizes.

Typical API

Java
class UnionFind {
    private int[] parent;
    private int[] rank; // or size

    UnionFind(int n) {
        parent = new int[n];
        rank = new int[n];
        for (int i = 0; i < n; i++) {
            parent[i] = i;
            rank[i] = 0;
        }
    }

    int find(int x) {
        if (parent[x] != x) {
            parent[x] = find(parent[x]); // path compression
        }
        return parent[x];
    }

    void union(int x, int y) {
        int rx = find(x), ry = find(y);
        if (rx == ry) return;
        if (rank[rx] < rank[ry]) {
            parent[rx] = ry;
        } else if (rank[rx] > rank[ry]) {
            parent[ry] = rx;
        } else {
            parent[ry] = rx;
            rank[rx]++;
        }
    }
}

Use Cases

Union‑find is ideal for:

  • Counting connected components in an undirected graph.
  • Detecting cycles when incrementally adding edges.
  • Grid connectivity problems (e.g. number of islands with online operations).
  • Building Minimum Spanning Trees with Kruskal’s algorithm.

Whenever you need to merge groups and ask “are these two elements in the same group?”, union‑find should be top of mind.