Mark As Completed Discussion

Union-Find, Explained Simply

Union-Find (also called Disjoint Set Union, DSU) is a tiny toolkit for tracking which items belong together. Think of it as a label maker for groups: it can quickly tell you whether two people are in the same “friend circle,” and it can glue two circles together.


What We’re Managing: Disjoint Sets

Imagine n people split into separate circles. Two rules:

  1. No overlap. Each person is in exactly one circle. If Sarah is in S1, she isn’t in S2.
  2. Two questions we care about.

    • Find: “Which circle is Sarah in?”
    • Union: “Can we merge S1 and S2 into one circle?”

Example circles:

  • S1 = {Alice, Bob, Carol}
  • S2 = {Dave, Eve}
  • S3 = {Grace, Helen, Frank}

They don’t overlap, so they’re perfect for Union-Find.


The Two Moves

  1. Find(x) Returns the representative (the label) of x’s circle. Example: Find(Bob) → S1.

  2. Union(a, b) Merges the circles that contain a and b. Example: Union(S1, S2){Alice, Bob, Carol, Dave, Eve} becomes one circle.

That’s the whole API.


How It Works Under the Hood

Internally, each circle is a tiny tree:

  • Every person points to a “parent.”
  • The root of the tree is the circle’s representative.

So:

  • Find(x) walks parent pointers until it reaches the root (the label of x’s circle).
  • Union(a, b) connects one root under the other, turning two trees into one.

Speed Tricks (Why It’s So Fast)

  1. Path Compression (during Find) After we climb to the root, we flatten the path: everyone we touched points directly to the root. Future Finds get almost instant.

  2. Union by Rank/Size Always attach the smaller tree under the larger tree’s root. This keeps trees shallow, which keeps Finds quick.

With both tricks, operations are effectively near O(1) on average—even for huge datasets.


Quick Mental Model

  • Label maker: Find reads the label; Union combines labels.
  • Family tree: the root is the “head of the family.” Find walks up to the head; Union marries two families by making one head point to the other.
  • Autotune: path compression and union-by-size keep the trees short so everything stays fast.