Introduction

Algorithms, Part 1 is a solid course for IT or software engineers to learn algorithms which are provided by Princeton. Of course, we still have to take time to clarify the concepts after completing the class.

Union-Find is a data structure for recording whether points are in the same set. In this article, we will focus on Union-find and its improvements algorithms.

About this Series

This series aims to wrap up contents of Algorithms, Part 1.

Agenda

It includes the following topics in this article.

1. Analysis of algorithms

Example — 2 sum

Approximately how many array accesses as a function of input size N?

Bottom line: use cost model and tilde notation to simplify counts

Simplification 1: cost model

Use some basic operation as a proxy for running time

Simplification 2: tilde notation

  • Estimate running time (or memory) as a function of input size N
  • Ignore lower order terms - When N is large, terms are negligible - When N is small, we don't care

Common math formulas

  • 1 + 2 + ... + N = 1/2 * N * (1 + N)
  • 1^k + 2^k + ... + N^k = 1/(k + 1) * N ^ (k + 1)
  • 1 + 1/2 + ... + 1/N= Log N
  • Triple loops = 1/6 * N ^ 3

2. Algorithm design approach

Steps to develop a usable algorithm

None

Optimal algorithm

Lower bound equals to upper bound (to within a constant factor)

  • e.g 1., Brute-force algorithm for -sum is optimal: it's running time is N
  • e.g 2., Merge sort is an optimal algorithm - upper bound: ~N log N - lower bound: ~N log N

Approach

  1. Develop an algorithm
  2. Prove a lower bound
  3. If this a gap between lower bound and the upper bound, lower the upper bound (discover a new algorithm) or raise the lower bound (more difficult)

3. Union-Find

Goal

Design efficient data structure for union-find

  • Number of objects N can be huge
  • Number of operations M can be huge
  • Find queries and union commands may be intermixed

Quick find (eager approach)

  • Integer array id of N
  • Interpretation - p and q are connected iff(if and only if) they have the same id
None

Java implementation

  • union too expensive (N array accesses)
  • trees are flat, but too expensive to keep them flat

Quick union (lazy approach)

  • Integer array id of N
  • Interpretation - id[i] is the parent of i - Root of i is id[id[id[...id[i]...]]]
None

Java implementation

  • tree can get tall
  • find too expensive (could be N array accesses)

Quick find vs Quick union

4. Improvement for quick union 1: weighting

Purpose

  • Avoid tall trees
  • Keep track of size of each tree (number of objects)
  • Balance by linking root of smaller tree to root of larger tree
None

Proposition

Depth of any node x is at most log N

Why?

  • The depth of x increase 1/the size of the tree at least doubles when tree T1 containing x is merged into another tree T2
  • The size of tree containing x can double at most log N times

Java implementation

  • link root of smaller tree to root of larger tree
  • update the sz[] array

Proposition of Quick find, Quick union, and weighted quick-union

5. Improvement for quick union 2: path compression

Purpose

  • Flatten the tree
  • Just after computing the root of p, set the id of each examined node to point to that root
None

Java implementation

Proposition: Bottom line

WQUPC reduces from 30 years to 6 seconds.

Related Leetcode questions

Week 1 Programming Assignment: Percolation

It's called percolation if the opened sited were connected from the top to the bottom. So what's the probability? (The numbers of opened sites/The numbers of total sites)

Requirements in detail

Solutions

  • Percolation.java
  • PercolationStats.java

Grades

None

References

Summary

Thanks for your patient. I am Sean. I work as a software engineer.

This article is my note. Please feel free to give me advice if any mistakes. I am looking forward to your feedback.

  • Subscribe me
  • The Facebook page for articles
  • The latest side project: Daily Learning

Related topics

How to use the two-way binding in Knout.js and ReactJS?

Learn how to use SignalR to build a chatroom application

My reflection of <Effective SQL>:

IT & Network:

Database: