Algorithms, Part 1 Course from Princeton — Union-find and Week 1 assignment Percolation

Algorithms, Part 1 is a solid course for IT or software engineers to learn algorithms. Union-Find is a data structure for recording whether

Jen-Hsuan Hsieh (Sean)

A Layman

· ~6 min read · February 11, 2023 (Updated: April 5, 2023) · Free: No

Introduction

Algorithms, Part 1 is a solid course for IT or software engineers to learn algorithms which are provided by Princeton. Of course, we still have to take time to clarify the concepts after completing the class.

Union-Find is a data structure for recording whether points are in the same set. In this article, we will focus on Union-find and its improvements algorithms.

About this Series

This series aims to wrap up contents of Algorithms, Part 1.

week 1: this article
week 2: Stack, Queue, and Week 2 Assignment Deques and Randomized Queues
week 3: Sorting and Week 3 assignment Collinear Points
week 4: Binary Tree, Binary Heaps, and Week 4 Assignment 8 puzzle
week 5: Hash table and Week 5 Assignment KD-Tree

Agenda

It includes the following topics in this article.

Analysis of algorithms
Algorithm design approach
Union-Find
Improvement for quick union 1: weighting
Improvement for quick union 2: path compression
Related Leetcode questions
Week 1 Programming Assignment: Percolation

1. Analysis of algorithms

Example — 2 sum

Approximately how many array accesses as a function of input size N?

Bottom line: use cost model and tilde notation to simplify counts

Simplification 1: cost model

Use some basic operation as a proxy for running time

Simplification 2: tilde notation

Estimate running time (or memory) as a function of input size N
Ignore lower order terms - When N is large, terms are negligible - When N is small, we don't care

Common math formulas

1 + 2 + ... + N = 1/2 * N * (1 + N)
1^k + 2^k + ... + N^k = 1/(k + 1) * N ^ (k + 1)
1 + 1/2 + ... + 1/N= Log N
Triple loops = 1/6 * N ^ 3

2. Algorithm design approach

Steps to develop a usable algorithm

Optimal algorithm

Lower bound equals to upper bound (to within a constant factor)

e.g 1., Brute-force algorithm for -sum is optimal: it's running time is N
e.g 2., Merge sort is an optimal algorithm - upper bound: ~N log N - lower bound: ~N log N

Approach

Develop an algorithm
Prove a lower bound
If this a gap between lower bound and the upper bound, lower the upper bound (discover a new algorithm) or raise the lower bound (more difficult)

3. Union-Find

Goal

Design efficient data structure for union-find

Number of objects N can be huge
Number of operations M can be huge
Find queries and union commands may be intermixed

Quick find (eager approach)

Integer array id of N
Interpretation - p and q are connected iff(if and only if) they have the same id

Java implementation

union too expensive (N array accesses)
trees are flat, but too expensive to keep them flat

Quick union (lazy approach)

Integer array id of N
Interpretation - id[i] is the parent of i - Root of i is id[id[id[...id[i]...]]]

Java implementation

tree can get tall
find too expensive (could be N array accesses)

Quick find vs Quick union

4. Improvement for quick union 1: weighting

Purpose

Avoid tall trees
Keep track of size of each tree (number of objects)
Balance by linking root of smaller tree to root of larger tree

Proposition

Depth of any node x is at most log N

Why?

The depth of x increase 1/the size of the tree at least doubles when tree T1 containing x is merged into another tree T2
The size of tree containing x can double at most log N times

Java implementation

link root of smaller tree to root of larger tree
update the sz[] array

Proposition of Quick find, Quick union, and weighted quick-union

5. Improvement for quick union 2: path compression

Purpose

Flatten the tree
Just after computing the root of p, set the id of each examined node to point to that root

Java implementation

Proposition: Bottom line

WQUPC reduces from 30 years to 6 seconds.

Week 1 Programming Assignment: Percolation

It's called percolation if the opened sited were connected from the top to the bottom. So what's the probability? (The numbers of opened sites/The numbers of total sites)

Requirements in detail

Programming Assignment 1: Percolation

Write a program to estimate the value of the percolation threshold via Monte Carlo simulation. Install our Java…

princeton.edu

Solutions

Percolation.java

PercolationStats.java

Grades

References

Summary

Thanks for your patient. I am Sean. I work as a software engineer.

This article is my note. Please feel free to give me advice if any mistakes. I am looking forward to your feedback.

Subscribe me

Join Medium with my referral link — Jen-Hsuan Hsieh (Sean)

As a Medium member, a portion of your membership fee goes to writers you read, and you get full access to every story…

medium.com

The Facebook page for articles

A Layman

A Layman. 81 likes. The page sharing the information about programming languages, algorithm, databases, devops and…

facebook.com

The latest side project: Daily Learning

ALayman Daily Learning

Daily learning provides articles, challenges, or videos to people who are also self-learner for programming.

daily-learning.herokuapp.com

Components built by different frameworks must have the same action in one website. What we have to realize is how to…

medium.com

Learn how to use SignalR to build a chatroom application

The Introduction of SignalR

medium.com

My reflection of <Effective SQL>:

How To Design The Data Model? — My reflection of

gave me many tips for using the database.

medium.com

How To Design The Index? — My reflection of

gave me many tips for using the database.

medium.com

What if we can’t change the design? — My reflection of Part 3

gave me many tips for using the database.

medium.com

IT & Network:

Back to the basic: Introduction to DNS

I decided to back to the basic and enhance these bits of knowledge. I also expect that this article can help someone…

medium.com

Database:

Learning SQL server part.1 — lock and concurrency in SQL server

Even though I used the SQL server for a long time, I didn’t know it very well. However, I received an alert with the…

medium.com

#algorithms #princeton #software-development #union-find #data-structures

Algorithms, Part 1 Course from Princeton — Union-find and Week 1 assignment Percolation

Algorithms, Part 1 is a solid course for IT or software engineers to learn algorithms. Union-Find is a data structure for recording whether

Introduction

About this Series

Agenda

1. Analysis of algorithms

Example — 2 sum

Simplification 1: cost model

Simplification 2: tilde notation

Common math formulas

2. Algorithm design approach

Steps to develop a usable algorithm

Optimal algorithm

Approach

3. Union-Find

Goal

Quick find (eager approach)

Java implementation

Quick union (lazy approach)

Java implementation

Quick find vs Quick union

4. Improvement for quick union 1: weighting

Purpose

Proposition

Why?

Java implementation

Proposition of Quick find, Quick union, and weighted quick-union

5. Improvement for quick union 2: path compression

Purpose

Java implementation

Proposition: Bottom line

Related Leetcode questions

Week 1 Programming Assignment: Percolation

Requirements in detail

Programming Assignment 1: Percolation

Write a program to estimate the value of the percolation threshold via Monte Carlo simulation. Install our Java…

Solutions

Grades

References

Summary

Join Medium with my referral link — Jen-Hsuan Hsieh (Sean)

As a Medium member, a portion of your membership fee goes to writers you read, and you get full access to every story…

A Layman

A Layman. 81 likes. The page sharing the information about programming languages, algorithm, databases, devops and…

ALayman Daily Learning

Daily learning provides articles, challenges, or videos to people who are also self-learner for programming.

Related topics

How to use the two-way binding in Knout.js and ReactJS?

Components built by different frameworks must have the same action in one website. What we have to realize is how to…

Learn how to use SignalR to build a chatroom application

The Introduction of SignalR

How To Design The Data Model? — My reflection of

gave me many tips for using the database.

How To Design The Index? — My reflection of

gave me many tips for using the database.

What if we can’t change the design? — My reflection of Part 3

gave me many tips for using the database.

Back to the basic: Introduction to DNS

I decided to back to the basic and enhance these bits of knowledge. I also expect that this article can help someone…

Learning SQL server part.1 — lock and concurrency in SQL server

Even though I used the SQL server for a long time, I didn’t know it very well. However, I received an alert with the…

Reporting a Problem