CSE 331: Review August 1, 2013 Main Steps in Algorithm Design Problem Statement Real world problem Problem Definition Precise mathematical def Algorithm Implementation

Analysis Data Structures Correctness/Run time Stable Matching Problem Gale-Shaply Algorithm Stable Marriage problem Set of men M and women W Preferences (ranking of potential spouses) Matching (no polygamy in M X W)

Perfect Matching (everyone gets married) m w m w Instablity Stable matching = perfect matching+ no instablity Gale-Shapley Algorithm

Intially all men and women are free At most n2 iterations While there exists a free woman who can propose Let w be such a woman and m be the best man she has not proposed to w proposes to m If m is free (m,w) get engaged Else (m,w) are engaged O(1) time implementation

If m prefers w to w Else w remains free (m,w) get engaged and w is free Output the engaged pairs as the final output GS algorithm: Firefly Edition Mal Wash 1

2 3 4 5 6 1 2

3 4 5 6 Simon Inara Zoe

Kaylee GS algo outputs a stable matching Lemma 1: GS outputs a perfect matching S Lemma 2: S has no instability Proof technique de jour Proof by contradiction Assume the negation of what you want to prove After some reasoning

Source: 4simpsons.wordpress.com Two obervations Obs 1: Once m is engaged he keeps getting engaged to better women Obs 2: If w proposes to m first and then to m (or never proposes to m) then she prefers m to m Proof of Lemma 2 By contradiction Assume there is an instability (m,w) m prefers w to w

w prefers m to m w last proposed to m m w m w Contradiction by Case Analysis

Depending on whether w had proposed to m or not Case 1: w never proposed to m w prefers m to m By Obs 2 Assumed w prefers m to m Source: 4simpsons.wordpress.com Case 2: w had proposed to m Case 2.1: m had accepted w proposal m is finally engaged to w

Thus, m prefers w to w 4simpsons.wordpress.com By Obs 1 Case 2.2: m had rejected w proposal m was engaged to w (prefers w to w) m is finally engaged to w (prefers w to w) m prefers w to w 4simpsons.wordpress.com By Obs 1 By Obs 1

Overall structure of case analysis Did w propose to m? Did m accept w proposal? 4simpsons.wordpress.com 4simpsons.wordpress.com 4simpsons.wordpress.com Graph Searching BFS/DFS

O(m+n) BFS Implementation BFS(s) Array Input graph as Adjacency list CC[s] = T and CC[w] = F for every w s Set i = 0 Set L0= {s} While Li is not empty Linked List

Li+1 = For every u in Li For every edge (u,w) If CC[w] = F then CC[w] = T Add w to Li+1 i++ Version in KT also computes a BFS tree An illustration

1 1 2 7 3 8 4 5 6

2 3 4 5 7 8 6

O(m+n) DFS implementation BFS(s) CC[s] = T and CC[w] = F for every w s O(n) Intitialize Q= {s} O(1) While Q is not empty Delete the front element u in Q For every edge (u,w)

If CC[w] = F then CC[w] = T Add w to the back of Q Repeated atu most O(nu) = once for each vertexO( u u nu) = Repeated O(m)nu O(1) times O(nu)

O(1) A DFS run using an explicit stack 7 8 1 2 7 76 3 3

8 5 4 4 5 5 3 6

2 3 1 Topological Ordering Run of TopOrd algorithm Greedy Algorithms Interval Scheduling: Maximum Number of Intervals Schedule by Finish Time

End of Semester blues Can only do one thing at any day: what is the maximum number of tasks that you can do? Write up a term paper Party! Exam study 331 HW Project Monday Tuesday

Wednesday Thursday Friday Schedule by Finish Time O(n log n) time sort intervals such that f(i) f(i+1) O(n) time build array s[1..n] s.t. s[i] = start time for i Set A to be the empty set While R is not empty Choose i in R with the earliest finish time

Add i to A Remove all requests that conflict with i from R Return A*=A Do the removal on the fly The final algorithm Order tasks by their END time Write up a term paper Party! Exam study

331 HW Project Monday Tuesday Wednesday Thursday Friday

Proof of correctness uses greedy stays ahead Interval Scheduling: Maximum Intervals Schedule by Finish Time Scheduling to minimize lateness All the tasks have to be scheduled GOAL: minimize maximum lateness Write up a term paper Exam study Party!

331 HW Project Monday Tuesday Wednesday Thursday Friday The Greedy Algorithm (Assume jobs sorted by deadline: d1 d2 .. dn)

f=s For every i in 1..n do Schedule job i from s(i)=f to f(i)=f+ti f=f+ti Proof of Correctness uses Exchange argument Proved the following Any two schedules with 0 idle time and 0 inversions have the same max lateness Greedy schedule has 0 idle time and 0 inversions

There is an optimal schedule with 0 idle time and 0 inversions Shortest Path in a Graph: nonnegative edge weights Dijkstras Algorithm Shortest Path problem Input: Directed graph G=(V,E) Edge lengths, le for e in E s 100 5

w 15 u start vertex s in V s s 5 5 u

Output: All shortest paths from s to all nodes in V w u 15 Dijkstras shortest path algorithm 1 3 u 1

1 4 2 x 2 2 43 y 1 4

s d(w) = min e=(u,w) in E, u in R d(u)+le w 2 3 2 z 54

d(s) = 0 d(u) = 1 d(w) = 2 d(x) = 2 d(y) = 3 d(z) = 4 s

Input: Directed G=(V,E), le 0, s in V u R = {s}, d(s) =0 While there is a x not in R with (u,x) in E, u in R Pick w that minimizes d(w) Add w to R d(w) = d(w) Shortest paths w x

y z Dijkstras shortest path algorithm (formal) Input: Directed G=(V,E), le 0, s in V S = {s}, d(s) =0 While there is a v not in S with (u,v) in E, u in S At most n iterations Pick w that minimizes d(w)

Add w to S d(w) = d(w) O(m) time O(mn) time bound is trivial O(m log n) time implementation is possible Proved that d(v) is best when v is added Minimum Spanning Tree Kruskal/Prim

Minimum Spanning Tree (MST) Input: A connected graph G=(V,E), ce> 0 for every e in E Output: A tree containing all V that minimizes the sum of edge weights Kruskals Algorithm Input: G=(V,E), ce> 0 for every e in E T= Sort edges in increasing order of their cost Joseph B. Kruskal Consider edges in sorted order If an edge can be added to T without adding a cycle then add it to T

Prims algorithm Similar to Dijkstras algorithm 0.5 2 1 3 50 51 Robert Prim

0.5 2 Input: G=(V,E), ce> 0 for every e in E S = {s}, T = 1 50 While S is not the same as V Among edges e= (u,w) with u in S and w not in S, pick one with minimum cost Add w to S, e to T

Cut Property Lemma for MSTs Condition: S and V\S are non-empty S V\S Cheapest crossing edge is in all MSTs Assumption: All edge costs are distinct Divide & Conquer Sorting Merge-Sort

Sorting Given n numbers order them from smallest to largest Works for any set of elements on which there is a total order Mergesort algorithm Input: a1, a2, , an Output: Numbers in sorted order MergeSort( a, n ) If n = 2 return the order min(a1,a2); max(a1,a2) aL = a1,, an/2 aR = an/2+1,, an

return MERGE ( MergeSort(aL, n/2), MergeSort(aR, n/2) ) An example run 51 1 100 19 2 8

1 51 19 100 2 8 1

19 51 100 2 1 2 3

4 8 4 3 3 19 MergeSort( a, n ) If n = 2 return the order min(a1,a2); max(a1,a2) aL = a1,, an/2

aR = an/2+1,, an return MERGE ( MergeSort(aL, n/2), MergeSort(aR, n/2) ) 3 4 4 51 8 100 Correctness Input: a1, a2, , an

Output: Numbers in sorted order MergeSort( a, n ) If n = 1 return the order a1 If n = 2 return the order min(a1,a2); max(a1,a2) aL = a1,, an/2 aR = an/2+1,, an return MERGE ( MergeSort(aL, n/2), MergeSort(aR, n/2) ) Inductive step follows from correctness of MERGE By induction on n

Counting Inversions Merge-Count Mergesort-Count algorithm Input: a1, a2, , an Output: Numbers in sorted order+ #inversion T(2) = c MergeSortCount( a, n ) If n = 1 return ( 0 , a1) If n = 2 return ( a1 > a2, min(a1,a2); max(a1,a2)) aL = a1,, an/2

T(n) = 2T(n/2) + cn O(n log n) time aR = an/2+1,, an (cL, aL) = MergeSortCount(aL, n/2) O(n) (cR, aR) = MergeSortCount(aR, n/2) (c, a) = MERGE-COUNT(aL,aR) return (c+cL+cR,a)

Counts #crossing-inversions+ MERGE Closest Pair of Points Closest Pair of Points Algorithm Closest pairs of points Input: n 2-D points P = {p1,,pn}; pi=(xi,yi) d(pi,pj) = ( (xi-xj)2+(yi-yj)2)1/2 Output: Points p and q that are closest The algorithm O(n log n) + T(n)

Input: n 2-D points P = {p1,,pn}; pi=(xi,yi) Sort P to get Px and Py Closest-Pair (Px, Py) O(n log n) T(< 4) = c T(n) = 2T(n/2) + cn If n < 4 then find closest point by brute-force Q is first half of Px and R is the rest O(n)
Compute Qx, Qy, Rx and Ry O(n) (q0,q1) = Closest-Pair (Qx, Qy) O(n log n) overall (r0,r1) = Closest-Pair (Rx, Ry) = min ( d(q0,q1), d(r0,r1) ) S = points (x,y) in P s.t. |x x*| < return Closest-in-box (S, (q0,q1), (r0,r1)) O(n)
O(n) Assume can be done in O(n) Dynamic Programming Weighted Interval Scheduling Scheduling Algorithm Weighted Interval Scheduling Input: n jobs (si,ti,vi) Output: A schedule S s.t. no two jobs in S have a conflict Goal: max i in S vj Assume: jobs are sorted by their finish time

A recursive algorithm Compute-Opt(j) Proof of correctness by induction on j Correct for j=0 If j = 0 then return 0 return max { vj + Compute-Opt( p(j) ), Compute-Opt( j-1 ) } = OPT( p(j) )

= OPT( j-1 ) OPT(j) = max { vj + OPT( p(j) ), OPT(j-1) } Exponential Running Time 1 2 p(j) = j-2 3 4 Only 5 OPT

values! 5 OPT(5) OPT(3) Formal proof: Ex. OPT(4) OPT(3) OPT(1)

OPT(2) OPT(2) OPT(1) OPT(1) OPT(1) OPT(2) OPT(1)

Bounding # recursions M-Compute-Opt(j) If j = 0 then return 0 If M[j] is not null then return M[j] M[j] = max { vj + M-Compute-Opt( p(j) ), M-Compute-Opt( j-1 ) } return M[j] Whenever a recursive call is made an M value of assigned At most n values of M can be assigned O(n) overall

Property of OPT OPT(j) = max { vj + OPT( p(j) ), OPT(j-1) } Given OPT(1), , OPT(j-1), one can compute OPT(j) Recursion+ memory = Iteration Iteratively compute the OPT(j) values Iterative-Compute-Opt M[0] = 0 For j=1,,n M[j] = max { vj + M[p(j)], M[j-1] }

M[j] = OPT(j) O(n) run time Shortest Path in a Graph Bellman-Ford Shortest Path Problem Input: (Directed) Graph G=(V,E) and for every edge e has a cost ce (can be <0) t in V Output: Shortest path from every s to t 1

1 899 100 s Shortest path has cost negative infinity -1000 t

Assume that G has no negative cycle Recurrence Relation OPT(i,u) = cost of shortest path from u to t with at most i edges OPT(i,u) = min { OPT(i-1,u), min(u,w) in E { cu,w + OPT(i-1, w)} } Path uses i-1 edges Best path through all neighbors

P vs NP P vs NP question P: problems that can be solved by poly time algorithms Is P=NP? NP: problems that have polynomial time verifiable witness to optimal solution Alternate NP definition: Guess witness and verify!