Adversarial Search - Manhattan College

Adversarial Search - Manhattan College

Adversarial Search CMPT 420 / CMPG 720 Outline Game playing Game trees o Minimax o Alpha-beta pruning Games vs.

search problems competitive environments: agents goals are in conflict adversarial search problems (games) Types of Games deterministic perfect information

chess, checkers, go, othello chance backgammon, monopoly bridge, poker imperfect information

Games deterministic, fully-observable, turntaking, twoplayer, zero-sum games o Utility values at the end are equal and opposite o Tic-tac-toe Formulation Two players MAX and MIN take turns (with MAX playing first)

S0: Player(s): Action(s): Result(s,a): Terminal-test(s):

Utility(s,p): Formulation S0: initial state

Player(s): Action(s): Result(s,a): Terminal-test(s): Utility(s,p): Formulation

S0: initial state Player(s): which player has the move in a state Action(s): Result(s,a): Terminal-test(s): Utility(s,p): Formulation

S0: initial state Player(s): which player has the move in a state Action(s): set of legal moves in a state Result(s,a):

Terminal-test(s): Utility(s,p): Formulation

S0: initial state Player(s): which player has the move in a state Action(s): set of legal moves in a state Result(s,a): transition model Terminal-test(s): Utility(s,p): Formulation

S0: initial state Player(s): which player has the move in a state Action(s): set of legal moves in a state Result(s,a): transition model Terminal-test(s): true/false (terminal states) Utility(s,p):

Formulation S0: initial state Player(s): which player has the move in a state Actions(s): set of legal moves in a state

Result(s,a): transition model Terminal-test(s): true/false (terminal states) Utility(s,p): utility function defines the final value of a game that ends in terminal state s for a player p o zero-sum games: same total payoff Game tree (1-player) Partial Game Tree for Tic-Tac-Toe

Optimal strategies MAX uses search tree to determine next move. Assumption: Both players play optimally!! Given a game tree, the optimal strategy can be determined by using the minimax value of each node Minimax The minimax value of a node is the utility

(for Max) of being in the corresponding state, assuming that both players play optimally. Minimax(s) = o o o if Terminal-test(s) if Player(s) = Max if Player(s) = Min

Minimax The minimax value of a node is the utility (for Max) of being in the corresponding state, assuming that both players play optimally. Minimax(s) = o Utility (s) if Terminal-test(s) o max of Minimax(Result(s,a)) if Player(s) = Max o min of Minimax(Result(s,a)) if Player(s) = Min

Optimal Play 2 2 7 1 8

2 2 7 1 8

1 2 7 1 8 2

This is the optimal play 2 1 MAX 2 7 1

8 MIN 2 7 1 8

Two-Ply Game Tree Two-Ply Game Tree Two-Ply Game Tree Two-Ply Game Tree The minimax decision Minimax maximizes the worst-case outcome for max.

optimally? Definition of optimal play for MAX assumes MIN plays optimally: maximizes worst-case outcome for MAX. But if MIN does not play optimally, MAX can do even better. Minimax Algorithm function MINIMAX-DECISION(state) returns an action inputs: state, current state in game

vMAX-VALUE(state) return the action in SUCCESSORS(state) with value v function MAX-VALUE(state) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v - for a,s in SUCCESSORS(state) do v MAX(v,MIN-VALUE(s)) return v function MIN-VALUE(state) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v

for a,s in SUCCESSORS(state) do v MIN(v,MAX-VALUE(s)) return v Properties of minimax Complete? o Yes (if tree is finite) Optimal? o Yes (against an optimal opponent)

Time complexity? o O(bm) Space complexity? o O(bm) (depth-first exploration) o For chess, b 35, m 100 for "reasonable" games exact solution is infeasible Alpha-Beta Pruning Problem with minimax search: exponential in the depth of the tree

Can we cut it in half? It is possible to compute the minimax decision without looking at every node. o pruning: eliminate some parts of the tree Alpha-beta pruning We can improve on the performance of the minimax algorithm through alpha-beta pruning MAX MIN

MAX 2 7 1 ? Alpha-beta pruning We can improve on the performance of the

minimax algorithm through alpha-beta pruning MAX We dont need to compute the value at this node. MIN No matter what it is, it cant affect the value of the root node.

MAX 2 7 1 ? Alpha-Beta Example

Do DFS until the first leaf Range of possible values [-,+] [-, +] Alpha-Beta Example Do DFS until first leaf Range of possible values

[-,+] [-, +] (continued) [-,+] [-,3] (continued)

[-,+] [-,3] (continued) [-,+] [3,3] (continued) [3,+]

[3,3] (continued) [3,+] [3,3] [-, ] (continued)

[3,+] [3,3] [-,2] (continued) [3,+] [3,3]

[-,2] This node is worse for MAX (continued) [3,14] [3,3] [-,2]

, [-, ] (continued) [3,14] [3,3] [-,2]

, [-,14] (continued) [3,5] [3,3] [,2]

, [-,5] (continued) [3,3] [,2]

[2,2] (continued) [3,3] [3,3] [-,2] [2,2]

- pruning example Minimax(root) = max(min(3,12,8),min(2,x,y),min(14,5,2)) = max(3,min(2,x,y),2) =3

- pruning We made the same minimax decision without ever evaluating two of the leaf nodes! o They are independent. It is possible to prune entire subtrees. Why is it called -? = value of the best choice found so far at

any choice point along the path for max If v is worse than , max will avoid it prune that branch Define similarly for

min Alpha-Beta Algorithm function ALPHA-BETA-SEARCH(state) returns an action inputs: state, current state in game vMAX-VALUE(state, - , +) return the action in SUCCESSORS(state) with value v function MAX-VALUE(state, , ) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v- for a,s in SUCCESSORS(state) do

v MAX(v,MIN-VALUE(s, , )) if v then return v MAX( ,v) return v Alpha-Beta Algorithm function MIN-VALUE(state, , ) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v+ for a,s in SUCCESSORS(state) do v MIN(v,MAX-VALUE(s, , ))

if v then return v MIN( ,v) return v Comments: Alpha-Beta Pruning Pruning does not affect the final results. Entire subtrees can be pruned. Good move ordering improves effectiveness of pruning. With perfect ordering, time

complexity is O(bm/2) o Alpha-beta pruning can look twice as far as minimax in the same amount of time Deterministic games in practice Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in 1994. Used a precomputed endgame database defining perfect play for all positions involving 8 or fewer pieces on the board, a total of 444 billion

positions. Chess: Deep Blue defeated human world champion Garry Kasparov in a six-game match in 1997. Deep Blue searches 200 million positions per second, uses very sophisticated evaluation, and undisclosed methods for extending some lines of search up to 40 ply. Othello: Logistello defeated the human world champion. It is generally acknowledged that human are no match for computers at Othello.

Recently Viewed Presentations

  • TOURISM PETER ROBINSON MICHAEL LCK STEPHEN L. J.

    TOURISM PETER ROBINSON MICHAEL LCK STEPHEN L. J.

    Matches output to demand levels over time. Capacity is adjusted by manpower planning policies such as changing the number of staff, more part time, seasonal staff, overtime working, subcontracting etc. e.g. fast food outlets. Chase Demand Plan
  • Interventions for Challenging Populations: Children Exposed ...

    Interventions for Challenging Populations: Children Exposed ...

    High doses of adversity not only affect brain structure and function, they affect the developing immune system, developing hormonal systems, and even the way our DNA is read and transcribed." ... Neglect. Acts of Omission "The failure to provide for...
  • Data Converter Overview: D/A and A/D converters

    Data Converter Overview: D/A and A/D converters

    Data Converter Overview: D/A and A/D converters Dr. Paul Hasler and Dr. Philip Allen The need for Data Converters D/A Block Diagram Where the A/D is in the System Where to divide Analog and Digital? Where the A/D is in...
  • Conductance of Single Alkanedithiol Molecular Junctions

    Conductance of Single Alkanedithiol Molecular Junctions

    Interlayer exchange coupling in magnetic trilayers One puzzle Our calculations DFT Chang, Dou, Chen, Hong, & Kaun, Scientific Reports 5, 16844 (2015) The Fe3/Ag6/Fe3 trilayer Chang, Dou, Chen, Hong, & Kaun, Scientific Reports 5, 16844 (2015) Magnetic anisotropy in ferromagnetic...
  • River Valleys - isjgorj.ro

    River Valleys - isjgorj.ro

    Dinamica placilor tectonice şi deriva continentelor folosind imagini Ce se intâmplă după? (What happens next?) What happens next? Ce se întâmplă după....
  • Genetics Powerpoint - Mrs. O'Brien's Science

    Genetics Powerpoint - Mrs. O'Brien's Science

    Color blindness is the inability to distinguish the differences between certain colors. The most common type is red-green color blindness, where red and green are seen as the same color. You should see . 58 (upper left), 18 (upper right),...
  • Stereotypes, prejudice and discrimination Part II

    Stereotypes, prejudice and discrimination Part II

    Racism - Blatant and subtle . Traditional racism. Blatant, explicit, and unmistakable acts of aggression against minorities. Modern racism. A form of prejudice that surfaces in subtle ways when it is safe, socially acceptable, and easy to rationalize; likely present...
  • The Ohio High School Athletic Association

    The Ohio High School Athletic Association

    You may not be eligible if you are competing under a false name or have provided your school with an incorrect home address. ... Semester and yearly grades have no effect on OHSAA eligibility unless your school provides grades at...