CSCE 212 Computer Architecture - Computer Science & E

CSCE 212 Computer Architecture - Computer Science & E

CSCE 531 Compiler Construction Lecture 4 (new improved) Subset construction NFA DFA Topics FLEX Subset Construction Equivalence relations, partitions Readings:

CSCE 531 Spring 2018 1 January 30, 2018 Overview Last Time Thompson Construction: Regular Expression NFA Todays Lecture

Rubular regular expressions online Subset Construction: NFA DFA Minimization of DFA well: distinguished states Lex/Flex References 2 Chapter 2

http://www.gnu.org/software/flex/manual/html_mono/flex.html C introduction Examples CSCE 531 Spring 2018 Rubular.com 3 CSCE 531 Spring 2018 RegEx Quick Reference http://rubular.com/r/11Z1trB1JI

4 CSCE 531 Spring 2018 Python Regular Expressions https://docs.python.org/3.4/library/re.html Pythex: a Python regular expression editor https://pythex.org/ POSIX:

5 https://www.regular-expressions.info/posix.html CSCE 531 Spring 2018 Python Regular Expressions import re import time filename="../../513/513F16.txt" f = open(filename, "r") regexp=re.compile(r'Degree:\s*([A-Z].*)|[1-9][0-9]?.*([A-Z][a-zAZ-]*)') #,\s*([A-Z][a-z]*)') degreeRE=re.compile(r'Degree:\s*([A-Za-z].*)')

nameRE=re.compile(r'[1-9][0-9]?\s*([A-Z][a-zA-Z-]*\,\s*[A-Za-z-]+)') CSCE 531 Spring 2018 6 https://docs.python.org/3.4/library/re.html name="unassigned" for line in f: Python Example Continued Continued m=regexp.match(line)

if m != None: m2 = nameRE.match(line) if m2 != None: name=m2.group(1) m3 = degreeRE.match(line) if m3 != None: print (name, "\t", m3.group(1)) f.close() 7 CSCE 531 Spring 2018 Example 3.14 (a | b)*abb

8 CSCE 531 Spring 2018 Transition Table 9 CSCE 531 Spring 2018 Deterministic Finite Automata (DFA) A deterministic finite automata (DFA) is an NFA such that: 1. There are no moves on input , and 2. For each state s and input symbol a , there is exactly

one edge out of s labeled a . 10 CSCE 531 Spring 2018 NFA transitions versus DFA 1. Not every state an input has a transition (s, a) might be emptys, a) might be empty) might be empty (s, a) might be empty1, a) might be empty) =

2. For a given state and input there can be more than one transition (s, a) might be empty0, a) might be empty) = { 0, 1} 3. Epsilon transitions allowed in NFAs 11 None in this diagram CSCE 531 Spring 2018 Every DFA is an NFA

A DFA is just an NFA that satisfies: 1. No epsilon moves; formally (s, a) might be emptys, ) = 2. For each state s and input symbol a, (s, a) might be emptys, a) might be empty) is a single state; formally | (s, a) might be emptys, a) might be empty) | = 1 So, every language accepted by an DFA is also accepted by an NFA. Earlier we showed every language denoted by a regular expression has an NFA that accepts it. Today we show every language recognized by an NFA has a DFA that recognizes it also. 12 CSCE 531 Spring 2018 So why use an NFA and why use a DFA?

NFA more expressive power DFA more efficient recognizer 13 CSCE 531 Spring 2018 Algorithm 3.18 : Simulating a DFA. INPUT: An input string x terminated by an end-of-file character eof. A DFA D with start state s0 , accepting states F , and transition function move. OUTPUT: Answer yes if D accepts x ; no otherwise. METHOD: Apply the algorithm in Fig. 3.27 to the input string x . The function move( s; c) gives the state to

which there is an edge from state s on input c . The function nextChar returns the next character of the input string x . 14 CSCE 531 Spring 2018 Simulating DFA = recognizing L(DFA) 15 CSCE 531 Spring 2018 Algorithm 3.20 : The subset construction

Algorithm 3.20 : The subset construction of a DFA from an NFA. INPUT: An NFA N . OUTPUT: A DFA D accepting the same language as N . METHOD: 16 Our algorithm constructs a transition table Dtran for D . Each state of D is a set of NFA states, and we construct Dtran so D will simulate in parallel all possible moves N can make on a given input string.

CSCE 531 Spring 2018 Fig 3.31 Operations on NFA states 17 CSCE 531 Spring 2018 Fig. 3.32 the Subset Contruction initially, - closure ( s 0 ) is the only state in Dstates, and it is unmarked; while ( there is an unmarked state T in Dstates ) { mark T ; for ( each input symbol a ) { U = - closure (move( T;a) );

if ( U is not in Dstates ) add U as an unmarked state to Dstates; Dtran[T;a] = U ; } } 18 CSCE 531 Spring 2018 Fig 3.33 Computing -closure of a set 19 CSCE 531 Spring 2018

NFA DFA via Subset Construction d0 = -closure(0) = {0, 1, 2, 4, 7} 20 CSCE 531 Spring 2018 State of DFA a d0 = -closure(0) closure(0) = {0, 1, 2, 4, 7}

d1 = -closure(0) closure(move(d0, a)) =-closure(0) closure( 3, 8} ={3, 8, 6, 7, 1, 2, 4} d1={3, 8, 6, 7, 1, 2, 4} -closure(0) closure(move(d1, a)) =-closure(0) closure( 3, 8} ={3, 8, 6, 7, 1, 2, 4} b d2 = -closure(0) closure(move(d0, b)) =-closure(0) closure( 5 } ={5, 6, 1, 7, 2, 4}

d2={5, 6, 1, 2, 4} 21 CSCE 531 Spring 2018 22 CSCE 531 Spring 2018 23 CSCE 531 Spring 2018 Algorithm 3.22 : Simulating an NFA

INPUT: An input string x terminated by an end-of- le character eof. An NFA N with start state s0 , accepting states F , and transition function move. OUTPUT: Answer yes if N accepts x ; no otherwise. METHOD: The algorithm keeps a set of current states S, those that are reached from s0 following a path labeled by the inputs read so far. If c is the next input character, read by the function nextChar(), then we first compute move( S; c) and then close that set using - closure (). 24 CSCE 531 Spring 2018

Simulating an NFA 25 CSCE 531 Spring 2018 26 CSCE 531 Spring 2018 DFA Minimization Algorithm (Authors Slide) Discover sets of equivalent states Represent each such set with just one state Two states are equivalent (indistinguishable)if and only if: The set of paths leading to them are equivalent

, transitions on lead to equivalent states (DFA) -transitions to distinct sets states must be in distinct sets A partition P of S Each s S is in exactly one set pi P The algorithm iteratively partitions the DFAs states 27 CSCE 531 Spring 2018 Details of the algorithm (Authors Slide) Details of the algorithm Group states into maximal size sets, optimistically

Iteratively subdivide those sets, as needed States that remain grouped together are equivalent Initial partition, P0 , has two sets: {F} & {Q-F} (D =(Q,,,q0,F)) Splitting a set (partitioning a set by a) Assume qa, & qb s, and (qa,a) = qx, & (qb,a) = qy If qx & qy are not in the same set, then s must be split qa has transition on a, qb does not a splits s One state in the final DFA cannot have two transitions on a 28

CSCE 531 Spring 2018 Minimization Algorithm P { F, {Q-F}} while ( P is still changing) T{} The partitioning step is: If (S) contains states in 2 equivalent sets of states. for each set S P for each a partition S by a

into S1, and S2 Add S1 and S2 to T if T P then S PT 29 CSCE 531 Spring 2018 Building Faster Scanners from the DFA (Authors slide) Table-driven recognizers waste a lot of effort

Read (& classify) the next character Find the next state Assign to the state variable Trip through case logic in action() Branch back to the top We can do better Encode state & actions in the code Do transition tests locally Generate ugly, spaghetti-like code Takes (many) fewer operations per input character 30 CSCE 531 Spring 2018

Symbol Tables #define ENDSTR 0 #define MAXSTR 100 #include struct nlist { /* basic table entry */ char *name; struct nlist *next; /*next entry in chain */ int val; }; #define HASHSIZE 100

static struct nlist *hashtab[HASHSIZE]; /* pointer table */ 31 CSCE 531 Spring 2018 The Hash Function /* PURPOSE: Hash determines hash value based on the sum of the character values in the string. USAGE: n = hash(s); DESCRIPTION OF PARAMETERS: s(array of char) string to be hashed AUTHOR: Kernighan and Ritchie LAST REVISION: 12/11/83 */

hash(char *s) /* form hash value for string s */ { int hashval; for (hashval = 0; *s != '\0'; ) hashval += *s++; return (hashval % HASHSIZE); } 32 CSCE 531 Spring 2018 The lookup Function /*PURPOSE: Lookup searches for entry in symbol table and

returns a pointer USAGE: np= lookup(s); DESCRIPTION OF PARAMETERS: s(array of char) string searched for AUTHOR: Kernighan and Ritchie LAST REVISION: 12/11/83*/ struct nlist *lookup(char *s) /* look for s in hashtab */ { struct nlist *np; for (np = hashtab[hash(s)]; np != NULL; np = np->next) if (strcmp(s, np->name) == 0) return(np); /* found it */ return(NULL); /* not found */

} 33 CSCE 531 Spring 2018 The install Function PURPOSE: Install checks hash table using lookup and if entry not found, it "installs" the entry. USAGE: np = install(name); DESCRIPTION OF PARAMETERS: name(array of char) name to install in symbol table AUTHOR: Kernighan and Ritchie, modified by Ron Sobczak LAST REVISION: 12/11/83

*/ 34 CSCE 531 Spring 2018 struct nlist *install(char *name) /* put (name) in hashtab */ { struct nlist *np, *lookup(); char *strdup(), *malloc(); int hashval; if ((np = lookup(name)) == NULL) { /* not found */ np = (struct nlist *) malloc(sizeof(*np)); if (np == NULL)

return(NULL); if ((np->name = strdup(name)) == NULL) return(NULL); hashval = hash(np->name); np->next = hashtab[hashval]; hashtab[hashval] = np; } return(np); } 35 CSCE 531 Spring 2018

Recently Viewed Presentations

  • Engine Diagnosis and Service: Block, Crankshaft, Bearings ...

    Engine Diagnosis and Service: Block, Crankshaft, Bearings ...

    Engine Diagnosis and Service: Block, Crankshaft, Bearings, and Lubrication System Chapter 53 Objectives Analyze wear and damage to the cylinder block Select and perform the most appropriate repairs to the block, crankshaft, and bearings Analyze wear and damage to the...
  • Closing Kosovo's Energy Deficit

    Closing Kosovo's Energy Deficit

    Politicisation of data-generating institutions (especially statistical institutes in charge of nominal GDP compilation); ... In 2003, Germany and France were the first euro area countries breaking the 3-per cent-of-GDP fiscal deficit criterion, the principal anchor of the Maastricht criteria ...
  • Department of Army Permitting Process: Dredging Felicity Dodson

    Department of Army Permitting Process: Dredging Felicity Dodson

    Section 10 Rivers and Harbors Act of 1899 (33 U.S.C. 401) Structures and/or work in, or affecting, navigable waters of the United States. Structures and/or work outside the limits of navigable waters, IF these structures or work could affect the...
  • Agenda & Presenters Welcome & Opening Remarks Presidents

    Agenda & Presenters Welcome & Opening Remarks Presidents

    Executive Director Advisory Council Chris Brown, EDAC Chair. Medical Advisory Council Ralph Atkinson, MD, MAC Chair. Kidney Patient Advisory Council Derek Forfang, KPAC Chair. Quality Conference Webinars Mr. Forfang & Dr. Atkinson. Financial Overview Stephanie Hutchinson, Treasurer. 2017 Forum Election...
  • FM-1 Droplet Size Increase with Flowrate

    FM-1 Droplet Size Increase with Flowrate

    DOE/NETL Test Sites Coal-Fired Boiler with Sorbent Injection and Spray Cooling Sampling Time Required Comparison of OH and S-CEM*, Long Term Tests (10 lbs/MMacf) Capture of Vapor Phase Hg by Solid Sorbents Mass Transfer Limits (getting the Hg to the...
  • Center for Teaching and Learning (CTL) - Baruch College

    Center for Teaching and Learning (CTL) - Baruch College

    Fall 2017/Spring 2018 Activity. Pedagogy. Technology. Research. Policy. Active Learning (EAM grant for MTH 1030). Baruch Online Learning Week (October 9-15) "Civic Engagement" CTL Conversations - November CTL Workshops / Tech Café / Consultations Collaborative Online International Learning (COIL)
  • Le théâtre baroque et ses caractéristiques

    Le théâtre baroque et ses caractéristiques

    Il touche tous les domaines artistiques, sculpture, peinture, littérature, architecture, théâtre et musique et se caractérise par l'exagération du mouvement, la surcharge décorative, les effets dramatiques, la tension, l'exubérance et la grandeur parfois pompeuse. Introduction . Le baroque est un...
  • The Silk Road - Magda Martinez

    The Silk Road - Magda Martinez

    The Silk Road Global History I: Spiconardi & Roher Geography of the Silk Road Silk Road stretched from Xi'an, China to Rome It covers a vast area of different climates and geographies Taklimakan Desert Occupies much of the routes Temperatures...