""" regex_ab.py Two examples of python regex. Both patterns are made up of A and B characters. The first has exactly two As, and any Bs anywhere. This is a formal "regular expression language" in the mathematical sense, and so there is a finite state machine that could recognize this language. start -> Bs --> first A --> Bs -> second A -> Bs -> accept The second is bAbAb where "b" is some number of B's, the same number each time. Because this requires rememembering an earlier match, this is *not* a formal math "regular expression"; it requires more compute power than that. Since the number of B's in the string can be arbitrarily large, it cannot be understood by any machine with only a given finite number of states. (The python "grouping" syntax within its regex engine is not formally part of the definition of mathematical regular expressions.) $ python regex_ab.py regex As and Bs -- playing with regex patterns -- Candidate pattern 1? AA Yes, 'AA' is a '^(B*)A(B*)A(B*)$'. Candidate pattern 2? BABBBABBBB No, 'BABBBABBBB' is not a '^(B+)A(\1)A(\1)$'. -- playing with regex patterns -- Candidate pattern 1? BABBBABBBB Yes, 'BABBBABBBB' is a '^(B*)A(B*)A(B*)$'. Candidate pattern 2? BBABBABB Yes, 'BBABBABB' is a '^(B+)A(\1)A(\1)$'. -- playing with regex patterns -- Candidate pattern 1? C No, 'C' is not a '^(B*)A(B*)A(B*)$'. Candidate pattern 2? C No, 'C' is not a '^(B+)A(\1)A(\1)$'. Jim Mahoney | cs.bennington.college | May 2021 | MIT License """ import re print("regex As and Bs") # Notes : # * The syntax r"" is a "raw" string, # without any special meaning for the \ character. # * "B*" means "any number of B including 0" # * "B+" means "any number of B but at least 1" # * () are special grouping symbols; not literal "(" # * "^" means "match the start of the string" # * "$" means "match the end of the string" pattern1 = r"^(B*)A(B*)A(B*)$" # exactly two As, any Bs anywhere. pattern2 = r"^(B+)A(\1)A(\1)$" # bAbAb; same number of B's (min 1) in each b while True: print('-- playing with regex patterns --') one = input("Candidate pattern 1? ") if re.match(pattern1, one): print(f"Yes, '{one}' is a '{pattern1}'.") else: print(f"No, '{one}' is not a '{pattern1}'.") two = input("Candidate pattern 2? ") if re.match(pattern2, two): print(f"Yes, '{two}' is a '{pattern2}'.") else: print(f"No, '{two}' is not a '{pattern2}'.") print()