I think I’m missing something obvious, and I’m hoping that a second pair of eyes will see it.
I want to check that a particular value is NOT a string that looks like a time on a 12-hour clock, so I’m trying to check for the presence of the character strings “AM” or “PM” in the string. Here’s a simplification of the regex with which I’m working - print(re.match('.*AM|PM.*', '1:00 PM')).
print(re.match('.*AM|PM.*', '1:00 PM'))
When I run it I get the returned value None. That’s definitely not what I’m trying to get.
None
I’ve tried re.match('[AM|PM]', 'PM') and re.match('AM|PM$', 'PM'). Those return None too.
re.match('[AM|PM]', 'PM')
re.match('AM|PM$', 'PM')
I can get a match with re.match('AM|PM', 'PM'). If I use re.match('.*AM|PM.*', '11:00 AM') then it returns a match and everything is fine. Similarly, re.match('.*PM|AM.*', '2:00 PM') returns a match.
re.match('AM|PM', 'PM')
re.match('.*AM|PM.*', '11:00 AM')
re.match('.*PM|AM.*', '2:00 PM')
What do I have to do to get the “OR” section of my regex to work so that the first thing in this question will match?
In regular expressions, the | (pipe) character has a lower precedence than the concatenation, so the pattern .*AM|PM.* is interpreted as .*(AM) | (PM).* rather than .*(AM|PM).* as you intend.
|
.*AM|PM.*
.*(AM) | (PM).*
.*(AM|PM).*
To fix this, you should use parentheses to explicitly indicate the scope of the alternation. Here’s the corrected regex:
import re result = re.match('.*(AM|PM).*', '1:00 PM') print(result.group(1) if result else "No match")
In this regex, .*(AM|PM).*, the parentheses (AM|PM) form a capturing group that includes either “AM” or “PM”. This way, the alternation is applied only to “AM” and “PM” as a whole.
(AM|PM)
Now, when you run the code with '1:00 PM', it should match correctly, and result.group(1) will give you the matched “AM” or “PM”. If there’s no match, result will be None.
'1:00 PM'
result.group(1)
result