I think I’m missing something obvious, and I’m hoping that a second pair of eyes will see it.
I want to check that a particular value is NOT a string that looks like a time on a 12-hour clock, so I’m trying to check for the presence of the character strings “AM” or “PM” in the string. Here’s a simplification of the regex with which I’m working - print(re.match('.*AM|PM.*', '1:00 PM'))
.
When I run it I get the returned value None
. That’s definitely not what I’m trying to get.
I’ve tried re.match('[AM|PM]', 'PM')
and re.match('AM|PM$', 'PM')
. Those return None too.
I can get a match with re.match('AM|PM', 'PM')
. If I use re.match('.*AM|PM.*', '11:00 AM')
then it returns a match and everything is fine. Similarly, re.match('.*PM|AM.*', '2:00 PM')
returns a match.
What do I have to do to get the “OR” section of my regex to work so that the first thing in this question will match?
In regular expressions, the |
(pipe) character has a lower precedence than the concatenation, so the pattern .*AM|PM.*
is interpreted as .*(AM) | (PM).*
rather than .*(AM|PM).*
as you intend.
To fix this, you should use parentheses to explicitly indicate the scope of the alternation. Here’s the corrected regex:
import re
result = re.match('.*(AM|PM).*', '1:00 PM')
print(result.group(1) if result else "No match")
In this regex, .*(AM|PM).*
, the parentheses (AM|PM)
form a capturing group that includes either “AM” or “PM”. This way, the alternation is applied only to “AM” and “PM” as a whole.
Now, when you run the code with '1:00 PM'
, it should match correctly, and result.group(1)
will give you the matched “AM” or “PM”. If there’s no match, result
will be None
.