I’m new to this site so please forgive any mistakes or misconceptions I may have. I’m new to Python and am currently working through a tutorial. I have come across the following input (apparently very simple) but I cannot understand the sequence in the answer.
Code is as follows
pattern = r"\W" # Matches any non-word character text = "Hello, world!" matches = re.findall(pattern, text) print("Matches:", matches)`
Output is
Matches: [',', ' ', '!']
The tutorial then gives this statement to aid understanding:
The regular expression pattern is defined as r”\W”, which uses the \W special sequence to match any character that is not a word character (a-z, A-Z, 0-9, or _). The string we’re searching for matches in is “Hello, world!”.
N.B. “import re” had been previously input.
My problem is that I don’t understand how the sequence of quotation marks and commas marry up with “Hello, world” - that is I don’t know what they all mean in the order they are given.
If anyone can offer any advance I would be very grateful.
Many thanks in advance.
I have read through the sequence and comments many times but still cannot understand the order of the sequence.
I’ll break down the code and the regular expression pattern for you:
import re pattern = r"\W" # Matches any non-word character text = "Hello, world!" matches = re.findall(pattern, text) print("Matches:", matches)
import re
pattern = r"\W"
\W
r
text = "Hello, world!"
matches = re.findall(pattern, text)
findall
re
print("Matches:", matches)
Now, let’s break down the regular expression pattern \W:
_
In the given example, the text is “Hello, world!” and the non-word characters are ',', ' ', and '!'. The re.findall function finds all occurrences of these non-word characters in the text and returns them as a list.
','
' '
'!'
re.findall
So, the output “Matches: [‘,’, ‘ ‘, ‘!’]” indicates that these non-word characters were found in the input text “Hello, world!”.