小能豆

如何在 string.replace 中输入正则表达式?

javascript

我需要一些关于声明正则表达式的帮助。我的输入如下:

this is a paragraph with<[1> in between</[1> and then there are cases ... where the<[99> number ranges from 1-100</[99>. 
and there are many other lines in the txt files
with<[3> such tags </[3>

所需的输出是:

this is a paragraph with in between and then there are cases ... where the number ranges from 1-100. 
and there are many other lines in the txt files
with such tags

我尝试过这个:

#!/usr/bin/python
import os, sys, re, glob
for infile in glob.glob(os.path.join(os.getcwd(), '*.txt')):
    for line in reader: 
        line2 = line.replace('<[1> ', '')
        line = line2.replace('</[1> ', '')
        line2 = line.replace('<[1>', '')
        line = line2.replace('</[1>', '')

        print line

我也尝试过这个(但似乎我使用了错误的正则表达式语法):

        line2 = line.replace('<[*> ', '')
        line = line2.replace('</[*> ', '')
        line2 = line.replace('<[*>', '')
        line = line2.replace('</[*>', '')

replace我不想对 1 到 99进行硬编码。


阅读 56

收藏
2024-07-03

共1个答案

小能豆

这个经过测试的代码片段应该可以做到这一点:

import re
line = re.sub(r"</?\[\d+>", "", line)

编辑:这里有一个注释版本,解释了它是如何工作的:

line = re.sub(r"""
  (?x) # Use free-spacing mode.
  <    # Match a literal '<'
  /?   # Optionally match a '/'
  \[   # Match a literal '['
  \d+  # Match one or more digits
  >    # Match a literal '>'
  """, "", line)

正则表达式很有趣!但我强烈建议您花一两个小时学习基础知识。首先,您需要了解哪些字符是特殊的:“元字符”需要转义(即在前面放置反斜杠 - 并且字符类内部和外部的规则不同。)有一个很棒的在线教程:www.regular-expressions.info。您在那里花费的时间将获得丰厚的回报。祝您使用正则表达式愉快

2024-07-03