一尘不染

ElementTree:Element.remove()跳跃迭代

python

我有这个xml输入文件:

<?xml version="1.0"?>
<zero>
  <First>
    <second>
      <third-num>1</third-num>
      <third-def>object001</third-def>
      <third-len>458</third-len>
    </second>
    <second>
      <third-num>2</third-num>
      <third-def>object002</third-def>
      <third-len>426</third-len>
    </second>
    <second>
      <third-num>3</third-num>
      <third-def>object003</third-def>
      <third-len>998</third-len>
    </second>
  </First>
</zero>

我的目标是删除<third-def>没有价值的任何第二层。为此,我编写了以下代码:

try:
    import xml.etree.cElementTree as ET
except ImportError:
    import xml.etree.ElementTree as ET
inputfile='inputfile.xml'
tree = ET.parse(inputfile)
root = tree.getroot()

elem = tree.find('First')
for elem2 in tree.iter(tag='second'):
    if elem2.find('third-def').text == 'object001':
        pass
    else:
        elem.remove(elem2)
        #elem2.clear()

我的问题是elem.remove(elem2)。它每隔第二级跳过一次。这是此代码的输出:

<?xml version="1.0" ?>
<zero>
  <First>
    <second>
      <third-num>1</third-num>
      <third-def>object001</third-def>
      <third-len>458</third-len>
    </second>
    <second>
      <third-num>3</third-num>
      <third-def>object003</third-def>
      <third-len>998</third-len>
    </second>
  </First>
</zero>

现在,如果我取消注释该elem2.clear()行,则脚本可以完美运行,但是输出效果不佳,因为它保留了所有已删除的 第二级

<?xml version="1.0" ?>
<zero>
  <First>
    <second>
      <third-num>1</third-num>
      <third-def>object001</third-def>
      <third-len>458</third-len>
    </second>
    <second/>
    <second/>
  </First>
</zero>

有人知道我的element.remove()陈述为什么错误吗?


阅读 167

收藏
2021-01-20

共1个答案

一尘不染

您正在遍历活动树:

for elem2 in tree.iter(tag='second'):

然后在迭代时进行更改。该迭代的“计数器”将不被告知更改的一些元素,所以元素0前瞻性和上元件数1移除元素,迭代器然后移动,但什么时候
单元号1现在是单元号0。

首先捕获所有元素的列表,然后在其上循环:

for elem2 in tree.findall('.//second'):

.findall() 返回结果列表,该列表在您更改树时不会更新。

现在迭代不会跳过最后一个元素:

>>> print ET.tostring(tree)
<zero>
  <First>
    <second>
      <third-num>1</third-num>
      <third-def>object001</third-def>
      <third-len>458</third-len>
    </second>
    </First>
</zero>

这种现象不仅限于ElementTree树

2021-01-20