一尘不染

使用 Element.tree 如何解析标签中的标签?

py

我是编码新手,我正在尝试从此条目中解析以下字段:

名称、类别、风险、成员

我似乎编写了代码来获取 3/4 字段,但由于某种原因,当我尝试从“成员”字段中获取文本时,我收到一条错误消息,请告诉我我又做错了什么,我是新手,所以如果你有一种更简单的方法可以接受建议。

<application>
  <entry id="120" name="100bao" ori_country="USA" ori_language="English">
   <category>general-internet</category>
   <subcategory>file-sharing</subcategory>
   <technology>peer-to-peer</technology>
   <evasive-behavior>yes</evasive-behavior>
   <consume-big-bandwidth>yes</consume-big-bandwidth>
   <used-by-malware>yes</used-by-malware>
   <able-to-transfer-file>yes</able-to-transfer-file>
   <has-known-vulnerability>yes</has-known-vulnerability>
   <tunnel-other-application>no</tunnel-other-application>
   <prone-to-misuse>yes</prone-to-misuse>
   <pervasive-use>yes</pervasive-use>
   <risk>5</risk>
   <references>
     <entry name="www.100bao.com">
      <link>http://www.100bao.com/</link>
     </entry>
   </references>
   <per-direction-regex>no</per-direction-regex>
   <appident>yes</appident>
   <default>
     <port>
       <member>tcp/3468,6346,11300</member>
     </port>
   </default>
 </entry>
import xml.etree.ElementTree as ET

mytree = ET.parse('C:/Documents/Parse Folder/apps.xml')
root = mytree.getroot()

for entry in root.findall('entry'):
    category = entry.find('category').text
    risk = entry.find('risk').text
    member = entry.find('member').text

print(entry.attrib, category, risk, member)

member = entry.find('member').text
AttributeError: 'NoneType' object has no     attribute 'text' 

阅读 101

收藏
2023-01-28

共1个答案

一尘不染

这是因为member不是的孩子,entry所以您需要提供XPath

member = entry.find('./default/port/member').text

(未经测试,因为您问题中的代码按原样无法测试。)

更新了经过测试的代码

apps.xml(更新为格式正确)

<application>
    <entry id="120" name="100bao" ori_country="USA" ori_language="English">
        <category>general-internet</category>
        <subcategory>file-sharing</subcategory>
        <technology>peer-to-peer</technology>
        <evasive-behavior>yes</evasive-behavior>
        <consume-big-bandwidth>yes</consume-big-bandwidth>
        <used-by-malware>yes</used-by-malware>
        <able-to-transfer-file>yes</able-to-transfer-file>
        <has-known-vulnerability>yes</has-known-vulnerability>
        <tunnel-other-application>no</tunnel-other-application>
        <prone-to-misuse>yes</prone-to-misuse>
        <pervasive-use>yes</pervasive-use>
        <risk>5</risk>
        <references>
            <entry name="www.100bao.com">
                <link>http://www.100bao.com/</link>
            </entry>
        </references>
        <per-direction-regex>no</per-direction-regex>
        <appident>yes</appident>
        <default>
            <port>
                <member>tcp/3468,6346,11300</member>
            </port>
        </default>
    </entry>
</application>

Python

import xml.etree.ElementTree as ET

mytree = ET.parse('apps.xml')
root = mytree.getroot()

for entry in root.findall('entry'):
    name = entry.get('name')
    category = entry.find('category').text
    risk = entry.find('risk').text
    member = entry.find('default/port/member').text

    print(f'Name: "{name}"\nCategory: "{category}"\nRisk: "{risk}"\nMember: "{member}"')

输出

Name: "100bao"
Category: "general-internet"
Risk: "5"
Member: "tcp/3468,6346,11300"
2023-01-28