Answer a question

Python code:

import xml.etree.ElementTree as ET
root = ET.parse("h.xml")
print root.findall('saybye')

h.xml code:

<hello>
  <saybye>
   <saybye>
   </saybye>
  </saybye>
  <saybye>
  </saybye>
</hello>

Code outputs,

[<Element 'saybye' at 0x7fdbcbbec690>, <Element 'saybye' at 0x7fdbcbbec790>]

saybye which is a child of another saybye is not selected here. So, how to instruct findall to recursively walk down the DOM tree and collect all three saybye elements?

Answers

Quoting findall,

Element.findall() finds only elements with a tag which are direct children of the current element.

Since it finds only the direct children, we need to recursively find other children, like this

>>> import xml.etree.ElementTree as ET
>>> 
>>> def find_rec(node, element, result):
...     for item in node.findall(element):
...         result.append(item)
...         find_rec(item, element, result)
...     return result
... 
>>> find_rec(ET.parse("h.xml"), 'saybye', [])
[<Element 'saybye' at 0x7f4fce206710>, <Element 'saybye' at 0x7f4fce206750>, <Element 'saybye' at 0x7f4fce2067d0>]

Even better, make it a generator function, like this

>>> def find_rec(node, element):
...     for item in node.findall(element):
...         yield item
...         for child in find_rec(item, element):
...             yield child
... 
>>> list(find_rec(ET.parse("h.xml"), 'saybye'))
[<Element 'saybye' at 0x7f4fce206a50>, <Element 'saybye' at 0x7f4fce206ad0>, <Element 'saybye' at 0x7f4fce206b10>]
Logo

Python社区为您提供最前沿的新闻资讯和知识内容

更多推荐