ElementTree - findall to recursively select all child elements
·
Answer a question
Python code:
import xml.etree.ElementTree as ET
root = ET.parse("h.xml")
print root.findall('saybye')
h.xml code:
<hello>
<saybye>
<saybye>
</saybye>
</saybye>
<saybye>
</saybye>
</hello>
Code outputs,
[<Element 'saybye' at 0x7fdbcbbec690>, <Element 'saybye' at 0x7fdbcbbec790>]
saybye which is a child of another saybye is not selected here. So, how to instruct findall to recursively walk down the DOM tree and collect all three saybye elements?
Answers
Quoting findall,
Element.findall()finds only elements with a tag which are direct children of the current element.
Since it finds only the direct children, we need to recursively find other children, like this
>>> import xml.etree.ElementTree as ET
>>>
>>> def find_rec(node, element, result):
... for item in node.findall(element):
... result.append(item)
... find_rec(item, element, result)
... return result
...
>>> find_rec(ET.parse("h.xml"), 'saybye', [])
[<Element 'saybye' at 0x7f4fce206710>, <Element 'saybye' at 0x7f4fce206750>, <Element 'saybye' at 0x7f4fce2067d0>]
Even better, make it a generator function, like this
>>> def find_rec(node, element):
... for item in node.findall(element):
... yield item
... for child in find_rec(item, element):
... yield child
...
>>> list(find_rec(ET.parse("h.xml"), 'saybye'))
[<Element 'saybye' at 0x7f4fce206a50>, <Element 'saybye' at 0x7f4fce206ad0>, <Element 'saybye' at 0x7f4fce206b10>]
更多推荐

所有评论(0)