Answer a question

I am working with beautiful soup in python, and I am working on a project that retrieves my school schedule. It's a badly written website. It is a HTML table, with every cell containing another table. inside that table, there is data. there are no id's or classes.

I've managed to get a list of all the tables I need, but there is one single value (rowspan) in the parent of the tables that I can't access, but I still need.

Is it possible to check out the parents of the soup when you do actually have the complete source laying around?

page:

<td colspan="12" rowspan="4" align="center">
    <table>
        <tr><td>*data is here*</td></tr>

(my soup object is made of the HTML, starting at the table)

Answers

You can find a tags parent by calling the .parent attribute.

...
print(soup.find('table').parent)

Edit: Try using the find_previous() method:

>>> html = """
... <td colspan="12" rowspan="4" align="center">
...     <table>
...         <tr><td>*data is here*</td></tr>
... """
>>> soup = BeautifulSoup(html, "html.parser")
>>>
>>> for tag in soup.find_all("table"):
...     print(tag.find_previous("td")["rowspan"])
...
4
>>>
Logo

Python社区为您提供最前沿的新闻资讯和知识内容

更多推荐