Answer a question

I have a chunk of html extracted with bs4 as following

<div class="a-section a-spacing-small" id="productDescription">
<!-- show up to 2 reviews by default -->
<p>Satin Smooth Universal Protective Wax Pot Collars by Satin Smooth</p>
</div>

To extract the text I was using text.strip()

output.text()

It gave me the output "TypeError: 'str' object is not callable"

While I used output.get_text() and output.getText(), I got the desired text

What are the differences between these 3? why the get_text() and getText() is giving the same output?

Answers

They are very similar:

  • .get_text is a function that returns the text of a tag as a string
  • .text is a property that calls get_text (so it's identical, except you don't use parantheses)
  • .getText is an alias of get_text

I would use .text whenever possible, and .get_text(...) when you need to pass custom arguments (e.g. foo.get_text(strip=True, seperator='\n')).

Logo

Python社区为您提供最前沿的新闻资讯和知识内容

更多推荐