Answer a question

I'm trying to get a JavaScript var value from an HTML source code using BeautifulSoup.

For example I have:

<script>
[other code]
var my = 'hello';
var name = 'hi';
var is = 'halo';
[other code]
</script>

I want something to return the value of the var "my" in Python

How can I achieve that?

Answers

The simplest approach is to use a regular expression pattern to both locate the element via BeautifulSoup and extract the desired substring:

import re

from bs4 import BeautifulSoup

data = """
<script>
[other code]
var my = 'hello';
var name = 'hi';
var is = 'halo';
[other code]
</script>
"""

soup = BeautifulSoup(data, "html.parser")

pattern = re.compile(r"var my = '(.*?)';$", re.MULTILINE | re.DOTALL)
script = soup.find("script", text=pattern)

print(pattern.search(script.text).group(1))

Prints hello.

Logo

Python社区为您提供最前沿的新闻资讯和知识内容

更多推荐