Get JS var value in HTML source using BeautifulSoup in Python
·
Answer a question
I'm trying to get a JavaScript var value from an HTML source code using BeautifulSoup.
For example I have:
<script>
[other code]
var my = 'hello';
var name = 'hi';
var is = 'halo';
[other code]
</script>
I want something to return the value of the var "my" in Python
How can I achieve that?
Answers
The simplest approach is to use a regular expression pattern to both locate the element via BeautifulSoup and extract the desired substring:
import re
from bs4 import BeautifulSoup
data = """
<script>
[other code]
var my = 'hello';
var name = 'hi';
var is = 'halo';
[other code]
</script>
"""
soup = BeautifulSoup(data, "html.parser")
pattern = re.compile(r"var my = '(.*?)';$", re.MULTILINE | re.DOTALL)
script = soup.find("script", text=pattern)
print(pattern.search(script.text).group(1))
Prints hello.
更多推荐

所有评论(0)