How to scrape data from webpage which uses react.js with Selenium in Python?
Answer a question
I am facing some difficulties scraping a website which uses react.js
and not sure why this is happening.
This is the html of the website:
What I wish to do is click on the button with the class: play-pause-button btn btn -naked
. However, when I load the page with the Mozilla gecko webdriver there is an exception thrown saying
Message: Unable to locate element: .play-pause-button btn btn-naked
which makes me think that maybe I should do something else to get this element? This is my code so far:
driver.get("https://drawittoknowit.com/course/neurological-system/anatomy/peripheral-nervous-system/1332/brachial-plexus---essentials")
# execute script to scroll down the page
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);var lenOfPage=document.body.scrollHeight;return lenOfPage;")
time.sleep(10)
soup = BeautifulSoup(driver.page_source, 'lxml')
print(driver.page_source)
play_button = driver.find_element_by_class_name("play-pause-button btn btn-naked").click()
print(play_button)
Does anyone have an idea as to how I could go about solving this? Any help is much appreciated
Answers
Seems you were close. While using find_element_by_class_name()
you can't pass multiple classes and you are allowed to pass only one classname, i.e. only only one among either of the following:
play-pause-button
btn
btn-naked
On passing multiple classes through find_element_by_class_name()
you will face Message: invalid selector: Compound class names not permitted
Solution
As an alternative, as the element is an Angular element, to click()
on the element you have to induce WebDriverWait for the element_to_be_clickable()
and you you can use either of the following Locator Strategies:
-
Using
CSS_SELECTOR
:WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button.play-pause-button.btn.btn-naked")))click()
-
Using
XPATH
:WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//button[@class='play-pause-button btn btn-naked']")))click()
-
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC
更多推荐
所有评论(0)