Scraping web page after accepting cookies in python
·
Answer a question
I'm trying to scrape a web page but before accessing the page, there is a banner for accepting cookies. I am using selenium to click on the button "Accept all cookies" but even after clicking on the button I can't access the right HTML page.
This is my code :
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from bs4 import BeautifulSoup
url = 'https://www.wikiparfum.fr/explore/by-name?query=dior'
driver = webdriver.Chrome(executable_path=DRIVER_PATH)
driver.get(url)
driver.find_element_by_id('onetrust-accept-btn-handler').click()
html = driver.page_source
soup = BeautifulSoup(html, 'lxml')
print(soup)
And this is the beginning of the HTML page that is printed :
If anyone can help me with this one, thank you!
Answers
You should wait for the accept cookies button element appearance before clicking it
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from bs4 import BeautifulSoup
url = 'https://www.wikiparfum.fr/explore/by-name?query=dior'
driver = webdriver.Chrome(executable_path=DRIVER_PATH)
wait = WebDriverWait(driver, 20)
driver.get(url)
wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "#onetrust-accept-btn-handler"))).click()
html = driver.page_source
soup = BeautifulSoup(html, 'lxml')
print(soup)
更多推荐
所有评论(0)