在学习Python超强爬虫8天速成(完整版)爬取各种网站数据实战案例Day7 - 06.无头浏览器+规避检测时候老师演示的代码,遇到一些问题及解决过程,供分享和指点

from selenium import webdriver
from time import sleep
from selenium.webdriver.chrome.options import Options
from selenium.webdriver import ChromeOptions

# non visual interface
chrome_options = Options()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--disable-gpu')

# avoid detection risks
option = ChromeOptions()
option.add_experimental_option('excludeSwitches', ['enable-automation'])


driver = webdriver.Chrome(executable_path='./chromedriver.exe', chrome_options=chrome_options, options=option)

driver.get('https://www.baidu.com')
# get page source
print(driver.page_source)
sleep(2)
driver.quit()

由于刚开始使用的是seleniumV3.7报错TypeError: __init__() got an unexpected keyword argument 'options' ,作为初学者,比较疑惑,网上没有找到合适的解决办法,尝试将selenium升级到Version4.1.0,但是会有两个warning,

01: DeprecationWarning: executable_path has been deprecated, please pass in a Service object  发生于driver = webdriver.Chrome(executable_path='./chromedriver.exe')

解决方式 

from selenium import webdriver
from selenium.webdriver.chrome.service import Service

# 创建一个Service对象,指定ChromeDriver的路径
service = Service('./chromedriver.exe')

# 通过Service对象来初始化Chrome WebDriver
driver = webdriver.Chrome(service=service)

 02:DeprecationWarning: use options instead of chrome_options 发生于driver = webdriver.Chrome(service=service, chrome_options=chrome_options, options=option),

但是chrome_options和option都需要传入options,不知如何解决,但是最后尝试将无界面和反检测相应配置参数都传入Options对象,如下

from selenium import webdriver
from selenium.webdriver.chrome.service import Service

# 创建一个Service对象,指定ChromeDriver的路径
service = Service('./chromedriver.exe')

# 通过Service对象来初始化Chrome WebDriver
driver = webdriver.Chrome(service=service)

经过测试,后台运行和防止被检测均生效

 最终代码

from selenium import webdriver
from time import sleep
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service

chrome_options = Options()
# non visual interface
chrome_options.add_argument('--headless')
chrome_options.add_argument('--disable-gpu')
# avoid detection risks
chrome_options.add_experimental_option('excludeSwitches', ['enable-automation'])


# 创建一个Service对象,指定ChromeDriver的路径
service = Service('./chromedriver.exe')
# 通过Service对象来初始化Chrome WebDriver
driver = webdriver.Chrome(service=service, options=chrome_options)

driver.get('https://www.baidu.com')
print(driver.page_source)
sleep(2)
driver.quit()

期待指点...

更多推荐