Answer a question

I am trying to get urls for each live match from https://www.348365365.com/#/IP/B1. Here is a python script in which I am using Selenium to parse the main page which contains all live matches.

    options = Options()
    options.add_argument('--headless')
    options.add_argument('--disable-gpu')

    driver = webdriver.Chrome(options=options)
    driver.get('https://www.348365365.com/#/IP/B1')

    time.sleep(10)

    page = driver.page_source

    driver.quit()

    soup = BeautifulSoup(page, 'html.parser')

The problem is that I cannot find the event id. As an example, a url should be like this: https://www.348365365.com/#/IP/EV15569134772C1. I need EV15569134772C1 ids like this to create the urls I need for each match, but it's not present on the page source.

Answers

It seems inaccessible with selenium. (page loads indefinitely)

-> https://www.tutorialfor.com/questions-316541.htm

If you manage to connect with selenium, simulate clicks on the divs, retrieve the current url, get back and do it again ...

Moreover, bet365 has had to arm itself for a long time against web-scraping. From what I've seen, once the page is loaded, nothing more goes through the network. So the solution must be in the files js + html + xhr. Good luck for reverse engineering :)

Logo

Python社区为您提供最前沿的新闻资讯和知识内容

更多推荐