Answer a question

I would like to use asyncio to get webpage html.

I run the following code in jupyter notebook:

import aiofiles
import aiohttp
from aiohttp import ClientSession

async def get_info(url, session):
    resp = await session.request(method="GET", url=url)
    resp.raise_for_status()
    html = await resp.text(encoding='GB18030')
    with open('test_asyncio.html', 'w', encoding='utf-8-sig') as f:
        f.write(html)
    return html
    
async def main(urls):
    async with ClientSession() as session:
        tasks = [get_info(url, session) for url in urls]
        return await asyncio.gather(*tasks)

if __name__ == "__main__":
    url = ['http://huanyuntianxiazh.fang.com/house/1010123799/housedetail.htm', 'http://zhaoshangyonghefu010.fang.com/house/1010126863/housedetail.htm']
    result = asyncio.run(main(url))

However, it returns RuntimeError: asyncio.run() cannot be called from a running event loop

What is the problem?

How to solve it?

Answers

The asyncio.run() documentation says:

This function cannot be called when another asyncio event loop is running in the same thread.

In your case, jupyter (IPython ≥ 7.0) is already running an event loop:

You can now use async/await at the top level in the IPython terminal and in the notebook, it should — in most of the cases — “just work”. Update IPython to version 7+, IPykernel to version 5+, and you’re off to the races.

Therefore you don't need to start the event loop yourself and can instead call await main(url) directly, even if your code lies outside any asynchronous function.

Jupyter / IPython

async def main():
    print(1)
    
await main()

Python (≥ 3.7) or older versions of IPython

import asyncio

async def main():
    print(1)
    
asyncio.run(main())

In your code that would give:

url = ['url1', 'url2']
result = await main(url)

for text in result:
    pass # text contains your html (text) response

Caution

There is a slight difference on how Jupyter uses the loop compared to IPython.

Logo

Python社区为您提供最前沿的新闻资讯和知识内容

更多推荐