Answer a question

I know that an app server can be configured to:

  1. Launch new process per request

  2. Launch new thread per request

This question is regarding using python multi-processing (or multi-threading) code inside a flask endpoint. For example say I want to use python multiprocessing for CPU intensive work (or multithreading for IO intensive work).

I have a flask endpoint that takes 40 seconds to run the code (CPU intensive work). I have used python multiprocessing (pool) inside the endpoint code [so that certain CPU intensive things can be done in parallel via multiple processes] and now the endpoints takes 4 seconds to run.

Is it OK to use python multiprocessing (or multithreading) code inside an endpoint when either of the above 2 app server configurations (that is -when the app server is configured to serve each request in a new thread or each new request in a new process). The thread per request is default setting of the flask development server. Where as for gunicorn I can choose either. Is there anything I need to consider when using multiprocessing (or multithreading) inside a flask endpoint so that I am not messing up with the flask process/thread.

I know that a better solution is to use a task queue. But this question is specifically regarding using multithreading/multiprocessing.

Answers

In short, don't. It's tempting to try to negotiate some way to do a lot of work directly within a request handler, but that path leads to pain.

Consider instead one of the frameworks that allows a request handler (e.g., a Flask route) to queue up a task to be run asynchronously. The handler queues work, and gets back a task id, saving it in some way that allows the UI to poll for task completion. Meanwhile, an entirely separate process outside of flask picks up work, performs it, and returns a response through the framework (or separately via a shared data store).

Celery and Rq are two of such frameworks. (The Flask Mega Tutorial has a chapter on Rq that's worth a read.) They do require some additional setup. At a minimum, you'll need a shared Redis instance.

This approach has several benefits: First, it allows your web app to remain responsive. If your 40 second task evolves into an 80, 160, or several thousand second task, you won't tie up a Flask thread. Second, it protects Flask against memory growth and fragmentation; the task is performed in an entirely separate process, which will release memory when it exits.

What you do in those tasks is isolated from Flask. Want to use multiple processes or thread pools with a task? Fine. There's very little* risk of interfering with Flask. (* You could exhaust memory if you're running tasks workers on the same server as Flask).

Logo

Python社区为您提供最前沿的新闻资讯和知识内容

更多推荐