10.多线程，多进程

系统差异进程池进程间通信多线程线程锁ThreadLocal1.系统差异在Linux和Unix系统上，os模块有fork调用方式来创建子进程，在windows上可用multiprocessing模块来操作进程。fork方式：import osprint('Process (%s) start...' % os.getpid())#...

Aislli

3749人浏览 · 2018-07-23 16:22:37

Aislli · 2018-07-23 16:22:37 发布

系统差异
进程池
进程间通信
多线程
线程锁
ThreadLocal

1.系统差异
在Linux和Unix系统上，os模块有fork调用方式来创建子进程，在windows上可用multiprocessing模块来操作进程。
fork方式：

import os

print('Process (%s) start...' % os.getpid())
# Only works on Unix/Linux/Mac:
pid = os.fork()
if pid == 0:
    print('I am child process (%s) and my parent is %s.' % (os.getpid(), os.getppid()))
else:
    print('I (%s) just created a child process (%s).' % (os.getpid(), pid))

windows上的方式：

from multiprocessing import Process

import os


# 子进程要执行的代码
def run_proc(name):
    print('Run child process %s (%s)...' % (name, os.getpid()))


if __name__ == '__main__':
    print('Parent process %s.' % os.getpid())
    p = Process(target=run_proc, args=('test',))
    print('Child process will start.')
    p.start()
    p.join()
    print('Child process end.')
---
Parent process 7504.
Child process will start.
Run child process test (2532)...
Child process end.

创建一个子进程可通过创建一个Process实例，传入一个执行函数和参数，通过start启动，join方法可让进程执行完了再继续往下执行，通常用于进程间同步。

2.进程池pool

用进程池的方式创建子进程：

from multiprocessing import Pool
import os, time, random

def long_time_task(name):
    print('Run task %s (%s)...' % (name, os.getpid()))
    start = time.time()
    time.sleep(random.random() * 3)
    end = time.time()
    print('Task %s runs %0.2f seconds.' % (name, (end - start)))

if __name__=='__main__':
    print('Parent process %s.' % os.getpid())
    p = Pool(4)
    for i in range(5):
        p.apply_async(long_time_task, args=(i,))
    print('Waiting for all subprocesses done...')
    p.close()
    p.join()
    print('All subprocesses done.')
---
Parent process 2836.
Waiting for all subprocesses done...
Run task 0 (5516)...
Run task 1 (1112)...
Run task 2 (6464)...
Run task 3 (7860)...
Task 1 runs 1.87 seconds.
Run task 4 (1112)...
Task 2 runs 2.38 seconds.
Task 0 runs 2.65 seconds.
Task 3 runs 2.57 seconds.
Task 4 runs 2.39 seconds.
All subprocesses done.

pool调用close()后就不能再继续添加新的Process了，调用join方法等待所有子进程执行完，在调用join方法前必须先调用close()方法。

还有个需要注意的地方是task0，1，2，3是立即执行的，task4是等到有一个进程执行完了它才执行，这是因为我的电脑是4核的，所以Pool的默认大小是4，因此最多执行4个进程，这是pool有意设计的限制，而不是操作系统的限制，如果非要把pool(4)改成pool(5)，也是可以的，就会同时执行5个进程。

3.进程间通信

from multiprocessing import Process, Queue
import os, time, random


# 写数据进程执行的代码:
def write(q):
    print('Process to write: %s' % os.getpid())
    for value in ['A', 'B', 'C']:
        print('Put %s to queue...' % value)
        q.put(value)
        time.sleep(random.random()*5)


# 读数据进程执行的代码:
def read(q):
    print('Process to read: %s' % os.getpid())
    while True:
        value = q.get(True)
        print('Get %s from queue.' % value)


if __name__ == '__main__':
    # 父进程创建Queue，并传给各个子进程：
    q = Queue()
    pw = Process(target=write, args=(q,))
    pr = Process(target=read, args=(q,))
    # 启动子进程pw，写入:
    pw.start()
    # 启动子进程pr，读取:
    pr.start()
    # 等待pw结束:
    pw.join()
    # pr进程里是死循环，无法等待其结束，只能强行终止:
    pr.terminate()
---
Process to write: 5784
Put A to queue...
Process to read: 8012
Get A from queue.
Put B to queue...
Get B from queue.
Put C to queue...
Get C from queue.

5.多线程
Python使用多线程有两个模块_thread和threading，前者是低级模块，后者是高级模块，对前者做了封装，大多数情况下只需要使用后者；

import threading


def methodRun():
    print('thread %s is running...' % threading.current_thread().name)


print('thread %s is running...' % threading.current_thread().name)
t = threading.Thread(target=methodRun,name='ChildThread')
t.start()
t.join()
print('thread %s ended.' % threading.current_thread().name)
---
thread MainThread is running...
thread ChildThread is running...
thread MainThread ended.

线程锁
当多个线程同时执行同一段代码操作同一个变量时，有可能因线程不同步造成结果错误，此时就需要同步锁保持线程同步，相当于JAVA里的synchronized:

import threading

lock = threading.Lock()


# lock = threading.Lock()

def methodRun():
    lock.acquire()#获取锁
    try:
        print('thread %s is running...' % threading.current_thread().name)
    finally:
        lock.release()#释放锁


print('thread %s is running...' % threading.current_thread().name)
t = threading.Thread(target=methodRun, name='ChildThread')
t.start()
t.join()
print('thread %s ended.' % threading.current_thread().name)

这段代码实际上不需要加同步锁，只是记录下加锁的方式，在获取锁到释放锁之间的代码只会同时被某一个线程执行，不会多个线程同时执行，当一个线程获取到锁后，其它线程只能等到这个线程释放锁后才能获取锁。

ThreadLocal
在多线程中，每个线程都可能有自己处理的变量，如果这些局部变量在线程里处理时又要传递到别的方法里处理，就得一次次传递，比较麻烦：

import threading


class Student:
    def __init__(self, name):
        self.name = name


def pri(std):
    print(std.name)


def stdthread(str):
    std = Student(str)
    pri(std)


def thread_test():
    t1 = threading.Thread(target=stdthread, args=('Tom',), name='Thread-1')
    t2 = threading.Thread(target=stdthread, args=('Mary',), name='Thread-2')
    t1.start()
    t2.start()
    t1.join()
    t2.join()


thread_test()
---
Tom
Mary

那么就希望把这个变量变成全局的，在pri方法里可以直接调用本线程的那个std，不通过参数传进来：

import threading

dstd = {}  # 全局的Students


class Student:
    def __init__(self, name):
        self.name = name


def pri():
    print(dstd[threading.current_thread()].name)  # 取出对应线程的student对象


def stdthread(str):
    std = Student(str)
    dstd[threading.current_thread()] = std
    pri()


def thread_test():
    t1 = threading.Thread(target=stdthread, args=('Tom',), name='Thread-1')
    t2 = threading.Thread(target=stdthread, args=('Mary',), name='Thread-2')
    t1.start()
    t2.start()
    t1.join()
    t2.join()


thread_test()
---
Tom
Mary

这样虽然不用传递参数了，但是获取变量的方式还是有些奇怪，可以用Python提供的ThreadLocal实现：

import threading

std_local = threading.local()

class Student:
    def __init__(self, name):
        self.name = name


def pri():
    print(std_local.std.name)  # 取出对应线程的student对象


def stdthread(str):
    std = Student(str)
    std_local.std = std
    pri()


def thread_test():
    t1 = threading.Thread(target=stdthread, args=('Tom',), name='Thread-1')
    t2 = threading.Thread(target=stdthread, args=('Mary',), name='Thread-2')
    t1.start()
    t2.start()
    t1.join()
    t2.join()


thread_test()
---
Tom
Mary