溫馨提示×

溫馨提示×

您好，登錄后才能下訂單哦！

密碼登錄×

忘記密碼？

登錄注冊×

獲取短信驗證碼

其他方式登錄

點擊登錄注冊即表示同意《億速云用戶服務條款》

用戶登錄×

賬戶密碼登錄

請使用微信掃描上方二維碼

使用幫助

請求超時！

請點擊重新獲取二維碼

python3下multiprocessing、threading和gevent性能對比以及進程池、線程池和協程池性能對比

發布時間：2021-10-14 14:01:10 來源：億速云閱讀：255 作者：柒染欄目：編程語言

python3下multiprocessing、threading和gevent性能對比以及進程池、線程池和協程池性能對比，很多新手對此不是很清楚，為了幫助大家解決這個難題，下面小編將為大家詳細講解，有這方面需求的人可以來學習下，希望你能有所收獲。

目前計算機程序一般會遇到兩類I/O：硬盤I/O和網絡I/O。我就針對網絡I/O的場景分析下python3下進程、線程、協程效率的對比。進程采用multiprocessing.Pool進程池，線程是自己封裝的進程池，協程采用gevent的庫。用python3自帶的urlllib.request和開源的requests做對比。代碼如下：

import urllib.request
import requests
import time
import multiprocessing
import threading
import queue

def startTimer():
    return time.time()

def ticT(startTime):
    useTime = time.time() - startTime
    return round(useTime, 3)

#def tic(startTime, name):
#    useTime = time.time() - startTime
#    print('[%s] use time: %1.3f' % (name, useTime))

def download_urllib(url):
    req = urllib.request.Request(url,
            headers={'user-agent': 'Mozilla/5.0'})
    res = urllib.request.urlopen(req)
    data = res.read()
    try:
        data = data.decode('gbk')
    except UnicodeDecodeError:
        data = data.decode('utf8', 'ignore')
    return res.status, data

def download_requests(url):
    req = requests.get(url,
            headers={'user-agent': 'Mozilla/5.0'})
    return req.status_code, req.text

class threadPoolManager:
	def __init__(self,urls, workNum=10000,threadNum=20):
		self.workQueue=queue.Queue()
		self.threadPool=[]
		self.__initWorkQueue(urls)
		self.__initThreadPool(threadNum)

	def __initWorkQueue(self,urls):
		for i in urls:
			self.workQueue.put((download_requests,i))

	def __initThreadPool(self,threadNum):
		for i in range(threadNum):
			self.threadPool.append(work(self.workQueue))

	def waitAllComplete(self):
		for i in self.threadPool:
			if i.isAlive():
				i.join()

class work(threading.Thread):
	def __init__(self,workQueue):
		threading.Thread.__init__(self)
		self.workQueue=workQueue
		self.start()
	def run(self):
		while True:
			if self.workQueue.qsize():
				do,args=self.workQueue.get(block=False)
				do(args)
				self.workQueue.task_done()
			else:
				break

urls = ['http://www.ustchacker.com'] * 10
urllibL = []
requestsL = []
multiPool = []
threadPool = []
N = 20
PoolNum = 100

for i in range(N):
    print('start %d try' % i)
    urllibT = startTimer()
    jobs = [download_urllib(url) for url in urls]
    #for status, data in jobs:
    #    print(status, data[:10])
    #tic(urllibT, 'urllib.request')
    urllibL.append(ticT(urllibT))
    print('1')
    
    requestsT = startTimer()
    jobs = [download_requests(url) for url in urls]
    #for status, data in jobs:
    #    print(status, data[:10])
    #tic(requestsT, 'requests')
    requestsL.append(ticT(requestsT))
    print('2')
    
    requestsT = startTimer()
    pool = multiprocessing.Pool(PoolNum)
    data = pool.map(download_requests, urls)
    pool.close()
    pool.join()
    multiPool.append(ticT(requestsT))
    print('3')

    requestsT = startTimer()
    pool = threadPoolManager(urls, threadNum=PoolNum)
    pool.waitAllComplete()
    threadPool.append(ticT(requestsT))
    print('4')

import matplotlib.pyplot as plt
x = list(range(1, N+1))
plt.plot(x, urllibL, label='urllib')
plt.plot(x, requestsL, label='requests')
plt.plot(x, multiPool, label='requests MultiPool')
plt.plot(x, threadPool, label='requests threadPool')
plt.xlabel('test number')
plt.ylabel('time(s)')
plt.legend()
plt.show()

運行結果如下：

python3下multiprocessing、threading和gevent性能對比以及進程池、線程池和協程池性能對比

從上圖可以看出，python3自帶的urllib.request效率還是不如開源的requests，multiprocessing進程池效率明顯提升，但還低于自己封裝的線程池，有一部分原因是創建、調度進程的開銷比創建線程高（測試程序中我把創建的代價也包括在里面）。

下面是gevent的測試代碼：

import urllib.request
import requests
import time
import gevent.pool
import gevent.monkey

gevent.monkey.patch_all()

def startTimer():
    return time.time()

def ticT(startTime):
    useTime = time.time() - startTime
    return round(useTime, 3)

#def tic(startTime, name):
#    useTime = time.time() - startTime
#    print('[%s] use time: %1.3f' % (name, useTime))

def download_urllib(url):
    req = urllib.request.Request(url,
            headers={'user-agent': 'Mozilla/5.0'})
    res = urllib.request.urlopen(req)
    data = res.read()
    try:
        data = data.decode('gbk')
    except UnicodeDecodeError:
        data = data.decode('utf8', 'ignore')
    return res.status, data

def download_requests(url):
    req = requests.get(url,
            headers={'user-agent': 'Mozilla/5.0'})
    return req.status_code, req.text

urls = ['http://www.ustchacker.com'] * 10
urllibL = []
requestsL = []
reqPool = []
reqSpawn = []
N = 20
PoolNum = 100

for i in range(N):
    print('start %d try' % i)
    urllibT = startTimer()
    jobs = [download_urllib(url) for url in urls]
    #for status, data in jobs:
    #    print(status, data[:10])
    #tic(urllibT, 'urllib.request')
    urllibL.append(ticT(urllibT))
    print('1')
    
    requestsT = startTimer()
    jobs = [download_requests(url) for url in urls]
    #for status, data in jobs:
    #    print(status, data[:10])
    #tic(requestsT, 'requests')
    requestsL.append(ticT(requestsT))
    print('2')
    
    requestsT = startTimer()
    pool = gevent.pool.Pool(PoolNum)
    data = pool.map(download_requests, urls)
    #for status, text in data:
    #    print(status, text[:10])
    #tic(requestsT, 'requests with gevent.pool')
    reqPool.append(ticT(requestsT))
    print('3')
    
    requestsT = startTimer()
    jobs = [gevent.spawn(download_requests, url) for url in urls]
    gevent.joinall(jobs)
    #for i in jobs:
    #    print(i.value[0], i.value[1][:10])
    #tic(requestsT, 'requests with gevent.spawn')
    reqSpawn.append(ticT(requestsT))
    print('4')
    
import matplotlib.pyplot as plt
x = list(range(1, N+1))
plt.plot(x, urllibL, label='urllib')
plt.plot(x, requestsL, label='requests')
plt.plot(x, reqPool, label='requests geventPool')
plt.plot(x, reqSpawn, label='requests Spawn')
plt.xlabel('test number')
plt.ylabel('time(s)')
plt.legend()
plt.show()

運行結果如下：

python3下multiprocessing、threading和gevent性能對比以及進程池、線程池和協程池性能對比

從上圖可以看到，對于I/O密集型任務，gevent還是能對性能做很大提升的，由于協程的創建、調度開銷都比線程小的多，所以可以看到不論使用gevent的Spawn模式還是Pool模式，性能差距不大。

因為在gevent中需要使用monkey補丁，會提高gevent的性能，但會影響multiprocessing的運行，如果要同時使用，需要如下代碼：

gevent.monkey.patch_all(thread=False, socket=False, select=False)

可是這樣就不能充分發揮gevent的優勢，所以不能把multiprocessing Pool、threading Pool、gevent Pool在一個程序中對比。不過比較兩圖可以得出結論，線程池和gevent的性能最優的，其次是進程池。附帶得出個結論，requests庫比urllib.request庫性能要好一些哈:-)

看完上述內容是否對您有幫助呢？如果還想對相關知識有進一步的了解或閱讀更多相關文章，請關注億速云行業資訊頻道，感謝您對億速云的支持。

向AI問一下細節

推薦閱讀：

免責聲明：本站發布的內容（圖片、視頻和文字）以原創、轉載和分享為主，文章觀點不代表本網站立場，如果涉及侵權請聯系站長郵箱：is@yisu.com進行舉報，并提供相關證據，一經查實，將立刻刪除涉嫌侵權內容。

上一篇新聞：
如何判斷WebBrowser瀏覽器網頁加載完成
下一篇新聞：
微服務的常用組件有哪些

猜你喜歡

AI
助
手

產品服務

地區劃分

專題活動

幫助支持

關于我們

售后咨詢

7*24小時在線電話：400-100-2938

7*24小時在線 QQ：800811969

關注億速云

億速云公眾號

手機網站二維碼

亚洲午夜精品一区二区_中文无码日韩欧免_久久香蕉精品视频_欧美主播一区二区三区美女