我是一个从未使用过任何并行处理方法的新手。我希望从SQL Server读取大量数据(即至少200万行),并希望使用并行处理来加快读取速度。下面是我使用并发未来进程池进行并行处理的尝试。在class DatabaseWorker(object):
def __init__(self, connection_string, n, result_queue = []):
self.connection_string = connection_string
stmt = "select distinct top %s * from dbo.KrishAnalyticsAllCalls" %(n)
self.query = stmt
self.result_queue = result_queue
def reading(self,x):
return(x)
def pooling(self):
t1 = time.time()
con = pyodbc.connect(self.connection_string)
curs = con.cursor()
curs.execute(self.query)
with concurrent.futures.ProcessPoolExecutor(max_workers=8) as executor:
print("Test1")
future_to_read = {executor.submit(self.reading, row): row for row in curs.fetchall()}
print("Test2")
for future in concurrent.futures.as_completed(future_to_read):
print("Test3")
read = future_to_read[future]
try:
print("Test4")
self.result_queue.append(future.result())
except:
print("Not working")
print("\nTime take to grab this data is %s" %(time.time() - t1))
df = DatabaseWorker(r'driver={SQL Server}; server=SPROD_RPT01; database=Reporting;', 2*10**7)
df.pooling()
我当前的实现没有得到任何输出。"Test1"打印,就这样。没有其他事情发生。我理解并发未来文档提供的各种示例,但我无法在这里实现它。我将非常感谢你的帮助。谢谢您。在