# [python] 用熊猫循环数据框的最有效方式是什么？

``````pct_change = close[1:]/close[:-1]
``````

``````pct_change = []
for row in close:
pct_change.append(...)
``````

Question

``````Date,Open,High,Low,Close,Volume,Adj Close
2011-10-19,27.37,27.47,27.01,27.13,42880000,27.13
2011-10-18,26.94,27.40,26.80,27.31,52487900,27.31
2011-10-17,27.11,27.42,26.85,26.98,39433400,26.98
2011-10-14,27.31,27.50,27.02,27.27,50947700,27.27

....
``````

``````#!/usr/bin/env python
from pandas import *

for i, row in enumerate(df.values):
date = df.index[i]
open, high, low, close, adjclose = row
#now perform analysis on open/close based on date, etc..
``````

``````t = pd.DataFrame({'a': range(0, 10000), 'b': range(10000, 20000)})
B = []
C = []
A = time.time()
for i,r in t.iterrows():
C.append((r['a'], r['b']))
B.append(time.time()-A)

C = []
A = time.time()
for ir in t.itertuples():
C.append((ir[1], ir[2]))
B.append(time.time()-A)

C = []
A = time.time()
for r in zip(t['a'], t['b']):
C.append((r[0], r[1]))
B.append(time.time()-A)

print B
``````

``````[0.5639059543609619, 0.017839908599853516, 0.005645036697387695]
``````

• .iterrows（）：在单独的变量中返回索引和行项目，但显着较慢
• .itertuples（）：快于.iterrows（），但将索引与行项目一起返回，ir [0]是索引
• zip：最快，但不能访问该行的索引

``````df[b] = df[a].apply(lambda col: do stuff with col here)
``````

``````index = df.index.values
column_of_interest1 = df.column_name1.values
...
column_of_interestk = df.column_namek.values

for i in range(df.shape[0]):
index_value = index[i]
...
column_value_k = column_of_interest_k[i]
``````