[关闭]
@Channelchan 2017-11-21T21:01:19.000000Z 字数 4453 阅读 85420

Panel

pandas的panel是由items, major_xs, minor_xs构成的三维数据,方便管理三维数据。

字典转Panel

  1. from datetime import datetime
  2. import pandas as pd
  3. symbol=['600036.XSHG','600050.XSHG','601318.XSHG']
  4. data_dict = {}
  5. for s in symbol:
  6. data = pd.read_excel('sz50.xlsx',sheetname=s, index_col='datetime')
  7. data_dict[s] = data.loc['2017-03-21':'2017-05-10']
  8. PN = pd.Panel(data_dict)
  9. print(PN)
<class 'pandas.core.panel.Panel'>
Dimensions: 3 (items) x 34 (major_axis) x 5 (minor_axis)
Items axis: 600036.XSHG to 601318.XSHG
Major_axis axis: 2017-03-21 15:00:00 to 2017-05-10 15:00:00
Minor_axis axis: close to volume
  1. MI = PN.to_frame()
  2. print(round(MI.head(),1))
                            600036.XSHG  600050.XSHG  601318.XSHG
datetime            minor                                        
2017-03-21 15:00:00 close          72.4          9.3         81.3
                    high           72.7          9.4         81.3
                    low            72.0          9.2         80.3
                    open           72.4          9.3         80.7
                    volume   40184305.0  147347370.0   96990107.0

修改顺序

  1. MI_tp = PN.transpose(2,1,0).to_frame()
  2. print(MI_tp.head())
                                 close   high    low   open       volume
datetime            minor                                               
2017-03-21 15:00:00 600036.XSHG  72.36  72.67  71.97  72.36   40184305.0
                    600050.XSHG   9.27   9.36   9.21   9.32  147347370.0
                    601318.XSHG  81.28  81.28  80.34  80.70   96990107.0
2017-03-22 15:00:00 600036.XSHG  71.51  72.36  71.31  72.32   46855076.0
                    600050.XSHG   9.26   9.36   8.97   9.21  231158696.0

修改名称

  1. PN_rename = PN.rename(items={'600036.XSHG':'ZSYH','600050.XSHG':'ZGLT','601318.XSHG':'ZGPA'})
  1. print(PN_rename)
<class 'pandas.core.panel.Panel'>
Dimensions: 3 (items) x 34 (major_axis) x 5 (minor_axis)
Items axis: ZSYH to ZGPA
Major_axis axis: 2017-03-21 15:00:00 to 2017-05-10 15:00:00
Minor_axis axis: close to volume

resample抽样取周线

  1. PN_RE = PN.transpose(2,1,0).resample('W-MON',axis=1).last()
  2. print(PN_RE)
<class 'pandas.core.panel.Panel'>
Dimensions: 5 (items) x 8 (major_axis) x 3 (minor_axis)
Items axis: close to volume
Major_axis axis: 2017-03-27 00:00:00 to 2017-05-15 00:00:00
Minor_axis axis: 600036.XSHG to 601318.XSHG
  1. print(PN_RE.to_frame().head())
                        close   high    low   open       volume
datetime   minor                                               
2017-03-27 600036.XSHG  73.67  74.56  72.51  72.51   80019192.0
           600050.XSHG   9.11   9.32   9.08   9.25  175576529.0
           601318.XSHG  82.31  83.50  81.75  81.77  122855895.0
2017-04-03 600036.XSHG  74.09  74.29  73.40  73.40   33962146.0
           600050.XSHG   9.03   9.10   8.85   8.85  125608109.0

访问数据

访问items的数据

  1. print(PN_RE['close'].head())
            600036.XSHG  600050.XSHG  601318.XSHG
datetime                                         
2017-03-27        73.67         9.11        82.31
2017-04-03        74.09         9.03        82.85
2017-04-10        73.01          NaN        81.61
2017-04-17        72.90          NaN        80.67
2017-04-24        73.05          NaN        81.01

访问major的数据

  1. print(PN_RE.major_xs('2017-04-10'))
             close   high    low   open      volume
600036.XSHG  73.01  73.21  72.16  73.05  32313532.0
600050.XSHG    NaN    NaN    NaN    NaN         NaN
601318.XSHG  81.61  82.22  81.48  81.88  48440496.0

访问minor的数据

  1. print(PN_RE.minor_xs('600036.XSHG').head())
            close   high    low   open      volume
datetime                                          
2017-03-27  73.67  74.56  72.51  72.51  80019192.0
2017-04-03  74.09  74.29  73.40  73.40  33962146.0
2017-04-10  73.01  73.21  72.16  73.05  32313532.0
2017-04-17  72.90  72.94  71.78  72.43  40747304.0
2017-04-24  73.05  73.36  72.55  72.90  37143301.0

访问loc使用名称索引

  1. print(PN_RE.loc[:,'2017-03-21':'2017-04-10',:].to_frame())
                        close   high    low   open       volume
datetime   minor                                               
2017-03-27 600036.XSHG  73.67  74.56  72.51  72.51   80019192.0
           600050.XSHG   9.11   9.32   9.08   9.25  175576529.0
           601318.XSHG  82.31  83.50  81.75  81.77  122855895.0
2017-04-03 600036.XSHG  74.09  74.29  73.40  73.40   33962146.0
           600050.XSHG   9.03   9.10   8.85   8.85  125608109.0
           601318.XSHG  82.85  83.18  81.93  82.15   63797819.0
2017-04-10 600036.XSHG  73.01  73.21  72.16  73.05   32313532.0
           601318.XSHG  81.61  82.22  81.48  81.88   48440496.0

访问iloc使用位置索引

  1. print(PN_RE.iloc[2:,2:,2:])
<class 'pandas.core.panel.Panel'>
Dimensions: 3 (items) x 6 (major_axis) x 1 (minor_axis)
Items axis: low to volume
Major_axis axis: 2017-04-10 00:00:00 to 2017-05-15 00:00:00
Minor_axis axis: 601318.XSHG to 601318.XSHG

访问ix使用索引或位置

  1. print(PN_RE.ix[0:3,-1,'601318.XSHG'])
close    89.36
high     89.63
low      85.18
Name: 601318.XSHG, dtype: float64

处理缺失值

  1. print(PN.isnull().values.any())
True
  1. if PN.isnull().values.any():
  2. PN.fillna(method='ffill',inplace=True)
  1. print(PN.isnull().values.any())
False

多维数据计算与合并

计算每只股票的5日均线,并且合并成DataFrame

  1. import talib as ta
  2. df_ma = pd.DataFrame({name: ta.abstract.MA(value, 5) for name, value in PN.iteritems()})
  3. print(df_ma.tail())
                     600036.XSHG  600050.XSHG  601318.XSHG
datetime                                                  
2017-05-04 15:00:00       73.158         9.03       84.886
2017-05-05 15:00:00       73.012         9.03       84.626
2017-05-08 15:00:00       72.758         9.03       84.544
2017-05-09 15:00:00       72.580         9.03       84.580
2017-05-10 15:00:00       72.650         9.03       85.430

计算每只股票的macd, 然后合并成MultiIndex

  1. pn_macd = pd.Panel({name: ta.abstract.MACD(value) for name, value in PN.iteritems()})
  2. df_macd = pn_macd.transpose(2,1,0).to_frame().head()
  3. print(df_macd)
                                     macd  macdsignal  macdhist
datetime            minor                                      
2017-05-10 15:00:00 600036.XSHG -0.067689   -0.105095  0.037406
                    600050.XSHG -0.019533   -0.027102  0.007569
                    601318.XSHG  1.090898    0.361969  0.728928

用stack()将DataFrame转换multiIndex,再将两个multiIndex合并

  1. df_macd['ma'] = df_ma.stack()
  2. print(df_macd)
                                     macd  macdsignal  macdhist     ma
datetime            minor                                             
2017-05-10 15:00:00 600036.XSHG -0.067689   -0.105095  0.037406  72.65
                    600050.XSHG -0.019533   -0.027102  0.007569   9.03
                    601318.XSHG  1.090898    0.361969  0.728928  85.43
添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注