[关闭]
@Channelchan 2018-10-10T23:57:47.000000Z 字数 10495 阅读 75218

因子分析

目录

  1. SignalDigger是什么?
  2. SignalDigger vs alphalens
  3. 数据准备工作
  4. 如何用SignalDigger测试和分析选股效果?
  5. 选股效果可视化

SignalDigger是什么?

下载方式: pip install git+https://github.com/xingetouzi/JAQS.git@fxdayu

github地址: https://github.com/xingetouzi/JAQS/tree/fxdayu

官方网站:https://www.quantos.org/ 可登录该网站注册自己的数据账号

SignalDigger vs alphalens

image.png-98kB

数据准备工作

下面以沪深300成分股为例,处理选股因子(signal_data)

  1. from jaqs_fxdayu.data import DataView # 可以视为一个轻量级的数据库,数据格式基于pandas,方便数据的调用和处理
  2. from jaqs_fxdayu.data import RemoteDataService # 数据服务,用于下载数据
  3. import os
  4. import warnings
  5. warnings.filterwarnings("ignore")
  6. dataview_folder = '../Factor'
  7. if not (os.path.isdir(dataview_folder)):
  8. os.makedirs(dataview_folder)
  9. # 数据下载
  10. def save_dataview():
  11. data_config = {
  12. "remote.data.address": "tcp://data.quantOS.org:8910",
  13. "remote.data.username": "18566262672",
  14. "remote.data.password": "eyJhbGciOiJIUzI1NiJ9.eyJjcmVhdGVfdGltZSI6IjE1MTI3MDI3NTAyMTIiLCJpc3MiOiJhdXRoMCIsImlkIjoiMTg1NjYyNjI2NzIifQ.O_-yR0zYagrLRvPbggnru1Rapk4kiyAzcwYt2a3vlpM"
  15. }
  16. ds = RemoteDataService()
  17. ds.init_from_config(data_config)
  18. dv = DataView()
  19. props = {'start_date': 20140101, 'end_date': 20180101, 'universe': '000300.SH',
  20. 'fields': "volume,pb,pe,ps,roe,float_mv,sw1",
  21. 'freq': 1,
  22. 'timeout': 180}
  23. dv.init_from_config(props, ds)
  24. dv.prepare_data()
  25. dv.save_dataview(dataview_folder) # 保存数据文件到指定路径,方便下次直接加载
  26. save_dataview()
Begin: DataApi login 18566262672@tcp://data.quantOS.org:8910
    Already login as 18566262672, skip init_from_config
Initialize config success.
Query data...
Query data - query...
NOTE: price adjust method is [post adjust]
当前请求daily...
{'adjust_mode': None, 'fields': 'symbol,trade_date,open,volume,high_adj,vwap,close_adj,high,low,vwap_adj,close,trade_status,low_adj,open_adj'}
当前请求daily...
{'adjust_mode': 'post', 'fields': 'open,vwap,low,high,close,symbol,trade_date'}
当前请求query_lb_dailyindicator...
{'fields': 'ps,float_mv,symbol,pe,pb,trade_date'}
WARNING: some data is unavailable: 
    At fields 
Query data - daily fields prepared.
Query data - quarterly fields prepared.
Query instrument info...
Query adj_factor...
Query benchmark...
Query benchmar member info...
Query groups (industry)...
Field [trade_status] is overwritten.
Data has been successfully prepared.

Store data...
Dataview has been successfully saved to:
E:\2018_Course\HighSchool\Final\Factor

You can load it with load_dataview('E:\2018_Course\HighSchool\Final\Factor')
  1. # 加载数据
  2. dv = DataView()
  3. dv.load_dataview(dataview_folder)
Dataview loaded successfully.
  1. print(dv.get_ts("pb").head())
symbol      000001.SZ  000002.SZ  000008.SZ  000009.SZ  000012.SZ  000024.SZ  \
trade_date                                                                     
20140102       1.0563     1.2891     4.8981     3.5794     2.3725     1.3202   
20140103       1.0304     1.2649     4.8709     3.4842     2.3346     1.2977   
20140106       1.0079     1.2068     4.6314     3.4537     2.2036     1.2283   
20140107       1.0044     1.1987     4.5661     3.4461     2.1920     1.2013   
20140108       1.0157     1.1971     4.4790     3.3852     2.1862     1.1685   

symbol      000027.SZ  000039.SZ  000046.SZ  000059.SZ    ...      601998.SH  \
trade_date                                                ...                  
20140102       0.9077     2.0483     2.4159     0.8806    ...         0.8216   
20140103       0.8861     2.0801     2.3726     0.8488    ...         0.8088   
20140106       0.8662     2.0113     2.3348     0.8081    ...         0.7960   
20140107       0.8629     2.0721     2.2970     0.7940    ...         0.7939   
20140108       0.8728     2.0629     2.3294     0.7904    ...         0.7960   

symbol      603000.SH  603160.SH  603288.SH  603699.SH  603799.SH  603833.SH  \
trade_date                                                                     
20140102      10.0487        NaN        NaN        NaN        NaN        NaN   
20140103       9.8886        NaN        NaN        NaN        NaN        NaN   
20140106       9.8515        NaN        NaN        NaN        NaN        NaN   
20140107      10.1024        NaN        NaN        NaN        NaN        NaN   
20140108      10.3713        NaN        NaN        NaN        NaN        NaN   

symbol      603858.SH  603885.SH  603993.SH  
trade_date                                   
20140102          NaN        NaN     2.7133  
20140103          NaN        NaN     2.6706  
20140106          NaN        NaN     2.5682  
20140107          NaN        NaN     2.5682  
20140108          NaN        NaN     2.5298  


[5 rows x 488 columns]
  1. import numpy as np
  2. #定义信号过滤条件-非指数成分
  3. def mask_index_member():
  4. df_index_member = dv.get_ts('index_member')
  5. mask_index_member = df_index_member == 0
  6. return mask_index_member
  7. # 定义可买卖条件——未停牌、未涨跌停
  8. def limit_up_down():
  9. trade_status = dv.get_ts('trade_status')
  10. mask_sus = trade_status == 0
  11. # 涨停
  12. dv.add_formula('up_limit', '(close - Delay(close, 1)) / Delay(close, 1) > 0.095', is_quarterly=False, add_data=True)
  13. # 跌停
  14. dv.add_formula('down_limit', '(close - Delay(close, 1)) / Delay(close, 1) < -0.095', is_quarterly=False, add_data=True)
  15. can_enter = np.logical_and(dv.get_ts('up_limit') < 1, ~mask_sus) # 未涨停未停牌
  16. can_exit = np.logical_and(dv.get_ts('down_limit') < 1, ~mask_sus) # 未跌停未停牌
  17. return can_enter,can_exit
  18. mask = mask_index_member()
  19. can_enter,can_exit = limit_up_down()
  1. print(mask.head())
  2. print(can_enter.head())
symbol    000001.SZ  000002.SZ  000008.SZ  000009.SZ   ...
20140102      False      False       True      False      ...
20140103      False      False       True      False      ...
20140106      False      False       True      False      ...
20140107      False      False       True      False      ... 
20140108      False      False       True      False      ... 
[5 rows x 488 columns]

symbol    000001.SZ  000002.SZ  000008.SZ  000009.SZ   ...
20140102       True       True       True       True      ...
20140103       True       True       True       True      ...
20140106       True       True       True       True      ...
20140107       True       True       True       True      ...
20140108       True       True       True       True      ...
[5 rows x 488 columns]
  1. from jaqs_fxdayu.research import SignalDigger
  2. obj = SignalDigger(output_folder='./output',
  3. output_format='pdf')
  4. # 处理因子 计算目标股票池每只股票的持有期收益,和对应因子值的quantile分类
  5. obj.process_signal_before_analysis(signal=dv.get_ts("pb"),
  6. price=dv.get_ts("close_adj"),
  7. high=dv.get_ts("high_adj"), # 可为空
  8. low=dv.get_ts("low_adj"),# 可为空
  9. group=dv.get_ts("sw1"),# 可为空
  10. n_quantiles=5,# quantile分类数
  11. mask=mask,# 过滤条件
  12. can_enter=can_enter,# 是否能进场
  13. can_exit=can_exit,# 是否能出场
  14. period=15,# 持有期
  15. benchmark_price=dv.data_benchmark, # 基准价格 可不传入,持有期收益(return)计算为绝对收益
  16. commission = 0.0008,
  17. )
  18. signal_data = obj.signal_data
  19. signal_data.head()
Nan Data Count (should be zero) : 0;  Percentage of effective data: 57%
signal return upside_ret downside_ret group quantile
trade_date symbol
20140103 000001.SZ 1.0563 -0.003744 0.005068 -0.057799 银行 1
000002.SZ 1.2891 0.012511 0.010680 -0.102841 房地产 2
000009.SZ 3.5794 0.029817 0.025430 -0.069652 综合 4
000012.SZ 2.3725 0.021382 0.014163 -0.116760 建筑材料 4
000024.SZ 1.3202 -0.031632 -0.002781 -0.161771 房地产 2

因子分析

  1. from jaqs_fxdayu.research.signaldigger.analysis import analysis
  2. result = analysis(signal_data, is_event=False, period=15)

因子分析相关指标文档

列项(ic类型/投资组合类型):

索引项(ic或收益的具体指标):

  1. print("——ic分析——")
  2. print(result["ic"])
  3. print("——选股收益分析——")
  4. print(result["ret"])
  5. print("——最大潜在盈利/亏损分析——")
  6. print(result["space"])
——ic分析——
            return_ic  upside_ret_ic  downside_ret_ic
IC Mean     -6.945807e-02   5.937467e-02    -2.219792e-01
IC Std.      2.594710e-01   2.464900e-01     2.104796e-01
t-stat(IC)  -8.298423e+00   7.467299e+00    -3.269370e+01
p-value(IC)  3.549725e-16   1.835765e-13    3.604265e-158
IC Skew      5.251623e-02  -4.326744e-01     5.580971e-01
IC Kurtosis -7.602440e-01  -4.722475e-01     1.068205e-01
Ann. IR     -2.676911e-01   2.408806e-01    -1.054635e+00
——选股收益分析——
             long_ret  long_short_ret  top_quantile_ret  bottom_quantile_ret  \
t-stat      -4.354131       -6.429742        -18.372327            19.273406   
p-value      0.000010        0.000000          0.000000             0.000000   
skewness    -0.056936        0.043832          0.738623             2.296000   
kurtosis     2.476179        1.086328          4.264752            14.086266   
Ann. Ret    -0.074512       -0.093044         -0.122735             0.095147   
Ann. Vol     0.132007        0.111627          0.384917             0.286435   
Ann. IR     -0.564453       -0.833528         -0.318861             0.332176   
occurance  961.000000      961.000000      53562.000000         54314.000000   

              tmb_ret  all_sample_ret  
t-stat      -6.597274       -4.942966  
p-value      0.000000        0.000000  
skewness    -0.240436        1.263993  
kurtosis     1.677179        7.805175  
Ann. Ret    -0.216154       -0.012879  
Ann. Vol     0.252739        0.336874  
Ann. IR     -0.855246       -0.038232  
occurance  961.000000   269680.000000  
——最大潜在盈利/亏损分析——
               long_space  top_quantile_space  bottom_quantile_space  \
Up_sp Mean       0.086507            0.085930               0.078885   
Up_sp Std        0.052492            0.090577               0.091406   
Up_sp IR         1.647994            0.948698               0.863017   
Up_sp Pct5       0.031881            0.002194               0.001494   
Up_sp Pct25      0.053368            0.025353               0.020845   
Up_sp Pct50      0.070421            0.060478               0.050651   
Up_sp Pct75      0.102281            0.115654               0.104006   
Up_sp Pct95      0.198886            0.261074               0.251017   
Up_sp Occur    961.000000        53562.000000           54314.000000   
Down_sp Mean    -0.119896           -0.121331              -0.069269   
Down_sp Std      0.107698            0.214995               0.139876   
Down_sp IR      -1.113264           -0.564343              -0.495217   
Down_sp Pct5    -0.292471           -0.648977              -0.203668   
Down_sp Pct25   -0.128481           -0.109396              -0.068355   
Down_sp Pct50   -0.092179           -0.057195              -0.034265   
Down_sp Pct75   -0.067415           -0.026451              -0.015115   
Down_sp Pct95   -0.047044           -0.005004              -0.003160   
Down_sp Occur  961.000000        53562.000000           54314.000000   

                tmb_space  all_sample_space  
Up_sp Mean       0.156780          0.084070  
Up_sp Std        0.076756          0.092138  
Up_sp IR         2.042564          0.912434  
Up_sp Pct5       0.073360          0.001867  
Up_sp Pct25      0.102274          0.023622  
Up_sp Pct50      0.135111          0.057282  
Up_sp Pct75      0.189303          0.113234  
Up_sp Pct95      0.334003          0.257430  
Up_sp Occur    961.000000     269680.000000  
Down_sp Mean    -0.201954         -0.099619  
Down_sp Std      0.118146          0.190084  
Down_sp IR      -1.709366         -0.524077  
Down_sp Pct5    -0.401497         -0.346045  
Down_sp Pct25   -0.238337         -0.091120  
Down_sp Pct50   -0.174540         -0.045931  
Down_sp Pct75   -0.127514         -0.020540  
Down_sp Pct95   -0.087007         -0.003896  
Down_sp Occur  961.000000     269680.000000    

因子分析可视化

  1. import matplotlib.pyplot as plt
  2. obj.create_full_report()
  3. plt.show()
    Value of signals of Different Quantiles Statistics
             min        max      mean        std  count    count %
quantile                                                          
1         0.4286     2.7217  1.159793   0.309939  54314  20.140166
2         0.9723     4.1892  1.794878   0.471906  53937  20.000371
3         1.4054     5.7229  2.578376   0.661841  53930  19.997775
4         1.6454     9.4488  3.854879   0.986411  53937  20.000371
5         2.8008  5750.5164  8.834096  59.155352  53562  19.861317
Figure saved: E:\2018_Course\HighSchool\Final\5_因子研发工具实操Richard\output\returns_report.pdf
Information Analysis
                ic
IC Mean     -0.069
IC Std.      0.259
t-stat(IC)  -8.298
p-value(IC)  0.000
IC Skew      0.053
IC Kurtosis -0.760
Ann. IR     -0.268
Figure saved: E:\2018_Course\HighSchool\Final\5_因子研发工具实操Richard\output\information_report.pdf
<matplotlib.figure.Figure at 0x21498899f60>

output_17_2.png-425.5kB
output_17_3.png-171.6kB

  1. # 分组分析
  2. from jaqs_fxdayu.research.signaldigger import performance as pfm
  3. ic = pfm.calc_signal_ic(signal_data, by_group=True)
  4. mean_ic_by_group = pfm.mean_information_coefficient(ic, by_group=True)
  1. from jaqs_fxdayu.research.signaldigger import plotting
  2. plotting.plot_ic_by_group(mean_ic_by_group)
  3. plt.show()

output_19_0.png-14.5kB

将Quantile1的选股结果保存成excel

  1. excel_data = signal_data[signal_data['quantile']==1]["quantile"].unstack().replace(np.nan, 0)
  2. print (excel_data.head())
  3. excel_data.to_excel('./pb_quantile_1.xlsx')
symbol      000001.SZ  000002.SZ  000024.SZ  000027.SZ   ... 
trade_date            
20140103          1.0        0.0        0.0        0.0     ...   
20140106          1.0        0.0        0.0        0.0     ...  
20140107          1.0        0.0        0.0        0.0     ...  
20140108          1.0        0.0        0.0        0.0     ... 
20140109          1.0        0.0        0.0        0.0     ...  
[5 rows x 157 columns]
添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注