[关闭]
@Channelchan 2018-04-19T09:36:56.000000Z 字数 9767 阅读 52343

因子分析

目录

  1. SignalDigger是什么?
  2. SignalDigger vs alphalens
  3. 数据准备工作
  4. 如何用SignalDigger测试和分析选股效果?
  5. 选股效果可视化

SignalDigger是什么?

下载方式: pip install git+https://github.com/xingetouzi/JAQS.git@fxdayu

github地址: https://github.com/xingetouzi/JAQS/tree/fxdayu

官方网站:https://www.quantos.org/ 可登录该网站注册自己的数据账号

SignalDigger vs alphalens

signaldiggervsalphalens___.png-55.3kB

数据准备工作

下面以沪深300成分股为例,处理选股因子(signal_data)

  1. from jaqs_fxdayu.data import DataView # 可以视为一个轻量级的数据库,数据格式基于pandas,方便数据的调用和处理
  2. from jaqs_fxdayu.data import RemoteDataService # 数据服务,用于下载数据
  3. import os
  4. import warnings
  5. warnings.filterwarnings("ignore")
  6. dataview_folder = './Factor'
  7. if not (os.path.isdir(dataview_folder)):
  8. os.makedirs(dataview_folder)
  9. # 数据下载
  10. def save_dataview():
  11. data_config = {
  12. "remote.data.address": "tcp://data.tushare.org:8910",
  13. "remote.data.username": "18566262672",
  14. "remote.data.password": "eyJhbGciOiJIUzI1NiJ9.eyJjcmVhdGVfdGltZSI6IjE1MTI3MDI3NTAyMTIiLCJpc3MiOiJhdXRoMCIsImlkIjoiMTg1NjYyNjI2NzIifQ.O_-yR0zYagrLRvPbggnru1Rapk4kiyAzcwYt2a3vlpM"
  15. }
  16. ds = RemoteDataService()
  17. ds.init_from_config(data_config)
  18. dv = DataView()
  19. props = {'start_date': 20140101, 'end_date': 20180101, 'universe': '000300.SH',
  20. 'fields': "pb,pe,ps,float_mv,sw1",
  21. 'freq': 1}
  22. dv.init_from_config(props, ds)
  23. dv.prepare_data()
  24. dv.save_dataview(dataview_folder) # 保存数据文件到指定路径,方便下次直接加载
  25. save_dataview()
Begin: DataApi login 18566262672@tcp://data.tushare.org:8910
    login success 

Initialize config success.
Query data...
Query data - query...
NOTE: price adjust method is [post adjust]
当前请求daily...
{'adjust_mode': None, 'fields': 'open_adj,high,close_adj,symbol,vwap,trade_date,open,trade_status,high_adj,low_adj,close,vwap_adj,low'}
当前请求daily...
{'adjust_mode': 'post', 'fields': 'open_adj,high,close_adj,symbol,vwap,trade_date,open,trade_status,high_adj,low_adj,close,vwap_adj,low'}
当前请求query_lb_dailyindicator...
{'fields': 'symbol,pb,ps,float_mv,trade_date,pe'}
WARNING: some data is unavailable: 
    At fields 
Query data - daily fields prepared.
Query instrument info...
Query adj_factor...
Query benchmark...
Query benchmar member info...
Query groups (industry)...
Field [sw1] is overwritten.
Data has been successfully prepared.

Store data...
Dataview has been successfully saved to:
E:\QTC\PythonQTC\course\4_Selection\Richard\Factor

You can load it with load_dataview('E:\QTC\PythonQTC\course\4_Selection\Richard\Factor')
  1. # 加载数据
  2. dv = DataView()
  3. dv.load_dataview(dataview_folder)
Dataview loaded successfully.
  1. print(dv.get_ts("pb").head())
symbol   000001.SZ 000002.SZ 000008.SZ 000009.SZ  ...   
20140102    1.0563    1.2891    4.8981    3.579   ...   
20140103    1.0304    1.2649    4.8709    3.4843  ...   
20140106    1.0079    1.2068    4.6314    3.4537  ...   
20140107    1.0044    1.1987    4.5661    3.4461  ...   
20140108    1.0157    1.1971     4.479    3.3852  ...   

[5 rows x 488 columns]
  1. import numpy as np
  2. #定义信号过滤条件-非指数成分
  3. def mask_index_member():
  4. df_index_member = dv.get_ts('index_member')
  5. mask_index_member = df_index_member == 0
  6. return mask_index_member
  7. # 定义可买卖条件——未停牌、未涨跌停
  8. def limit_up_down():
  9. trade_status = dv.get_ts('trade_status')
  10. mask_sus = trade_status == u'停牌'
  11. # 涨停
  12. dv.add_formula('up_limit', '(close - Delay(close, 1)) / Delay(close, 1) > 0.095', is_quarterly=False, add_data=True)
  13. # 跌停
  14. dv.add_formula('down_limit', '(close - Delay(close, 1)) / Delay(close, 1) < -0.095', is_quarterly=False, add_data=True)
  15. can_enter = np.logical_and(dv.get_ts('up_limit') < 1, ~mask_sus) # 未涨停未停牌
  16. can_exit = np.logical_and(dv.get_ts('down_limit') < 1, ~mask_sus) # 未跌停未停牌
  17. return can_enter,can_exit
  18. mask = mask_index_member()
  19. can_enter,can_exit = limit_up_down()
  1. print(mask.head())
  2. print(can_enter.head())
symbol    000001.SZ  000002.SZ  000008.SZ  000009.SZ   ...
20140102      False      False       True      False      ...
20140103      False      False       True      False      ...
20140106      False      False       True      False      ...
20140107      False      False       True      False      ... 
20140108      False      False       True      False      ... 
[5 rows x 488 columns]

symbol    000001.SZ  000002.SZ  000008.SZ  000009.SZ   ...
20140102       True       True       True       True      ...
20140103       True       True       True       True      ...
20140106       True       True       True       True      ...
20140107       True       True       True       True      ...
20140108       True       True       True       True      ...
[5 rows x 488 columns]
  1. from jaqs_fxdayu.research import SignalDigger
  2. obj = SignalDigger(output_folder='./output',
  3. output_format='pdf')
  4. # 处理因子 计算目标股票池每只股票的持有期收益,和对应因子值的quantile分类
  5. obj.process_signal_before_analysis(signal=dv.get_ts("pb"),
  6. price=dv.get_ts("close_adj"),
  7. high=dv.get_ts("high_adj"), # 可为空
  8. low=dv.get_ts("low_adj"),# 可为空
  9. group=dv.get_ts("sw1"),# 可为空
  10. n_quantiles=5,# quantile分类数
  11. mask=mask,# 过滤条件
  12. can_enter=can_enter,# 是否能进场
  13. can_exit=can_exit,# 是否能出场
  14. period=15,# 持有期
  15. benchmark_price=dv.data_benchmark, # 基准价格 可不传入,持有期收益(return)计算为绝对收益
  16. commission = 0.0008,
  17. )
  18. signal_data = obj.signal_data
  19. signal_data.head()
Nan Data Count (should be zero) : 0;  Percentage of effective data: 58%
signal return upside_ret downside_ret group quantile
trade_date symbol
20140103 000001.SZ 1.0563 -0.003744 0.005068 -0.057799 480000 1
000002.SZ 1.2891 0.012511 0.010680 -0.102841 430000 2
000009.SZ 3.5794 0.029817 0.025430 -0.069652 510000 4
000012.SZ 2.3725 0.021382 0.014163 -0.116760 610000 4
000024.SZ 1.3202 -0.031632 -0.002781 -0.161771 430000 2

因子分析

  1. from jaqs_fxdayu.research.signaldigger.analysis import analysis
  2. result = analysis(signal_data, is_event=False, period=15)

因子分析相关指标文档

列项(ic类型/投资组合类型):

索引项(ic或收益的具体指标):

  1. print("——ic分析——")
  2. print(result["ic"])
  3. print("——选股收益分析——")
  4. print(result["ret"])
  5. print("——最大潜在盈利/亏损分析——")
  6. print(result["space"])
——ic分析——
                return_ic  upside_ret_ic  downside_ret_ic
IC Mean     -6.951035e-02   5.606238e-02    -2.230749e-01
IC Std.      2.589841e-01   2.442219e-01     2.096405e-01
t-stat(IC)  -8.320284e+00   7.116209e+00    -3.298656e+01
p-value(IC)  2.989256e-16   2.169488e-12    3.871870e-160
IC Skew      5.081720e-02  -4.449374e-01     5.723058e-01
IC Kurtosis -7.709979e-01  -4.734605e-01     1.615940e-01
Ann. IR     -2.683963e-01   2.295551e-01    -1.064083e+00
——选股收益分析——
             long_ret  long_short_ret  top_quantile_ret  bottom_quantile_ret  \
t-stat      -4.381030       -6.490318        -18.694714            19.120962   
p-value      0.000010        0.000000          0.000000             0.000000   
skewness    -0.049526        0.029738          0.744796             2.317705   
kurtosis     2.514092        1.086382          4.428683            14.380815   
Ann. Ret    -0.073421       -0.092002         -0.122516             0.092692   
Ann. Vol     0.130281        0.110197          0.381137             0.283953   
Ann. IR     -0.563555       -0.834883         -0.321449             0.326435   
occurance  976.000000      976.000000      54569.000000         55355.000000   

              tmb_ret  all_sample_ret  
t-stat      -6.687941       -5.011899  
p-value      0.000000        0.000000  
skewness    -0.265104        1.287218  
kurtosis     1.650265        8.079226  
Ann. Ret    -0.214525       -0.012834  
Ann. Vol     0.249360        0.334219  
Ann. IR     -0.860305       -0.038401  
occurance  976.000000   274820.000000  
——最大潜在盈利/亏损分析——
               long_space  top_quantile_space  bottom_quantile_space  \
Up_sp Mean       0.082828            0.081309               0.077566   
Up_sp Std        0.054042            0.107235               0.091162   
Up_sp IR         1.532662            0.758226               0.850856   
Up_sp Pct5       0.027272            0.000000               0.000000   
Up_sp Pct25      0.048482            0.023366               0.019653   
Up_sp Pct50      0.066416            0.058783               0.049451   
Up_sp Pct75      0.101446            0.113807               0.102595   
Up_sp Pct95      0.198889            0.258658               0.249200   
Up_sp Occur    976.000000        54569.000000           55355.000000   
Down_sp Mean    -0.120504           -0.122398              -0.068287   
Down_sp Std      0.107125            0.219643               0.139466   
Down_sp IR      -1.124897           -0.557260              -0.489629   
Down_sp Pct5    -0.287460           -1.000800              -0.200800   
Down_sp Pct25   -0.129827           -0.108807              -0.067467   
Down_sp Pct50   -0.094101           -0.056131              -0.033480   
Down_sp Pct75   -0.070250           -0.025190              -0.014265   
Down_sp Pct95   -0.045912           -0.002715              -0.001762   
Down_sp Occur  976.000000        54569.000000           55355.000000   

                tmb_space  all_sample_space  
Up_sp Mean       0.151314          0.080787  
Up_sp Std        0.079456          0.102709  
Up_sp IR         1.904376          0.786562  
Up_sp Pct5       0.066108          0.000000  
Up_sp Pct25      0.097333          0.021979  
Up_sp Pct50      0.130869          0.055737  
Up_sp Pct75      0.187622          0.111749  
Up_sp Pct95      0.330373          0.255610  
Up_sp Occur    976.000000     274820.000000  
Down_sp Mean    -0.201872         -0.100036  
Down_sp Std      0.118838          0.193569  
Down_sp IR      -1.698714         -0.516796  
Down_sp Pct5    -0.404359         -0.356856  
Down_sp Pct25   -0.239143         -0.090412  
Down_sp Pct50   -0.174785         -0.045032  
Down_sp Pct75   -0.130423         -0.019492  
Down_sp Pct95   -0.088023         -0.002146  
Down_sp Occur  976.000000     274820.000000  

因子分析可视化

  1. import matplotlib.pyplot as plt
  2. obj.create_full_report()
  3. plt.show()
Value of signals of Different Quantiles Statistics
             min        max      mean        std  count    count %
quantile                                                          
1         0.4286     2.7217  1.160327   0.309087  55355  20.142275
2         0.9723     4.1892  1.794837   0.470010  54967  20.001092
3         1.4054     5.7229  2.580409   0.658445  54962  19.999272
4         1.6454     9.4488  3.861061   0.982767  54967  20.001092
5         2.8008  5750.5164  8.830355  58.609644  54569  19.856270
Figure saved: E:\QTC\PythonQTC\course\4_Selection\Richard\output\returns_report.pdf
Information Analysis
                ic
IC Mean     -0.070
IC Std.      0.259
t-stat(IC)  -8.320
p-value(IC)  0.000
IC Skew      0.051
IC Kurtosis -0.771
Ann. IR     -0.268
Figure saved: E:\QTC\PythonQTC\course\4_Selection\Richard\output\information_report.pdf



<matplotlib.figure.Figure at 0x1790c4ff710>

output_17_2.png-426.2kB

output_17_3.png-172.1kB

  1. # 分组分析
  2. from jaqs_fxdayu.research.signaldigger import performance as pfm
  3. ic = pfm.calc_signal_ic(signal_data, by_group=True)
  4. mean_ic_by_group = pfm.mean_information_coefficient(ic, by_group=True)
  1. from jaqs_fxdayu.research.signaldigger import plotting
  2. plotting.plot_ic_by_group(mean_ic_by_group)
  3. plt.show()

output_19_0.png-19.8kB

将Quantile1的选股结果保存成excel

  1. excel_data = signal_data[signal_data['quantile']==1]["quantile"].unstack().replace(np.nan, 0)
  2. print (excel_data.head())
  3. excel_data.to_excel('./pb_quantile_1.xlsx')
symbol      000001.SZ  000002.SZ  000024.SZ  000027.SZ   ... 
trade_date            
20140103          1.0        0.0        0.0        0.0     ...   
20140106          1.0        0.0        0.0        0.0     ...  
20140107          1.0        0.0        0.0        0.0     ...  
20140108          1.0        0.0        0.0        0.0     ... 
20140109          1.0        0.0        0.0        0.0     ...  
[5 rows x 157 columns]
添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注