@Channelchan
2018-10-10T23:57:47.000000Z
字数 10495
阅读 75218
SignalDigger是一个Python第三方库,专门用于选股因子alpha(α)的绩效分析。
它是alphalens的功能集成、简化版,针对A股市场交易制度(如涨跌停)专门进行了一些细节上的优化,适合初学者迅速掌握和使用
下载方式: pip install git+https://github.com/xingetouzi/JAQS.git@fxdayu
github地址: https://github.com/xingetouzi/JAQS/tree/fxdayu
官方网站:https://www.quantos.org/ 可登录该网站注册自己的数据账号
下面以沪深300成分股为例,处理选股因子(signal_data)
from jaqs_fxdayu.data import DataView # 可以视为一个轻量级的数据库,数据格式基于pandas,方便数据的调用和处理
from jaqs_fxdayu.data import RemoteDataService # 数据服务,用于下载数据
import os
import warnings
warnings.filterwarnings("ignore")
dataview_folder = '../Factor'
if not (os.path.isdir(dataview_folder)):
os.makedirs(dataview_folder)
# 数据下载
def save_dataview():
data_config = {
"remote.data.address": "tcp://data.quantOS.org:8910",
"remote.data.username": "18566262672",
"remote.data.password": "eyJhbGciOiJIUzI1NiJ9.eyJjcmVhdGVfdGltZSI6IjE1MTI3MDI3NTAyMTIiLCJpc3MiOiJhdXRoMCIsImlkIjoiMTg1NjYyNjI2NzIifQ.O_-yR0zYagrLRvPbggnru1Rapk4kiyAzcwYt2a3vlpM"
}
ds = RemoteDataService()
ds.init_from_config(data_config)
dv = DataView()
props = {'start_date': 20140101, 'end_date': 20180101, 'universe': '000300.SH',
'fields': "volume,pb,pe,ps,roe,float_mv,sw1",
'freq': 1,
'timeout': 180}
dv.init_from_config(props, ds)
dv.prepare_data()
dv.save_dataview(dataview_folder) # 保存数据文件到指定路径,方便下次直接加载
save_dataview()
Begin: DataApi login 18566262672@tcp://data.quantOS.org:8910
Already login as 18566262672, skip init_from_config
Initialize config success.
Query data...
Query data - query...
NOTE: price adjust method is [post adjust]
当前请求daily...
{'adjust_mode': None, 'fields': 'symbol,trade_date,open,volume,high_adj,vwap,close_adj,high,low,vwap_adj,close,trade_status,low_adj,open_adj'}
当前请求daily...
{'adjust_mode': 'post', 'fields': 'open,vwap,low,high,close,symbol,trade_date'}
当前请求query_lb_dailyindicator...
{'fields': 'ps,float_mv,symbol,pe,pb,trade_date'}
WARNING: some data is unavailable:
At fields
Query data - daily fields prepared.
Query data - quarterly fields prepared.
Query instrument info...
Query adj_factor...
Query benchmark...
Query benchmar member info...
Query groups (industry)...
Field [trade_status] is overwritten.
Data has been successfully prepared.
Store data...
Dataview has been successfully saved to:
E:\2018_Course\HighSchool\Final\Factor
You can load it with load_dataview('E:\2018_Course\HighSchool\Final\Factor')
# 加载数据
dv = DataView()
dv.load_dataview(dataview_folder)
Dataview loaded successfully.
print(dv.get_ts("pb").head())
symbol 000001.SZ 000002.SZ 000008.SZ 000009.SZ 000012.SZ 000024.SZ \
trade_date
20140102 1.0563 1.2891 4.8981 3.5794 2.3725 1.3202
20140103 1.0304 1.2649 4.8709 3.4842 2.3346 1.2977
20140106 1.0079 1.2068 4.6314 3.4537 2.2036 1.2283
20140107 1.0044 1.1987 4.5661 3.4461 2.1920 1.2013
20140108 1.0157 1.1971 4.4790 3.3852 2.1862 1.1685
symbol 000027.SZ 000039.SZ 000046.SZ 000059.SZ ... 601998.SH \
trade_date ...
20140102 0.9077 2.0483 2.4159 0.8806 ... 0.8216
20140103 0.8861 2.0801 2.3726 0.8488 ... 0.8088
20140106 0.8662 2.0113 2.3348 0.8081 ... 0.7960
20140107 0.8629 2.0721 2.2970 0.7940 ... 0.7939
20140108 0.8728 2.0629 2.3294 0.7904 ... 0.7960
symbol 603000.SH 603160.SH 603288.SH 603699.SH 603799.SH 603833.SH \
trade_date
20140102 10.0487 NaN NaN NaN NaN NaN
20140103 9.8886 NaN NaN NaN NaN NaN
20140106 9.8515 NaN NaN NaN NaN NaN
20140107 10.1024 NaN NaN NaN NaN NaN
20140108 10.3713 NaN NaN NaN NaN NaN
symbol 603858.SH 603885.SH 603993.SH
trade_date
20140102 NaN NaN 2.7133
20140103 NaN NaN 2.6706
20140106 NaN NaN 2.5682
20140107 NaN NaN 2.5682
20140108 NaN NaN 2.5298
[5 rows x 488 columns]
import numpy as np
#定义信号过滤条件-非指数成分
def mask_index_member():
df_index_member = dv.get_ts('index_member')
mask_index_member = df_index_member == 0
return mask_index_member
# 定义可买卖条件——未停牌、未涨跌停
def limit_up_down():
trade_status = dv.get_ts('trade_status')
mask_sus = trade_status == 0
# 涨停
dv.add_formula('up_limit', '(close - Delay(close, 1)) / Delay(close, 1) > 0.095', is_quarterly=False, add_data=True)
# 跌停
dv.add_formula('down_limit', '(close - Delay(close, 1)) / Delay(close, 1) < -0.095', is_quarterly=False, add_data=True)
can_enter = np.logical_and(dv.get_ts('up_limit') < 1, ~mask_sus) # 未涨停未停牌
can_exit = np.logical_and(dv.get_ts('down_limit') < 1, ~mask_sus) # 未跌停未停牌
return can_enter,can_exit
mask = mask_index_member()
can_enter,can_exit = limit_up_down()
print(mask.head())
print(can_enter.head())
symbol 000001.SZ 000002.SZ 000008.SZ 000009.SZ ...
20140102 False False True False ...
20140103 False False True False ...
20140106 False False True False ...
20140107 False False True False ...
20140108 False False True False ...
[5 rows x 488 columns]
symbol 000001.SZ 000002.SZ 000008.SZ 000009.SZ ...
20140102 True True True True ...
20140103 True True True True ...
20140106 True True True True ...
20140107 True True True True ...
20140108 True True True True ...
[5 rows x 488 columns]
from jaqs_fxdayu.research import SignalDigger
obj = SignalDigger(output_folder='./output',
output_format='pdf')
# 处理因子 计算目标股票池每只股票的持有期收益,和对应因子值的quantile分类
obj.process_signal_before_analysis(signal=dv.get_ts("pb"),
price=dv.get_ts("close_adj"),
high=dv.get_ts("high_adj"), # 可为空
low=dv.get_ts("low_adj"),# 可为空
group=dv.get_ts("sw1"),# 可为空
n_quantiles=5,# quantile分类数
mask=mask,# 过滤条件
can_enter=can_enter,# 是否能进场
can_exit=can_exit,# 是否能出场
period=15,# 持有期
benchmark_price=dv.data_benchmark, # 基准价格 可不传入,持有期收益(return)计算为绝对收益
commission = 0.0008,
)
signal_data = obj.signal_data
signal_data.head()
Nan Data Count (should be zero) : 0; Percentage of effective data: 57%
signal | return | upside_ret | downside_ret | group | quantile | ||
---|---|---|---|---|---|---|---|
trade_date | symbol | ||||||
20140103 | 000001.SZ | 1.0563 | -0.003744 | 0.005068 | -0.057799 | 银行 | 1 |
000002.SZ | 1.2891 | 0.012511 | 0.010680 | -0.102841 | 房地产 | 2 | |
000009.SZ | 3.5794 | 0.029817 | 0.025430 | -0.069652 | 综合 | 4 | |
000012.SZ | 2.3725 | 0.021382 | 0.014163 | -0.116760 | 建筑材料 | 4 | |
000024.SZ | 1.3202 | -0.031632 | -0.002781 | -0.161771 | 房地产 | 2 |
from jaqs_fxdayu.research.signaldigger.analysis import analysis
result = analysis(signal_data, is_event=False, period=15)
print("——ic分析——")
print(result["ic"])
print("——选股收益分析——")
print(result["ret"])
print("——最大潜在盈利/亏损分析——")
print(result["space"])
——ic分析——
return_ic upside_ret_ic downside_ret_ic
IC Mean -6.945807e-02 5.937467e-02 -2.219792e-01
IC Std. 2.594710e-01 2.464900e-01 2.104796e-01
t-stat(IC) -8.298423e+00 7.467299e+00 -3.269370e+01
p-value(IC) 3.549725e-16 1.835765e-13 3.604265e-158
IC Skew 5.251623e-02 -4.326744e-01 5.580971e-01
IC Kurtosis -7.602440e-01 -4.722475e-01 1.068205e-01
Ann. IR -2.676911e-01 2.408806e-01 -1.054635e+00
——选股收益分析——
long_ret long_short_ret top_quantile_ret bottom_quantile_ret \
t-stat -4.354131 -6.429742 -18.372327 19.273406
p-value 0.000010 0.000000 0.000000 0.000000
skewness -0.056936 0.043832 0.738623 2.296000
kurtosis 2.476179 1.086328 4.264752 14.086266
Ann. Ret -0.074512 -0.093044 -0.122735 0.095147
Ann. Vol 0.132007 0.111627 0.384917 0.286435
Ann. IR -0.564453 -0.833528 -0.318861 0.332176
occurance 961.000000 961.000000 53562.000000 54314.000000
tmb_ret all_sample_ret
t-stat -6.597274 -4.942966
p-value 0.000000 0.000000
skewness -0.240436 1.263993
kurtosis 1.677179 7.805175
Ann. Ret -0.216154 -0.012879
Ann. Vol 0.252739 0.336874
Ann. IR -0.855246 -0.038232
occurance 961.000000 269680.000000
——最大潜在盈利/亏损分析——
long_space top_quantile_space bottom_quantile_space \
Up_sp Mean 0.086507 0.085930 0.078885
Up_sp Std 0.052492 0.090577 0.091406
Up_sp IR 1.647994 0.948698 0.863017
Up_sp Pct5 0.031881 0.002194 0.001494
Up_sp Pct25 0.053368 0.025353 0.020845
Up_sp Pct50 0.070421 0.060478 0.050651
Up_sp Pct75 0.102281 0.115654 0.104006
Up_sp Pct95 0.198886 0.261074 0.251017
Up_sp Occur 961.000000 53562.000000 54314.000000
Down_sp Mean -0.119896 -0.121331 -0.069269
Down_sp Std 0.107698 0.214995 0.139876
Down_sp IR -1.113264 -0.564343 -0.495217
Down_sp Pct5 -0.292471 -0.648977 -0.203668
Down_sp Pct25 -0.128481 -0.109396 -0.068355
Down_sp Pct50 -0.092179 -0.057195 -0.034265
Down_sp Pct75 -0.067415 -0.026451 -0.015115
Down_sp Pct95 -0.047044 -0.005004 -0.003160
Down_sp Occur 961.000000 53562.000000 54314.000000
tmb_space all_sample_space
Up_sp Mean 0.156780 0.084070
Up_sp Std 0.076756 0.092138
Up_sp IR 2.042564 0.912434
Up_sp Pct5 0.073360 0.001867
Up_sp Pct25 0.102274 0.023622
Up_sp Pct50 0.135111 0.057282
Up_sp Pct75 0.189303 0.113234
Up_sp Pct95 0.334003 0.257430
Up_sp Occur 961.000000 269680.000000
Down_sp Mean -0.201954 -0.099619
Down_sp Std 0.118146 0.190084
Down_sp IR -1.709366 -0.524077
Down_sp Pct5 -0.401497 -0.346045
Down_sp Pct25 -0.238337 -0.091120
Down_sp Pct50 -0.174540 -0.045931
Down_sp Pct75 -0.127514 -0.020540
Down_sp Pct95 -0.087007 -0.003896
Down_sp Occur 961.000000 269680.000000
import matplotlib.pyplot as plt
obj.create_full_report()
plt.show()
Value of signals of Different Quantiles Statistics
min max mean std count count %
quantile
1 0.4286 2.7217 1.159793 0.309939 54314 20.140166
2 0.9723 4.1892 1.794878 0.471906 53937 20.000371
3 1.4054 5.7229 2.578376 0.661841 53930 19.997775
4 1.6454 9.4488 3.854879 0.986411 53937 20.000371
5 2.8008 5750.5164 8.834096 59.155352 53562 19.861317
Figure saved: E:\2018_Course\HighSchool\Final\5_因子研发工具实操Richard\output\returns_report.pdf
Information Analysis
ic
IC Mean -0.069
IC Std. 0.259
t-stat(IC) -8.298
p-value(IC) 0.000
IC Skew 0.053
IC Kurtosis -0.760
Ann. IR -0.268
Figure saved: E:\2018_Course\HighSchool\Final\5_因子研发工具实操Richard\output\information_report.pdf
<matplotlib.figure.Figure at 0x21498899f60>
# 分组分析
from jaqs_fxdayu.research.signaldigger import performance as pfm
ic = pfm.calc_signal_ic(signal_data, by_group=True)
mean_ic_by_group = pfm.mean_information_coefficient(ic, by_group=True)
from jaqs_fxdayu.research.signaldigger import plotting
plotting.plot_ic_by_group(mean_ic_by_group)
plt.show()
excel_data = signal_data[signal_data['quantile']==1]["quantile"].unstack().replace(np.nan, 0)
print (excel_data.head())
excel_data.to_excel('./pb_quantile_1.xlsx')
symbol 000001.SZ 000002.SZ 000024.SZ 000027.SZ ...
trade_date
20140103 1.0 0.0 0.0 0.0 ...
20140106 1.0 0.0 0.0 0.0 ...
20140107 1.0 0.0 0.0 0.0 ...
20140108 1.0 0.0 0.0 0.0 ...
20140109 1.0 0.0 0.0 0.0 ...
[5 rows x 157 columns]