[关闭]
@Channelchan 2018-10-11T01:02:33.000000Z 字数 20579 阅读 74778

自定义因子、事件分析

目录

  1. 两种自定义因子的方法——add_formula&append_df
  2. 什么是事件?(0/1因子)
  3. 事件选股效果的可视化
  1. from jaqs_fxdayu.data import DataView
  2. import warnings
  3. warnings.filterwarnings("ignore")
  4. dataview_folder = './Factor'
  5. dv = DataView()
  6. dv.load_dataview(dataview_folder)
Dataview loaded successfully.

1、两种自定义因子的方法——add_formula&append_df

dataview可以通过两种方式计算/添加自定义因子进行分析

方法1:add_formula 基于dataview里已有的字段,通过表达式定义因子

  1. # 直接返回
  2. dv.add_formula("momentum", "Return(close_adj, 20)", is_quarterly=False).head()
symbol 000001.SZ 000002.SZ 000008.SZ 000009.SZ 000012.SZ 000024.SZ 000027.SZ 000039.SZ 000046.SZ 000059.SZ ... 601992.SH 601997.SH 601998.SH 603000.SH 603160.SH 603288.SH 603699.SH 603858.SH 603885.SH 603993.SH
trade_date
20140102 -0.100735 -0.085812 -0.057592 -0.006342 -0.100442 -0.051708 -0.068143 0.012426 -0.074534 -0.089580 ... -0.140442 NaN -0.065375 0.104574 NaN NaN NaN NaN NaN -0.084892
20140103 -0.111690 -0.102975 -0.052910 -0.040881 -0.116740 -0.078923 -0.082474 0.048699 -0.091097 -0.111111 ... -0.167112 NaN -0.075426 0.105497 NaN NaN NaN NaN NaN -0.091437
20140106 -0.121896 -0.137255 -0.095643 -0.059129 -0.165380 -0.111576 -0.106164 0.011311 -0.098121 -0.134470 ... -0.214003 NaN -0.085575 0.132137 NaN NaN NaN NaN NaN -0.123726
20140107 -0.118271 -0.138051 -0.109342 -0.060228 -0.174342 -0.122535 -0.104991 0.039841 -0.095745 -0.139847 ... -0.200000 NaN -0.088020 0.076545 NaN NaN NaN NaN NaN -0.118594
20140108 -0.115124 -0.144175 -0.159346 -0.063224 -0.179235 -0.160665 -0.093103 0.066347 -0.081023 -0.156604 ... -0.216033 NaN -0.085575 0.118630 NaN NaN NaN NaN NaN -0.127941

5 rows × 472 columns

  1. # 添加到数据集dv里,则计算结果之后可以反复调用
  2. dv.add_formula("momentum", "Return(close_adj, 20)", is_quarterly=False, add_data=True)
  3. dv.get_ts("momentum").head()
symbol 000001.SZ 000002.SZ 000008.SZ 000009.SZ 000012.SZ 000024.SZ 000027.SZ 000039.SZ 000046.SZ 000059.SZ ... 601992.SH 601997.SH 601998.SH 603000.SH 603160.SH 603288.SH 603699.SH 603858.SH 603885.SH 603993.SH
trade_date
20140102 -0.100735 -0.085812 -0.057592 -0.006342 -0.100442 -0.051708 -0.068143 0.012426 -0.074534 -0.089580 ... -0.140442 NaN -0.065375 0.104574 NaN NaN NaN NaN NaN -0.084892
20140103 -0.111690 -0.102975 -0.052910 -0.040881 -0.116740 -0.078923 -0.082474 0.048699 -0.091097 -0.111111 ... -0.167112 NaN -0.075426 0.105497 NaN NaN NaN NaN NaN -0.091437
20140106 -0.121896 -0.137255 -0.095643 -0.059129 -0.165380 -0.111576 -0.106164 0.011311 -0.098121 -0.134470 ... -0.214003 NaN -0.085575 0.132137 NaN NaN NaN NaN NaN -0.123726
20140107 -0.118271 -0.138051 -0.109342 -0.060228 -0.174342 -0.122535 -0.104991 0.039841 -0.095745 -0.139847 ... -0.200000 NaN -0.088020 0.076545 NaN NaN NaN NaN NaN -0.118594
20140108 -0.115124 -0.144175 -0.159346 -0.063224 -0.179235 -0.160665 -0.093103 0.066347 -0.081023 -0.156604 ... -0.216033 NaN -0.085575 0.118630 NaN NaN NaN NaN NaN -0.127941

5 rows × 472 columns

附:add_formula 支持的内置公式查询方法

  1. # 完整文档
  2. dv.func_doc.doc
- 分类 说明 公式 示例
0 四则运算 加法运算 + close + open
1 四则运算 减法运算 - close - open
2 四则运算 乘法运算 * vwap * volume
3 四则运算 除法运算 / close / open
4 基本数学函数 符号函数,返回值为{-1, 0, 1} Sign(x) Sign(close-open)
5 基本数学函数 绝对值函数 Abs(x) Abs(close-open)
6 基本数学函数 自然对数 Log(x) Log(close/open)
7 基本数学函数 对x取负 -x -close
8 基本数学函数 幂函数 ^ close ^ 2
9 基本数学函数 幂函数x^y Pow(x,y) Pow(close,2)
10 基本数学函数 保持符号的幂函数,等价于Sign(x) * (Abs(x)^e) SignedPower(x,e) SignedPower(close-open, 0.5)
11 基本数学函数 取余函数 % oi % 10
12 逻辑运算 判断是否相等 == close == open
13 逻辑运算 判断是否不等 != close != open
14 逻辑运算 大于 > close > open
15 逻辑运算 小于 < close < open
16 逻辑运算 大于等于 >= close >= open
17 逻辑运算 小于等于 <= close <= open
18 逻辑运算 逻辑与 && (close > open) && (close > vwap)
19 逻辑运算 逻辑或
20 逻辑运算 逻辑非 ! !(close>open)
21 逻辑运算 判断值是否为NaN IsNan(x) IsNan(net_profit)
22 三角函数 正弦函数 Sin(x) Sin(close/open)
23 三角函数 余弦函数 Cos(x) Cos(close/open)
24 三角函数 正切函数 Tan(x) Tan(close/open)
25 三角函数 开平方函数 Sqrt(x) Sqrt(close^2 + open^2)
26 取整函数 向上取整 Ceil(x) Ceil(high)
27 取整函数 向下取整 Floor(x) Floor(low)
28 取整函数 四舍五入 Round(x) Round(close)
29 选择函数 取 x 和 y 同位置上的较大值组成新的DataFrame返回 Max(x,y) Max(close, open)
... ... ... ... ...
38 时间序列函数 - 统计 指标在过去n天的标准差 StdDev(x,n) StdDev(close/Delay(close,1)-1, 10)
39 时间序列函数 - 统计 两个指标在过去n天的协方差 Covariance(x,y,n) Covariance(close, open, 10)
40 时间序列函数 - 统计 两个指标在过去n天的相关系数 Correlation(x,y,n) Correlation(close,open, 10)
41 时间序列函数 - 统计 计算指标在过去n天的最小值 Ts_Min(x,n) Ts_Min(close,5)
42 时间序列函数 - 统计 计算指标在过去n天的最大值 Ts_Max(x,n) Ts_Max(close,5)
43 时间序列函数 - 统计 计算指标在过去n天的偏度 Ts_Skewness(x,n) Ts_Skewness(close,20)
44 时间序列函数 - 统计 计算指标在过去n天的峰度 Ts_Kurtosis(x,n) Ts_Kurtosis(close,20)
45 时间序列函数 - 排名 计算指标在过去n天的排名,返回值为名次 Ts_Rank(x, n) Ts_Rank(close, 5)
46 时间序列函数 - 排名 计算指标在过去n天的百分比,返回值为[0.0, 1.0] Ts_Percentile(x, n) Ts_Percentile(close, 5)
47 时间序列函数 - 排名 计算指标在过去n天所属的quantile,返回值为表示quantile的整数 Ts_Quantile(x, n) Ts_Quantile(close, 5)
48 时间序列函数 - 排名 指数移动平均,以halflife的衰减对x进行指数移动平均 Ewma(x, halflife) Ewma(x, 3)
49 横截面函数 - 排名 将指标值在横截面方向排名,返回值为名次 Rank(x) Rank( close/Delay(close,1)-1 ) 表示按日收益率进行排名
50 横截面函数 - 排名 按分组数据g在每组内将指标值在横截面方向排名,返回值为名次 GroupRank(x,g) GroupRank(close/Delay(close,1)-1, g) 表示按分组g根据日...
51 横截面函数 - 排名 将指标值在横截面方向排名,返回值为排名百分比 Percentile(x) Percentile(close)
52 横截面函数 - 排名 按分组数据g在每组内将指标值在横截面方向排名,返回值为排名百分比 GroupPercentile(x, g, n) GroupPercentile(close, sw1) 按申万1级行业
53 横截面函数 - 排名 和Rank函数相同,但只有 cond 中值为True的标的参与排名 ConditionRank(x, cond) GroupRank(close/Delay(close,1)-1, cond) 表示按条件c...
54 横截面函数 - 排名 根据指标值在横截面方向将标的分成n个quantile,返回值为所属quantile Quantile(x, n) Quantile( close/Delay(close,1)-1,5)表示按日收益率分为5档
55 横截面函数 - 排名 按分组数据g在每组内根据指标值在横截面方向将标的分成n个quantile,返回值为所属qua... GroupQuantile(x, g, n) GroupQuantile(close/Delay(close,1)-1,g,5) 表示按日...
56 横截面函数 - 数据处理 将指标标准化,即在横截面上减去平均值后再除以标准差 Standardize(x) Standardize(close/Delay(close,1)-1) 表示日收益率的标准化
57 横截面函数 - 数据处理 将指标横截面上去极值,用MAD (Maximum Absolute Deviation)方法... Cutoff(x, z_score) Cutoff(close,3) 表示去掉z_score大于3的极值
58 财报函数 将累计财务数据转换为单季财务数据 CumToSingle(x) CumToSingle(net_profit)
59 财报函数 从累计财务数据计算TTM的财务数据 TTM(x) TTM(net_profit)
60 其他 过去 n 天的指数衰减函数,其中 f 是平滑因子。这里 f 是平滑因子,可以赋一个小于 1 ... Decay_exp(x,f,n) Decay_exp(close,0.9,10)
61 其他 过去n天的线性衰减函数。Decay_linear(x, n) = (x[date] * n ... Decay_linear(x,n) Decay_linear(close,15)
62 其他 如果 x 的值介于 lower 和 upper,则将其设定为 newval Tail(x, lower, upper, newval) Tail(close/open, 0.99, 1.01, 1.0)
63 其他 Step(n) 为每个标的创建一个向量,向量中 n 代表最新日期,n-1 代表前一天,以此类推。 Step(n) Step(30)
64 其他 时间序列函数,计算 x 中的值在过去 n 天中为 nan (非数字)的次数 CountNans(x,n) CountNans((close-open)^0.5, 10) 表示过去10天内有几天clo...
65 时间序列函数 - 统计 计算指标在过去n天最大值的坐标 Ts_Argmax(x,n) Ts_Argmax(high,10)
66 时间序列函数 - 统计 计算指标在过去n天最小值的坐标 Ts_Argmin(x,n) Ts_Argmin(low,10)
67 技术指标 根据talib技术指标库计算x中每只股票的技术指标 Ta(ta_method,ta_column,open,high,low,close,vol... Ta('MACD','macdsignal',open,high,low,close,vol...

68 rows × 4 columns

  1. # 函数一览
  2. dv.func_doc.funcs
array(['+', '-', '*', '/', 'Sign(x)', 'Abs(x)', 'Log(x)', '-x', '^',
       'Pow(x,y)', 'SignedPower(x,e)', '%', '==', '!=', '>', '<', '>=',
       '<=', '&&', '||', '!', 'IsNan(x)', 'Sin(x)', 'Cos(x)', 'Tan(x)',
       'Sqrt(x)', 'Ceil(x)', 'Floor(x)', 'Round(x)', 'Max(x,y)',
       'Min(x,y)', 'If(cond,x,y)', 'Delay(x,n)', 'Ts_Sum(x,n)',
       'Ts_Product(x,n)', 'Delta(x,n)', 'Return(x,n,log)', 'Ts_Mean(x,n)',
       'StdDev(x,n)', 'Covariance(x,y,n)', 'Correlation(x,y,n)',
       'Ts_Min(x,n)', 'Ts_Max(x,n)', 'Ts_Skewness(x,n)',
       'Ts_Kurtosis(x,n)', 'Ts_Rank(x, n)', 'Ts_Percentile(x, n)',
       'Ts_Quantile(x, n)', 'Ewma(x, halflife)', 'Rank(x)',
       'GroupRank(x,g)', 'Percentile(x)', 'GroupPercentile(x, g, n)',
       'ConditionRank(x, cond)', 'Quantile(x, n)',
       'GroupQuantile(x, g, n)', 'Standardize(x)', 'Cutoff(x, z_score)',
       'CumToSingle(x)', 'TTM(x)', 'Decay_exp(x,f,n)', 'Decay_linear(x,n)',
       'Tail(x, lower, upper, newval)', 'Step(n)', 'CountNans(x,n)',
       'Ts_Argmax(x,n)', 'Ts_Argmin(x,n)',
       'Ta(ta_method,ta_column,open,high,low,close,volume,*args)'], dtype=object)
  1. # 函数类型
  2. dv.func_doc.types
array(['四则运算', '基本数学函数', '逻辑运算', '三角函数', '取整函数', '选择函数', '时间序列函数 - 基本数学运算',
       '时间序列函数 - 统计', '时间序列函数 - 排名', '横截面函数 - 排名', '横截面函数 - 数据处理', '财报函数',
       '其他', '技术指标'], dtype=object)
  1. # 函数描述
  2. dv.func_doc.descriptions
array(['加法运算', '减法运算', '乘法运算', '除法运算', '符号函数,返回值为{-1, 0, 1}', '绝对值函数',
       '自然对数', '对x取负', '幂函数', '幂函数x^y', '保持符号的幂函数,等价于Sign(x) * (Abs(x)^e)',
       '取余函数', '判断是否相等', '判断是否不等', '大于', '小于', '大于等于', '小于等于', '逻辑与',
       '逻辑或', '逻辑非', '判断值是否为NaN', '正弦函数', '余弦函数', '正切函数', '开平方函数', '向上取整',
       '向下取整', '四舍五入', '取 x 和 y 同位置上的较大值组成新的DataFrame返回',
       '取 x 和 y 同位置上的较小值组成新的DataFrame返回', 'cond为True取x的值,反之取y的值',
       '指标n个周期前的值', '指标在过去n天的和', '指标在过去 n 天的积', '指标当前值与n天前的值的差',
       '计算指标相比n天前的变化率,默认计算百分比变化率;当log为1时,计算对数变化率;为0时计算普通变化率',
       '计算指标在过去n天的平均值', '指标在过去n天的标准差', '两个指标在过去n天的协方差', '两个指标在过去n天的相关系数',
       '计算指标在过去n天的最小值', '计算指标在过去n天的最大值', '计算指标在过去n天的偏度', '计算指标在过去n天的峰度',
       '计算指标在过去n天的排名,返回值为名次', '计算指标在过去n天的百分比,返回值为[0.0, 1.0]',
       '计算指标在过去n天所属的quantile,返回值为表示quantile的整数',
       '指数移动平均,以halflife的衰减对x进行指数移动平均', '将指标值在横截面方向排名,返回值为名次',
       '按分组数据g在每组内将指标值在横截面方向排名,返回值为名次', '将指标值在横截面方向排名,返回值为排名百分比',
       '按分组数据g在每组内将指标值在横截面方向排名,返回值为排名百分比',
       '和Rank函数相同,但只有 cond 中值为True的标的参与排名',
       '根据指标值在横截面方向将标的分成n个quantile,返回值为所属quantile',
       '按分组数据g在每组内根据指标值在横截面方向将标的分成n个quantile,返回值为所属quantile',
       '将指标标准化,即在横截面上减去平均值后再除以标准差',
       '将指标横截面上去极值,用MAD (Maximum Absolute Deviation)方法, z_score为极值判断标准',
       '将累计财务数据转换为单季财务数据', '从累计财务数据计算TTM的财务数据',
       '过去 n 天的指数衰减函数,其中 f 是平滑因子。这里 f 是平滑因子,可以赋一个小于 1 的值。Decay_exp(x, f, n) = (x[date] + x[date - 1] * f + … +x[date – n - 1] * (f (n – 1))) / (1 + f + … + f ^ (n - 1))',
       '过去n天的线性衰减函数。Decay_linear(x, n) = (x[date] * n + x[date - 1] * (n - 1) + … + x[date – n - 1]) / (n + (n - 1) + … + 1)',
       '如果 x 的值介于 lower 和 upper,则将其设定为 newval',
       'Step(n) 为每个标的创建一个向量,向量中 n 代表最新日期,n-1 代表前一天,以此类推。',
       '时间序列函数,计算 x 中的值在过去 n 天中为 nan (非数字)的次数', '计算指标在过去n天最大值的坐标',
       '计算指标在过去n天最小值的坐标', '根据talib技术指标库计算x中每只股票的技术指标'], dtype=object)
  1. # 根据函数类型查询该类型下所有的函数
  2. dv.func_doc.search_by_type("数学函数")
- 分类 说明 公式 示例
4 基本数学函数 符号函数,返回值为{-1, 0, 1} Sign(x) Sign(close-open)
5 基本数学函数 绝对值函数 Abs(x) Abs(close-open)
6 基本数学函数 自然对数 Log(x) Log(close/open)
7 基本数学函数 对x取负 -x -close
8 基本数学函数 幂函数 ^ close ^ 2
9 基本数学函数 幂函数x^y Pow(x,y) Pow(close,2)
10 基本数学函数 保持符号的幂函数,等价于Sign(x) * (Abs(x)^e) SignedPower(x,e) SignedPower(close-open, 0.5)
11 基本数学函数 取余函数 % oi % 10
  1. # 根据函数描述查询可能符合该描述的所有的函数
  2. dv.func_doc.search_by_description("绝对值")
- 分类 说明 公式 示例
5 基本数学函数 绝对值函数 Abs(x) Abs(close-open)
  1. # 根据函数名查询该函数
  2. dv.func_doc.search_by_func("Tan",precise=True)
- 分类 说明 公式 示例
24 三角函数 正切函数 Tan(x) Tan(close/open)
  1. # 根据函数名查询该函数 -模糊查询
  2. dv.func_doc.search_by_func("Tan",precise=False)
- 分类 说明 公式 示例
24 三角函数 正切函数 Tan(x) Tan(close/open)
56 横截面函数 - 数据处理 将指标标准化,即在横截面上减去平均值后再除以标准差 Standardize(x) Standardize(close/Delay(close,1)-1) 表示日收益率的标准化

更高自由度下的自定义因子

内置的函数终究有限,add_formula可以通过事先定义并注册一些因子计算中需要的函数方法,完成更高自由度的因子计算

  1. # 定义指数平均计算函数-传入一个时间为索引,股票为columns的Dataframe,计算其指数平均序列
  2. # SMAtoday=m/n * Pricetoday + ( n-m )/n * SMAyesterday;
  3. def sma(df, n, m):
  4. a = n / m - 1
  5. r = df.ewm(com=a, axis=0, adjust=False)
  6. return r.mean()
  1. dv.add_formula("double_SMA","SMA(SMA(close_adj,3,1),3,1)",
  2. is_quarterly=False,
  3. add_data=True,
  4. register_funcs={"SMA":sma}).head()
symbol 000001.SZ 000002.SZ 000008.SZ 000009.SZ 000012.SZ 000024.SZ 000027.SZ 000039.SZ 000046.SZ 000059.SZ ... 601998.SH 603000.SH 603160.SH 603288.SH 603699.SH 603799.SH 603833.SH 603858.SH 603885.SH 603993.SH
trade_date
20140102 670.001224 866.321144 55.978992 56.156775 155.501418 96.905673 45.642678 333.108222 131.600373 7.665619 ... 4.559056 75.925974 NaN NaN NaN NaN NaN NaN NaN 6.630090
20140103 670.754419 864.895402 55.622882 56.221643 155.191833 96.852825 45.563959 336.905794 131.232323 7.624630 ... 4.561700 76.535824 NaN NaN NaN NaN NaN NaN NaN 6.604232
20140106 669.232846 858.583814 55.029013 56.140789 153.899560 96.151771 45.326884 339.232586 130.538272 7.537741 ... 4.553090 76.943164 NaN NaN NaN NaN NaN NaN NaN 6.546882
20140107 666.620239 850.197604 54.314453 55.990621 152.229109 95.018897 45.027148 341.799375 129.548772 7.426478 ... 4.539106 77.435346 NaN NaN NaN NaN NaN NaN NaN 6.481907
20140108 664.622559 841.700257 53.517639 55.718069 150.533678 93.550863 44.788736 344.012293 128.733057 7.313338 ... 4.525617 78.146377 NaN NaN NaN NaN NaN NaN NaN 6.410478

5 rows × 472 columns

方法2: append_df 构造一个因子表格(pandas.Dataframe),直接添加到dataview当中

更高自由度,只需要任意事先计算出一个因子表格,均可以添加到数据集里

  1. import pandas as pd
  2. import talib as ta
  3. close = dv.get_ts("close_adj").dropna(how='all', axis=1)
  4. slope_df = pd.DataFrame({sec_symbol: -ta.LINEARREG_SLOPE(value.values, 10) for sec_symbol, value in close.iteritems()}, index=close.index)
  5. dv.append_df(slope_df,'slope')
  6. dv.get_ts("slope").tail()
symbol 000001.SZ 000002.SZ 000008.SZ 000009.SZ 000012.SZ 000024.SZ 000027.SZ 000039.SZ 000046.SZ 000059.SZ ... 601998.SH 603000.SH 603160.SH 603288.SH 603699.SH 603799.SH 603833.SH 603858.SH 603885.SH 603993.SH
trade_date
20171225 -5.964987 -2.935803 -0.667269 0.272287 2.049269 NaN -0.072746 -4.823338 0.141595 0.071706 ... 0.007734 0.495512 1.065298 -1.027654 0.013481 1.026721 -0.610970 0.146825 -0.167882 0.164181
20171226 -7.955376 -10.674239 -1.047420 0.209359 1.611634 NaN -0.038910 -4.209824 0.223873 0.054990 ... -0.000740 0.522930 1.105218 -0.566065 -0.006355 0.692044 -0.359273 0.103882 -0.093462 0.140671
20171227 -7.844112 -22.146334 -1.124412 0.103469 1.035090 NaN 0.005921 -3.235674 0.327199 0.029917 ... -0.008968 0.560068 1.066272 -0.342089 -0.027669 0.425376 -0.055576 0.093022 -0.084378 0.096784
20171228 -6.206059 -33.122673 -0.997696 0.082896 0.316651 NaN 0.056674 -2.627907 0.543418 0.006268 ... -0.009462 0.543119 0.976696 -0.213343 -0.036400 0.043794 0.179273 0.109964 -0.121064 -0.049372
20171229 -2.905227 -37.173927 -0.941555 0.005446 -0.280804 NaN 0.025376 -2.701184 0.562552 -0.012252 ... -0.006006 0.489031 0.747401 -0.088527 -0.049817 -0.114670 0.361152 0.099849 -0.204568 -0.122450

5 rows × 472 columns

2、什么是事件?

事件是因子的一种特殊形式,用1/0/-1表示。
比方说,我们可以定义长短期均线金叉为一个事件,股价走势发生金叉记为1,死叉记为-1,其他记位0。
通过事件分析,可以测试某个事件和股票未来收益的关系。

运用上面提供的自定义因子方法,构造一个5日/10日均线的金叉事件

  1. from jaqs_fxdayu.research.signaldigger import process
  2. Open = dv.get_ts("open_adj")
  3. High = dv.get_ts("high_adj")
  4. Low = dv.get_ts("low_adj")
  5. Close = dv.get_ts("close_adj")
  6. trade_status = dv.get_ts('trade_status')
  7. mask_sus = trade_status == 0
  8. # 剔除掉停牌期的数据 再计算指标
  9. open_masked = process._mask_df(Open,mask=mask_sus)
  10. high_masked = process._mask_df(High,mask=mask_sus)
  11. low_masked = process._mask_df(Low,mask=mask_sus)
  12. close_masked = process._mask_df(Close,mask=mask_sus)
  1. from jaqs_fxdayu.data import signal_function_mod as sfm
  2. MA5 = sfm.ta(ta_method='MA',
  3. ta_column=0,
  4. Open=open_masked,
  5. High=high_masked,
  6. Low=low_masked,
  7. Close=close_masked,
  8. Volume=None,
  9. timeperiod=5)
  10. MA10 = sfm.ta('MA',Close=close_masked, timeperiod=10)
  11. dv.append_df(MA5,'MA5')
  12. dv.append_df(MA10,'MA10')
  1. # 定义金叉事件
  2. dv.add_formula("Cross","(MA5>=MA10)&&(Delay(MA5<MA10, 1))",is_quarterly=False, add_data=True)
symbol 000001.SZ 000002.SZ 000008.SZ 000009.SZ 000012.SZ 000024.SZ 000027.SZ 000039.SZ 000046.SZ 000059.SZ ... 601998.SH 603000.SH 603160.SH 603288.SH 603699.SH 603799.SH 603833.SH 603858.SH 603885.SH 603993.SH
trade_date
20140102 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
20140103 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
20140106 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
20140107 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
20140108 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
20140109 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
20140110 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
20140113 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
20140114 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
20140115 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
20140116 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 ... 0.0 0.0 NaN NaN NaN NaN NaN NaN NaN 0.0
20140117 0.0 0.0 0.0 0.0 1.0 0.0 1.0 0.0 0.0 0.0 ... 0.0 0.0 NaN NaN NaN NaN NaN NaN NaN 1.0
20140120 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 NaN NaN NaN NaN NaN NaN NaN 0.0
20140121 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 NaN NaN NaN NaN NaN NaN NaN 0.0
20140122 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 NaN NaN NaN NaN NaN NaN NaN 1.0
20140123 0.0 1.0 0.0 0.0 0.0 1.0 0.0 0.0 1.0 0.0 ... 0.0 0.0 NaN NaN NaN NaN NaN NaN NaN 0.0
20140124 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 NaN NaN NaN NaN NaN NaN NaN 0.0
20140127 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 1.0 0.0 NaN NaN NaN NaN NaN NaN NaN 1.0
20140128 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 ... 0.0 0.0 NaN NaN NaN NaN NaN NaN NaN 0.0
20140129 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 NaN NaN NaN NaN NaN NaN NaN 0.0
20140130 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 NaN NaN NaN NaN NaN NaN NaN 0.0
20140207 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 NaN NaN 0.0 NaN NaN NaN NaN 0.0
20140210 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 NaN NaN 0.0 NaN NaN NaN NaN 0.0
20140211 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 NaN NaN 0.0 NaN NaN NaN NaN 0.0
20140212 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 NaN NaN 0.0 NaN NaN NaN NaN 0.0
20140213 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 NaN NaN 0.0 NaN NaN NaN NaN 0.0
20140214 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 1.0 1.0 ... 0.0 0.0 NaN NaN 1.0 NaN NaN NaN NaN 0.0
20140217 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 NaN NaN 0.0 NaN NaN NaN NaN 0.0
20140218 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 NaN NaN 0.0 NaN NaN NaN NaN 0.0
20140219 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 NaN NaN 0.0 NaN NaN NaN NaN 0.0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
20171120 0.0 0.0 0.0 0.0 0.0 NaN 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
20171121 0.0 0.0 0.0 0.0 0.0 NaN 0.0 0.0 0.0 0.0 ... 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
20171122 0.0 0.0 0.0 0.0 0.0 NaN 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0
20171123 0.0 0.0 0.0 0.0 0.0 NaN 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
20171124 0.0 0.0 0.0 0.0 0.0 NaN 0.0 0.0 1.0 0.0 ... 0.0 0.0 1.0 1.0 0.0 1.0 0.0 0.0 0.0 0.0
20171127 0.0 0.0 0.0 0.0 0.0 NaN 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0
20171128 0.0 0.0 0.0 0.0 0.0 NaN 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
20171129 0.0 0.0 0.0 0.0 0.0 NaN 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
20171130 0.0 0.0 0.0 1.0 0.0 NaN 0.0 0.0 0.0 1.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
20171201 0.0 0.0 0.0 0.0 0.0 NaN 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
20171204 0.0 0.0 0.0 0.0 0.0 NaN 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0
20171205 0.0 0.0 0.0 0.0 0.0 NaN 0.0 1.0 0.0 0.0 ... 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
20171206 0.0 0.0 1.0 0.0 0.0 NaN 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
20171207 0.0 0.0 0.0 0.0 0.0 NaN 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0
20171208 0.0 0.0 0.0 0.0 0.0 NaN 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
20171211 0.0 0.0 0.0 0.0 0.0 NaN 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0
20171212 0.0 0.0 0.0 0.0 0.0 NaN 0.0 1.0 0.0 0.0 ... 0.0 0.0 0.0 1.0 0.0 0.0 1.0 0.0 0.0 0.0
20171213 0.0 0.0 0.0 0.0 0.0 NaN 0.0 0.0 1.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0
20171214 1.0 0.0 0.0 1.0 0.0 NaN 1.0 0.0 0.0 0.0 ... 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0
20171215 0.0 0.0 0.0 0.0 0.0 NaN 0.0 0.0 0.0 0.0 ... 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
20171218 0.0 0.0 0.0 0.0 0.0 NaN 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
20171219 0.0 0.0 0.0 0.0 0.0 NaN 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
20171220 0.0 0.0 1.0 0.0 0.0 NaN 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
20171221 0.0 0.0 0.0 0.0 0.0 NaN 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
20171222 1.0 0.0 0.0 0.0 0.0 NaN 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
20171225 0.0 1.0 0.0 0.0 0.0 NaN 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
20171226 0.0 0.0 0.0 0.0 0.0 NaN 0.0 0.0 0.0 0.0 ... 1.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0
20171227 0.0 0.0 0.0 0.0 0.0 NaN 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
20171228 0.0 0.0 0.0 0.0 0.0 NaN 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 1.0
20171229 0.0 0.0 0.0 1.0 1.0 NaN 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

917 rows × 472 columns

事件分析

  1. import numpy as np
  2. #定义信号过滤条件-非指数成分
  3. def mask_index_member():
  4. df_index_member = dv.get_ts('index_member')
  5. mask_index_member = df_index_member == 0
  6. return mask_index_member
  7. # 定义可买卖条件——未停牌、未涨跌停
  8. def limit_up_down():
  9. trade_status = dv.get_ts('trade_status')
  10. mask_sus = trade_status == 0
  11. # 涨停
  12. dv.add_formula('up_limit', '(close - Delay(close, 1)) / Delay(close, 1) > 0.095', is_quarterly=False, add_data=True)
  13. # 跌停
  14. dv.add_formula('down_limit', '(close - Delay(close, 1)) / Delay(close, 1) < -0.095', is_quarterly=False, add_data=True)
  15. can_enter = np.logical_and(dv.get_ts('up_limit') < 1, ~mask_sus) # 未涨停未停牌
  16. can_exit = np.logical_and(dv.get_ts('down_limit') < 1, ~mask_sus) # 未跌停未停牌
  17. return can_enter,can_exit
  18. mask = mask_index_member()
  19. can_enter,can_exit = limit_up_down()
  1. from jaqs_fxdayu.research import SignalDigger
  2. obj = SignalDigger(output_folder='./output',
  3. output_format='pdf')
  4. # 和处理因子的步骤一样 n_quantiles=1
  5. # 不传入benchmark 可以分析绝对收益
  6. obj.process_signal_before_analysis(signal=dv.get_ts("Cross"),
  7. price=dv.get_ts("close_adj"),
  8. high=dv.get_ts("high_adj"), # 可为空
  9. low=dv.get_ts("low_adj"),# 可为空
  10. n_quantiles=1,# quantile分类数
  11. mask=mask,# 过滤条件
  12. can_enter=can_enter,# 是否能进场
  13. can_exit=can_exit,# 是否能出场
  14. period=15,# 持有期
  15. # benchmark_price=dv.data_benchmark, # 基准价格 可不传入,持有期收益(return)计算为绝对收益
  16. commission = 0.0008,
  17. )
  18. signal_data = obj.signal_data
  19. signal_data.head()
Nan Data Count (should be zero) : 0;  Percentage of effective data: 58%
signal return upside_ret downside_ret quantile
trade_date symbol
20140117 000001.SZ 0.0 0.051465 0.088050 -0.033901 1
000002.SZ 0.0 0.040160 0.119256 -0.006450 1
000009.SZ 0.0 0.264483 0.322153 -0.012334 1
000012.SZ 0.0 0.097084 0.130152 -0.008737 1
000024.SZ 0.0 0.064774 0.155523 -0.008997 1
  1. from jaqs_fxdayu.research.signaldigger.analysis import analysis
  2. result = analysis(signal_data, is_event=True, period=15)
  3. print("——选股收益分析——")
  4. print(result["ret"])
  5. print("——最大潜在盈利/亏损分析——")
  6. print(result["space"])
    ——选股收益分析——
               long_ret  long_short_ret  all_sample_ret
t-stat        12.598541       -2.532330       53.562672
p-value        0.000000        0.011500        0.000000
skewness       0.687618        0.384795        0.541517
kurtosis       5.757444        6.439517        6.210530
Ann. Ret       0.172894       -0.029187        0.181235
Ann. Vol       0.422764        0.085846        0.434142
Ann. IR        0.408960       -0.339994        0.417456
occurance  15312.000000      896.000000   265600.000000
——最大潜在盈利/亏损分析——
                 long_space  all_sample_space
Up_sp Mean         0.082264          0.084310
Up_sp Std          0.089366          0.092245
Up_sp IR           0.920522          0.913982
Up_sp Pct5         0.001019          0.001947
Up_sp Pct25        0.022871          0.023772
Up_sp Pct50        0.056777          0.057538
Up_sp Pct75        0.111500          0.113509
Up_sp Pct95        0.250879          0.257973
Up_sp Occur    15312.000000     265600.000000
Down_sp Mean      -0.098562         -0.099719
Down_sp Std        0.187441          0.190518
Down_sp IR        -0.525831         -0.523410
Down_sp Pct5      -0.330871         -0.347779
Down_sp Pct25     -0.092257         -0.091053
Down_sp Pct50     -0.046429         -0.045785
Down_sp Pct75     -0.021270         -0.020408
Down_sp Pct95     -0.004611         -0.003867
Down_sp Occur  15312.000000     265600.000000
  1. # 可视化
  2. import matplotlib.pyplot as plt
  3. obj.create_full_report()
  4. plt.show()
    Value of signals of Different Quantiles Statistics
          min  max      mean       std   count  count %
quantile                                               
1         0.0  1.0  0.057651  0.233082  265600    100.0
Figure saved: E:\2018_Course\HighSchool\Final\5_因子研发工具实操Richard\output\returns_report.pdf
Information Analysis
                ic
IC Mean     -0.005
IC Std.      0.084
t-stat(IC)  -1.615
p-value(IC)  0.107
IC Skew     -0.217
IC Kurtosis  1.232
Ann. IR     -0.054
Figure saved: E:\2018_Course\HighSchool\Final\5_因子研发工具实操Richard\output\information_report.pdf
<matplotlib.figure.Figure at 0x1d0b7982860>

output_28_2.png-251.8kB

output_28_3.png-160.2kB

添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注