[关闭]
@Channelchan 2018-10-23T18:07:28.000000Z 字数 21034 阅读 74193

技术因子-类别四收益风险类

如须调用该因子,将add_data=False改为True.可自行将因子名字value更改.收盘价在部分数据没有close_adj的情况下,使用close即可,其他high、low同理.其中N为参数,可自行设置。

Variance20 H010001A

因子描述: 20日收益方差。

计算方法
13.png-11.4kB

StdDev(Return(close,1),N)^2*250

N=20、60、120等
注:因子值为年化后的值,等于日度方差*250

  1. Variance20_J = dv.add_formula('Variance20_J', 'StdDev(Return(close,1),20)^2*250' ,
  2. is_quarterly=False, add_data=False)

Kurtosis20 H010004A

因子描述:个股收益的20日峰度。

计算方法
image.png-10.3kB

Ts_Kurtosis(Return(close_adj,1),N)

或者

Ts_Kurtosis(((close_adj-Delay(close_adj,1))/Delay(close_adj,1)),N)

两种方式皆可,N=20、60、120等
其中:
r代表每日收益
σ代表收益标准差

  1. Kurtosis20_j = dv.add_formula('Kurtosis20_j', 'Ts_Kurtosis(Return(close_adj,1),20)' ,
  2. is_quarterly=False, add_data=False)

Alpha20 H010007A

因子描述: 20日Jensen's alpha

计算方法

alpha=(E(r)-rf)-betaE(rm-rf) r为每日收益,rf为无风险收益,beta为收益20日的bata值,bata=cov(r,rm)/var(rm)

betaN_J=Covariance(r_J,nr,N)/(StdDev(nr,N)^2)

AlphaN=(Ts_Mean(r_J-0.01,N) - betaN_J*(Ts_Mean((nr-0.01),N)))*250

主要在上面的两处地方可以修改N

N=20、60、120等

  1. hs300_close = dv.data_api.daily('000300.SH', dv.extended_start_date_d, dv.end_date, fields="close",
  2. adjust_mode=None)
  3. hs300_benchmark = hs300_close[0][['trade_date', 'close']].set_index('trade_date')
  4. dv.add_field("close")
  5. hs300 = 0 * dv.get_ts('close')
  6. for i in range(hs300.shape[1]):
  7. hs300.iloc[:, i] = hs300_benchmark
  8. dv.append_df(hs300, 'hs300')
  9. nr = dv.add_formula('nr', '(hs300-Delay(hs300,1))/Delay(hs300,1)', is_quarterly=False, add_data=True)
  10. r_J = dv.add_formula('r_J', '(close-Delay(close,1))/Delay(close,1)', is_quarterly=False, add_data=True)
  11. beta20_J = dv.add_formula('beta20_J', 'Covariance(r_J,nr,20)/(StdDev(nr,20)^2)' , is_quarterly=False,add_data=True)
  12. Alpha20_J = dv.add_formula('Alpha20_A',"(Ts_Mean(r_J-0.01,20) - beta20_J*(Ts_Mean((nr-0.01),20)))*250", is_quarterly=False, add_data=False)

Beta20 H010010A

因子描述: 20日beta值

计算方法
r为每日收益,rm为指数收益,beta为收益20日的bata值,bata=cov(r,rm)/var(rm)

betaN_J=Covariance(r_J,nr,N)/(StdDev(nr,N)^2)

N=20、60、120、250等

  1. hs300, msg = dv.data_api.daily("000300.SH", dv.extended_start_date_d, dv.end_date, fields='trade_date,close,open')
  2. hs300_benchmark = hs300[['trade_date', 'close']].set_index('trade_date')
  3. dv.add_field("close")
  4. hs300 = 0 * dv.get_ts('close')
  5. for i in range(hs300.shape[1]):
  6. hs300.iloc[:, i] = hs300_benchmark
  7. dv.append_df(hs300, 'hs300')
  8. nr = dv.add_formula('nr', '(hs300-Delay(hs300,1))/Delay(hs300,1)', is_quarterly=False, add_data=True)
  9. r_J = dv.add_formula('r_J', '(close_adj-Delay(close_adj,1))/Delay(close_adj,1)', is_quarterly=False, add_data=True)
  10. beta20_J =dv.add_formula('beta20_J' , 'Covariance(r_J,nr,20)/(StdDev(nr,20)^2)', is_quarterly=False,add_data=True)

SharpeRatio20 H010014A

因子描述: 20日夏普比率,表示每承受一单位总风险,会产生多少的超额报酬,可以同时对策略的收益与风险进行综合考虑。

计算方法
image.png-15.2kB

(Ts_Mean(close_ret,N)*250-0.03)/StdDev(close_ret,N)/Sqrt(250)

N=20、60、120等

其中:
E(r)代表期望收益,等于日度收益均值*250
r_f代表无风险收益率,使用计算日当日值,下同
σ代表收益的标准偏差,等于日度收益标准差*sqrt(250)

  1. dv.add_formula("close_ret", "Return(close_adj,1)", is_quarterly=False, add_data=True)
  2. SharpeRatio20 = dv.add_formula('SharpeRatio20_J', "(Ts_Mean(close_ret,20)*250-0.03)/StdDev(close_ret,20)/Sqrt(250)",is_quarterly=False,add_data=True)

TreynorRatio20 H010017A

因子描述:20日特诺雷比率,用以衡量投资回报率

计算方法

TR = (E(r)-Rf)/β

r代表每日收益,E(r)代表期望收益,Rf代表无风险收益,beta代表收益的风险值

因子值是年化后的值,等于日度值乘以250

betaN_J=Covariance(r_J,nr,N)/(StdDev(nr,N)^2)

TRN_J(250*(Ts_Mean(r_J,N))-0.03)/betaN_J

N=20、60、120等

  1. import numpy as np
  2. hs300, msg = dv.data_api.daily("000300.SH", dv.extended_start_date_d, dv.end_date, fields='trade_date,close,open')
  3. hs300_benchmark = hs300[['trade_date', 'close']].set_index('trade_date')
  4. hs300 = 0 * dv.get_ts('close')
  5. for i in range(hs300.shape[1]):
  6. hs300.iloc[:, i] = hs300_benchmark
  7. dv.append_df(hs300, 'hs300')
  8. dv.add_formula('nr', '(hs300-Delay(hs300,1))/Delay(hs300,1)'
  9. , is_quarterly=False, add_data=True)
  10. dv.add_formula('r_J','(close_adj-Delay(close_adj,1))/Delay(close_adj,1)'
  11. , is_quarterly=False, add_data=True)
  12. beta20_J =dv.add_formula('beta20_J' , 'Covariance(r_J,nr,20)/(StdDev(nr,20)^2)', is_quarterly=False,
  13. add_data=True)
  14. TR20_J = dv.add_formula('TR20_J','(250*(Ts_Mean(r_J,20))-0.03)/beta20_J' ,
  15. is_quarterly=False, add_data=True)

InformationRatio20 H010020A

因子描述: 20日信息比率

计算方法
image.png-14.9kB

Ts_Mean(r_J - nr,N)/StdDev(r_J - nr,N)

N=20、60、120等

其中:
r代表每日收益
r_M代表指数收益,选用沪深300指数

  1. import numpy as np
  2. hs300, msg = dv.data_api.daily("000300.SH", dv.extended_start_date_d, dv.end_date, fields='trade_date,close,open')
  3. hs300_benchmark = hs300[['trade_date', 'close']].set_index('trade_date')
  4. hs300 = 0 * dv.get_ts('close')
  5. for i in range(hs300.shape[1]):
  6. hs300.iloc[:, i] = hs300_benchmark
  7. dv.append_df(hs300, 'hs300')
  8. nr = dv.add_formula('nr', '(hs300-Delay(hs300,1))/Delay(hs300,1)', is_quarterly=False, add_data=True)
  9. r_J = dv.add_formula('r_J', '(close_adj-Delay(close_adj,1))/Delay(close_adj,1)', is_quarterly=False, add_data=True)
  10. IR20_J=dv.add_formula('IR20_J' , 'Ts_Mean(r_J - nr,20)/StdDev(r_J - nr,20)' , is_quarterly=False,
  11. add_data=True)

GainVariance20 H010023A

因子描述: 20日收益方差,类似于方差,但是主要衡量收益的表现。

计算方法
image.png-25.4kB

GainVariance20_J = pd.DataFrame({name: value.dropna().rolling(N).std() ** 2 for name, value in pct_return.iteritems()},index=pct_return.index).fillna(method='ffill')

N=20、60、120等

其中:
r代表每日收益
注:因子值是年化后的值,等于日度值*250

  1. import pandas as pd
  2. def cal_positive(df):
  3. return df[df > 0]
  4. dv.add_field("close_adj")
  5. pct_return = cal_positive(dv.get_ts('close_adj').pct_change())
  6. GainVariance20_J = pd.DataFrame(
  7. {name: value.dropna().rolling(20).std() ** 2 for name, value in pct_return.iteritems()},
  8. index=pct_return.index).fillna(method='ffill')
  9. dv.append_df(GainVariance20_J, 'GainVariance20_J')

LossVariance20 H010026A

因子描述: 20日损失方差, 类似于方差,但是主要衡量损失的表现

计算方法
image.png-24.8kB

LossVariance20_A = pd.DataFrame({name: value.dropna().rolling(N).std() ** 2 for name, value in pct_return.iteritems()},index=pct_return.index).fillna(method='ffill')

N=20、60、120等

其中:
r代表每日收益
注:因子值是年化后的值,等于日度值*250

  1. import pandas as pd
  2. cal_negative = lambda df: df[df < 0]
  3. dv.add_field("close_adj")
  4. pct_return = cal_negative(dv.get_ts('close_adj').pct_change())
  5. LossVariance20_A = pd.DataFrame(
  6. {name: value.dropna().rolling(20).std() ** 2 for name, value in pct_return.iteritems()},
  7. index=pct_return.index).fillna(method='ffill')

GainLossVarianceRatio20 H010029A

因子描述: 20日收益损失方差比

计算方法
image.png-28.4kB

GainVariance N_J/LossVariance N_J

N=20、60、120等

其中:
r代表每日收益

  1. import pandas as pd
  2. def cal_negative(df):
  3. return df[df < 0]
  4. dv.add_field("close_adj")
  5. pct_return = cal_negative(dv.get_ts('close_adj').pct_change())
  6. LossVariance20_J = pd.DataFrame(
  7. {name: value.dropna().rolling(20).std() ** 2 for name, value in pct_return.iteritems()},
  8. index=pct_return.index).fillna(method='ffill')
  9. dv.append_df(LossVariance20_J, 'LossVariance20_J')
  10. def cal_positive(df):
  11. return df[df > 0]
  12. pct_return = cal_positive(dv.get_ts('close_adj').pct_change())
  13. GainVariance20_J = pd.DataFrame(
  14. {name: value.dropna().rolling(20).std() ** 2 for name, value in pct_return.iteritems()},
  15. index=pct_return.index).fillna(method='ffill')
  16. dv.append_df(GainVariance20_J, 'GainVariance20_J')
  17. GainlossVarianceratio20 = dv.add_formula('GainlossVarianceratio20_J', "GainVariance20_J/LossVariance20_J",is_quarterly=False, add_data=True)
  18. dv.append_df(GainlossVarianceratio20, 'GainlossVarianceratio20_J')

RealizedVolatility H010032A

因子描述:实际波动率,日内5分钟线的收益率标准差

计算方法:使用5分钟线的close计算每5分钟的收益,然后求日内5分钟的收益的标准差

  1. import pandas as pd
  2. def get_daily_value(date):
  3. print(date)
  4. data, msg = dv.data_api.bar(",".join(dv.symbol),
  5. trade_date=date, freq="5M")
  6. try:
  7. data = data.dropna().pivot(index="time", columns="symbol", values="close")
  8. data = data.groupby(data.index // 500).first()
  9. except ValueError:
  10. print(date)
  11. raise
  12. return data.std().rename(date)
  13. # 跟请求效率很有关...
  14. dv.add_field("close")
  15. dates = list(dv.get_ts("close").index)
  16. result = pd.concat(map(get_daily_value, dates), axis=1).T
  17. dv.append_df(df=result, field_name="NPFromOperatingTTM", is_quarterly=False)

DASTD H010033A

因子描述: 252日超额收益标准差

计算方法
DASTD=std(r-rf)

r为每日收益,rf为无风险收益,半衰期为42个交易日

  1. r_J = dv.add_formula('r_J', '(close_adj-Delay(close_adj,1))/close_adj', is_quarterly=False, add_data=True)
  2. # dv.append_df(r_J, 'r_J')
  3. dastd = (r_J).ewm(halflife=42).std(ddof=1) # 如果用了定盘利率还得考虑一下日度收益率和年度收益率的问题
  4. dv.append_df(DASTD, 'dastd')

HsigmaCNE5 H010034A

因子描述: 252日残差收益波动率

计算方法
HsigmaCNE5=std(ei)

ei代表残差收益,总共使用252个交易日,半衰期为63个交易日

  1. start_date_delta = 300
  2. hs300, msg = dv.data_api.daily("000300.SH", dv.extended_start_date_d, dv.end_date, fields='trade_date,close')
  3. hs300_benchmark = hs300[['trade_date', 'close']].set_index('trade_date').pct_change()
  4. from numpy.lib.stride_tricks import as_strided as strided
  5. import numpy as np
  6. import pandas as pd
  7. def get_sliding_window(df, W, return2D=0):
  8. a = df.values
  9. s0, s1 = a.strides
  10. m, n = a.shape
  11. out = strided(a, shape=(m - W + 1, W, n), strides=(s0, s0, s1))
  12. if return2D == 1:
  13. return out.reshape(m - W + 1, -1)
  14. else:
  15. return out
  16. def _beta(stock_return, universe_return, trailing_days=252, half_life=63):
  17. # universe_return = hs300_benchmark.reindex(index=)
  18. weights = np.sqrt(exponential_weight(trailing_days, half_life))
  19. coef, _, resid = wls_by_numpy(universe_return, stock_return, weights)
  20. return coef, np.std(resid) # np.sqrt(np.sum(resid**2)/250)
  21. def wls_by_numpy(x, y, w):
  22. A = np.vstack([x, np.ones(len(x))]).T * w.reshape(-1, 1)
  23. _y = y * w
  24. m, c = np.linalg.lstsq(A, _y)[0]
  25. resid = y - (m * x + c)
  26. return m, c, resid
  27. def exponential_weight(trailing_days=252, half_life=63):
  28. _s = np.flip(np.arange(0, trailing_days), axis=0) / half_life
  29. return np.power(0.5, _s)
  30. def calc_hsigmacne5(s):
  31. idx = s.index
  32. s = s.dropna()
  33. if s.size == 0 or len(s) < 252:
  34. return np.nan
  35. concated = pd.concat([s, hs300_benchmark.reindex(index=s.index)], axis=1)
  36. strides = get_sliding_window(concated, 252)
  37. hsigma = pd.Series(list(map(lambda x: _beta(x[:, 0], x[:, 1])[1], strides)),
  38. index=s.index[252 - 1:]).reindex(index=idx)
  39. ret_ = dv.add_formula("ret_", "Return(close_adj)", is_quarterly=False)
  40. hsigmacne5= ret_.apply(calc_hsigmacne5)
  41. dv.append_df(hsigmacne5, 'hsigmacne5')

CmraCNE5 H010035A

因子描述: 12月累计收益(Monthly cumulative return range over the past 12 months)。

计算方法

2.png-37.9kB
其中:rf代表无风险收益

  1. def run_formula(dv):
  2. ret_ = dv.add_formula("ret", "Return(close_adj)", is_quarterly=False)
  3. month_days = 21
  4. # Rf先当作0
  5. def get_sliding_window(s, W):
  6. """
  7. input: np.arange(20), W=4
  8. output:
  9. array([[ 0, 1, 2, 3],
  10. [ 4, 5, 6, 7],
  11. [ 8, 9, 10, 11],
  12. [12, 13, 14, 15],
  13. [16, 17, 18, 19]])
  14. """
  15. from numpy.lib.stride_tricks import as_strided as strided
  16. assert len(s) % W == 0
  17. strides = s.strides
  18. assert len(strides) == 1
  19. strides = strides[0]
  20. return strided(s, shape=(int(len(s) / W), W), strides=(W * strides, strides))
  21. def calc_range(s, days=month_days, allrange=month_days * 12):
  22. """
  23. 还要考虑缺失值的问题
  24. """
  25. def get_max_and_min(s):
  26. # print(s)
  27. from numpy import log
  28. s = s + 1
  29. out = get_sliding_window(s, days)
  30. out = out.prod(axis=1)
  31. # return out.max()-out.min() # 跟这个有点像,那么uqer就是瞎写的
  32. Z_T = log(out).cumsum() # 这里没有减无风险收益,
  33. return log((1 + Z_T.max()) / (1 + Z_T.min()))
  34. return s.rolling(allrange).apply(get_max_and_min)
  35. CmraCNE5=ret_.apply(calc_range)
  36. dv.append_df(CmraCNE5, 'CmraCNE5')

Cmra H010036A

因子描述: 24月累计收益(Monthly cumulative return range over the past 24 months)。

计算方法

3.png-36kB
成交量为0时不考虑计算

  1. CMRA = dv.add_formula('CMRA_J',"Log(Ts_Max(close_adj,475)/Ts_Min(close_adj,475))"
  2. , is_quarterly=False, add_data=False)

Hbeta H010037A

因子描述:历史贝塔(Historical daily beta ) , 过往12个月中,个股日收益关于市场组合日收益的三阶自回归,市场组合日收益的系数。

均值回归的残差的方差除以自由度

计算方法

4.png-16.9kB
其中市场组合日收益r_m.t的计算采用沪深300的数据
r_m.t=(CloseIndex-PrevCloseIndex)/PrevCloseIndex
回归结果中的β_h即为历史贝塔HBETA

  1. def Multi_Regression(index, n, Y, *X):
  2. """
  3. index是指返回的系数矩阵中第几个,从0开始,0代表常数项,1代表第一个系数的值
  4. n是rolling多少天
  5. Y是因变量矩阵
  6. *X传入list或者set或者tuple,list的元素是每一个矩阵
  7. """
  8. from numpy.linalg import inv, LinAlgError
  9. from pandas import DataFrame, Series
  10. import numpy as np
  11. import pandas as pd
  12. DF = dict()
  13. le_th = len(Y)
  14. columns = Y.columns
  15. indexes = Y.index
  16. for column in columns:
  17. betas = []
  18. def _func_x(x):
  19. if isinstance(x, DataFrame):
  20. return x[column].values
  21. elif isinstance(x, Series):
  22. return x.values
  23. else:
  24. raise Error
  25. X_column = list(map(_func_x, X))
  26. X_column.insert(0, np.ones(le_th))
  27. X_column = np.array(X_column).T
  28. Y_column = np.array(Y[column].values).T
  29. print(column)
  30. for length in range(n, le_th):
  31. X_temp = X_column[length - n:length]
  32. try:
  33. beta = (inv((X_temp.T).dot(X_temp))).dot(X_temp.T).dot(Y_column[length - n:length])
  34. betas.append(beta[index])
  35. except LinAlgError:
  36. betas.append(np.nan)
  37. DF[column] = betas
  38. for key, value in DF.items():
  39. if len(value) != le_th - n:
  40. DF[key] += [np.nan] * (le_th - n - len(DF[key]))
  41. if len(value) != le_th:
  42. DF[key] = [np.nan] * n + DF[key]
  43. df = pd.DataFrame(DF, index=indexes)
  44. return df
  45. dv.add_field("close_adj")
  46. zz800_benchmark, msg = dv.data_api.daily("000300.SH", dv.extended_start_date_d, dv.end_date,
  47. fields='trade_date,close,open')
  48. dv.data_benchmark = zz800_benchmark[['trade_date', 'close', 'open']].set_index('trade_date')
  49. BenchmarkIndexClose = dv.data_benchmark["close"].loc[dv.get_ts("close_adj").index]
  50. BenchmarkIndexOpen = dv.data_benchmark["open"].loc[dv.get_ts("close_adj").index]
  51. # risk_free_rate=-0.005
  52. M_Return = BenchmarkIndexClose.pct_change(1)
  53. Return = dv.get_ts("close_adj").pct_change(1)
  54. Return_1 = Return.shift(1)
  55. Return_2 = Return.shift(2)
  56. Return_3 = Return.shift(3)
  57. # 参数改成252是标准,但是只有0.89
  58. df = Multi_Regression(1, 252, Return, M_Return, Return_1, Return_2, Return_3)
  59. dv.append_df(df, "HBETA_J", overwrite=True, is_quarterly=False)
  60. HBETA = dv.get_ts("HBETA_J")

Hsigma H010038A

因子描述:历史波动(Historical daily s igma) , 过往12个月中,个股日收益关于市场组合日收益的三阶自回归,市场组合日收益的残差标准差。

计算方法

4.png-16.9kB
其中市场组合日收益r_m.t的计算采用沪深300的数据
r_m.t=(CloseIndex-PrevCloseIndex)/PrevCloseIndex
5.png-11.3kB
p为4,image.png-1.7kB即为历史波动HSIGMA

  1. # start_date_delta = 500 # 至少需要2年的数据参与运算
  2. def Multi_Regression(n, Y, *X):
  3. from numpy.linalg import inv, LinAlgError
  4. from pandas import DataFrame, Series
  5. import numpy as np
  6. import pandas as pd
  7. DF = dict()
  8. le_th = len(Y)
  9. columns = Y.columns
  10. indexes = Y.index
  11. p = 4
  12. for column in columns:
  13. residuals = []
  14. def _func_x(x):
  15. if isinstance(x, DataFrame):
  16. return x[column].values
  17. elif isinstance(x, Series):
  18. return x.values
  19. else:
  20. raise Error
  21. X_column = list(map(_func_x, X))
  22. # print(X_column)
  23. X_column.insert(0, np.ones(le_th))
  24. X_column = np.array(X_column).T
  25. Y_column = np.array(Y[column].values).T
  26. for length in range(n, le_th):
  27. X_temp = X_column[length - n:length]
  28. Y_temp = Y_column[length - n:length]
  29. try:
  30. beta = (inv((X_temp.T).dot(X_temp))).dot(X_temp.T).dot(Y_temp)
  31. residual = (Y_temp - (beta).dot(X_temp.T))
  32. # print(residual[0])
  33. residual = (residual ** 2).sum() / (n - p - 1)
  34. residuals.append(residual)
  35. except LinAlgError:
  36. residuals.append(np.nan)
  37. DF[column] = residuals
  38. for key, value in DF.items():
  39. if len(value) != le_th - n:
  40. DF[key] += [np.nan] * (le_th - n - len(DF[key]))
  41. if len(value) != le_th:
  42. DF[key] = [np.nan] * n + DF[key]
  43. df = pd.DataFrame(DF, index=indexes)
  44. return df
  45. dv.add_field("close_adj")
  46. Return = dv.get_ts("close_adj").pct_change(1)
  47. Return_1 = Return.shift(1)
  48. Return_2 = Return.shift(2)
  49. Return_3 = Return.shift(3)
  50. zz800_benchmark, msg = dv.data_api.daily("000300.SH", dv.extended_start_date_d, dv.end_date,
  51. fields='trade_date,close,open')
  52. dv.data_benchmark = zz800_benchmark[['trade_date', 'close', 'open']].set_index('trade_date')
  53. BenchmarkIndexClose = dv.data_benchmark["close"].loc[dv.get_ts("close_adj").index]
  54. BenchmarkIndexOpen = dv.data_benchmark["open"].loc[dv.get_ts("close_adj").index]
  55. M_Return = BenchmarkIndexClose.pct_change(1)
  56. df = Multi_Regression(250, Return, Return_1, Return_2, Return_3, M_Return)
  57. dv.append_df(df, "HSIGMA_J", overwrite=True, is_quarterly=False)
  58. HSIGMA = dv.get_ts("HSIGMA_J")

DDNSR H010039A

因子描述:下跌波动(Downside standard deviations ratio) , 过往12个月中,市场组合日收益为负时, 个股日收益标准差和市场组合日收益标准差之比。

计算方法

image.png-10.8kB

其中市场组合日收益image.png-1.9kB的计算采用沪深300的数据,仅考虑市场回报为负的数据
image.png-1.9kB=(CloseIndex-PrevCloseIndex)/PrevCloseIndex

  1. import pandas as pd
  2. import numpy as np
  3. T = 250
  4. benchmark, msg = dv.data_api.daily("000300.SH", dv.extended_start_date_d, dv.end_date,
  5. fields='trade_date,close,open')
  6. benchmark = benchmark[['trade_date', 'close']].set_index('trade_date').pct_change()
  7. benchmark = benchmark[benchmark < 0]
  8. def drop_and_corr(arr):
  9. arr = arr[arr[:, 1] < 0]
  10. arr = arr[~np.isnan(arr).any(axis=1)] # 可加可不加
  11. return arr[:, 0].std(ddof=1)/arr[:, 1].std(ddof=1)
  12. from numpy.lib.stride_tricks import as_strided as strided
  13. def get_sliding_window(df, W, return2D=0):
  14. a = df.values
  15. s0, s1 = a.strides
  16. m, n = a.shape
  17. out = strided(a, shape=(m - W + 1, W, n), strides=(s0, s0, s1))
  18. if return2D == 1:
  19. return out.reshape(m - W + 1, -1)
  20. else:
  21. return out
  22. def calc_corr(s):
  23. benchmark_ = benchmark.reindex(index=s.index)
  24. df = pd.concat([s, benchmark_], axis=1)
  25. idx = df.index[T - 1:]
  26. strides = get_sliding_window(df, T)
  27. result = list(map(lambda x: drop_and_corr(x), strides))
  28. return pd.Series(result, index=idx)
  29. ret = dv.add_formula("_ret", "Return(close_adj)", is_quarterly=False)
  30. DDNSR = ret.apply(calc_corr)
  31. dv.append_df(DDNSR, 'DDNSR')

DDNCR H010040A

因子描述:下跌相关系数(Downside correlation) , 过往12个月中,市场组合日收益为负时,个股日收益关于市场组合日收益的相关系数。

计算方法
image.png-10.1kB

其中市场组合日收益image.png-2.1kB的计算采用沪深300的数据,仅考虑市场回报为负的数据
image.png-2.1kB=(CloseIndex-PrevCloseIndex)/PrevCloseIndex

  1. import pandas as pd
  2. import numpy as np
  3. T = 250
  4. benchmark, msg = dv.data_api.daily("000300.SH", dv.extended_start_date_d, dv.end_date,
  5. fields='trade_date,close,open')
  6. benchmark = benchmark[['trade_date', 'close']].set_index('trade_date').pct_change()
  7. benchmark = benchmark[benchmark < 0]
  8. def drop_and_corr(arr):
  9. from scipy.stats import pearsonr
  10. arr = arr[arr[:, 1] < 0]
  11. arr = arr[~np.isnan(arr).any(axis=1)] # 可加可不加
  12. return pearsonr(arr[:, 0], arr[:, 1])[0]
  13. from numpy.lib.stride_tricks import as_strided as strided
  14. def get_sliding_window(df, W, return2D=0):
  15. a = df.values
  16. s0, s1 = a.strides
  17. m, n = a.shape
  18. out = strided(a, shape=(m - W + 1, W, n), strides=(s0, s0, s1))
  19. if return2D == 1:
  20. return out.reshape(m - W + 1, -1)
  21. else:
  22. return out
  23. def calc_corr(s):
  24. benchmark_ = benchmark.reindex(index=s.index)
  25. df = pd.concat([s, benchmark_], axis=1)
  26. idx = df.index[T - 1:]
  27. strides = get_sliding_window(df, T)
  28. result = list(map(lambda x: drop_and_corr(x), strides))
  29. return pd.Series(result, index=idx)
  30. ret = dv.add_formula("_ret", "Return(close_adj)", is_quarterly=False)
  31. DDNCR = ret.apply(calc_corr)
  32. dv.append_df(DDNCR, 'DDNCR')

Dvrat H010042A

因子描述:收益相对波动(Daily returns variance ratio-serial dependence in daily returns)。

计算方法
image.png-1.4kB为第i支股票的日收益,image.png-2kB为每日的无风险收益,则该股票当日的超额日收益image.png-5kB收益相对波动可表示为:

6.png-7.4kB
其中:
7.png-15.8kB
m=q(T-q+1)(1-q/T)
T为过往24个月中的交易日数,q=10
8.png-8.4kB
代码计算中将每日无风险收益image.png-1.8kB按0处理。最终结果舍去了交易日不足180天的结果。

  1. dv.add_formula("ret_J", "Return(close_adj)", is_quarterly=False, add_data=True)
  2. T = 500 # 过往24个月中的交易日数
  3. q = 10
  4. dv.add_formula("sigma_squared_J", "Ts_Sum(Pow(ret_J, 2), %s)/(%s -1)" % (T, T),
  5. is_quarterly=False, add_data=False)
  6. m = q*(T-q+1)*(1-q/T)
  7. dv.add_formula("sigma_q_tmp", "Pow(Ts_Sum(ret_J, %s), 2)" % q, is_quarterly=False, add_data=True)
  8. sigma_q = dv.add_formula("sigma_q", "Ts_Sum(sigma_q_tmp, %s)/%s" % (T-q, m), is_quarterly=False)

Ddnbt H010043A

因子描述:下跌贝塔(Downside beta) , 过往12个月中,市场组合日收益为负时,个股日收益关于市场组合日收益的回归系数。

计算方法
14.png-7kB
其中市场组合日收益image.png-2.3kB的计算采用沪深300的数据,仅考虑市场回报为负的数据
image.png-2.3kB=(CloseIndex-PrevCloseIndex)/PrevCloseIndex
回归结果中的β_d即为下跌贝塔DDNBT。

  1. import pandas as pd
  2. import numpy as np
  3. benchmark, msg = dv.data_api.daily("000300.SH", dv.extended_start_date_d, dv.end_date,
  4. fields='trade_date,close,open')
  5. benchmark = benchmark[['trade_date', 'close']].set_index('trade_date').pct_change()
  6. def np_regr(A, y, residuals=False):
  7. if A.ndim == 1:
  8. A = A[:, None]
  9. A = np.hstack((np.ones(len(A))[:, None], A))
  10. betas = np.linalg.lstsq(A, y)[0]
  11. if residuals:
  12. return y - A @ betas
  13. return betas
  14. def drop_and_regr(arr, residuals=False, drop_first=True):
  15. arr = arr[~np.isnan(arr).any(axis=1)]
  16. # print(arr.shape)
  17. if arr.size == 0:
  18. return np.nan
  19. # 可以再加一个条件当过去12个月内有数据的交易日不足多少天就不算
  20. # 这里加一个下跌贝塔的条件判断
  21. arr = arr[arr[:, 1] < 0]
  22. if arr.size == 0:
  23. return np.nan
  24. return np_regr(arr[:, 1:], arr[:, :1], residuals=residuals)[1].item()
  25. from numpy.lib.stride_tricks import as_strided as strided
  26. def get_sliding_window(df, W, return2D=0):
  27. a = df.values
  28. s0, s1 = a.strides
  29. m, n = a.shape
  30. out = strided(a, shape=(m - W + 1, W, n), strides=(s0, s0, s1))
  31. if return2D == 1:
  32. return out.reshape(m - W + 1, -1)
  33. else:
  34. return out
  35. ret = dv.add_formula("ret", "Return(close_adj)", is_quarterly=False, add_data=True)
  36. def calc_regr(stock_ret):
  37. """
  38. 在里边rolling吧
  39. """
  40. X = pd.concat([stock_ret, benchmark], axis=1)
  41. idx = X.index[252 - 1:]
  42. strides = get_sliding_window(X, 252)
  43. result = list(map(lambda x: drop_and_regr(x), strides))
  44. return pd.Series(result, index=idx)
  45. Ddnbt = ret.apply(calc_regr)
  46. dv.append_df(Ddnbt, 'Ddnbt')

Tobt H010044A

因子描述:超额流动(Liquidity-turnover beta)。

计算方法

(a)记image.png-1.4kB为第i支股票的日收益,市场组合日收益image.png-3.4kB为每日的无风险收益,则当日各自的超额日收益为9.png-10.4kB

市场组合日收益image.png-1.7kB的计算采用沪深300的数据

image.png-1.7kB=(CloseIndex-PrevCloseIndex)/PrevCloseIndex

代码计算中将每日无风险收益image.png-2.2kB按0处理。

(b)每日换手率image.png-4.3kB可查询得到,也可按如下公式计算

image.png-19.4kB
若有数值缺失使用总股本数值代替。

(c)日超额收益绝对值关于换手率、市场组合日收益绝对值的五阶和自身五阶的回归表示为:
10.png-23.9kB
回归结果中的日换手率系数βi即为所求的超额流动TOBT.
最终结果舍去了交易日不足180天的结果。

  1. import pandas as pd
  2. import numpy as np
  3. zz800_benchmark, msg = dv.data_api.daily("000300.SH", dv.extended_start_date_d, dv.end_date,
  4. fields='trade_date,close,open')
  5. dv.data_benchmark = zz800_benchmark[['trade_date', 'close', 'open']].set_index('trade_date')
  6. BenchmarkIndexClose = dv.data_benchmark["close"].loc[dv.get_ts("close_adj").index]
  7. BenchmarkIndexOpen = dv.data_benchmark["open"].loc[dv.get_ts("close_adj").index]
  8. # risk_free_rate=-0.005
  9. M_Return = BenchmarkIndexClose.pct_change(1)
  10. # M_return=M_Return.apply(lambda x:round(x*100,4))
  11. M_Return_1 = M_Return.shift(1)
  12. M_Return_2 = M_Return.shift(2)
  13. M_Return_3 = M_Return.shift(3)
  14. M_Return_4 = M_Return.shift(4)
  15. M_Return_5 = M_Return.shift(5)
  16. dv.add_field("turnover_ratio")
  17. TORate = dv.add_formula("TORate", "turnover_ratio/100", add_data=True, is_quarterly=False)
  18. # TORate=TORate.applymap(lambda x:round(x*100,4))
  19. Return = dv.add_formula("Close_Return", "Return(close_adj)", add_data=True, is_quarterly=False)
  20. # Return=Return.applymap(lambda x:round(x*100,4))
  21. Return_1 = Return.shift(1)
  22. Return_2 = Return.shift(2)
  23. Return_3 = Return.shift(3)
  24. Return_4 = Return.shift(4)
  25. Return_5 = Return.shift(5)
  26. from numpy.linalg import inv, LinAlgError
  27. DF = dict()
  28. window = 498
  29. le_th = len(Return)
  30. indexes = Return.index
  31. betas = []
  32. for column in Return.columns:
  33. X = np.array(
  34. [np.ones(le_th), TORate[column].abs().values, Return_1[column].abs().values, Return_2[column].abs().values,
  35. Return_3[column].abs().values, Return_4[column].abs().values, Return_5[column].abs().values,
  36. M_Return_1.abs().values, M_Return_2.abs().values, M_Return_3.abs().values, M_Return_4.abs().values,
  37. M_Return_5.abs().values]).T
  38. # print(X)
  39. Y = np.array(Return[column].abs().values).T
  40. betas = []
  41. # print(X.shape)
  42. # print(Y.shape)
  43. print(column)
  44. for length in range(window, len(Y)):
  45. X_temp = X[length - window:length]
  46. # print(X_temp.shape)
  47. # print(Y[length-window:length].shape)
  48. try:
  49. beta = (inv((X_temp.T).dot(X_temp))).dot(X_temp.T).dot(Y[length - window:length])
  50. betas.append(beta[1])
  51. except LinAlgError:
  52. betas.append(np.nan)
  53. DF[column] = betas
  54. for key, value in DF.items():
  55. if len(value) != le_th - window:
  56. DF[key] += [np.nan] * (le_th - window - len(DF[key]))
  57. if len(value) != le_th:
  58. # print(key)
  59. DF[key] = [np.nan] * window + DF[key]
  60. df = pd.DataFrame(DF, index=indexes)
  61. dv.append_df(df, "TOBT", overwrite=True, is_quarterly=False)
  62. TOBT = dv.get_ts("TOBT")

Skewness H010045A

因子描述:股价偏度(Skewness of price during the last 20 days) , 过去20个交易日股价的偏度。

计算方法
11.png-13.9kB

  1. SKEWNESS_J = dv.add_formula('SKEWNESS_J', "Ts_Skewness(close_adj,{})".format(20), is_quarterly=False, add_data=False)
添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注