@Channelchan
2018-10-23T10:07:28.000000Z
字数 21034
阅读 74413
如须调用该因子,将add_data=False改为True.可自行将因子名字value更改.收盘价在部分数据没有close_adj的情况下,使用close即可,其他high、low同理.其中N为参数,可自行设置。
Variance20 H010001A
因子描述: 20日收益方差。
计算方法:

StdDev(Return(close,1),N)^2*250
N=20、60、120等
注:因子值为年化后的值,等于日度方差*250
Variance20_J = dv.add_formula('Variance20_J', 'StdDev(Return(close,1),20)^2*250' ,is_quarterly=False, add_data=False)
Kurtosis20 H010004A
因子描述:个股收益的20日峰度。
计算方法:

Ts_Kurtosis(Return(close_adj,1),N)
或者
Ts_Kurtosis(((close_adj-Delay(close_adj,1))/Delay(close_adj,1)),N)
两种方式皆可,N=20、60、120等
其中:
r代表每日收益
σ代表收益标准差
Kurtosis20_j = dv.add_formula('Kurtosis20_j', 'Ts_Kurtosis(Return(close_adj,1),20)' ,is_quarterly=False, add_data=False)
Alpha20 H010007A
因子描述: 20日Jensen's alpha
计算方法:
alpha=(E(r)-rf)-betaE(rm-rf) r为每日收益,rf为无风险收益,beta为收益20日的bata值,bata=cov(r,rm)/var(rm)
betaN_J=Covariance(r_J,nr,N)/(StdDev(nr,N)^2)
AlphaN=(Ts_Mean(r_J-0.01,N) - betaN_J*(Ts_Mean((nr-0.01),N)))*250
主要在上面的两处地方可以修改N
N=20、60、120等
hs300_close = dv.data_api.daily('000300.SH', dv.extended_start_date_d, dv.end_date, fields="close",adjust_mode=None)hs300_benchmark = hs300_close[0][['trade_date', 'close']].set_index('trade_date')dv.add_field("close")hs300 = 0 * dv.get_ts('close')for i in range(hs300.shape[1]):hs300.iloc[:, i] = hs300_benchmarkdv.append_df(hs300, 'hs300')nr = dv.add_formula('nr', '(hs300-Delay(hs300,1))/Delay(hs300,1)', is_quarterly=False, add_data=True)r_J = dv.add_formula('r_J', '(close-Delay(close,1))/Delay(close,1)', is_quarterly=False, add_data=True)beta20_J = dv.add_formula('beta20_J', 'Covariance(r_J,nr,20)/(StdDev(nr,20)^2)' , is_quarterly=False,add_data=True)Alpha20_J = dv.add_formula('Alpha20_A',"(Ts_Mean(r_J-0.01,20) - beta20_J*(Ts_Mean((nr-0.01),20)))*250", is_quarterly=False, add_data=False)
Beta20 H010010A
因子描述: 20日beta值
计算方法:
r为每日收益,rm为指数收益,beta为收益20日的bata值,bata=cov(r,rm)/var(rm)
betaN_J=Covariance(r_J,nr,N)/(StdDev(nr,N)^2)
N=20、60、120、250等
hs300, msg = dv.data_api.daily("000300.SH", dv.extended_start_date_d, dv.end_date, fields='trade_date,close,open')hs300_benchmark = hs300[['trade_date', 'close']].set_index('trade_date')dv.add_field("close")hs300 = 0 * dv.get_ts('close')for i in range(hs300.shape[1]):hs300.iloc[:, i] = hs300_benchmarkdv.append_df(hs300, 'hs300')nr = dv.add_formula('nr', '(hs300-Delay(hs300,1))/Delay(hs300,1)', is_quarterly=False, add_data=True)r_J = dv.add_formula('r_J', '(close_adj-Delay(close_adj,1))/Delay(close_adj,1)', is_quarterly=False, add_data=True)beta20_J =dv.add_formula('beta20_J' , 'Covariance(r_J,nr,20)/(StdDev(nr,20)^2)', is_quarterly=False,add_data=True)
SharpeRatio20 H010014A
因子描述: 20日夏普比率,表示每承受一单位总风险,会产生多少的超额报酬,可以同时对策略的收益与风险进行综合考虑。
计算方法:

(Ts_Mean(close_ret,N)*250-0.03)/StdDev(close_ret,N)/Sqrt(250)
N=20、60、120等
其中:
E(r)代表期望收益,等于日度收益均值*250
r_f代表无风险收益率,使用计算日当日值,下同
σ代表收益的标准偏差,等于日度收益标准差*sqrt(250)
dv.add_formula("close_ret", "Return(close_adj,1)", is_quarterly=False, add_data=True)SharpeRatio20 = dv.add_formula('SharpeRatio20_J', "(Ts_Mean(close_ret,20)*250-0.03)/StdDev(close_ret,20)/Sqrt(250)",is_quarterly=False,add_data=True)
TreynorRatio20 H010017A
因子描述:20日特诺雷比率,用以衡量投资回报率
计算方法:
TR = (E(r)-Rf)/β
r代表每日收益,E(r)代表期望收益,Rf代表无风险收益,beta代表收益的风险值
因子值是年化后的值,等于日度值乘以250
betaN_J=Covariance(r_J,nr,N)/(StdDev(nr,N)^2)
TRN_J(250*(Ts_Mean(r_J,N))-0.03)/betaN_J
N=20、60、120等
import numpy as nphs300, msg = dv.data_api.daily("000300.SH", dv.extended_start_date_d, dv.end_date, fields='trade_date,close,open')hs300_benchmark = hs300[['trade_date', 'close']].set_index('trade_date')hs300 = 0 * dv.get_ts('close')for i in range(hs300.shape[1]):hs300.iloc[:, i] = hs300_benchmarkdv.append_df(hs300, 'hs300')dv.add_formula('nr', '(hs300-Delay(hs300,1))/Delay(hs300,1)', is_quarterly=False, add_data=True)dv.add_formula('r_J','(close_adj-Delay(close_adj,1))/Delay(close_adj,1)', is_quarterly=False, add_data=True)beta20_J =dv.add_formula('beta20_J' , 'Covariance(r_J,nr,20)/(StdDev(nr,20)^2)', is_quarterly=False,add_data=True)TR20_J = dv.add_formula('TR20_J','(250*(Ts_Mean(r_J,20))-0.03)/beta20_J' ,is_quarterly=False, add_data=True)
InformationRatio20 H010020A
因子描述: 20日信息比率
计算方法:

Ts_Mean(r_J - nr,N)/StdDev(r_J - nr,N)
N=20、60、120等
其中:
r代表每日收益
r_M代表指数收益,选用沪深300指数
import numpy as nphs300, msg = dv.data_api.daily("000300.SH", dv.extended_start_date_d, dv.end_date, fields='trade_date,close,open')hs300_benchmark = hs300[['trade_date', 'close']].set_index('trade_date')hs300 = 0 * dv.get_ts('close')for i in range(hs300.shape[1]):hs300.iloc[:, i] = hs300_benchmarkdv.append_df(hs300, 'hs300')nr = dv.add_formula('nr', '(hs300-Delay(hs300,1))/Delay(hs300,1)', is_quarterly=False, add_data=True)r_J = dv.add_formula('r_J', '(close_adj-Delay(close_adj,1))/Delay(close_adj,1)', is_quarterly=False, add_data=True)IR20_J=dv.add_formula('IR20_J' , 'Ts_Mean(r_J - nr,20)/StdDev(r_J - nr,20)' , is_quarterly=False,add_data=True)
GainVariance20 H010023A
因子描述: 20日收益方差,类似于方差,但是主要衡量收益的表现。
计算方法:

GainVariance20_J = pd.DataFrame({name: value.dropna().rolling(N).std() ** 2 for name, value in pct_return.iteritems()},index=pct_return.index).fillna(method='ffill')
N=20、60、120等
其中:
r代表每日收益
注:因子值是年化后的值,等于日度值*250
import pandas as pddef cal_positive(df):return df[df > 0]dv.add_field("close_adj")pct_return = cal_positive(dv.get_ts('close_adj').pct_change())GainVariance20_J = pd.DataFrame({name: value.dropna().rolling(20).std() ** 2 for name, value in pct_return.iteritems()},index=pct_return.index).fillna(method='ffill')dv.append_df(GainVariance20_J, 'GainVariance20_J')
LossVariance20 H010026A
因子描述: 20日损失方差, 类似于方差,但是主要衡量损失的表现
计算方法:

LossVariance20_A = pd.DataFrame({name: value.dropna().rolling(N).std() ** 2 for name, value in pct_return.iteritems()},index=pct_return.index).fillna(method='ffill')
N=20、60、120等
其中:
r代表每日收益
注:因子值是年化后的值,等于日度值*250
import pandas as pdcal_negative = lambda df: df[df < 0]dv.add_field("close_adj")pct_return = cal_negative(dv.get_ts('close_adj').pct_change())LossVariance20_A = pd.DataFrame({name: value.dropna().rolling(20).std() ** 2 for name, value in pct_return.iteritems()},index=pct_return.index).fillna(method='ffill')
GainLossVarianceRatio20 H010029A
因子描述: 20日收益损失方差比
计算方法:

GainVariance N_J/LossVariance N_J
N=20、60、120等
其中:
r代表每日收益
import pandas as pddef cal_negative(df):return df[df < 0]dv.add_field("close_adj")pct_return = cal_negative(dv.get_ts('close_adj').pct_change())LossVariance20_J = pd.DataFrame({name: value.dropna().rolling(20).std() ** 2 for name, value in pct_return.iteritems()},index=pct_return.index).fillna(method='ffill')dv.append_df(LossVariance20_J, 'LossVariance20_J')def cal_positive(df):return df[df > 0]pct_return = cal_positive(dv.get_ts('close_adj').pct_change())GainVariance20_J = pd.DataFrame({name: value.dropna().rolling(20).std() ** 2 for name, value in pct_return.iteritems()},index=pct_return.index).fillna(method='ffill')dv.append_df(GainVariance20_J, 'GainVariance20_J')GainlossVarianceratio20 = dv.add_formula('GainlossVarianceratio20_J', "GainVariance20_J/LossVariance20_J",is_quarterly=False, add_data=True)dv.append_df(GainlossVarianceratio20, 'GainlossVarianceratio20_J')
RealizedVolatility H010032A
因子描述:实际波动率,日内5分钟线的收益率标准差
计算方法:使用5分钟线的close计算每5分钟的收益,然后求日内5分钟的收益的标准差
import pandas as pddef get_daily_value(date):print(date)data, msg = dv.data_api.bar(",".join(dv.symbol),trade_date=date, freq="5M")try:data = data.dropna().pivot(index="time", columns="symbol", values="close")data = data.groupby(data.index // 500).first()except ValueError:print(date)raisereturn data.std().rename(date)# 跟请求效率很有关...dv.add_field("close")dates = list(dv.get_ts("close").index)result = pd.concat(map(get_daily_value, dates), axis=1).Tdv.append_df(df=result, field_name="NPFromOperatingTTM", is_quarterly=False)
DASTD H010033A
因子描述: 252日超额收益标准差
计算方法:
DASTD=std(r-rf)
r为每日收益,rf为无风险收益,半衰期为42个交易日
r_J = dv.add_formula('r_J', '(close_adj-Delay(close_adj,1))/close_adj', is_quarterly=False, add_data=True)# dv.append_df(r_J, 'r_J')dastd = (r_J).ewm(halflife=42).std(ddof=1) # 如果用了定盘利率还得考虑一下日度收益率和年度收益率的问题dv.append_df(DASTD, 'dastd')
HsigmaCNE5 H010034A
因子描述: 252日残差收益波动率
计算方法:
HsigmaCNE5=std(ei)
ei代表残差收益,总共使用252个交易日,半衰期为63个交易日
start_date_delta = 300hs300, msg = dv.data_api.daily("000300.SH", dv.extended_start_date_d, dv.end_date, fields='trade_date,close')hs300_benchmark = hs300[['trade_date', 'close']].set_index('trade_date').pct_change()from numpy.lib.stride_tricks import as_strided as stridedimport numpy as npimport pandas as pddef get_sliding_window(df, W, return2D=0):a = df.valuess0, s1 = a.stridesm, n = a.shapeout = strided(a, shape=(m - W + 1, W, n), strides=(s0, s0, s1))if return2D == 1:return out.reshape(m - W + 1, -1)else:return outdef _beta(stock_return, universe_return, trailing_days=252, half_life=63):# universe_return = hs300_benchmark.reindex(index=)weights = np.sqrt(exponential_weight(trailing_days, half_life))coef, _, resid = wls_by_numpy(universe_return, stock_return, weights)return coef, np.std(resid) # np.sqrt(np.sum(resid**2)/250)def wls_by_numpy(x, y, w):A = np.vstack([x, np.ones(len(x))]).T * w.reshape(-1, 1)_y = y * wm, c = np.linalg.lstsq(A, _y)[0]resid = y - (m * x + c)return m, c, residdef exponential_weight(trailing_days=252, half_life=63):_s = np.flip(np.arange(0, trailing_days), axis=0) / half_lifereturn np.power(0.5, _s)def calc_hsigmacne5(s):idx = s.indexs = s.dropna()if s.size == 0 or len(s) < 252:return np.nanconcated = pd.concat([s, hs300_benchmark.reindex(index=s.index)], axis=1)strides = get_sliding_window(concated, 252)hsigma = pd.Series(list(map(lambda x: _beta(x[:, 0], x[:, 1])[1], strides)),index=s.index[252 - 1:]).reindex(index=idx)ret_ = dv.add_formula("ret_", "Return(close_adj)", is_quarterly=False)hsigmacne5= ret_.apply(calc_hsigmacne5)dv.append_df(hsigmacne5, 'hsigmacne5')
CmraCNE5 H010035A
因子描述: 12月累计收益(Monthly cumulative return range over the past 12 months)。
计算方法:
其中:rf代表无风险收益
def run_formula(dv):ret_ = dv.add_formula("ret", "Return(close_adj)", is_quarterly=False)month_days = 21# Rf先当作0def get_sliding_window(s, W):"""input: np.arange(20), W=4output:array([[ 0, 1, 2, 3],[ 4, 5, 6, 7],[ 8, 9, 10, 11],[12, 13, 14, 15],[16, 17, 18, 19]])"""from numpy.lib.stride_tricks import as_strided as stridedassert len(s) % W == 0strides = s.stridesassert len(strides) == 1strides = strides[0]return strided(s, shape=(int(len(s) / W), W), strides=(W * strides, strides))def calc_range(s, days=month_days, allrange=month_days * 12):"""还要考虑缺失值的问题"""def get_max_and_min(s):# print(s)from numpy import logs = s + 1out = get_sliding_window(s, days)out = out.prod(axis=1)# return out.max()-out.min() # 跟这个有点像,那么uqer就是瞎写的Z_T = log(out).cumsum() # 这里没有减无风险收益,return log((1 + Z_T.max()) / (1 + Z_T.min()))return s.rolling(allrange).apply(get_max_and_min)CmraCNE5=ret_.apply(calc_range)dv.append_df(CmraCNE5, 'CmraCNE5')
Cmra H010036A
因子描述: 24月累计收益(Monthly cumulative return range over the past 24 months)。
计算方法:
成交量为0时不考虑计算
CMRA = dv.add_formula('CMRA_J',"Log(Ts_Max(close_adj,475)/Ts_Min(close_adj,475))", is_quarterly=False, add_data=False)
Hbeta H010037A
因子描述:历史贝塔(Historical daily beta ) , 过往12个月中,个股日收益关于市场组合日收益的三阶自回归,市场组合日收益的系数。
均值回归的残差的方差除以自由度
计算方法:
其中市场组合日收益r_m.t的计算采用沪深300的数据
r_m.t=(CloseIndex-PrevCloseIndex)/PrevCloseIndex
回归结果中的β_h即为历史贝塔HBETA
def Multi_Regression(index, n, Y, *X):"""index是指返回的系数矩阵中第几个,从0开始,0代表常数项,1代表第一个系数的值n是rolling多少天Y是因变量矩阵*X传入list或者set或者tuple,list的元素是每一个矩阵"""from numpy.linalg import inv, LinAlgErrorfrom pandas import DataFrame, Seriesimport numpy as npimport pandas as pdDF = dict()le_th = len(Y)columns = Y.columnsindexes = Y.indexfor column in columns:betas = []def _func_x(x):if isinstance(x, DataFrame):return x[column].valueselif isinstance(x, Series):return x.valueselse:raise ErrorX_column = list(map(_func_x, X))X_column.insert(0, np.ones(le_th))X_column = np.array(X_column).TY_column = np.array(Y[column].values).Tprint(column)for length in range(n, le_th):X_temp = X_column[length - n:length]try:beta = (inv((X_temp.T).dot(X_temp))).dot(X_temp.T).dot(Y_column[length - n:length])betas.append(beta[index])except LinAlgError:betas.append(np.nan)DF[column] = betasfor key, value in DF.items():if len(value) != le_th - n:DF[key] += [np.nan] * (le_th - n - len(DF[key]))if len(value) != le_th:DF[key] = [np.nan] * n + DF[key]df = pd.DataFrame(DF, index=indexes)return dfdv.add_field("close_adj")zz800_benchmark, msg = dv.data_api.daily("000300.SH", dv.extended_start_date_d, dv.end_date,fields='trade_date,close,open')dv.data_benchmark = zz800_benchmark[['trade_date', 'close', 'open']].set_index('trade_date')BenchmarkIndexClose = dv.data_benchmark["close"].loc[dv.get_ts("close_adj").index]BenchmarkIndexOpen = dv.data_benchmark["open"].loc[dv.get_ts("close_adj").index]# risk_free_rate=-0.005M_Return = BenchmarkIndexClose.pct_change(1)Return = dv.get_ts("close_adj").pct_change(1)Return_1 = Return.shift(1)Return_2 = Return.shift(2)Return_3 = Return.shift(3)# 参数改成252是标准,但是只有0.89df = Multi_Regression(1, 252, Return, M_Return, Return_1, Return_2, Return_3)dv.append_df(df, "HBETA_J", overwrite=True, is_quarterly=False)HBETA = dv.get_ts("HBETA_J")
Hsigma H010038A
因子描述:历史波动(Historical daily s igma) , 过往12个月中,个股日收益关于市场组合日收益的三阶自回归,市场组合日收益的残差标准差。
计算方法:
其中市场组合日收益r_m.t的计算采用沪深300的数据
r_m.t=(CloseIndex-PrevCloseIndex)/PrevCloseIndex
p为4,
即为历史波动HSIGMA
# start_date_delta = 500 # 至少需要2年的数据参与运算def Multi_Regression(n, Y, *X):from numpy.linalg import inv, LinAlgErrorfrom pandas import DataFrame, Seriesimport numpy as npimport pandas as pdDF = dict()le_th = len(Y)columns = Y.columnsindexes = Y.indexp = 4for column in columns:residuals = []def _func_x(x):if isinstance(x, DataFrame):return x[column].valueselif isinstance(x, Series):return x.valueselse:raise ErrorX_column = list(map(_func_x, X))# print(X_column)X_column.insert(0, np.ones(le_th))X_column = np.array(X_column).TY_column = np.array(Y[column].values).Tfor length in range(n, le_th):X_temp = X_column[length - n:length]Y_temp = Y_column[length - n:length]try:beta = (inv((X_temp.T).dot(X_temp))).dot(X_temp.T).dot(Y_temp)residual = (Y_temp - (beta).dot(X_temp.T))# print(residual[0])residual = (residual ** 2).sum() / (n - p - 1)residuals.append(residual)except LinAlgError:residuals.append(np.nan)DF[column] = residualsfor key, value in DF.items():if len(value) != le_th - n:DF[key] += [np.nan] * (le_th - n - len(DF[key]))if len(value) != le_th:DF[key] = [np.nan] * n + DF[key]df = pd.DataFrame(DF, index=indexes)return dfdv.add_field("close_adj")Return = dv.get_ts("close_adj").pct_change(1)Return_1 = Return.shift(1)Return_2 = Return.shift(2)Return_3 = Return.shift(3)zz800_benchmark, msg = dv.data_api.daily("000300.SH", dv.extended_start_date_d, dv.end_date,fields='trade_date,close,open')dv.data_benchmark = zz800_benchmark[['trade_date', 'close', 'open']].set_index('trade_date')BenchmarkIndexClose = dv.data_benchmark["close"].loc[dv.get_ts("close_adj").index]BenchmarkIndexOpen = dv.data_benchmark["open"].loc[dv.get_ts("close_adj").index]M_Return = BenchmarkIndexClose.pct_change(1)df = Multi_Regression(250, Return, Return_1, Return_2, Return_3, M_Return)dv.append_df(df, "HSIGMA_J", overwrite=True, is_quarterly=False)HSIGMA = dv.get_ts("HSIGMA_J")
DDNSR H010039A
因子描述:下跌波动(Downside standard deviations ratio) , 过往12个月中,市场组合日收益为负时, 个股日收益标准差和市场组合日收益标准差之比。
计算方法:

其中市场组合日收益
的计算采用沪深300的数据,仅考虑市场回报为负的数据
=(CloseIndex-PrevCloseIndex)/PrevCloseIndex
import pandas as pdimport numpy as npT = 250benchmark, msg = dv.data_api.daily("000300.SH", dv.extended_start_date_d, dv.end_date,fields='trade_date,close,open')benchmark = benchmark[['trade_date', 'close']].set_index('trade_date').pct_change()benchmark = benchmark[benchmark < 0]def drop_and_corr(arr):arr = arr[arr[:, 1] < 0]arr = arr[~np.isnan(arr).any(axis=1)] # 可加可不加return arr[:, 0].std(ddof=1)/arr[:, 1].std(ddof=1)from numpy.lib.stride_tricks import as_strided as strideddef get_sliding_window(df, W, return2D=0):a = df.valuess0, s1 = a.stridesm, n = a.shapeout = strided(a, shape=(m - W + 1, W, n), strides=(s0, s0, s1))if return2D == 1:return out.reshape(m - W + 1, -1)else:return outdef calc_corr(s):benchmark_ = benchmark.reindex(index=s.index)df = pd.concat([s, benchmark_], axis=1)idx = df.index[T - 1:]strides = get_sliding_window(df, T)result = list(map(lambda x: drop_and_corr(x), strides))return pd.Series(result, index=idx)ret = dv.add_formula("_ret", "Return(close_adj)", is_quarterly=False)DDNSR = ret.apply(calc_corr)dv.append_df(DDNSR, 'DDNSR')
DDNCR H010040A
因子描述:下跌相关系数(Downside correlation) , 过往12个月中,市场组合日收益为负时,个股日收益关于市场组合日收益的相关系数。
计算方法:

其中市场组合日收益
的计算采用沪深300的数据,仅考虑市场回报为负的数据
=(CloseIndex-PrevCloseIndex)/PrevCloseIndex
import pandas as pdimport numpy as npT = 250benchmark, msg = dv.data_api.daily("000300.SH", dv.extended_start_date_d, dv.end_date,fields='trade_date,close,open')benchmark = benchmark[['trade_date', 'close']].set_index('trade_date').pct_change()benchmark = benchmark[benchmark < 0]def drop_and_corr(arr):from scipy.stats import pearsonrarr = arr[arr[:, 1] < 0]arr = arr[~np.isnan(arr).any(axis=1)] # 可加可不加return pearsonr(arr[:, 0], arr[:, 1])[0]from numpy.lib.stride_tricks import as_strided as strideddef get_sliding_window(df, W, return2D=0):a = df.valuess0, s1 = a.stridesm, n = a.shapeout = strided(a, shape=(m - W + 1, W, n), strides=(s0, s0, s1))if return2D == 1:return out.reshape(m - W + 1, -1)else:return outdef calc_corr(s):benchmark_ = benchmark.reindex(index=s.index)df = pd.concat([s, benchmark_], axis=1)idx = df.index[T - 1:]strides = get_sliding_window(df, T)result = list(map(lambda x: drop_and_corr(x), strides))return pd.Series(result, index=idx)ret = dv.add_formula("_ret", "Return(close_adj)", is_quarterly=False)DDNCR = ret.apply(calc_corr)dv.append_df(DDNCR, 'DDNCR')
Dvrat H010042A
因子描述:收益相对波动(Daily returns variance ratio-serial dependence in daily returns)。
计算方法:
记
为第i支股票的日收益,
为每日的无风险收益,则该股票当日的超额日收益
收益相对波动可表示为:
其中:
m=q(T-q+1)(1-q/T)
T为过往24个月中的交易日数,q=10
代码计算中将每日无风险收益
按0处理。最终结果舍去了交易日不足180天的结果。
dv.add_formula("ret_J", "Return(close_adj)", is_quarterly=False, add_data=True)T = 500 # 过往24个月中的交易日数q = 10dv.add_formula("sigma_squared_J", "Ts_Sum(Pow(ret_J, 2), %s)/(%s -1)" % (T, T),is_quarterly=False, add_data=False)m = q*(T-q+1)*(1-q/T)dv.add_formula("sigma_q_tmp", "Pow(Ts_Sum(ret_J, %s), 2)" % q, is_quarterly=False, add_data=True)sigma_q = dv.add_formula("sigma_q", "Ts_Sum(sigma_q_tmp, %s)/%s" % (T-q, m), is_quarterly=False)
Ddnbt H010043A
因子描述:下跌贝塔(Downside beta) , 过往12个月中,市场组合日收益为负时,个股日收益关于市场组合日收益的回归系数。
计算方法:
其中市场组合日收益
的计算采用沪深300的数据,仅考虑市场回报为负的数据
=(CloseIndex-PrevCloseIndex)/PrevCloseIndex
回归结果中的β_d即为下跌贝塔DDNBT。
import pandas as pdimport numpy as npbenchmark, msg = dv.data_api.daily("000300.SH", dv.extended_start_date_d, dv.end_date,fields='trade_date,close,open')benchmark = benchmark[['trade_date', 'close']].set_index('trade_date').pct_change()def np_regr(A, y, residuals=False):if A.ndim == 1:A = A[:, None]A = np.hstack((np.ones(len(A))[:, None], A))betas = np.linalg.lstsq(A, y)[0]if residuals:return y - A @ betasreturn betasdef drop_and_regr(arr, residuals=False, drop_first=True):arr = arr[~np.isnan(arr).any(axis=1)]# print(arr.shape)if arr.size == 0:return np.nan# 可以再加一个条件当过去12个月内有数据的交易日不足多少天就不算# 这里加一个下跌贝塔的条件判断arr = arr[arr[:, 1] < 0]if arr.size == 0:return np.nanreturn np_regr(arr[:, 1:], arr[:, :1], residuals=residuals)[1].item()from numpy.lib.stride_tricks import as_strided as strideddef get_sliding_window(df, W, return2D=0):a = df.valuess0, s1 = a.stridesm, n = a.shapeout = strided(a, shape=(m - W + 1, W, n), strides=(s0, s0, s1))if return2D == 1:return out.reshape(m - W + 1, -1)else:return outret = dv.add_formula("ret", "Return(close_adj)", is_quarterly=False, add_data=True)def calc_regr(stock_ret):"""在里边rolling吧"""X = pd.concat([stock_ret, benchmark], axis=1)idx = X.index[252 - 1:]strides = get_sliding_window(X, 252)result = list(map(lambda x: drop_and_regr(x), strides))return pd.Series(result, index=idx)Ddnbt = ret.apply(calc_regr)dv.append_df(Ddnbt, 'Ddnbt')
Tobt H010044A
因子描述:超额流动(Liquidity-turnover beta)。
计算方法:
(a)记
为第i支股票的日收益,市场组合日收益
为每日的无风险收益,则当日各自的超额日收益为
市场组合日收益
的计算采用沪深300的数据
=(CloseIndex-PrevCloseIndex)/PrevCloseIndex
代码计算中将每日无风险收益
按0处理。
(b)每日换手率
可查询得到,也可按如下公式计算
若有数值缺失使用总股本数值代替。
(c)日超额收益绝对值关于换手率、市场组合日收益绝对值的五阶和自身五阶的回归表示为:
回归结果中的日换手率系数βi即为所求的超额流动TOBT.
最终结果舍去了交易日不足180天的结果。
import pandas as pdimport numpy as npzz800_benchmark, msg = dv.data_api.daily("000300.SH", dv.extended_start_date_d, dv.end_date,fields='trade_date,close,open')dv.data_benchmark = zz800_benchmark[['trade_date', 'close', 'open']].set_index('trade_date')BenchmarkIndexClose = dv.data_benchmark["close"].loc[dv.get_ts("close_adj").index]BenchmarkIndexOpen = dv.data_benchmark["open"].loc[dv.get_ts("close_adj").index]# risk_free_rate=-0.005M_Return = BenchmarkIndexClose.pct_change(1)# M_return=M_Return.apply(lambda x:round(x*100,4))M_Return_1 = M_Return.shift(1)M_Return_2 = M_Return.shift(2)M_Return_3 = M_Return.shift(3)M_Return_4 = M_Return.shift(4)M_Return_5 = M_Return.shift(5)dv.add_field("turnover_ratio")TORate = dv.add_formula("TORate", "turnover_ratio/100", add_data=True, is_quarterly=False)# TORate=TORate.applymap(lambda x:round(x*100,4))Return = dv.add_formula("Close_Return", "Return(close_adj)", add_data=True, is_quarterly=False)# Return=Return.applymap(lambda x:round(x*100,4))Return_1 = Return.shift(1)Return_2 = Return.shift(2)Return_3 = Return.shift(3)Return_4 = Return.shift(4)Return_5 = Return.shift(5)from numpy.linalg import inv, LinAlgErrorDF = dict()window = 498le_th = len(Return)indexes = Return.indexbetas = []for column in Return.columns:X = np.array([np.ones(le_th), TORate[column].abs().values, Return_1[column].abs().values, Return_2[column].abs().values,Return_3[column].abs().values, Return_4[column].abs().values, Return_5[column].abs().values,M_Return_1.abs().values, M_Return_2.abs().values, M_Return_3.abs().values, M_Return_4.abs().values,M_Return_5.abs().values]).T# print(X)Y = np.array(Return[column].abs().values).Tbetas = []# print(X.shape)# print(Y.shape)print(column)for length in range(window, len(Y)):X_temp = X[length - window:length]# print(X_temp.shape)# print(Y[length-window:length].shape)try:beta = (inv((X_temp.T).dot(X_temp))).dot(X_temp.T).dot(Y[length - window:length])betas.append(beta[1])except LinAlgError:betas.append(np.nan)DF[column] = betasfor key, value in DF.items():if len(value) != le_th - window:DF[key] += [np.nan] * (le_th - window - len(DF[key]))if len(value) != le_th:# print(key)DF[key] = [np.nan] * window + DF[key]df = pd.DataFrame(DF, index=indexes)dv.append_df(df, "TOBT", overwrite=True, is_quarterly=False)TOBT = dv.get_ts("TOBT")
Skewness H010045A
因子描述:股价偏度(Skewness of price during the last 20 days) , 过去20个交易日股价的偏度。
计算方法:

SKEWNESS_J = dv.add_formula('SKEWNESS_J', "Ts_Skewness(close_adj,{})".format(20), is_quarterly=False, add_data=False)