@Channelchan
2018-10-23T18:07:28.000000Z
字数 21034
阅读 74193
如须调用该因子,将add_data=False改为True.可自行将因子名字value更改.收盘价在部分数据没有close_adj的情况下,使用close即可,其他high、low同理.其中N为参数,可自行设置。
Variance20 H010001A
因子描述: 20日收益方差。
计算方法:
StdDev(Return(close,1),N)^2*250
N=20、60、120等
注:因子值为年化后的值,等于日度方差*250
Variance20_J = dv.add_formula('Variance20_J', 'StdDev(Return(close,1),20)^2*250' ,
is_quarterly=False, add_data=False)
Kurtosis20 H010004A
因子描述:个股收益的20日峰度。
计算方法:
Ts_Kurtosis(Return(close_adj,1),N)
或者
Ts_Kurtosis(((close_adj-Delay(close_adj,1))/Delay(close_adj,1)),N)
两种方式皆可,N=20、60、120等
其中:
r代表每日收益
σ代表收益标准差
Kurtosis20_j = dv.add_formula('Kurtosis20_j', 'Ts_Kurtosis(Return(close_adj,1),20)' ,
is_quarterly=False, add_data=False)
Alpha20 H010007A
因子描述: 20日Jensen's alpha
计算方法:
alpha=(E(r)-rf)-betaE(rm-rf) r为每日收益,rf为无风险收益,beta为收益20日的bata值,bata=cov(r,rm)/var(rm)
betaN_J=Covariance(r_J,nr,N)/(StdDev(nr,N)^2)
AlphaN=(Ts_Mean(r_J-0.01,N) - betaN_J*(Ts_Mean((nr-0.01),N)))*250
主要在上面的两处地方可以修改N
N=20、60、120等
hs300_close = dv.data_api.daily('000300.SH', dv.extended_start_date_d, dv.end_date, fields="close",
adjust_mode=None)
hs300_benchmark = hs300_close[0][['trade_date', 'close']].set_index('trade_date')
dv.add_field("close")
hs300 = 0 * dv.get_ts('close')
for i in range(hs300.shape[1]):
hs300.iloc[:, i] = hs300_benchmark
dv.append_df(hs300, 'hs300')
nr = dv.add_formula('nr', '(hs300-Delay(hs300,1))/Delay(hs300,1)', is_quarterly=False, add_data=True)
r_J = dv.add_formula('r_J', '(close-Delay(close,1))/Delay(close,1)', is_quarterly=False, add_data=True)
beta20_J = dv.add_formula('beta20_J', 'Covariance(r_J,nr,20)/(StdDev(nr,20)^2)' , is_quarterly=False,add_data=True)
Alpha20_J = dv.add_formula('Alpha20_A',"(Ts_Mean(r_J-0.01,20) - beta20_J*(Ts_Mean((nr-0.01),20)))*250", is_quarterly=False, add_data=False)
Beta20 H010010A
因子描述: 20日beta值
计算方法:
r为每日收益,rm为指数收益,beta为收益20日的bata值,bata=cov(r,rm)/var(rm)
betaN_J=Covariance(r_J,nr,N)/(StdDev(nr,N)^2)
N=20、60、120、250等
hs300, msg = dv.data_api.daily("000300.SH", dv.extended_start_date_d, dv.end_date, fields='trade_date,close,open')
hs300_benchmark = hs300[['trade_date', 'close']].set_index('trade_date')
dv.add_field("close")
hs300 = 0 * dv.get_ts('close')
for i in range(hs300.shape[1]):
hs300.iloc[:, i] = hs300_benchmark
dv.append_df(hs300, 'hs300')
nr = dv.add_formula('nr', '(hs300-Delay(hs300,1))/Delay(hs300,1)', is_quarterly=False, add_data=True)
r_J = dv.add_formula('r_J', '(close_adj-Delay(close_adj,1))/Delay(close_adj,1)', is_quarterly=False, add_data=True)
beta20_J =dv.add_formula('beta20_J' , 'Covariance(r_J,nr,20)/(StdDev(nr,20)^2)', is_quarterly=False,add_data=True)
SharpeRatio20 H010014A
因子描述: 20日夏普比率,表示每承受一单位总风险,会产生多少的超额报酬,可以同时对策略的收益与风险进行综合考虑。
计算方法:
(Ts_Mean(close_ret,N)*250-0.03)/StdDev(close_ret,N)/Sqrt(250)
N=20、60、120等
其中:
E(r)代表期望收益,等于日度收益均值*250
r_f代表无风险收益率,使用计算日当日值,下同
σ代表收益的标准偏差,等于日度收益标准差*sqrt(250)
dv.add_formula("close_ret", "Return(close_adj,1)", is_quarterly=False, add_data=True)
SharpeRatio20 = dv.add_formula('SharpeRatio20_J', "(Ts_Mean(close_ret,20)*250-0.03)/StdDev(close_ret,20)/Sqrt(250)",is_quarterly=False,add_data=True)
TreynorRatio20 H010017A
因子描述:20日特诺雷比率,用以衡量投资回报率
计算方法:
TR = (E(r)-Rf)/β
r代表每日收益,E(r)代表期望收益,Rf代表无风险收益,beta代表收益的风险值
因子值是年化后的值,等于日度值乘以250
betaN_J=Covariance(r_J,nr,N)/(StdDev(nr,N)^2)
TRN_J(250*(Ts_Mean(r_J,N))-0.03)/betaN_J
N=20、60、120等
import numpy as np
hs300, msg = dv.data_api.daily("000300.SH", dv.extended_start_date_d, dv.end_date, fields='trade_date,close,open')
hs300_benchmark = hs300[['trade_date', 'close']].set_index('trade_date')
hs300 = 0 * dv.get_ts('close')
for i in range(hs300.shape[1]):
hs300.iloc[:, i] = hs300_benchmark
dv.append_df(hs300, 'hs300')
dv.add_formula('nr', '(hs300-Delay(hs300,1))/Delay(hs300,1)'
, is_quarterly=False, add_data=True)
dv.add_formula('r_J','(close_adj-Delay(close_adj,1))/Delay(close_adj,1)'
, is_quarterly=False, add_data=True)
beta20_J =dv.add_formula('beta20_J' , 'Covariance(r_J,nr,20)/(StdDev(nr,20)^2)', is_quarterly=False,
add_data=True)
TR20_J = dv.add_formula('TR20_J','(250*(Ts_Mean(r_J,20))-0.03)/beta20_J' ,
is_quarterly=False, add_data=True)
InformationRatio20 H010020A
因子描述: 20日信息比率
计算方法:
Ts_Mean(r_J - nr,N)/StdDev(r_J - nr,N)
N=20、60、120等
其中:
r代表每日收益
r_M代表指数收益,选用沪深300指数
import numpy as np
hs300, msg = dv.data_api.daily("000300.SH", dv.extended_start_date_d, dv.end_date, fields='trade_date,close,open')
hs300_benchmark = hs300[['trade_date', 'close']].set_index('trade_date')
hs300 = 0 * dv.get_ts('close')
for i in range(hs300.shape[1]):
hs300.iloc[:, i] = hs300_benchmark
dv.append_df(hs300, 'hs300')
nr = dv.add_formula('nr', '(hs300-Delay(hs300,1))/Delay(hs300,1)', is_quarterly=False, add_data=True)
r_J = dv.add_formula('r_J', '(close_adj-Delay(close_adj,1))/Delay(close_adj,1)', is_quarterly=False, add_data=True)
IR20_J=dv.add_formula('IR20_J' , 'Ts_Mean(r_J - nr,20)/StdDev(r_J - nr,20)' , is_quarterly=False,
add_data=True)
GainVariance20 H010023A
因子描述: 20日收益方差,类似于方差,但是主要衡量收益的表现。
计算方法:
GainVariance20_J = pd.DataFrame({name: value.dropna().rolling(N).std() ** 2 for name, value in pct_return.iteritems()},index=pct_return.index).fillna(method='ffill')
N=20、60、120等
其中:
r代表每日收益
注:因子值是年化后的值,等于日度值*250
import pandas as pd
def cal_positive(df):
return df[df > 0]
dv.add_field("close_adj")
pct_return = cal_positive(dv.get_ts('close_adj').pct_change())
GainVariance20_J = pd.DataFrame(
{name: value.dropna().rolling(20).std() ** 2 for name, value in pct_return.iteritems()},
index=pct_return.index).fillna(method='ffill')
dv.append_df(GainVariance20_J, 'GainVariance20_J')
LossVariance20 H010026A
因子描述: 20日损失方差, 类似于方差,但是主要衡量损失的表现
计算方法:
LossVariance20_A = pd.DataFrame({name: value.dropna().rolling(N).std() ** 2 for name, value in pct_return.iteritems()},index=pct_return.index).fillna(method='ffill')
N=20、60、120等
其中:
r代表每日收益
注:因子值是年化后的值,等于日度值*250
import pandas as pd
cal_negative = lambda df: df[df < 0]
dv.add_field("close_adj")
pct_return = cal_negative(dv.get_ts('close_adj').pct_change())
LossVariance20_A = pd.DataFrame(
{name: value.dropna().rolling(20).std() ** 2 for name, value in pct_return.iteritems()},
index=pct_return.index).fillna(method='ffill')
GainLossVarianceRatio20 H010029A
因子描述: 20日收益损失方差比
计算方法:
GainVariance N_J/LossVariance N_J
N=20、60、120等
其中:
r代表每日收益
import pandas as pd
def cal_negative(df):
return df[df < 0]
dv.add_field("close_adj")
pct_return = cal_negative(dv.get_ts('close_adj').pct_change())
LossVariance20_J = pd.DataFrame(
{name: value.dropna().rolling(20).std() ** 2 for name, value in pct_return.iteritems()},
index=pct_return.index).fillna(method='ffill')
dv.append_df(LossVariance20_J, 'LossVariance20_J')
def cal_positive(df):
return df[df > 0]
pct_return = cal_positive(dv.get_ts('close_adj').pct_change())
GainVariance20_J = pd.DataFrame(
{name: value.dropna().rolling(20).std() ** 2 for name, value in pct_return.iteritems()},
index=pct_return.index).fillna(method='ffill')
dv.append_df(GainVariance20_J, 'GainVariance20_J')
GainlossVarianceratio20 = dv.add_formula('GainlossVarianceratio20_J', "GainVariance20_J/LossVariance20_J",is_quarterly=False, add_data=True)
dv.append_df(GainlossVarianceratio20, 'GainlossVarianceratio20_J')
RealizedVolatility H010032A
因子描述:实际波动率,日内5分钟线的收益率标准差
计算方法:使用5分钟线的close计算每5分钟的收益,然后求日内5分钟的收益的标准差
import pandas as pd
def get_daily_value(date):
print(date)
data, msg = dv.data_api.bar(",".join(dv.symbol),
trade_date=date, freq="5M")
try:
data = data.dropna().pivot(index="time", columns="symbol", values="close")
data = data.groupby(data.index // 500).first()
except ValueError:
print(date)
raise
return data.std().rename(date)
# 跟请求效率很有关...
dv.add_field("close")
dates = list(dv.get_ts("close").index)
result = pd.concat(map(get_daily_value, dates), axis=1).T
dv.append_df(df=result, field_name="NPFromOperatingTTM", is_quarterly=False)
DASTD H010033A
因子描述: 252日超额收益标准差
计算方法:
DASTD=std(r-rf)
r为每日收益,rf为无风险收益,半衰期为42个交易日
r_J = dv.add_formula('r_J', '(close_adj-Delay(close_adj,1))/close_adj', is_quarterly=False, add_data=True)
# dv.append_df(r_J, 'r_J')
dastd = (r_J).ewm(halflife=42).std(ddof=1) # 如果用了定盘利率还得考虑一下日度收益率和年度收益率的问题
dv.append_df(DASTD, 'dastd')
HsigmaCNE5 H010034A
因子描述: 252日残差收益波动率
计算方法:
HsigmaCNE5=std(ei)
ei代表残差收益,总共使用252个交易日,半衰期为63个交易日
start_date_delta = 300
hs300, msg = dv.data_api.daily("000300.SH", dv.extended_start_date_d, dv.end_date, fields='trade_date,close')
hs300_benchmark = hs300[['trade_date', 'close']].set_index('trade_date').pct_change()
from numpy.lib.stride_tricks import as_strided as strided
import numpy as np
import pandas as pd
def get_sliding_window(df, W, return2D=0):
a = df.values
s0, s1 = a.strides
m, n = a.shape
out = strided(a, shape=(m - W + 1, W, n), strides=(s0, s0, s1))
if return2D == 1:
return out.reshape(m - W + 1, -1)
else:
return out
def _beta(stock_return, universe_return, trailing_days=252, half_life=63):
# universe_return = hs300_benchmark.reindex(index=)
weights = np.sqrt(exponential_weight(trailing_days, half_life))
coef, _, resid = wls_by_numpy(universe_return, stock_return, weights)
return coef, np.std(resid) # np.sqrt(np.sum(resid**2)/250)
def wls_by_numpy(x, y, w):
A = np.vstack([x, np.ones(len(x))]).T * w.reshape(-1, 1)
_y = y * w
m, c = np.linalg.lstsq(A, _y)[0]
resid = y - (m * x + c)
return m, c, resid
def exponential_weight(trailing_days=252, half_life=63):
_s = np.flip(np.arange(0, trailing_days), axis=0) / half_life
return np.power(0.5, _s)
def calc_hsigmacne5(s):
idx = s.index
s = s.dropna()
if s.size == 0 or len(s) < 252:
return np.nan
concated = pd.concat([s, hs300_benchmark.reindex(index=s.index)], axis=1)
strides = get_sliding_window(concated, 252)
hsigma = pd.Series(list(map(lambda x: _beta(x[:, 0], x[:, 1])[1], strides)),
index=s.index[252 - 1:]).reindex(index=idx)
ret_ = dv.add_formula("ret_", "Return(close_adj)", is_quarterly=False)
hsigmacne5= ret_.apply(calc_hsigmacne5)
dv.append_df(hsigmacne5, 'hsigmacne5')
CmraCNE5 H010035A
因子描述: 12月累计收益(Monthly cumulative return range over the past 12 months)。
计算方法:
其中:rf代表无风险收益
def run_formula(dv):
ret_ = dv.add_formula("ret", "Return(close_adj)", is_quarterly=False)
month_days = 21
# Rf先当作0
def get_sliding_window(s, W):
"""
input: np.arange(20), W=4
output:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15],
[16, 17, 18, 19]])
"""
from numpy.lib.stride_tricks import as_strided as strided
assert len(s) % W == 0
strides = s.strides
assert len(strides) == 1
strides = strides[0]
return strided(s, shape=(int(len(s) / W), W), strides=(W * strides, strides))
def calc_range(s, days=month_days, allrange=month_days * 12):
"""
还要考虑缺失值的问题
"""
def get_max_and_min(s):
# print(s)
from numpy import log
s = s + 1
out = get_sliding_window(s, days)
out = out.prod(axis=1)
# return out.max()-out.min() # 跟这个有点像,那么uqer就是瞎写的
Z_T = log(out).cumsum() # 这里没有减无风险收益,
return log((1 + Z_T.max()) / (1 + Z_T.min()))
return s.rolling(allrange).apply(get_max_and_min)
CmraCNE5=ret_.apply(calc_range)
dv.append_df(CmraCNE5, 'CmraCNE5')
Cmra H010036A
因子描述: 24月累计收益(Monthly cumulative return range over the past 24 months)。
计算方法:
成交量为0时不考虑计算
CMRA = dv.add_formula('CMRA_J',"Log(Ts_Max(close_adj,475)/Ts_Min(close_adj,475))"
, is_quarterly=False, add_data=False)
Hbeta H010037A
因子描述:历史贝塔(Historical daily beta ) , 过往12个月中,个股日收益关于市场组合日收益的三阶自回归,市场组合日收益的系数。
均值回归的残差的方差除以自由度
计算方法:
其中市场组合日收益r_m.t的计算采用沪深300的数据
r_m.t=(CloseIndex-PrevCloseIndex)/PrevCloseIndex
回归结果中的β_h即为历史贝塔HBETA
def Multi_Regression(index, n, Y, *X):
"""
index是指返回的系数矩阵中第几个,从0开始,0代表常数项,1代表第一个系数的值
n是rolling多少天
Y是因变量矩阵
*X传入list或者set或者tuple,list的元素是每一个矩阵
"""
from numpy.linalg import inv, LinAlgError
from pandas import DataFrame, Series
import numpy as np
import pandas as pd
DF = dict()
le_th = len(Y)
columns = Y.columns
indexes = Y.index
for column in columns:
betas = []
def _func_x(x):
if isinstance(x, DataFrame):
return x[column].values
elif isinstance(x, Series):
return x.values
else:
raise Error
X_column = list(map(_func_x, X))
X_column.insert(0, np.ones(le_th))
X_column = np.array(X_column).T
Y_column = np.array(Y[column].values).T
print(column)
for length in range(n, le_th):
X_temp = X_column[length - n:length]
try:
beta = (inv((X_temp.T).dot(X_temp))).dot(X_temp.T).dot(Y_column[length - n:length])
betas.append(beta[index])
except LinAlgError:
betas.append(np.nan)
DF[column] = betas
for key, value in DF.items():
if len(value) != le_th - n:
DF[key] += [np.nan] * (le_th - n - len(DF[key]))
if len(value) != le_th:
DF[key] = [np.nan] * n + DF[key]
df = pd.DataFrame(DF, index=indexes)
return df
dv.add_field("close_adj")
zz800_benchmark, msg = dv.data_api.daily("000300.SH", dv.extended_start_date_d, dv.end_date,
fields='trade_date,close,open')
dv.data_benchmark = zz800_benchmark[['trade_date', 'close', 'open']].set_index('trade_date')
BenchmarkIndexClose = dv.data_benchmark["close"].loc[dv.get_ts("close_adj").index]
BenchmarkIndexOpen = dv.data_benchmark["open"].loc[dv.get_ts("close_adj").index]
# risk_free_rate=-0.005
M_Return = BenchmarkIndexClose.pct_change(1)
Return = dv.get_ts("close_adj").pct_change(1)
Return_1 = Return.shift(1)
Return_2 = Return.shift(2)
Return_3 = Return.shift(3)
# 参数改成252是标准,但是只有0.89
df = Multi_Regression(1, 252, Return, M_Return, Return_1, Return_2, Return_3)
dv.append_df(df, "HBETA_J", overwrite=True, is_quarterly=False)
HBETA = dv.get_ts("HBETA_J")
Hsigma H010038A
因子描述:历史波动(Historical daily s igma) , 过往12个月中,个股日收益关于市场组合日收益的三阶自回归,市场组合日收益的残差标准差。
计算方法:
其中市场组合日收益r_m.t的计算采用沪深300的数据
r_m.t=(CloseIndex-PrevCloseIndex)/PrevCloseIndex
p为4,即为历史波动HSIGMA
# start_date_delta = 500 # 至少需要2年的数据参与运算
def Multi_Regression(n, Y, *X):
from numpy.linalg import inv, LinAlgError
from pandas import DataFrame, Series
import numpy as np
import pandas as pd
DF = dict()
le_th = len(Y)
columns = Y.columns
indexes = Y.index
p = 4
for column in columns:
residuals = []
def _func_x(x):
if isinstance(x, DataFrame):
return x[column].values
elif isinstance(x, Series):
return x.values
else:
raise Error
X_column = list(map(_func_x, X))
# print(X_column)
X_column.insert(0, np.ones(le_th))
X_column = np.array(X_column).T
Y_column = np.array(Y[column].values).T
for length in range(n, le_th):
X_temp = X_column[length - n:length]
Y_temp = Y_column[length - n:length]
try:
beta = (inv((X_temp.T).dot(X_temp))).dot(X_temp.T).dot(Y_temp)
residual = (Y_temp - (beta).dot(X_temp.T))
# print(residual[0])
residual = (residual ** 2).sum() / (n - p - 1)
residuals.append(residual)
except LinAlgError:
residuals.append(np.nan)
DF[column] = residuals
for key, value in DF.items():
if len(value) != le_th - n:
DF[key] += [np.nan] * (le_th - n - len(DF[key]))
if len(value) != le_th:
DF[key] = [np.nan] * n + DF[key]
df = pd.DataFrame(DF, index=indexes)
return df
dv.add_field("close_adj")
Return = dv.get_ts("close_adj").pct_change(1)
Return_1 = Return.shift(1)
Return_2 = Return.shift(2)
Return_3 = Return.shift(3)
zz800_benchmark, msg = dv.data_api.daily("000300.SH", dv.extended_start_date_d, dv.end_date,
fields='trade_date,close,open')
dv.data_benchmark = zz800_benchmark[['trade_date', 'close', 'open']].set_index('trade_date')
BenchmarkIndexClose = dv.data_benchmark["close"].loc[dv.get_ts("close_adj").index]
BenchmarkIndexOpen = dv.data_benchmark["open"].loc[dv.get_ts("close_adj").index]
M_Return = BenchmarkIndexClose.pct_change(1)
df = Multi_Regression(250, Return, Return_1, Return_2, Return_3, M_Return)
dv.append_df(df, "HSIGMA_J", overwrite=True, is_quarterly=False)
HSIGMA = dv.get_ts("HSIGMA_J")
DDNSR H010039A
因子描述:下跌波动(Downside standard deviations ratio) , 过往12个月中,市场组合日收益为负时, 个股日收益标准差和市场组合日收益标准差之比。
计算方法:
其中市场组合日收益的计算采用沪深300的数据,仅考虑市场回报为负的数据
=(CloseIndex-PrevCloseIndex)/PrevCloseIndex
import pandas as pd
import numpy as np
T = 250
benchmark, msg = dv.data_api.daily("000300.SH", dv.extended_start_date_d, dv.end_date,
fields='trade_date,close,open')
benchmark = benchmark[['trade_date', 'close']].set_index('trade_date').pct_change()
benchmark = benchmark[benchmark < 0]
def drop_and_corr(arr):
arr = arr[arr[:, 1] < 0]
arr = arr[~np.isnan(arr).any(axis=1)] # 可加可不加
return arr[:, 0].std(ddof=1)/arr[:, 1].std(ddof=1)
from numpy.lib.stride_tricks import as_strided as strided
def get_sliding_window(df, W, return2D=0):
a = df.values
s0, s1 = a.strides
m, n = a.shape
out = strided(a, shape=(m - W + 1, W, n), strides=(s0, s0, s1))
if return2D == 1:
return out.reshape(m - W + 1, -1)
else:
return out
def calc_corr(s):
benchmark_ = benchmark.reindex(index=s.index)
df = pd.concat([s, benchmark_], axis=1)
idx = df.index[T - 1:]
strides = get_sliding_window(df, T)
result = list(map(lambda x: drop_and_corr(x), strides))
return pd.Series(result, index=idx)
ret = dv.add_formula("_ret", "Return(close_adj)", is_quarterly=False)
DDNSR = ret.apply(calc_corr)
dv.append_df(DDNSR, 'DDNSR')
DDNCR H010040A
因子描述:下跌相关系数(Downside correlation) , 过往12个月中,市场组合日收益为负时,个股日收益关于市场组合日收益的相关系数。
计算方法:
其中市场组合日收益的计算采用沪深300的数据,仅考虑市场回报为负的数据
=(CloseIndex-PrevCloseIndex)/PrevCloseIndex
import pandas as pd
import numpy as np
T = 250
benchmark, msg = dv.data_api.daily("000300.SH", dv.extended_start_date_d, dv.end_date,
fields='trade_date,close,open')
benchmark = benchmark[['trade_date', 'close']].set_index('trade_date').pct_change()
benchmark = benchmark[benchmark < 0]
def drop_and_corr(arr):
from scipy.stats import pearsonr
arr = arr[arr[:, 1] < 0]
arr = arr[~np.isnan(arr).any(axis=1)] # 可加可不加
return pearsonr(arr[:, 0], arr[:, 1])[0]
from numpy.lib.stride_tricks import as_strided as strided
def get_sliding_window(df, W, return2D=0):
a = df.values
s0, s1 = a.strides
m, n = a.shape
out = strided(a, shape=(m - W + 1, W, n), strides=(s0, s0, s1))
if return2D == 1:
return out.reshape(m - W + 1, -1)
else:
return out
def calc_corr(s):
benchmark_ = benchmark.reindex(index=s.index)
df = pd.concat([s, benchmark_], axis=1)
idx = df.index[T - 1:]
strides = get_sliding_window(df, T)
result = list(map(lambda x: drop_and_corr(x), strides))
return pd.Series(result, index=idx)
ret = dv.add_formula("_ret", "Return(close_adj)", is_quarterly=False)
DDNCR = ret.apply(calc_corr)
dv.append_df(DDNCR, 'DDNCR')
Dvrat H010042A
因子描述:收益相对波动(Daily returns variance ratio-serial dependence in daily returns)。
计算方法:
记为第i支股票的日收益,
为每日的无风险收益,则该股票当日的超额日收益
收益相对波动可表示为:
其中:
m=q(T-q+1)(1-q/T)
T为过往24个月中的交易日数,q=10
代码计算中将每日无风险收益按0处理。最终结果舍去了交易日不足180天的结果。
dv.add_formula("ret_J", "Return(close_adj)", is_quarterly=False, add_data=True)
T = 500 # 过往24个月中的交易日数
q = 10
dv.add_formula("sigma_squared_J", "Ts_Sum(Pow(ret_J, 2), %s)/(%s -1)" % (T, T),
is_quarterly=False, add_data=False)
m = q*(T-q+1)*(1-q/T)
dv.add_formula("sigma_q_tmp", "Pow(Ts_Sum(ret_J, %s), 2)" % q, is_quarterly=False, add_data=True)
sigma_q = dv.add_formula("sigma_q", "Ts_Sum(sigma_q_tmp, %s)/%s" % (T-q, m), is_quarterly=False)
Ddnbt H010043A
因子描述:下跌贝塔(Downside beta) , 过往12个月中,市场组合日收益为负时,个股日收益关于市场组合日收益的回归系数。
计算方法:
其中市场组合日收益的计算采用沪深300的数据,仅考虑市场回报为负的数据
=(CloseIndex-PrevCloseIndex)/PrevCloseIndex
回归结果中的β_d即为下跌贝塔DDNBT。
import pandas as pd
import numpy as np
benchmark, msg = dv.data_api.daily("000300.SH", dv.extended_start_date_d, dv.end_date,
fields='trade_date,close,open')
benchmark = benchmark[['trade_date', 'close']].set_index('trade_date').pct_change()
def np_regr(A, y, residuals=False):
if A.ndim == 1:
A = A[:, None]
A = np.hstack((np.ones(len(A))[:, None], A))
betas = np.linalg.lstsq(A, y)[0]
if residuals:
return y - A @ betas
return betas
def drop_and_regr(arr, residuals=False, drop_first=True):
arr = arr[~np.isnan(arr).any(axis=1)]
# print(arr.shape)
if arr.size == 0:
return np.nan
# 可以再加一个条件当过去12个月内有数据的交易日不足多少天就不算
# 这里加一个下跌贝塔的条件判断
arr = arr[arr[:, 1] < 0]
if arr.size == 0:
return np.nan
return np_regr(arr[:, 1:], arr[:, :1], residuals=residuals)[1].item()
from numpy.lib.stride_tricks import as_strided as strided
def get_sliding_window(df, W, return2D=0):
a = df.values
s0, s1 = a.strides
m, n = a.shape
out = strided(a, shape=(m - W + 1, W, n), strides=(s0, s0, s1))
if return2D == 1:
return out.reshape(m - W + 1, -1)
else:
return out
ret = dv.add_formula("ret", "Return(close_adj)", is_quarterly=False, add_data=True)
def calc_regr(stock_ret):
"""
在里边rolling吧
"""
X = pd.concat([stock_ret, benchmark], axis=1)
idx = X.index[252 - 1:]
strides = get_sliding_window(X, 252)
result = list(map(lambda x: drop_and_regr(x), strides))
return pd.Series(result, index=idx)
Ddnbt = ret.apply(calc_regr)
dv.append_df(Ddnbt, 'Ddnbt')
Tobt H010044A
因子描述:超额流动(Liquidity-turnover beta)。
计算方法:
(a)记为第i支股票的日收益,市场组合日收益
为每日的无风险收益,则当日各自的超额日收益为
市场组合日收益的计算采用沪深300的数据
=(CloseIndex-PrevCloseIndex)/PrevCloseIndex
代码计算中将每日无风险收益按0处理。
(b)每日换手率可查询得到,也可按如下公式计算
若有数值缺失使用总股本数值代替。
(c)日超额收益绝对值关于换手率、市场组合日收益绝对值的五阶和自身五阶的回归表示为:
回归结果中的日换手率系数βi即为所求的超额流动TOBT.
最终结果舍去了交易日不足180天的结果。
import pandas as pd
import numpy as np
zz800_benchmark, msg = dv.data_api.daily("000300.SH", dv.extended_start_date_d, dv.end_date,
fields='trade_date,close,open')
dv.data_benchmark = zz800_benchmark[['trade_date', 'close', 'open']].set_index('trade_date')
BenchmarkIndexClose = dv.data_benchmark["close"].loc[dv.get_ts("close_adj").index]
BenchmarkIndexOpen = dv.data_benchmark["open"].loc[dv.get_ts("close_adj").index]
# risk_free_rate=-0.005
M_Return = BenchmarkIndexClose.pct_change(1)
# M_return=M_Return.apply(lambda x:round(x*100,4))
M_Return_1 = M_Return.shift(1)
M_Return_2 = M_Return.shift(2)
M_Return_3 = M_Return.shift(3)
M_Return_4 = M_Return.shift(4)
M_Return_5 = M_Return.shift(5)
dv.add_field("turnover_ratio")
TORate = dv.add_formula("TORate", "turnover_ratio/100", add_data=True, is_quarterly=False)
# TORate=TORate.applymap(lambda x:round(x*100,4))
Return = dv.add_formula("Close_Return", "Return(close_adj)", add_data=True, is_quarterly=False)
# Return=Return.applymap(lambda x:round(x*100,4))
Return_1 = Return.shift(1)
Return_2 = Return.shift(2)
Return_3 = Return.shift(3)
Return_4 = Return.shift(4)
Return_5 = Return.shift(5)
from numpy.linalg import inv, LinAlgError
DF = dict()
window = 498
le_th = len(Return)
indexes = Return.index
betas = []
for column in Return.columns:
X = np.array(
[np.ones(le_th), TORate[column].abs().values, Return_1[column].abs().values, Return_2[column].abs().values,
Return_3[column].abs().values, Return_4[column].abs().values, Return_5[column].abs().values,
M_Return_1.abs().values, M_Return_2.abs().values, M_Return_3.abs().values, M_Return_4.abs().values,
M_Return_5.abs().values]).T
# print(X)
Y = np.array(Return[column].abs().values).T
betas = []
# print(X.shape)
# print(Y.shape)
print(column)
for length in range(window, len(Y)):
X_temp = X[length - window:length]
# print(X_temp.shape)
# print(Y[length-window:length].shape)
try:
beta = (inv((X_temp.T).dot(X_temp))).dot(X_temp.T).dot(Y[length - window:length])
betas.append(beta[1])
except LinAlgError:
betas.append(np.nan)
DF[column] = betas
for key, value in DF.items():
if len(value) != le_th - window:
DF[key] += [np.nan] * (le_th - window - len(DF[key]))
if len(value) != le_th:
# print(key)
DF[key] = [np.nan] * window + DF[key]
df = pd.DataFrame(DF, index=indexes)
dv.append_df(df, "TOBT", overwrite=True, is_quarterly=False)
TOBT = dv.get_ts("TOBT")
Skewness H010045A
因子描述:股价偏度(Skewness of price during the last 20 days) , 过去20个交易日股价的偏度。
计算方法:
SKEWNESS_J = dv.add_formula('SKEWNESS_J', "Ts_Skewness(close_adj,{})".format(20), is_quarterly=False, add_data=False)