@Channelchan
2017-12-03T19:54:59.000000Z
字数 4572
阅读 15558
Alphalens是一个Python包,用于对阿尔法因子进行性能分析。Alpha因子表示一些给定的信息和未来的回报之间的预测关系。通过将这种关系应用于多个股票,能够产生阿尔法信号,然后从中交易。开发一个好的alpha信号是很有挑战性的,那么用Alphalens能让事情变得更简单,因为一套用于分析alpha因素的常用工具会对量化交易产生很大的影响。通过Alphalens分析你在研究中的因素,你可以花更少的时间来写和运行回测。因此,这允许更快的思想迭代,以及最终的算法,您可以对它有信心。Alphalens建立了一个严格的工作流程,将使你的策略更有活力,更不容易过度拟合。
下载方式: pip install alphalens
官方网站: http://quantopian.github.io/alphalens/index.html
import talib as ta
import numpy as np
import pandas as pd
from datetime import datetime
def momentum(PN, period=40):
return pd.DataFrame(
{name: pd.Series(ta.abstract.ROCR(price, period),index=price.index)
for name, price in PN.iteritems()})
data = pd.Panel(pd.read_excel('sz50.xlsx', sheetname=None, index_col='datetime'))
PN = data.dropna(how='all',axis=0)
alpha_mom = momentum(PN)
factor: MultiIndex(用stack()方法来转换)
prices: DataFrame
#转换成MultiIndex
factor = alpha_mom.stack()
print (factor.tail())
datetime
2017-11-20 15:00:00 601857.XSHG 1.022616
601881.XSHG 0.744411
601901.XSHG 0.893478
601985.XSHG 0.993412
601988.XSHG 0.971698
dtype: float64
# 股票池价格的Dataframe
prices = PN.minor_xs('close')
print (prices.tail())
600000.XSHG 600016.XSHG 600028.XSHG 600029.XSHG \
datetime
2017-11-14 15:00:00 118.12 125.93 12.06 16.00
2017-11-15 15:00:00 118.12 124.74 11.82 16.04
2017-11-16 15:00:00 116.16 123.54 11.76 16.29
2017-11-17 15:00:00 119.81 127.42 11.92 16.97
2017-11-20 15:00:00 120.47 128.17 11.92 17.05
600030.XSHG 600036.XSHG 600048.XSHG 600050.XSHG \
datetime
2017-11-14 15:00:00 69.27 111.81 199.75 9.49
2017-11-15 15:00:00 69.04 111.25 204.52 9.68
2017-11-16 15:00:00 68.05 112.13 218.27 9.61
2017-11-17 15:00:00 69.88 117.24 224.00 9.63
2017-11-20 15:00:00 67.71 121.82 224.19 9.80
600100.XSHG 600104.XSHG ... 601766.XSHG \
datetime ...
2017-11-14 15:00:00 178.62 204.03 ... 12.10
2017-11-15 15:00:00 176.35 202.78 ... 12.07
2017-11-16 15:00:00 174.24 200.97 ... 11.77
2017-11-17 15:00:00 165.92 207.21 ... 12.11
2017-11-20 15:00:00 170.61 206.46 ... 12.14
601788.XSHG 601800.XSHG 601818.XSHG 601857.XSHG \
datetime
2017-11-14 15:00:00 17.28 17.39 5.13 10.63
2017-11-15 15:00:00 17.25 17.34 5.12 10.37
2017-11-16 15:00:00 17.04 16.91 5.11 10.28
2017-11-17 15:00:00 17.30 17.04 5.21 10.33
2017-11-20 15:00:00 17.18 16.79 5.24 10.40
601881.XSHG 601901.XSHG 601985.XSHG 601988.XSHG \
datetime
2017-11-14 15:00:00 13.15 8.63 7.80 6.08
2017-11-15 15:00:00 13.03 8.49 7.79 6.07
2017-11-16 15:00:00 12.76 8.28 7.54 6.02
2017-11-17 15:00:00 12.30 8.11 7.63 6.14
2017-11-20 15:00:00 12.32 8.22 7.54 6.18
601989.XSHG
datetime
2017-11-14 15:00:00 10.64
2017-11-15 15:00:00 10.51
2017-11-16 15:00:00 10.49
2017-11-17 15:00:00 10.14
2017-11-20 15:00:00 10.25
[5 rows x 49 columns]
#输入Alphalen所需要的数据格式
import alphalens
factor_data = alphalens.utils.get_clean_factor_and_forward_returns(factor, prices, quantiles=5)
print (factor_data.head())
1 5 10 factor \
date asset
2017-03-07 15:00:00 600000.XSHG -0.001197 -0.010349 -0.024974 1.008018
600016.XSHG -0.005597 -0.015598 -0.034555 0.985728
600028.XSHG 0.003578 -0.016100 0.007156 1.021938
600029.XSHG -0.003912 0.010172 0.000782 1.097938
600030.XSHG -0.006045 -0.006045 -0.013999 1.016659
factor_quantile
date asset
2017-03-07 15:00:00 600000.XSHG 2
600016.XSHG 1
600028.XSHG 3
600029.XSHG 5
600030.XSHG 3
mean_return_by_q, std_err_by_q = alphalens.performance.mean_return_by_quantile(factor_data, by_date=True)
print(mean_return_by_q.head())
print(std_err_by_q.head())
1 5 10
factor_quantile date
1 2017-03-07 15:00:00 0.006782 0.003821 0.006060
2017-03-08 15:00:00 0.002207 0.000536 -0.005845
2017-03-09 15:00:00 0.000176 0.001881 0.012697
2017-03-10 15:00:00 0.001894 0.004035 0.006478
2017-03-13 15:00:00 0.000316 0.009381 0.011278
1 5 10
factor_quantile date
1 2017-03-07 15:00:00 0.008181 0.005817 0.011047
2017-03-08 15:00:00 0.001643 0.005422 0.012947
2017-03-09 15:00:00 0.002841 0.004721 0.012215
2017-03-10 15:00:00 0.002748 0.003273 0.013972
2017-03-13 15:00:00 0.001233 0.006354 0.011653
import matplotlib.pyplot as plt
alphalens.plotting.plot_cumulative_returns_by_quantile(mean_return_by_q, 10)
plt.show()
度量变量的预测值与实际值之间的关系的相关值。信息系数是用来评估金融分析师预测技能的一种表现方法。
系数在-1到1之间,越大表示正相关程度强。标准是mean(IC)>0.02
其中d为秩次差。
因此IC值是代表因子排序与收益排序的相关性。
A = [1,3,5,7,9]
B = [3,2,4,5,1]
A的排序是1,2,3,4,5
B的排序是3,2,4,5,1
d为排序相减
# IC值例子
ic = alphalens.performance.factor_information_coefficient(factor_data)
# print (ic)
alphalens.plotting.plot_ic_hist(ic)
mean_monthly_ic = alphalens.performance.mean_information_coefficient(factor_data, by_time='M')
# print mean_monthly_ic.mean()
alphalens.plotting.plot_monthly_ic_heatmap(mean_monthly_ic)
plt.show()
factor_returns = alphalens.performance.factor_returns(factor_data)
alphalens.plotting.plot_cumulative_returns(factor_returns[10])
plt.show()
试用alphalens不同的功能。