@Channelchan
2017-05-22T12:29:32.000000Z
字数 1361
阅读 6785
未分类
登入优矿Notebook用数据API获取数据
import pandas as pd
#获取HS300所有股票的secID
HS300_list = set_universe('HS300')
#保存为excel的文件名
writer = pd.ExcelWriter("HS300_factors.xlsx")
for ticker in HS300_list:
data = DataAPI.MktStockFactorsDateRangeGet(secID=ticker, beginDate='20160101', field=['tradeDate','secID', 'close', 'ROE', 'PE', 'PB', 'PS'])
data.to_excel(writer, ticker[0:6])
writer.save()
下载到本地转存至本地MongoDB
只需要将你文件的路径复制至path即可
from fxdayu_data.data.handler.mongo_handler import MongoHandler
import pandas as pd
from datetime import timedelta
path = "C:\\Users\\small\\Desktop\\YouQuant_Data\\HS300_factors.xlsx"
DAY_END = timedelta(hours=15)
#存储至本地
mh = MongoHandler(db='YouQuant_Fundamental')
#存储至服务器
# mh = MongoHandler("192.168.0.103", 30000, db='Basic')
def coder(code):
if code.startswith("6"):
return "sh" + code
elif code.startswith("0") or code.startswith("3"):
return "sz" + code
else:
return code
def standard(frame):
frame['datetime'] = pd.to_datetime(frame.pop("tradeDate")) + DAY_END
frame.pop("secID")
return frame
def read_sheets(excel_file):
sheets = excel_file.sheet_names
print sheets
for sheet in sheets:
yield sheet, excel_file.parse(sheet)
def save_basic(file_path):
ef = pd.ExcelFile(file_path)
for code, frame in read_sheets(ef):
print mh.inplace(
standard(frame),
coder(code)
)
if __name__ == '__main__':
save_basic(path)
任务
1. 将沪深300, 中小板, 创业板分类下载至Mongodb
2. 研究优矿公开的130个因子,保存最常的20个基本面因子
3. 由于优矿的数据处理量有限,需要将时间以年为单位分批存入,需要10年至今的