@songying 2018-11-03T12:30:14.000000Z 字数 883 阅读 1133

Word2vec TensorFlow 源码

`word-embedding`

save_path: 模型存储路径
train_data: 训练数据文件路径
eval_data: 测试文件路径
embedding_size: 词向量维度 200
epochs_to_train: 多少轮 15
learning_rate: 学习率 0.2
batch_size: 梯度下降的batch size
concurrent_steps：The number of concurrent training steps， 12
window_size: 5
min_count: The minimum number of word occurrences for it to be included in the vocabulary. 5
subsample: Subsample threshold for word occurrence. Words that appear with higher frequency will be randomly down-sampled. Set to 0 to disable. 1e-3
interactive: true or false. If true, enters an IPython interactive session to play with the trained model. E.g., try model.analogy(b'france', b'paris', b'russia') and model.nearby([b'proton', b'elephant', b'maxwell'])
statistics_interval: Print statistics every n seconds 5
summary_interval: Save training summary to file every n seconds (rounded up to statistics interval) 5
checkpoint_interval: Checkpoint the model (i.e. save the parameters) every n seconds (rounded up to statistics interval) 600