@songying
2018-11-03T20:30:14.000000Z
字数 883
阅读 1039
Word2vec TensorFlow 源码
word-embedding
参数
- save_path: 模型存储路径
- train_data: 训练数据文件路径
- eval_data: 测试文件路径
- embedding_size: 词向量维度 200
- epochs_to_train: 多少轮 15
- learning_rate: 学习率 0.2
- batch_size: 梯度下降的batch size
- concurrent_steps:The number of concurrent training steps, 12
- window_size: 5
- min_count: The minimum number of word occurrences for it to be included in the vocabulary. 5
- subsample: Subsample threshold for word occurrence. Words that appear with higher frequency will be randomly down-sampled. Set to 0 to disable. 1e-3
- interactive: true or false. If true, enters an IPython interactive session to play with the trained model. E.g., try model.analogy(b'france', b'paris', b'russia') and model.nearby([b'proton', b'elephant', b'maxwell'])
- statistics_interval: Print statistics every n seconds 5
- summary_interval: Save training summary to file every n seconds (rounded up to statistics interval) 5
- checkpoint_interval: Checkpoint the model (i.e. save the parameters) every n seconds (rounded up to statistics interval) 600