@songying 2018-07-22T03:51:44.000000Z 字数 556 阅读 1131

MS MARCO: A Human Generated MAchine Reading COmprehension Dataset

数据集

Abstract

In MS MARCO, all questions are sampled from real anonymized user queries.

The context passages are extracted from real web documents .
The answers to the queries are human generated.

Introduction

Compared to previous publicly available datasets, this dataset is unique in the sense that

all questions are real user queries,
the context passages, which answers are derived from, are extracted from real web documents,
all the answers to the queries are human generated,
a subset of these queries has multiple answers,
all queries are tagged with segment information.

内容目录

- 以下【标签】将用于标记这篇文稿：

添加新批注

在作者公开此批注前，只有你和作者可见。

私有
公开
删除

回复批注