[关闭]
@songying 2018-09-04T21:20:12.000000Z 字数 2808 阅读 1515

Multi-Granularity Hierarchical Attention Fusion Networks for Reading Comprehension and Question Answering

squad2


2018-9-4 , 在squad数据集上排名第一。
数据集: Squad, TriviaQA, AddSent, AddOneSent

Introduction

我们的思想来自人类的阅读模式。
1. First, people scan through the whole passage to catch a glimpse of the main body of the passage.
2. Then with the question in mind, people make connection between passage and question, and understand the main intent of the question related with the passage theme.A rough answer span is then located from the passage and the attention can be focused on to the located context.
3. Finally, to prevent from forgetting the question, people come back to the question and
select a best answer according to the previously located answer span.

收到启发,我们提出了一个 hierarchical attention network, 该network能够逐渐将注意力放到正确的答案范围。

如上图所示, 我们的模型大致分为三部分:

  1. encoder layer where pretrained language models and recurrent neural networks are used to build representation for questions and passages separately
  2. attention layer in which hierarchical attention networks are designed to capture the relation between question and passage at different levels of granularity;
  3. match layer where refined question and passage are matched under a pointer-network answer boundary predictor.

3. Model

3.1 Task Description

给定passage , question , 其中, n表示passage中的单词数, m表示question中的单词数。 在SQUAD中, 答案A是passage中的片段。 阅读理解问题中的目标函数是为了学习函数 , 训练数据表示为:

3.2 Encode-Interaction-Pointer Framework

主要贡献在Attention层。

3.3 Hierachical Attention Fusion Network

我们的主要思想: performing fine-grained mechanism requires first to roughly see the potential answer domain and then progressively locate the most discriminative parts of the domain.
如上图所示, 我们的模型有以下几方面组成:
1. a basic co-attention layer with shallow semantic fusion
2. a self-attention layer with deep semantic fusion
3. a memory-wise bilinear alignment function

该模型有两个特点:
1. A fine-grained fusion approach to blend attention vectors for a better understanding of the relationship between question and passage;
2. A multi-granularity attention mechanism applied at the word and sentence-level, enabling it to properly attend to the most important content when constructing the question and passage representation.

3.4 Language Model & Encoder Layer

然后我们使用一个Bi-LSTM来获得最终的词的表示:

3.5 Hierarchical Attention & Fusion Layer

The Attention层用于将连接和融合question 和 passage 的信息。它旨在调整问题和段落,以便我们能够更好地找到与问题相关的最相关的段落。

我们提出了分层的attention结构, 通过将co-attention与self-attention机制结合。

没看懂

3.6 Model & Output Layer

添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注