@songying 2018-09-04T13:20:12.000000Z 字数 2808 阅读 1618

Multi-Granularity Hierarchical Attention Fusion Networks for Reading Comprehension and Question Answering

squad2

2018-9-4 , 在squad数据集上排名第一。
数据集： Squad， TriviaQA， AddSent， AddOneSent

Introduction

我们的思想来自人类的阅读模式。
1. First, people scan through the whole passage to catch a glimpse of the main body of the passage.
2. Then with the question in mind, people make connection between passage and question, and understand the main intent of the question related with the passage theme.A rough answer span is then located from the passage and the attention can be focused on to the located context.
3. Finally, to prevent from forgetting the question, people come back to the question and
select a best answer according to the previously located answer span.

收到启发，我们提出了一个 hierarchical attention network，该network能够逐渐将注意力放到正确的答案范围。

如上图所示，我们的模型大致分为三部分：

encoder layer where pretrained language models and recurrent neural networks are used to build representation for questions and passages separately

attention layer in which hierarchical attention networks are designed to capture the relation between question and passage at different levels of granularity;

match layer where refined question and passage are matched under a pointer-network answer boundary predictor.

3. Model

3.1 Task Description

给定passage $P = \{w_t^P\}^n_{t=1}$ , question $Q = \{w_t^Q\}^m_{t=1}$ , 其中， n表示passage中的单词数， m表示question中的单词数。在SQUAD中，答案A是passage中的片段。阅读理解问题中的目标函数是为了学习函数 $f(q, p) = argmax_{a \in A(p)} P(a|q, p)$ , 训练数据表示为： $<Q, P, A>$

3.2 Encode-Interaction-Pointer Framework

Encoder Layer: 是一个language model，将passage和question转化为语义信息。
Attention Layer：用于捕捉 question与passage之间的关系。Besides the aligned context, the contextual embeddings are also merged by a fusion function. Moreover, the multi-level of this operation forms a ”working memory”;
Match Layer：利用一个 bi-linear match function 来计算question 与 passage之间的关联度。
Output Layer：使用Pointer Network 来搜寻问题的答案片段。

主要贡献在Attention层。

3.3 Hierachical Attention Fusion Network

我们的主要思想： performing fine-grained mechanism requires first to roughly see the potential answer domain and then progressively locate the most discriminative parts of the domain.
如上图所示，我们的模型有以下几方面组成：
1. a basic co-attention layer with shallow semantic fusion
2. a self-attention layer with deep semantic fusion
3. a memory-wise bilinear alignment function

该模型有两个特点：
1. A fine-grained fusion approach to blend attention vectors for a better understanding of the relationship between question and passage;
2. A multi-granularity attention mechanism applied at the word and sentence-level, enabling it to properly attend to the most important content when constructing the question and passage representation.

3.4 Language Model & Encoder Layer

对于word embedding model，我们使用glove embeddings， $\{e_t^Q\}^m_{t=1}, \{e_t^P\}_{t=1}^n$
对于char-level Embedding model，我们使用ELMO language miodel， $\{c_t^Q\}^m_{t=1}, \{c_t^P\}_{t=1}^n$ .

然后我们使用一个Bi-LSTM来获得最终的词的表示：

$u_t^Q = [BiLSTM_Q([e_t^Q, c_t^Q]), c_t^Q ] \\ u_t^P = [BiLSTM_P([e_t^P, c_t^P]), c_t^P ]$

3.5 Hierarchical Attention & Fusion Layer

The Attention层用于将连接和融合question 和 passage 的信息。它旨在调整问题和段落，以便我们能够更好地找到与问题相关的最相关的段落。

我们提出了分层的attention结构，通过将co-attention与self-attention机制结合。

没看懂