方面级分类paper3 Interactive Attention Networks for Aspect-Level Sentiment Classification(IAN)(2017IJCAI)

The link of paper: InteractiveAttentionNetworksforAspect-LevelSentimentClassiﬁcationSource: 2017IJCAITask: aspect level sentiment classificationAuthor: JasminexjfTime: 2019-06-24参考：https:/...

Jasminexjf

2736人浏览 · 2019-06-24 21:05:07

Jasminexjf · 2019-06-24 21:05:07 发布

The link of paper: InteractiveAttentionNetworksforAspect-LevelSentimentClassiﬁcation

Source: 2017IJCAI

Task: aspect level sentiment classification

Author: Jasminexjf

Time: 2019-06-24

参考：https://www.jianshu.com/p/0c9d987141b6

一、前言 introduction

《Interactive Attention Networks for Aspect-Level Sentiment Classification》，这篇论文大致是讲利用attention机制将target和context联系起来，用于多层次语义分类。

多层次语义情感分类的解释是，如：

a group of friendly staff, the pizza is not bad, but the beef cubes are not worth the money!

这句话里，对staff，pizza, beef的情感是不同的，一句话里面的情感是多层次的。这三个名词都是target,这段话上下文就是是指context.

因此本文认为Aspect-level的情感分类任务中，target与context应该具有交互性，即context应该是target-specific的，target也应该是context-specific的，传统模型中将二者分开建模或只针对其一，本文利用attention实现二者交互。

Itroduction里面提到有些传统的做法，比如词袋，情感词库，SVM...这些方法都很容易遇到瓶颈，现在的做法一般都是利用深度学习进行多层次语义分析。

2011年的一篇论文解释了为何传统的方法会遇到很大的瓶颈（40% errors）,主要是因为之前的方法都没有有效的利用target信息。之后就有大量论文关注target的作用，但是这些论文都忽视了target与context的相互之间的关系。

作者的观点是：In our opinion, only the coordination of targets and their contexts can really enhance the performance of sentiment classification.

例子：

“The picture quality is clear-cut but the battery life is too short“

“Short fat noodle spoon, relatively deep some curva”

很显然，在上个context上下文short对于target battery是负面的，下面一个context上下文中，short对于target spoon是中性的.

所以结合short与具体的context的关系才能得到正确的语义情感分类。

那么，应该如何建模？

target和形容它的context其实是可以互相推理interative inference 的，所以两种虽然建模可以分开，但是模型学习的时候是通过两者之间的交互关系来学习的。
target和context不止一个单词，比如target: "picture quality"和context: "clear-cut"。这里的picture的重要度比quality重要，涉及到不同权重，自然而然就引入了attention机制。这篇论文也是第一个提出要将target与context分别计算attention weights的。并且不同的target对于相同的形容词的情感极性是不同的，所以target与context的关联也很重要。

二、模型model

从框架图分析论文设计的IAN模型的构造，target和context两方面分别输入相应的word embeddings,再将word embeddings的结果输入到LSTM网络中，获得隐藏层输出，再利用target和context隐藏层的输出平均值，结合attetion机制，生成attention weights. 最终target attention weights和context attention weights串联起来作为softmax函数的输入，得到分类结果。

1. word embedding

本文的word embedding采用的是预先训练好的词向量，一般是采用大语料库进行训练得到的初始词向量作为初始向量，然后在后续的训练过程中进行fine-tune.
2. LSTM