Comments (3)
@hosseinfani
Hi, I will finish it by tonight.
from lady.
@farinamhz
where is the summary? you were supposed to finish it by now!
from lady.
Main problem
The main problem of this paper is to detect the aspects of the text that specifically works on the restaurant domain. In sentiment analysis, an aspect is a dimension on which an entity is evaluated.
Aspects can be in two forms: 1. Concrete, like "a laptop battery" 2. Subjective, like "the loudness of a motorcycle."
Existing work
Existing methods to detect aspects in reviews can be divided into two categories based on the level of human supervision of the method:
-
Supervised: Most existing systems are supervised, and due to the reason that aspects are domain-specific, these supervised methods can not transfer data well between different domains. Also, we do not have much training data in all the domains and different languages, so these supervised methods are unsuitable in many cases.
-
Unsupervised: Besides those supervised methods, there are some existing works on unsupervised methods, including topic models and other neural models which have a complex architecture with a large number of parameters, although a simpler model can reach that accuracy in aspect extraction.
Inputs
- A collection of reviews with no human supervision
Outputs
- Aspects in the reviews include these three types of aspects: Food, service, and ambience.
This is because they want to compare their work with others, and they found out that previous work reported that the other labels, Anecdotes, and Price, were not reliably annotated.
Example
- "The sushi was great": The label which is the aspect is "Food" in this case.
Proposed Method
Unlike unsupervised deep neural networks that fit on some corpus and also require users to link discovered aspects to labels manually, this model links labels to aspects automatically, and it is not fit on a specific corpus.
The figure below is the schematic view of the Contrastive attention model:
Given a sentence S, we want to find aspect and a set of aspect terms A which are represented by word embeddings:
First, the model creates an attention-weighted average using the similarity between words and aspects. The RBF kernel defines this similarity. So it has an attention vector as the output that weighs the sentence with this vector.
Then, it assigns the sentence label of the closest aspect term by linking the sentence summary to one of three labels.
Attention is calculated as the sum of all RBF responses to a word divided by the sum of the RBF responses to all words.
Formulas of these two are attached below:
They calculate the RBF kernel between every word and every aspect in the sentence for each word divided by all RBF responses. So they create an attention distribution over a sentence (like probability distribution). Then if they multiply the attention distribution by the tokens in the sentence, it will be like a standard attention distribution.
Experimental Setup
Dataset
All datasets have been annotated with one or more labels for aspect in a sentence.
- Test set (and Evaluation): Citysearch dataset
- Development sets: Restaurant subsets of the SemEval 2014 and SemEval 2015.
Metrics
- Precision
- Recall
- F-scores
In addition, they compared weighted macro averages.
Baselines
- W2VLDA: Topic modeling method that computes the similarity from a word to a set of aspects. (Unsupervised)
- SERBM: This model learns topic distributions and assigns words to the distributions. Therefore, it assigns words to aspects. (Unsupervised)
- ABAE: An attention-based system that learns an attention distribution over words in the sentence by considering the global context and aspect vectors. In fact, it is an autoencoder using an attention mechanism. (Unsupervised)
- AE-CSA: An attention-based model that is similar to ABAE but also considers sense and sememe vectors. (Unsupervised)
What is sememe?: A sememe is a semantic language unit of meaning correlative to a morpheme. For example, the -er in singer would be a sememe pointing to someone performing the action of singing. Sing- is another instance of a sememe in this sense.
What is sense?: In linguistics, a word sense is one of the meanings of a word. For example, a dictionary may have over 50 different senses of the word "play," each of these having a different meaning based on the context of the word's usage in a sentence. - A baseline based on the mean of word embeddings,
Results
The main contribution of this paper is to propose an unsupervised model for detecting the aspects of reviews. Results show that this unsupervised method using frequency together with an attention mechanism based on RBF kernels and automated aspect assignment method leads to more accurate detection in comparison to baselines that have more complex architecture, such as neural models and topic modelings.
Results are compared for the baselines and three models in this paper. The first model is the Mean which is the mean of the word vectors, and the second one is Attention which is the dot-product of attention, and CAt is the complete model containing both.
Code
There is a repository in GitHub, which is the official implementation of this paper.
Presentation
There is a presentation in this link.
from lady.
Related Issues (20)
- batch execution of nllb translation
- Distribution of aspect terms and aspect categories in datasets HOT 6
- Adding HAST as a supervised baseline HOT 10
- Classification baseline for aspect term extraction HOT 18
- pipeline progress flow
- Check the existing readme and codeline HOT 2
- Gif image/video for illustrating the pipeline HOT 2
- Dockerize and fix installation on linux HOT 13
- a server for the web app
- Setup and Quickstart HOT 5
- Aspect based sentiment analysis + Running Bert and Cat library
- Adding Twitter Reviews Dataset HOT 1
- Needing for update in OCTIS library HOT 2
- Aspect Sentiment Triplet Extraction Baseline HOT 20
- New baseline for Aspect-Based Sentiment Analysis HOT 2
- Literature Review on Aspect and Sentiment Extraction HOT 1
- Adding a new tanslation model to the pipeline HOT 1
- OCTIS.CTM throws a value error during the training phase HOT 3
- Updating stats on quality of translation HOT 4
- Incorporating Underrepresented Languages: A Focus on Low-Resource Languages HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lady.