
Text sentiment analysis

author:Informed Dali Little Fish w

Psychological Terminology Psychological Terminology

Text sentiment analysis


Brief introduction

Information extraction



Research Interests:






Basic Information

Text sentiment analysis: also known as opinion mining, tendency analysis, etc. To put it simply, it is to analyze, process,

The process of induction and reasoning. Connections (e.g., blogs and forums and social service networks such as Dianping) generate a large number of user engagements,

Valuable review information such as events, products, etc. These comments express people's various emotional colors and emotional tendencies, such as joy, anger, sadness, happiness and criticism

Comments, praises, etc. Based on this, potential users can get a sense of public opinion about an event or product by browsing these subjective reviews.

Brief introduction

Sentiment analysis or opinion mining is the process of evaluating people's opinions, emotions, attitudes towards entities such as products, services, organizations, etc. The field is growing fast

The quick start benefited from the rapid development of social media on the Internet, such as product reviews, forum discussions, because this is the first time in human history that it has been so huge

Records in the form of large numbers. Since the early 2000s, sentiment analysis has grown to become one of the most active research areas in natural language processing (NLP).

He also has extensive research in data mining, web mining, text mining, and information retrieval. In fact, it has spread from computer science to management

Science and social sciences, such as marketing, finance, political science, communications, medical sciences, and even history, are triggered by their important commercial nature

will be common. This proliferation is due to the fact that opinion is at the center of the facts, and almost all human activities, to a considerable extent, care a lot about what others think. For

For this reason, whenever we need to make a decision, we often look for the opinions of others. This is not only for businesses, but also for individuals.

Nowadays, if a person wants to buy a consumer product, it is no longer limited to asking one's friends and family for their opinions, because there are a lot of user reviews and pairs

Product discussions are in the public forums on the network, and we can find out what we want to know in the comments, and there may be unexpected gains. about

An organization, it may no longer need to conduct surveys, polls and focus groups to gather public opinion because there is abundance

Such information disclosure. In recent years, we've seen those posts use social media to reshape corporate image, discuss celebrity lives, and sway public sentiment

threads and emotions, which have profoundly affected our social and political systems, such posts also mobilize mass political change. When we are sighing at people's words

At the same time, we also have to admit that the rapid development of social networks has brought huge moral problems. From this sentiment analysis was born, and we can pass through

Sentiment analysis or public opinion system helps the government monitor the emotional changes of the masses or the trend of public opinion, so as to avoid the occurrence of vicious events or false incidents

of occurrences.

Generally speaking, the purpose of sentiment analysis is to find out the attitude of the speaker/author on certain topics or in relation to the polar points of view of a text.

Text-level Word level

Sentence level


Article-level sentiment classification is the designation of the overall sentiment direction/polarity, i.e., determining whether the article (e.g., a full online review) is conveyed or not

Overall positive or negative opinions. In this context, this is a binary classification task. It can also be a regression task, for example, a review from 1 to 5 stars

Inferred overall rating. It can also be thought of as a level 5 classification task.

We can combine natural language processing technology with fuzzy logic technology to comment on news stories and movies based on manually created dictionaries of fuzzy emotions

On Conducting Sentiment Analysis. Define the types of emotions and mark the categories of emotions and their intensity in the fuzzy emotion dictionary. Each word can belong to more than one emotion category.

In the experiment, it is possible to compare different features such as word frequency, length-related features, semantic tendencies, emotional PMI-IR, emphatic words, and special symbols

results. Finally, the active/passive and positive/negative nature of the article were judged.

Sentence level

Since the sentiment analysis of sentences is inseparable from the emotions of the words that make up sentences, its methods are divided into three categories: (1) knowledge base-based analysis methods; (2)

network-based analysis methods; (3) Corpus-based analysis methods.

When we identify the sentiment of sentences in text information, we usually create a sentiment database that contains some emotion symbols, abbreviations, sentiment words,

modifiers and so on. We will define several emotions (anger, hatred, fear, guilt, interest, happiness, sadness, etc.) in specific experiments, and couplets

Sublabel one of the sentiment categories and its intensity value to classify the sentiment of the sentence.

Word level

The sentiment of words is the basis of sentiment analysis at the sentence or passage level. Early text sentiment analysis mainly focused on the judgment of the positive and negative polarities of the text. Words

Sentiment analysis methods can be summarized into three main categories: (1) dictionary-based analysis methods; (2) network-based analysis methods; and (3) corpus-based analysts


The dictionary-based analysis method uses the synonym and antonym relationships in the dictionary and the structural level of the dictionary to calculate the words and the positive and negative polarity seed words

semantic similarity, classifying the sentiment of words according to the proximity of semantics.

The network-based analysis method uses the search engine of the World Wide to obtain the statistical information of the query, and calculates the semantic relationship between the word and the positive and negative polarity seed words

Liandu, thus classifying the sentiment of the word.

Based on the corpus analysis method, the relevant technology of machine learning is used to classify the sentiment of words. Machine learning methods often require a handicap first

The class model learns the rules in the training data, and then uses the trained model to make predictions on the test data.

Information extraction

The lowest-level task of sentiment analysis, which aims to extract meaningful information units from the sentiment review text, and the sentiment information extraction can be extracted to extract the sentiment score

The results of the analysis of contributing word or phrase elements play an important role in reducing the dimensionality of features and improving the performance of the system.

Mutual information, expected cross-entropy, word frequency, document frequency, etc.

Extraction and discrimination of evaluation words

That is, for the recognition and polarity of evaluation words and measurement judgment, the extraction and discrimination of evaluation words is often an integrated work, which is mainly divided into language-based

There are two methods: library and dictionary-based.

Corpus-based evaluation word extraction and discrimination:

It is mainly based on the statistical characteristics of the large corpus and the observation of some phenomena to mine the evaluation words in the corpus and judge the polarity. Its most important advantages are:

The disadvantage is that the review corpus is limited, and the distribution of comment words in the large corpus is not easy to summarize.

Dictionary-based evaluation word extraction and discrimination methods:

It mainly uses the word meaning between words in the dictionary to mine and evaluate words, and the most difficult thing is that the degree of updating of the dictionary determines the word meaning analysis.

Research Interests:

A basic step in text sentiment analysis is to classify the polarity of a known passage in a text, which may be at the sentence level, at the work level

Level. The function of classification is to determine whether the views expressed in the text are positive, negative, or neutral emotions. More advanced "beyond bipolarity"

Sentiment analysis also looks for more complex emotional states, such as "angry", "sad", "happy", and so on.

In the field of textual sentiment analysis, early research contributions include Turney and Pang, who used a variety of methods to detect product reviews and movie images

Commenting on the polar point of view. This research is based on an analysis conducted at the document level. Another way to classify document opinions can be multi-level, Pang and

Snyder (among others): Extending an earlier study of the basic polar opinions, the film rating is classified and predicted to be on a multi-scale scale of 3 to 4 stars

Snyder did an in-depth analysis of restaurant reviews, predicting restaurant ratings based on a variety of different aspects, such as food, atmosphere, etc. (on a 5-star scale).

institutionally). Although in most statistical classifications, the "neutral" class is often overlooked, because the text of the "neutral" class is often disregarded

At the periphery of a bipolar classification, but many researchers point out that within each polarization problem, three distinct categories should be identified. enter

In a step-by-step, some existing classification methods such as Max Entropy and SVMs can demonstrate that distinguishing between "neutral" classes can help in the classification process

Improve the overall accuracy of the classification algorithm.

Another way to determine the sentiment of a text is to use a proportional conversion system. When a word is generally associated with negative, neutral or positive emotions

, assign the word to a numerical scale between -10 and +10 (the most negative to the most positive emotion), and use natural language processing to analyze a non-knot

After constructing the text data, the remaining concepts can also be analyzed to derive the relevance of words to concepts. Next, each concept can be given one

Scores, which are based on the relevance of emotional words to the concept, as well as their own scores.


The existing approaches to text sentiment analysis can be broadly grouped into four categories: keyword recognition, word association, statistical methods, and concept-level techniques. keyword

Recognition is the use of clearly defined affect words that appear in the text, such as "happy", "sad", "sad", "scared",

"boring" and so on, to affect the classification. In addition to detecting influencing words, lexical associations are also attached to a word that is "associated" with a certain emotion. Statistician

法通过调控机器学习中的元素,比如潜在语意分析(latent semantic analysis),SVM(support vector

machines),词袋(bag of words),等等。 (参见Peter Turney在相关领域的研究成果。 )一些更智能的方

Dharma is intended to detect the emotional holder (the person who maintains the emotional state) and the emotional target (the entity that causes the emotional holder to generate emotions). To dig in

Giving an opinion in a context, or a function of getting an opinion given, requires the use of grammatical relationships. The correlation between syntax is often required

It is to be obtained by deep parsing of the text. Unlike pure semantic technology, the concept-level algorithm idea weighs the knowledge representation

representation)的元素,比如知识本体 (ontologies)、语意络(semantic networks),因此这种算法也

You can detect the subtle emotional expressions between the words. For example, analyze concepts that do not explicitly express relevant information, but through them for clarity

The concept is not obvious to get the information sought.

There is a lot of open-source software that uses machine learning, statistics, and natural language processing techniques to compute large sets of text

Sense analysis, these large collections of texts include pages, online news, discussion groups, online comments, blogs, and social media.

Thank you for watching

Read on