laitimes

26-Chinese Weibo Sentiment Analysis Evaluation Outline (Revised Edition) .pdf

author:Informed Dali Little Fish w

Chinese Weibo Sentiment Analysis Evaluation Outline (Revised Version)

1. Evaluation object

The object of this evaluation is the core technology of sentiment analysis for Chinese microblogging, including opinion sentence recognition and emotional tendency score

Analysis and emotional element extraction.

2. Task settings

This evaluation has the following three sub-tasks, of which task 1 is mandatory, and task 2 and task 3 are task-based

1. The participating teams can choose to do it.

2.1 Opinion sentence recognition

For each sentence in each microblog, this task requires you to determine whether the sentence is an opinion sentence or a non-opinion sentence.

Submission Format:

idrun-tagweibo-idsentence-idopinionated

illustrate

id: the sequence number of the result

run-tag: the team result identifier

weibo-id:微博id

sentence-id:句子id

opinionated:观点句标识,是观点句则为Y,非观点句则为N

Note: The format of run-tag is "Team ID_Submit Result Group Number", the team ID can be customized, and the group number is used for the area

Multiple groups of the same team submit results. The different fields are separated by \t. The same applies hereinafter.

For example, the following two Weibo posts:

Weibo1:

The Weinan City Management tearing Spring Festival couplets incident was widely reported by Focus Media on the bus in Chengdu!

Weinan city management is really perverted!

Weibo2:

#iPad3#这么麻烦的东西怎么还有那么多人在用, it's a jailbreak again

It's cracking.

By the way, how do you escape from prison?

There are two sentences in weibo1, the first sentence is a non-opinion sentence, and the second sentence is an opinion sentence. There are two sentences in weibo2,

The first sentence is an opinion sentence, and the second sentence is a non-opinion sentence. The correct output would be:

1xyz11N

2xyz12Y

3xyz21Y

4xyz22N

Note: The definition of opinion sentences in this review does not include sentences that express one's own emotions, wishes, or moods, such as "I feel very

Sentences such as "happy" are emotional sentences, but they do not fall under the definition of opinion sentences in this review. The opinion sentences defined in this review are limited to:

Reviews of specific things or objects (e.g., "I genuinely like the screen look of my iPhone.") "), excluding inner self emotion,

Willingness or mood.

Evaluation Criteria:

本任务使用正确率(Precision),召回率(Recall)和F值(F-measure)来评价各个参

The result of the team's recognition of the opinion sentence. It is calculated as follows:

#_()

Precision

#_()

systemcorrectopinionY

systemproposedopinionY

#_()

Recall

#()

systemcorrectopinionY

goldopinionY

2PrecisionRecall

F-measure

Precision+Recall



#gold是人工标注结果的数目,#system_correct是提交结果中与人工标注匹配的数目,

#system_proposed is the number of submissions.

2.2 Judgment of emotional inclination

This task requires you to determine the emotional tendencies of each opinion sentence in Weibo. The evaluation dataset contains individual sentences in each microblog,

Participating teams need to identify opinion sentences first, and then analyze the tendency of opinion sentences. The emotional tendencies of opinion sentences can be divided into positive

面(POS),负面(NEG)和其他(OTHER)。

Submission Format:

idrun-tagweibo-idsentence-idpolarity

illustrate

id: the sequence number of the result

run-tag: the team result identifier

weibo-id:微博id

sentence-id:观点句id

polarity: Sentimental propensity markers, positive for POS, negative for NEG, neutral and other unambiguous attributes

正面或者负面的为OTHER。

For example, in the above two Weibo posts weibo1 and weibo2, the second sentence of weibo1 is an opinion sentence, and the emotional tendency is negative

Face. The first sentence of weibo2 is an opinion sentence with a negative emotional tendency. The result should be as follows:

1xyz12NEG

2xyz21NEG

Evaluation Criteria:

本任务同样使用正确率(Precision),召回率(Recall)和F值(F-measure)作为评价

Standard.

#_(,,)

Precision

#_()

systemcorrectpolarityPOSNEGOTHER

systemproposedopinionY

#_(,,)

Recall

#()

systemcorrectpolarityPOSNEGOTHER

goldopinionY

2PrecisionRecall

F-measure

Precision+Recall



#gold(opinion=Y)是人工标注结果中观点句的数目,#system_correct(polarity=POS,NEG,

OTHER) is the number of submissions that match the human annotation, and #system_proposed(opinion=Y) is all submissions

The number of dots.

3.3 Extraction of emotional elements

This task requires us to find out the object of evaluation of the author of each opinion sentence in Weibo, that is, the emotional object. At the same time, the judgment is directed against the emotional pair

Polarity of the elephant's point of view. The evaluation dataset contains each sentence in each microblog, and the participating teams need to identify the opinion sentence first

Emotional element extraction is carried out.

Annotation:

1. Only extract the emotional elements of the opinion sentences in Weibo.

2. The emotional object should be extracted from the current sentence first, and then from the whole article if the emotional object does not exist in the current sentence

Weibo. For the second case, preference should be given to starting with the previous sentence of the current sentence (including those contained in the sentence

hashtag) to start looking for the emotional object in turn, if there is no more from the next sentence of the current sentence (including the sentence

sub) to start looking backwards. For those emotions that don't appear in the whole Weibo

The opinion sentence of the object (in some cases the emotional object is implicit) does not have to be drawn by the participating teams.

3. In a sentence, there can be multiple emotional objects, and the emotional objects corresponding to each emotional fragment should be extracted.

"You're not a human being anymore, you're cold-blooded than a snake, you're more brute than a beast." , asking for the three "you" to be drawn.

4. When extracting emotional objects, ask for the extraction to be as complete and unambiguous as possible, such as "The screen of the iPad is great!" ”,

Ask to extract the emotional object "iPad's screen", not just "screen".

5. For personal pronouns (you, me, he, it, you, we, them, they, etc.) come out separately as emotional objects

At present, it is necessary to try to refer to and dissolve within the scope of the microblog (excluding forwards and comment information) (it cannot be referred to

In the case of dissolution, these pronouns can be taken as objects). For example, "Xiao Ming studied at Peking University, and he is a famous person

Show students. The object of affection is "Xiao Ming", not "him".

Submission format

id run-tag weibo-idsentence-id target begin-offsetend-offsetpolarity

illustrate

id: the sequence number of the result

run-tag: the team result identifier

weibo-id:微博id

sentence-id:句子id

target: an emotional object

begin-offset: The starting position of the emotional object in the entire post

end-offset:情感对象在整条微博中的终止位置

polarity:对情感对象的观点极性,POS 代表正面,NEG 代表负面, OTHER 代表

Neutral or other situations that cannot be clearly classified as positive or negative.

Note: For emotional objects extracted from the current sentence, their starting and ending positions should also be based

Calculated on the entire Weibo post.

For example, the results of the emotional element extraction of weibo1 and weibo2 are as follows:

1xyz12渭南城管 2629NEG

2xyz21iPad315NEG

The file is encoded in Unicode (UTF-16), and each character occupies two bytes, and the offset of the first character in any microblog

is 0, the offset of the second character is 1, and so on. For example: weibo1 "Weinan Chengguan" at the beginning of the second sentence

These four characters correspond to 26, 27, 28, and 29 offsets in the entire microblog. Emotional objects are evaluated only in terms of begin-offset

and end-offset as the basis for judgment, and target does not participate in the evaluation.

Evaluation Criteria:

In this task, both Strict and Lenified evaluations are used, both using accuracy

(Precision)、召回率(Recall)以及F 值(F-measure)作为评价标准。

In an accurate evaluation, the offset and answer of the submitted emotional object are required to be exactly the same, and the polarity of the emotional object is also the same

is correct.

# _

Precision

# _

system correct

system proposed

# _

Recall

#gold

system correct

2 Pr ecision Recall

F-measure

Pr ecision+Recall

 

#gold 是人工标注结果中情感对象的数目,#system_correct 是提交结果中与人工标注匹

The number of matches, #system_proposed is the number of emotional objects submitted.

In the loose evaluation, one result contains 4 elements to participate in the evaluation: sentence Weibo ID, sentence ID, and emotional object area

(consisting of the start and end positions) and polarity, i.e., r=(wid, sid, s, p). Let's start by defining between the two outcomes

Coverage of C:

 

& &

s s

if p p wid wid sid sid

C R R S

else

  

     

  

where s and s' are the intervals of the emotional objects in the two results r and r', p and p' are the corresponding polarities, and wid and wid' are the microblog ids,

sid and sid' are sentence IDs Indicates the length of the calculation interval.

The coverage C between the two result sets R and R' is defined as:

  , ( , )

i j

r R

Read on