PositiveSentiment is an annotation type that includes all words that suggest positivity such as good, better,advances while the opposite annotation (NegativeSentiment) exists for all keywords that suggest negativity.
The bolder the lines between words the heavier the association. To get an idea of how people feel, look at the line that connects NegativeSentiment and the word still which implies that the strongest sentiment is that US Economy is still under big problems.
- US President tells that the economy gets better but people don't feel the same.
- Economy cannot be getting better while at the same time there are layoffs.
- People expressing very negative feelings after losing their jobs.
Notice also the association between NegativeSentiment and people, job, money, sales. Interesting insights can also be found if brand names and product categories are also taken into account : In this analysis a specific brand was found that was associated with word sales and a good overall sentiment. Buying behavior can also be found regarding consumer intentions when the time is right.
You will also find that an association exists between finance_institution keywords (implying keyword Fed) and PositiveSentiment. This association exists because a number of Re-Tweets is about the Fed signaling the start of exit from recession and its impact on housing. Interesting also is the association between the words fool and annotation PositiveSentiment (...)
Specific Tweets were removed such as spam Tweets (that try to sell investing products). Re-Tweets were kept intact since we are making the assumption that if someone Re-Tweets -say- a positive sentiment Tweet then he/she also feels the same -positive- sentiment. Tweets that were jokes were identified, marked accordingly and removed.
As with many examples in the past, the software that was used consisted of GATE (for annotating unstructured text from Tweets) but also SPSS Clementine (now PASW Modeller). Here is the setup from GATE :
Specific rules (JAPE) were used that identify and annotate accordingly negative and positive sentiment. Consider the following sentences :
- The economy is most likely bad at the moment
- If the economy is great then why so many people can't find a job?
The first sentence has clearly a negative sentiment since the word bad exists. However the second phrase contains the word great so a specific matching rule should take into consideration the word If and annotate this phrase as one having negative sentiment despite the presence of word great.
After running GATE here is how the -now structured- data look like from a smaller sample of the original dataset (notice the highlighted record and the IfGood flag) :
With data in a structured form as the one depicted above we are then ready to identify which Tweets were found having a positive or negative sentiment, see erroneous annotations , take corrective actions and finally analyze the information and extract knowledge from it.
Link to original post