Here's the second half of our discussion with Greg Greenstreet, CTO & SVP of Engineering. In the second part of our discussion, Greg will be providing a technical perspective on how CI approaches social media & text mining analytics. Want to catchup on part I of our interview, read The Technology behind Social Media Analytics.
We left off our discussion about semantic filtering at trait extraction. What do you mean by trait and how does it relate to customer conversations?
Essentially, we are defining traits around conversations, which can then be used to perform standard marketing analyses such a factor analysis or discriminate analysis. Semantic analysis has something called a "meaning space", which is basically where each post is placed. Post-to-post similarity can then be performed by measuring how far apart any two posts are within this space. The resulting vector defines a point in this space, which captures the meaning within the collected set of documents. Similar to placement of a pushpin on a map, CI can measure how close a post is to various concepts of interest and the items clustering around that pin.
Let me give you an example. Suppose a company is interested in identifying consumer conversations related to broadband speeds for cable television provider. Consumer conversations may be about their intention to change service providers potentially balancing their intention on attributes like customer service, broadband speed or modem/router quality to form a combined value perspective. To identify these conversations an "internet" trait can be defined as a specific location in the semantic space. Then as each consumer post is processed it can be assigned a value for the "router" trait. This is how CI is able to identify and aggregate conversations relating concerns over service when choosing a cable provider.
The great thing about this type of rigorous analysis is that we can go beyond the conversation. By collecting and assigning traits of the posts we are able to map back to a specific author, producing an author profile. These profiles are used to segment authors by their demographic and psycho graphic properties and further define the true voice of the customer.
In the following image, we look at how authors discussing airlines are clustered around lifestyle traits, and then compare across brands to reveal which brands have higher clustering across those traits to each other.
We've talked about how CI's semantic engine filters social media conversations, but how do we extract actual customer insights and data?
Once our filters are configured to capture the desired conversations, we can then begin doing deeper analysis. Our process works a little differently as we focus our semantic engine on "snippets" of conversations from posts, which are small chunks of conversations that are anchored by a unique term. Let's say you are doing some analysis using the term "Comcast". A blog post mentioning the term "Comcast" may also talk about the author's recent vacation and other interesting data but irrelevant to to our interest in "Comcast". By using this idea of anchor terms, we are able to extract the snippet of text directly around the term "Comcast", so now you are working with only that text that is relevant to the topic we are interested in.
This precise cleaving of content provides more accurate sentiment, theme generation and trending and term analysis.
In the example above, we applied a "Subjective Filter" which narrows our collection of snippets to first-person only conversations. This is extremely valuable if you are wanting to derive opinions and insights customers may have about your product or services.
A new function that we recently released is something called a Theme Trend.
Now you can see not only the resulting themes, but how the themes manifest temporally over the time period being analyzed. You can now watch how themes emerge, bifurcate, and merge back with one another over time.
Social Media Analytics is in some ways the study of language. Why would word choice and language customers are using be of interest to a company?
We're looking at the terms and themes customers are using to describe products and services they use, their word choice for expressing an opinion or a recommendation. Essentially, we are analyzing true voice of customer to derive specific insights into customer considerations and preferences. We have a specific feature called "Term Analytics" which allows users to examine the actual language that is being used around their anchor term. This is invaluable information for marketing, tracking campaigns, or understanding customer service issues.
The Term Wheel graph (pictured above) shows not only the top terms by parts of speech surrounding our chosen anchor term of "Comcast", but a histograms shows where each term is in relation to the anchor term.
The next step, of course, is exploring the audience of consumers who are generating the content. CI allows you to more fully explore other themes of interest to a specific author. Now, imagine how much more informed a company's engagement strategy with that customer can be, by understanding their other interests and preferences.