Brand marketers and agencies engaged in social media are applying social media “intelligence” using offers from a growing number of application and service providers in an increasing differentiated field. Sentiment analysis is just one set of business intelligence vectors we can discover and analyze in social media data.
Previously in this series, we’ve explored how sentiment may be derived from text, including issues related to text-as-data, the integrity of that data and the processes that are applied. Now, lets look at some of the sentiment analysis factors brands and agencies should consider when choosing social intelligence platforms.
1. It all starts with the data. Bad data cannot yield good analysis. As explored in the last part, text analysis insiders express skepticism about the completeness and integrity of the text documents provided by social media platforms. Brands need to be equally vigilant when selecting social intelligence solutions. While there is no “official” repository of tweets, for example, several vendors tout the value of their stored history. Setting aside the question of the usefulness of long-tail trends in social media as effective business intelligence, stored data should be scrutinized with the same vigor as streamed data.
It may not be a question of whether the particular text stream or set is complete, intact and with sufficient history in the abstract, but whether the it meets the criteria for analysis to be conducted. Katie Delahaye Paine of research and consulting company KDPaine & Partners expressed this well at the Sentiment Analysis Symposium. “People are not focusing on what’s relevant to customers,” she said. “They think they need to see everything.”
2. There is no single, perfect secret sauce. I’ve always wondered why, if they were so good at predicting the future, fortune tellers were not richer than Warren Buffet. Just a bit facetiously, I wonder if the sentiment analysis company that has the ultimate algorithm should be able to learn the sentiment of brand marketers about their products and those of their competitors so well they nail 100% of the market.
Seriously, I’m not asking for vendors to disclose every aspect of their analysis processes. As with all things social, there is an appeal for those that are more transparent about what they do and how they do it. For example, I was invited to the Lexalytics Users Group meeting the day after the Sentiment Analysis Symposium, where it introduced Salience 5.0, an upgrade to its flagship product. The company announced it had created, what it calls, the “Concept Matrix” by digesting the entirety of Wikipedia. It’s a huge corpus information that’s human edited — the company sees Wikipedia as a source of not just information, but how people organize that information. Yes, good marketing for Lexalytics, but also more revealing about what’s “inside” of their product than offered by many other companies. Time will reveal how well it works.
3. The science is evolving rapidly. The body of social media text is not just being mined by brand marketers; value exists for counter-terrorism, finance and a host of other sectors. The demand is accelerating research and implementations. We may be in a buyer’s market in which vendors need to work hard to maintain customer loyalty.
4. Sentiment analysis is difficult and imperfect. Sentiment Analysis Symposium founder Seth Grimes cites expert systems pioneer Edward A. Feigenbaum as observing, “Reading from text in general is a hard problem, because it involves all of common sense knowledge.” Don’t pretend that whatever systems you choose will not have limitations and try to gain awareness of the limitations before implementation.
5. Man versus or with machine. Corollary to the above, neither full-automated nor fully-human sentiment analysis is always 100% perfect — of the two, human analysis appears, today, have an edge. But, the enormous amount of data and the speed at which businesses demand information points to machine processing. Between those on one side who feel the accuracy of automated sentiment analysis is sufficient and those on the other side who feel we can only rely on human analysis, most in the field concur with Marguerite Leenhardt of Paris 3 University that we, “Need to define a methodology where the software and the analyst collaborate to get over the noise and deliver accurate analysis.”
6. Data may travel a convoluted path from the source to your desktop. There are a small number of companies that are core data providers for sentiment analysis products. (And these data providers may be turning to other companies for source text documents.) The majority of commercial solutions are based on the data of these companies, tailored for the industry sectors they serve.
This in no way implies that brands need to get closer to the source — it’s too expensive and time consuming for most brands and there is an advantage to using a product that is informed and improved by its multiple users. It may be worthwhile, though, for a brand to know all the “ingredients” in the solutions in which it is investing.
7. Correlation isn’t causation. One should not assume “A” caused “B” because it’s easily observed “B” occurred after “A”. I’ve seen some marketing material from vendors in this space that seem to forget this basic concept. This is not a condemnation of the products, just of their marketing.
In the final part of this series, we’ll explore whether social media may serve as a leading indicator for changes to stock price and news media sentiment.