How Accurate Are Twitter Polls?
Twitter’s website promotes a Political Index score developed by Topsy Labs. Hours before the debate, the index showed just one in four tweets about Obama were likely to be positive while one in three tweets about Romney were likely to lean positive. Topsy doesn’t say what tweets are used to calculate these scores, though the company says its actual calculation function has been vetted by political polling firms.
Because tweets can contain sarcasm, multiple thoughts at once and several other English idiosyncracies that computers can’t pick up on yet, it’s difficult for a program to analyze their sentiment -- whether they say something positive or negative about a given subject.
“Companies aren’t very transparent about what tweets they are using,” said Alec Go, an engineer at Google who created Sentiment140 with classmates at Stanford. “Part of the reason for that is there are a lot of gray areas for sentiment analysis, even for human readers a lot of tweets are ambiguous or borderline.”
The people who design sentiment analysis algorithims also must factor in who’s tweeting. Should tweets from campaigns and the official accounts of candidates be dismissed or weighted differently than tweets from people not connected to campaigns? How should the measurement deal with tweets that are just news headlines with links?
Francois Bar, an associate professor of communication at the University of Southern California, has run into these questions during the past year as he and researchers at the Annenberg Innovation Lab try to develop an open-source algorithm.
“Increasingly, we’re realizing this is going to be a long-term project,” Bar said. “We’ve had a series of discoveries that is much more complex than we thought.”
Twitter will be used during debates by media networks as quick polling tool, or at least a poll of a small subset of America. Go said while there’s value in this form of sentiment analysis, it remains difficult to use the data to make the case for specific actions. Companies such as SocialFlow are measuring sentiment and conversations to direct marketers exactly when to promote a product. But the field is still young.
“They tell you trust us, and we’ll analyze and we’ll give you some summary about how people think about your product,” Bar said.
Go and his team have kept track of most of the companies that offer sentiment analysis solutions. Bar said most of these have been pretty coy about what tweets are used in measurement and how exactly they are scored, so it’s hard to tell if his own algoritihim is doing any better than their’s.
One of those data analyzers, Bluefin Labs, worked through tweets during the candidates' recent appearance on "60 Minutes" and found that older people on Twitter had negative views about Romney's healthcare policy proposals.
During the debates, the Annenberg Innovation Lab will try to manually record the sentiment of some tweets. Researchers also will look into which accounts tweet the most and where they are tweeting from.
“We want to get the big picture of what’s happening and then make some small adjustments (to the algorithim) for the next debate,” Bar said.
For example, Bar said during the nominating conventions at the end of August and early September, a huge number of tweets came from user accounts that had just been created. It was spike similar to what happened after Oprah mentioned Twitter on her talk show.
And so, he wonders, should the length of time the person has used Twitter be factored into the calculations?