Now I'm running some text analysis on the tweets. I'll be posting code and writing up results here over the next few days. Questions are welcome!
For starters, here are words people use in support/opposition to the #OWS movement.
Support | Oppose | |||
love | 0.48 | homeless | -0.93 | |
politics | 0.46 | focus | -0.9 | |
opwallstreet | 0.43 | crowd | -0.75 | |
congress | 0.43 | handouts | -0.74 | |
stand | 0.4 | capitalist | -0.71 | |
bank | 0.39 | irony | -0.59 | |
brutal | 0.36 | called | -0.51 | |
class | 0.32 | act | -0.49 | |
strike | 0.31 | scanner | -0.46 | |
poll | 0.31 | happened | -0.44 | |
evict | 0.31 | received | -0.36 | |
p21 | 0.31 | getting | -0.33 | |
stay | 0.29 | quotwe | -0.32 | |
global | 0.29 | dont | -0.32 | |
help | 0.25 | cont | -0.31 | |
justice | 0.23 | john | -0.3 | |
income | 0.23 | home | -0.29 | |
senatorsanders | 0.23 | paul | -0.28 | |
moveon | 0.22 | hear | -0.24 | |
occupywallst | 0.21 | weoccupyamerica | -0.23 | |
solidarity | 0.2 | protests | -0.22 | |
call | 0.2 | free | -0.19 | |
cop | 0.2 | tents | -0.19 | |
allowed | 0.17 | protesting | -0.14 | |
peaceful | 0.16 | occupylsx | -0.13 |
Here's the same data, rendered as word clouds, so it looks artsy. This really is the same data: sizes in the wordcloud are determined by the weights of the classifier -- regression betas, for you mathy people out there. Color and coordinates are arbitrary. So these wordclouds are exactly the same info as the tables above, just presented in a more visually appealing format.
In support:
Opposed:
As I peer at these tea leaves, I see a solidarity-oriented "stand together against brutal capitalist injustice" theme in the support words, and a libertarian "quit your whining and get to work" theme in the oppose words. What do you make of it?
Caveats and details of the method
This analysis is based on 1,000 tweets drawn from Monday, Tuesday and Wednesday of this week, so some of the themes might be specific to the events of those days. Also, there was quite a bit of noise in the sentiment coding. That will probably wash out in a large enough sample, but I don't know if 1,000 if large enough. Finally, support on twitter was running about 85% in favor of the protests, so the assessments of opposing words are probably less robust.
I confess, I think your word-based evaluations of texts are brilliant, and it's so interesting to try to tease out the context of these words. For example, the fact that "love" and "brutal", "stay" and "evict" are the contrasts evoked by the OWS crowd, while the anti-OWS group is almost entirely negative - with the exception of the word "focus" (which OWS lacks). Speaking of which - is that lack of focus borne out by text analysis? If you were to do a spread of words, would you see the anti-OWS crowd more focused on certain words while the OWS crowd is more dispersed? Regardless of which way it went, I think it would be an interesting reflection on both crowds.
ReplyDelete@Margaret - Glad you like it. Your idea on lack of focus is a really good one. I've been trying to think of a way to compare the two sides. It's tough to do well, because we'd have to control for the fact that so many more of the tweets were supporting than opposed.
ReplyDeleteThat said, maybe it's enough to know that the beta values were much larger (on the order of double) for the "opposed" tweets. In other words, it looks like the message of the anti-OWS minority is captured in a smaller set of words than the pro-OWS majority. I'd love to come up with a good way to visualize this -- the "on message-ness" of a movement.