The internet consists of billions of opinions. Focusing on unstructured internet chatter, we investigate if there is any quantitative informational content in such massive amounts of cheap talk. We develop a technique that integrates information from the internet by quantifying relative amounts of chatter. We find the relative frequencies of internet chatter in reference to major social phenomena in a geographic area to be highly correlated with actual demographic and economic empirical data frequencies. We exemplify the power of this technique by computing measures of corruption for countries, states and cities. These not only proved highly correlated with ratings of experts, but also allowed us to replicate the results in published papers establishing correlates of corruption. We discuss extensions and limitations of this approach.
|