TY - JOUR
T1 - Proxying for unobservable variables with internet document-frequency
AU - Saiz, Albert
AU - Simonsohn, U.
PY - 2013/2
Y1 - 2013/2
N2 - The internet contains billions of documents. We show that document frequencies in large decentralized textual databases can capture the cross-sectional variation in the occurrence frequencies of social phenomena. We characterize the econometric conditions under which such proxying is likely. We also propose using recently-introduced internet search volume indexes as proxies for fundamental locational traits, and discuss their advantages and limitations. We then successfully proxy for a number of economic and demographic variables in US cities and states. We further obtain document-frequency measures of corruption by country and US state and replicate the econometric results of previous research studying its covariates. Finally, we provide the first measure of corruption in American cities. Poverty, population size, service-sector orientation, and ethnic fragmentation are shown to predict higher levels of corruption in urban America.
AB - The internet contains billions of documents. We show that document frequencies in large decentralized textual databases can capture the cross-sectional variation in the occurrence frequencies of social phenomena. We characterize the econometric conditions under which such proxying is likely. We also propose using recently-introduced internet search volume indexes as proxies for fundamental locational traits, and discuss their advantages and limitations. We then successfully proxy for a number of economic and demographic variables in US cities and states. We further obtain document-frequency measures of corruption by country and US state and replicate the econometric results of previous research studying its covariates. Finally, we provide the first measure of corruption in American cities. Poverty, population size, service-sector orientation, and ethnic fragmentation are shown to predict higher levels of corruption in urban America.
UR - http://www.scopus.com/inward/record.url?scp=84872467938&partnerID=8YFLogxK
U2 - 10.1111/j.1542-4774.2012.01110.x
DO - 10.1111/j.1542-4774.2012.01110.x
M3 - Article
AN - SCOPUS:84872467938
SN - 1542-4766
VL - 11
SP - 137
EP - 165
JO - Journal of the European Economic Association
JF - Journal of the European Economic Association
IS - 1
ER -