When will Nonprobability Surveys Mirror Probability Surveys? Considering Types of Inference and Weighting Strategies as Criteria for Correspondence

When will a nonprobability sample look like a probability sample? This key question in survey research has reemerged with the proliferation of new, nonprobability methods for collecting social science data (Baker et al., 2013). Among these, Internet surveys offer myriad advantages in data collection, including reduced costs, faster survey administration, and possibly more accurate self-reports (Chang & Krosnick, 2010; Fricker & Schonlau, 2002; Greenlaw & Brown-Welty, 2009; Wright, 2005) as compared with other survey methods. These surveys also open new research possibilities, allowing for the presentation of experimental stimuli to broad national samples (Couper, 2000; Iyengar, 2011; Skitka & Sargis, 2006). But many of these advantages can only be reaped if data collected from nonprobability samples can be transformed to reflect the public. Inasmuch as this is not the case, there is reason for concern. The majority of Internet surveys use opt-in, nonprobability samples, for which generalizability is not assured (Baker et al., 2010; Couper, 2000).

As nonprobability data collection methods continue to propagate, it becomes increasingly important to demarcate conditions under which data from samples with broad but incomplete coverage and potentially problematic sampling frames will produce results similar to those of probability samples. The ability to account for any differences will likely depend on two critical factors: the type of inference we hope to make and the model we use to translate between the data collected and society as a whole. For example, probability and nonprobability samples may not reflect the same distributions of attitudes and behaviors at any given time, and, yet, they may reveal similar patterns of change over time. Nonprobability research might thereby estimate current attitudes by correcting for biases in past measures of those attitudes. Further, correspondence between data sources may depend on the set of survey weights that are applied. If we can identify conditions under which probability and nonprobability samples reliably reach similar results, newfound techniques can be implemented with confidence.

This paper juxtaposes two data sources: A probability-based telephone survey (collected using random digit dialing, or RDD) is compared with a series of simultaneously collected nonprobability opt-in Internet surveys to identify conditions of correspondence. Data streams are compared for each of three types of inference-point estimates, relations between variables, and trends over time-across four weighting strategies. The central question is the extent to which any combination of inferential objective and weighting procedure yields consistent results across vastly different methods.