Each week, Data for Progress surveys thousands of voters, primarily using SMS text-to-web and web panels, across weekly national omnibus surveys and additional polling projects. Despite both of these modes being relatively new in a world where the dominant mode of polling until recently has been live caller interviews, our methodology meets, and often exceeds, existing polling benchmarks for accuracy and respondent pool quality.
Polling methodology is constantly evolving to reach voters where they are — which is increasingly on their cellphones and computers. Data for Progress (DFP) continually adjusts and experiments with new methods of recruitment, weighting, and instrumentation to provide the most accurate picture of the electorate possible.
How We Poll
SMS Text-to-Web
SMS text-to-web is a method of polling that allows for less intrusive outreach compared with a phone call, while maintaining comparability to a registration-based study via phone. Data for Progress maintains access to a permanent voter file, supplements the file with cellphone records, and scores every voter with an SMS response propensity score. When sampling, this allows our analysts to stratify by any voter file and commercially appended features (e.g., age, gender, partisanship, income, and urbanicity).
Web Panel
Data for Progress works with a series of web panel respondent marketplaces such as Cint to recruit web respondents. To guard against self-selection among respondents, DFP maintains a complex set of quotas and screening questions to ensure each survey is representative of a genuine sample of respondents.
Web panel survey research requires a delicate balance between disqualifying disingenuous respondents and controlling for frequent participants, while also preserving responses from genuine and low-socioeconomic status respondents. DFP monitors providers, respondents, and trends among responses to detect anomalies in respondent pools, and automatically disqualifies or controls for respondents based on a series of checks, including attention, truthfulness, inconsistency, and completion speed. DFP also works with providers to monitor quality and blacklist those with high failure rates or inconsistent respondents to ensure accuracy.
Additional Recruitment
Data for Progress also regularly works with other survey modes (e.g., live caller phone recruitment, interactive voice response, and mail-to-web) in addition to our primary modes. In some cases, these could serve as the dominant mode, depending on the project. Their inclusion depends on the efficacy of existing modes in a given geography or among a specific respondent pool. When these modes are used, they are included in the methods statement along with each of our polling releases.
Sampling Methodology
Random Contact
Our sampling frames are generated by selecting records at random from a commercial voter file. We stratify these on personal and regional demographic and political characteristics available in the file. For SMS text-to-web polling, for example, sampling is conducted at the level of individual cellphone numbers. In the case of mail surveys, our sampling is based on address-level data. For other random contact based modes such as live caller, we utilize relevant, unique identifying variables for sampling purposes.
Respondent weights for sampling can vary on a survey-to-survey basis depending on the objective pool of respondents and the target population. When possible, we oversample voters who are less likely to respond based on a voter-level response score trained on past response data.
Web Panel
Potential web survey respondents are uniquely identified for survey participation by profile identifiers defined and maintained by a marketplace vendor. Profile selection for inclusion in a web panel survey is determined by DFP via preset quotas based on demographic, partisan, and survey participation characteristics necessary for a representative respondent pool.
Profiles are often sent an email invitation, an in-app notification, or an SMS notification informing them that the survey is for research purposes only, how long the survey is expected to take, and what incentives are available. On occasion, respondents will see surveys they are likely to qualify for upon signing into a panel portal. To avoid self-selection bias, survey invitations do not include specific details about the contents of the survey and are instead kept general.
Respondents will receive an incentive based on the length of the survey, their specific panelist profile, and target acquisition difficulty, among other factors. The specific rewards vary and may include cash, airline miles, gift cards, redeemable points, charitable donations, sweepstakes entrance, and vouchers.
Weighting and Data Analysis
Our national omnibus surveys are weighted to match the composition of likely voters by age, gender, education, race, geography, 2020 recalled vote, and select joint distributions of those variables from the TargetSmart commercially available voter file and high-quality, third-party turnout modeling. In 2024, we refined our methodology statement from "vote history" to "2020 recalled vote" for enhanced precision and clarity. Some surveys may represent registered or other voter environments similarly defined by the commercial voter file, or adult populations constructed from the most recently available Census Bureau’s American Community Survey (ACS) 1-year data; the targets for these surveys will be listed in each specific survey’s methodology statement.
We use raking with regularization to generate weights for our survey respondents. Raking is a procedure where data points are given a weight so that selected marginal distributions match given targets. Regularization allows us to generate weights which significantly reduce bias while keeping excessive weight variation to a minimum.
Our weighting scheme also requires additional conditions: (a) our weights are as close to uniform as possible, (b) our weights are constrained by upper and lower bounds of weight values, and (c) the weights sum to N (i.e., the average weight is 1). Additionally, our weighting scheme uses a series of checks to remove low-quality responses while preserving low-socioeconomic and low-salience responses using a series of attention and response quality checks.
Survey Accuracy
While Data for Progress has demonstrated a strong track record for accuracy – ranking in the top 10% of pollsters in FiveThirtyEight’s Pollster Ratings as of March 2024 – surveys across all modes are subject to a variety of sources of error that may impact final topline results.
Sampling Error
While it is possible to draw a sampling frame at random consisting of individuals we wish to contact, these methods still result in various forms of non-participation error, meaning that certain individuals have low or even no chance of being included in our poll. This includes non-coverage error (e.g. no phone appended on the voter file) and non-response error (people declining to participate when we invite them); these examples are why we prefer the term “random contact” when discussing sampling methods, as not to imply that simply by randomly selecting individuals from a file ensures their inclusion in the sample is random. We use two approaches to reduce the bias associated with these non-participation errors: These are (a) raking samples to match target population proportions along key variables, and (b) oversampling low response propensity individuals, where propensity is determined by a logistic regression model of response (or strata as the case may be).
Nonresponse Error
The choice to participate in any survey is nonrandom and often a unique indicator of political engagement that is related to — but not fully explained by — vote history, political activism, and donations. We know response rates are highly correlated to a variety of observable demographic factors, as well as some unobservable factors, and that these biases in survey composition can result in large shifts in polling results.
We have implemented a variety of approaches to mitigate the effects of nonresponse error in our surveys, including controls on observable characteristics of nonresponse, such as weighting on 2020 vote recall and controlling for the frequency of prior survey respondents.
Measurement Error
Measurement error in surveys refers to the difference between a respondent’s true opinion on a subject and their stated or interpreted response to a survey question. The cause of measurement error may be due to confusing wording, lack of context or sufficient information on a subject, or acquiescence bias in survey response. At Data for Progress, we mitigate measurement error with careful consideration in the design of survey questions and responses to reduce confusion, bias, and inattentive responses.
For niche subjects we offer relevant context to inform respondents on the general idea of the subject to reduce confusion and imprecise responses. For example, we always offer “Don’t know” response options if the context requires it to avoid careless responses, and we often ask preceding questions about the level of knowledge or interest in a subject to contextualize respondents’ opinions on a subject.
To promote attentiveness, we also randomize or flip the order of questions’ response options, depending on the question type. We also limit the length of question wording, the number of response options, and the overall length of the survey.
Margins of Error
The margin of error associated with the sample size varies in each survey based on the number of respondents, assuming a 95% confidence interval. Each poll released will display the respondent size (N) and margin of error associated with that sample size in its methods statement. While the majority of our surveys have over 1,000 weighted responses with a margin of error of approximately ±3 percentage points, it is important to note that surveys with larger N will have smaller margins of error and vice versa. Similarly, results for crosstabs of subgroups of the sample are subject to increased margins of error due to the smaller sample size of each subgroup. These calculations are based on the assumption that polls are genuine random samples, with every member of the population having an equal chance of being selected, which is virtually never fully achievable. Margins of error may also understate the total survey error in the presence of nonresponse or other types of bias. We report the margin of error associated with the sample size across various modes of contact, acknowledging that while all statistical measures operate under certain assumptions and uncertainties, the margin of error provides a consistent, standardized framework for quantifying sampling variability.
Focus Groups
Data for Progress has expanded its capabilities in qualitative research via the implementation of focus groups that look to offer more context and deeper conversations with voters. We manage the process from cradle to grave to ensure high-quality results. We recruit, host, moderate, and analyze the results from online and in-person focus groups, as well as lead one-on-one qualitative interviews.
In 2023, we conducted 17 focus groups with participants from nine states and Washington, D.C. DFP works with local recruiters to curate a representative sample of workshop participants given the scope and objectives of the analysis, and can narrow the sample to favor specific industries or demographic groups.
Ahead of each workshop, DFP’s polling and policy teams work collaboratively to draft a discussion guide. We compensate participants generously for their time and budget for potential travel costs to ensure a diverse, high-quality, and reliable sample group. We produce full transcripts after each workshop, which we use to inform our quantitative and qualitative research findings. We generally conduct in-person research for specific geographies or community engagement groups, online research for broader geographies (i.e., statewide research), and one-on-one interviews for groups with unique participants (i.e., specific group members, etc).
Message Testing
Data for Progress is committed to making message testing and polling insights accessible and useful to progressive organizations. In 2022, we published a white paper with Priorities USA detailing how our proprietary MaxDiff survey format and modeling produce ordinal rankings of messages or policies that are accurate and match randomized control trials. This makes it easy to answer the question, “How should I talk about X?,” even among hard-to-reach voter populations.