Max Pellert (https://mpellert.at)
Beloved standard tool of the social sciences
Often considered the gold standard, “ground truth” (especially when working with large representative samples of a population)
But, classical survey methodologies increasingly suffer from problems
First line of the 2024 Book “Polling at a Crossroads: Rethinking Modern Survey Research”: Survey research is in a state of crisis
Last example: US presidential elections 2024
(Generally, I think strong method conservatism in the social sciences does not make sense)
Ann Seltzer had an excellent multi-decade track record of accurate polling
For example: In 2008, predicting that a virtually unknown senator, Barack Obama, would beat frontrunner Hillary Clinton in the Iowa caucuses
The widely publicized final poll of Iowa by Selzer & Company showed Harris leading by 3 percentage points (in Iowa!)
On Election Day, Trump won the state by 13 points
Laudable public effort by the pollster at an error analysis: “To cut to the chase, I found nothing to illuminate the miss.”
“Within the margin of error”, “We said it’s close”, “Predicting it at almost 50-50 means that this can happen”
The 2024 elections were not close, a decisive victory on all metrics
Relevance of survey research? You don’t need much sophisticated machinery (or money) to predict that a 2-party system as the US with deeply ingrained political beliefs of the population and a very peculiar electoral system will be a tight race
Main issue? Non-response
Less than 1% of people respond even in respected well-established surveys (NYT for example)
By now, any change in the traditional polls may just mean a new pattern of non-response
Many reasons, spam calling and ping calls a recent one
Statistics can correct for some problems, but you need some basis to work from
At the same time that they don’t respond to surveys, people are extremely expressive on other, social media
Britain’s mood, measured weekly
One example of an easily accessible, representative survey (UK) not directly in the political domain
We can recreate dynamics with that approach of longitudinal adaptors
Not equally well for all constructs
Remember, our approach is just self-supervised next token prediction (no labels present as for example with the supervised text classification method of TweetNLP)
Our approach is very flexible, we can in principle ask any question and get survey-like responses for each week
Why does that work?
I don’t think we should be replacing survey research
Also with complementary synthetic methods we will need classical approaches for example to learn about the sampling frame
But we should be making use of the text that people are producing (and potentially other modalities too)
It’s first steps for now and we strongly have to validate what we are doing
Huge potential: Low costs, scalability, unobtrusive observation, high temporal resolution, …
Bridging the gap between “qualitative” data and quantitative insights