Can public support for US presidential candidates be accurately estimated by correcting biases in social polls published on X?

Thousands of social polls on X suggest that Trump is leading the election race by a landslide. While many recognize the bias in these polls, there’s an unexpected—and fascinating—twist, reminiscent of Asimov’s sci-fi.

Informal political polls have grown in popularity on X, the platform formerly known as Twitter. For instance, one such poll, conducted recently by billionaire X owner Elon Musk, received over 5 million votes. It showed the Republican nominee, Donald Trump, leading over the Democratic nominee Kamala Harris by a landslide, 73.2% to 26.8%. Trump publicly featured the results of multiple such polls on his social media platform, Truth Social, presumably to create the impression of his overwhelming popularity. 

My recent TechPolicyPress article summarizes our research on biases and manipulation in such polls. However, the story of social polls on X has an unexpected twist… 

As we studied the biases in election polls published on X, it became clear that their bias was driven by the Republican lean and demographics of poll participants. However, these biases can be corrected to make estimates of public support for the candidates.  The correction process, in essence, makes social poll outcomes representative of US voters using the so-called regression and post-stratification methodology… 

You’re probably thinking now: you must be kidding!… 

Bear with me…

Our research team computed a running average of support for the 2024 US presidential candidates, Harris and Trump – attached below timeline figure from our website tracking X polls. Polls on X resulted in an average of 70% to 75% votes for Trump (dashed red line). However, once we corrected for the biases among X poll participants to estimate support more representative of the US voting population, we got a running average that oscillates around 50% support for each of the candidates (solid red and blue lines). By correcting biases in X polls, we obtained a drastically different picture of the election: a nose-to-nose horse race.

One might say, great but is there something to learn from this time series, or is the situation equivalent to the randomness of a coin flip? Let’s compare this bias-corrected estimate of popular support, grounded in biased X polls, with the well-known forecasting model of 538 (ABC News sponsored)  that uses traditional election polls. The two time series created starkly different pictures of the presidential election horse race in August and September. 538 was forecasting that Harris is, in relative terms, 50% more likely to win the election than Trump. However, in October the two time series started to overlap while oscillating around 50% support for each candidate. Can this be a coincidence?

On the 5th of October, Musk joined Trump’s presidential election campaign event in Pennsylvania, a key swing state, forecasted by 538 as the most pivotal state in the election. Musk literally, and famously, jumped onto the stage at that event and into Trump’s presidential election campaign. On October 20, Musk launched a daily lottery, giving away $1 million to a registered swing state voter. These unusual events may be reflected in the two time series lines showing dipping support for Harris after October 15 and between October 20-25. 

The time series of public support estimates based on X polls does not only synchronize with the 538 forecast. It also synchronizes with prediction markets, where players bet on the winner of the election, such as PredictIt and Polymarket (blue lines below). These are not the only events in which there is synchrony  between the four time series lines. In fact, right after the September 10 debate between Harris and Trump, the support for Harris increased according to all four time series (see the below, and above, figure). 

It is unlikely that all these relationships are accidental, but more research is needed to establish the forecasting potential of bias-corrected estimates on biased polls from X. For instance, I’m curious how such estimates hold in comparison to top forecasting models in the profitability test proposed by Rajiv Sethi (check out his post). This research is fascinating, particularly given how drastic a twist the idea of accurately estimating public opinion from X polls is in comparison to the realization that extremely biased X polls, quite frankly, misinform users.

If X was committed to poll accuracy, they could compute and publish on their platform such bias-corrected estimates of public support. Not only that, but X could do it with a much higher precision than us, because the platform has access to much more data. Our bias-corrected estimation method applies various AI components, including a large language model, a demographic classifier, and a partisanship classifier. It is a data-intensive and computationally complex approach. Our approach resembles the vision of the famous sci-fi writer, Isaac Asimov (admired by Musk), outlined in his short story Franchise. Asimov envisioned a supercomputer, called Multivac, that would forecast election outcomes by identifying and interviewing an extremely small sample of representative voters. 

To do justice to this vision, X may need to commit to a point of distinction between freedom of speech and freedom to misinform. 

Disclaimer: I recently accepted a faculty position at University College Dublin and transitioned to the role of adjunct professor at the University of Massachusetts Amherst. My move is, in part, motivated by the introduction of the Digital Services Act (DSA) in the European Union, a regulation that facilitates the study of social media platforms and their impact on democratic societies. Without such regulation, it would be practically impossible to continue studying polls on X, because X has limited academic access to data since Musk bought the platform.

 1 We’re comparing here slightly different things: the percentage of population supporting a candidate (estimated based on data from X) with the probability of the candidate winning the election (estimated by 538 based on traditional polls).

2 Price of a prediction market contract bidding that a particular candidate will win the election is yet another thing than both the percentage of population supporting a candidate and the probability of that candidate winning the election. Nevertheless, the three are related, as this post shows.