Skip to main content

Problem of the between-state correlations in the Fivethirtyeight election forecast

Elliott writes:

I think we’re onto something with the low between-state correlations [see item 1 of our earlier post]. Someone sent me this collage of maps from Nate’s model that show:

– Biden winning every state except NJ
– Biden winning LA and MS but not MI and WI
– Biden losing OR but winning WI, PA

And someone says that in the 538 simulations where Trump wins CA, he only has a 60% chance of winning the elec overall.

Seems like the arrows are pointing to a very weird covariance structure.

I agree that these maps look really implausible for 2020. How’s Biden gonna win Idaho, Wyoming, Alabama, etc. . . . but not New Jersey?

But this does all seem consistent with correlations of uncertainties between states that are too low.

Perhaps this is a byproduct of Fivethirtyeight relying too strongly on state polls and not fully making use of the information from national polls and from the relative positions of the states in previous elections.

If you think of the goal as forecasting the election outcome (by way of vote intentions; see item 4 in the above-linked post), then state polls are just one of many sources of information. But if you start by aggregating state polls, and then try to hack your way into a national election forecast, then you can run into all sorts of problems. The issue here is that the between-state correlation is mostly not coming from the polling process at all; it’s coming from uncertainty in public opinion changes among states. So you need some underlying statistical model of opinion swings in the 50 states, or else you need to hack in a correlation just right. I don’t think we did this perfectly either! But I can see how the Fivethirtyeight team could’ve not even realized the difficulty of this problem, if they were too focused on creating simulations based on state polls without thinking about the larger forecasting problem.

There’s a Bayesian point here, which is that correlation in the prior induces correlation in the posterior, even if there’s no correlation in the likelihood.

And, as we discussed earlier, if your between-state correlations are too low, and at the same time you’re aiming for a realistic uncertainty in the national level, then you’re gonna end up with too much uncertainty for each individual state.

At some level, the Fivethirtyeight team must realize this—earlier this year, Nate Silver wrote that correlated errors are “where often *most* of the work is in modeling if you want your models to remotely resemble real-world conditions”—but recognizing the general principle is not the same thing as doing something reasonable in a live application.

These things happen

Again, assuming the above maps actually reflect the Fivethirtyeight forecast and they’re not just some sort of computer glitch, this does not mean that what they’re doing at that website is useless, nor does it mean that we’re “right” and they’re “wrong” in whatever other disagreements we might have (although I’m standing fast on the Carmelo Anthony thing). Everybody makes mistakes! We made mistakes in our forecast too (see item 3 in our earlier post)! Multivariate forecasting is harder than it looks. In our case, it helped that we had a team of 3 people staring at our model, but of course that didn’t stop us from making our mistakes the first time.

At the very least, maybe this will remind us all that knowing that a forecast is based on 40,000 simulations or 40,000,000 simulations or 40,000,000,000 simulations doesn’t really tell us anything until we know how the simulations are produced.



from Statistical Modeling, Causal Inference, and Social Science https://ift.tt/3lCAwYE
via IFTTT

Comments

Popular posts from this blog

Solving Van der Pol equation with ivp_solve

Van der Pol’s differential equation is The equation describes a system with nonlinear damping, the degree of damping given by μ. If μ = 0 the system is linear and undamped, but for positive μ the system is nonlinear and damped. We will plot the phase portrait for the solution to Van der Pol’s equation in Python using SciPy’s new ODE solver ivp_solve . The function ivp_solve does not solve second-order systems of equations directly. It solves systems of first-order equations, but a second-order differential equation can be recast as a pair of first-order equations by introducing the first derivative as a new variable. Since y is the derivative of x , the phase portrait is just the plot of ( x , y ). If μ = 0, we have a simple harmonic oscillator and the phase portrait is simply a circle. For larger values of μ the solutions enter limiting cycles, but the cycles are more complicated than just circles. Here’s the Python code that made the plot. from scipy import linspace from ...

Lawyer: 'Socialite Grifter' Anna Sorokin 'Had To Do It Her Way' (And Steal $275,000)

Opening statements were made in the "Socialite Grifter" trial on Wednesday, and both sides provided extremely different reasons why Anna Sorokin allegedly scammed a number of people and institutions out of $275,000. [ more › ] Gothamist https://ift.tt/2HXgI0E March 29, 2019 at 12:33AM

5 Massively Important AI Features In Time Tracking Applications

Artificial intelligence has transformed the future of many industries. One area that has been under- investigated is the use of AI in time tracking technology. AI is Fundamentally Changing the Future of Time Tracking Technology A time tracking software is a worthy investment irrespective of the size of your organization. It generates accurate reports based on the amount of time your team spends working on a task. These reports facilitate planning of budgets for upcoming projects. Many AI tools are changing the nature of time management. MindSync AI discussed the pivotal role of AI in time management in a Medium article . Why is time tracking software important? It helps with keeping track of the hours being invested on a given task. This sheds light on the timeline for the overall project. It also helps in determining the productivity levels of the employees. This is one of the many reasons that AI is driving workplace productivity . But how can employers utilize it effectively? ...