This is a short post because I’m on field trip in rural texas, another post with graph and explanation will be up soon.

In this statement, Research 2000 head Del Ali seems to state that he manually altered his polling data, though it’s hard to tell from the muddled language:

Yes we weight heavily and I will, using the margin of error adjust the top line and when adjusted under my discretion as both a pollster and social scientist, therefore all sub groups must be adjusted as well.

He also seems to say that this is common among pollsters:

I challenge anyone to then look at comparable data from other firms, not one or two but many others.

This is actually a very good point. In political polling especially, there seems to be an almost deliberate murkiness about methods. I think this is because many polls may not be actual data reported directly from polls, but rather estimations computed using outside information and only partially based on the polling sample.

While there is nothing wrong with estimation as opposed to raw sampling, it is important that a clear distinction be made between the two.

  1. mbweissman says:

    There are further distinctions.
    First, since it’s very hard to use phones these days to get a representative sample of the population, all the pollsters have to use some sort of weighting procedure. Some categories (sex, ‘race’, etc. ) have known demographic percents, so in some ways weighting for them is not too problematic. Others (party ID) obviously can change quickly. Here, trying to fix party ID percents at some target levels involves making some dubious assumptions about the very sort of thing the polls are trying to find out. Still, dubious assumptions can sometimes beat noisy data. When these various allowances for the known (and sometimes also the not quite known) properties of the population are done, a “polling result” is obtained. Different pollsters use different mixes of actual polling data and prior knowledge or beliefs about the population.

    What is not done by a legitimate pollster is to intervene, after all the data and algorithms are done, to change the results “at his discretion”. Even if that illegitimate step were taken, if it were done in a completely sincere if dishonest attempt simply to estimate the properties of the population, it would not show the peculiar statistical features exhibited by sequences people make up when they are trying to make things look “random”. Making a number look random is a different goal than making it a best possible estimate, even a best possible subjective guess estimate. So when a sequence has just the properties of a poorly simulated random sequence, that’s evidence it was not a best estimate of any type.

