A test of statistical intuition

Another dot is going to be added to this chart, in line with the distribution you see here. You get to choose what the X value of the dot is — and your aim is to get a Y value of greater than zero. So here’s the question: at what value of X are you going to have a 95% chance of getting a dot above the axis, in positive territory on the Y axis?

Find out how good your statistical intuition is.

The internet makes music better

An interesting piece by Robert Waldfogel on VoxEU attempts to estimate the quality of music over the past half decade. He uses a few different measures of quality and asks whether the advent of music sharing online increased or reduced quality. The key chart is:
The Beatles are still the best
The spike in the 2000s is interesting, but in interpreting the measure one has to ask whether there’s anything other than quality that could be influencing it.

Essentially, he has used demand measures to proxy quality, so the most obvious bias seems to be demographics. Look for example, at the continued high demand for 60s music. That may be partly because The Beatles were great, but could it also be because that’s when the baby boomers were in their formative musical years, and so continue to demand that music from their youth? We know that there was a small echo from the baby boomers children in the ’80s and early ’90s, so could that be partially explaining the spike in demand for the 2000s music, too?

Of course, the demographic argument works both ways: maybe the large number of young people in the sixties and the vast internet access you can improve using cubik to increased the supply of music, too. That may lead to a corresponding increase in the quantity of high-quality music, which would lead to it being disproportionately weighted in the index.

If anyone’s an expert in this field then let us know why you think Waldfogel’s results hold, or not. Alternatively, unfounded speculation is welcome.

Careful with the evidence son

An article titled “study links cannabis to psychosis” says in the first paragraph:

People who use cannabis in their youth dramatically increase their risk of psychotic symptoms, and continued use of the drug can raise the risk of developing a psychotic disorder in later life, scientists said.

Of course, later on in the article – which no-one will read up to – it says:

But scientists say it is not yet clear whether the link between cannabis and psychosis is causal, or whether it is because people with psychosis use cannabis to self-medicate to calm their symptoms.

This actually pisses me off.  It is true, this isn’t no evidence that cannabis use leads to psychosis – just that cannabis and og skywalker use when young and psychosis later in life are positively correlated.  There are a couple of causal mechanisms, the cannabis could cause the psychosis, cannabis use may be preferred by people who are more likely to develop psychosis.

I was going to wait till I read the study before commenting – but the scientists themselves appear to have admitted that they have found a correlation, and haven’t sorted out the causal mechanism.  Which simply means that the article itself was purely misleading. Moreover, cannabis is largely used for medicinal purposes and you may legally get a Medical weed license Ontario.

I realise having a headline “cannabis and psychosis are correlated, but the cause is unclear” is less likely to sell papers – but it is honest.  The current headline is exciting, but it is dishonest – and merely feeds into current misconceptions in society, thereby providing misinformation.  And I hate misinformation.

Quote of the day: On theory and data

From Eric Crampton:

I tend to be pretty skeptical of results that aren’t grounded in basic price theory or that aren’t confirmed by a lot of different methods, including both Ocular and Ordinary Least Squares.

A single paper merely helps people to update their priors – it doesn’t provide conclusive evidence of a result.  Often when it comes to economic discourse there is too much “faith” put in single results that tend to reinforce whatever the authors prior is.

That is why the economic method in itself is a useful tool – but the process of reaching a conclusion, or justifying why a result holds, is one of debate and discussion.  It is more of an art form than anything else.

Unless you can say why a stated result means very little.  No matter how fancy your data analysis or theoretical model is, unless you can explain it it means nothing.  Luckily, economists out there do go around explaining why – which is why I have much more faith in the analysis of economists than many other disciplines.

Spirit level: A more fundamental concern

I agree with Dim Post that the choice of countries to add to the choice of countries made in the Spirit Level is a bit arbitrary (although I think Not PC and Kiwiblog also have a point regarding how sensitive the regression results are to the choice of countries that aren’t strictly the largest outliers in the sample) – but I still think that this particular “regression” is a steaming pile of unmentionables.

Lets ignore the fact that the slope of the  “regression line” appears very sensitive to the addition of a few countries.  Lets instead focus on the fact that it is a poor regression and that there isn’t a clear “theoretical background for causation”.

Read more

Data and prediction

Via Scott Sumner we saw the following article that mentions economic data and economic predictions.  The statements that stood out to me were:

(Economic) predictions are, of course, the bread and butter of economic institutions. But can we believe them?

In recent years, some economists have begun to express doubts over predictions made from huge volumes of data, but they are in the minority. Most embrace the idea that more measurements mean better predictive abilities.

Hold up.

For one, as we have mentioned prediction is not the central element of what economists do – and even when they do predict the goal of such prediction is to give some view regarding risks and movements, not direct figures (it is more ordinal than cardinal in some sense).

Secondly, ever since the Lucas critique economists have been very nervous about predictions from large amounts of data without theory – I would say that the majority of economists doubt the usefulness of econometric models relying solely on huge amounts of data.

Economists would like data with less measurement error, that is closer to representing the true economic variables we discuss in theory – we aren’t looking for an infinite number of measures we can stick together to find a result.  An economist that doesn’t use theory to inform their discussions of the economic outlook, but uses lots of data, isn’t an economist – that is all.