Lately I’ve been encountering a lot of statistical rhetoric in both scientific and journalistic articles. By which I mean there are a not insignificant number of article authors who are using tricks of statistical methodology and lingo to bias the numbers we consume to support particular agendas.

A couple of examples:

(1) Relative Risk

Relative risk is a popular favorite because it makes relatively small changes big enough to be noticeable. For instance, if you have statistics that establish that the absolute risk of getting purple polkadots (for anyone in our population) is 5%, but if you jump up and down for 30 minutes a day, the risk goes down to 4%, then the absolute change is 1%, but the relative risk change is 20%! A lot of folks use this kind of thing to make their numbers more noticeable, but it can be taken to extremes and often is, especially to promote certain agendas. I will definitely write about this in more detail later.

(2) The Language of Relative Risk

Related to the use of relative risk above, there are a couple of ways to present the change in relative risk between two statistical factors or anomalies. Not only can you use absolute risk or relative risk to help maximize or minimize your results, but you can also use a tricky short word, “by”, to give your statistic another boost. What, for instance, is the difference between these two sentences?

a) Our relative risk was 100% of the original risk.

b) Our relative risk increased by 100%.

The difference between (a) and (b) is 100%. If in (a) a risk is 100% of the original risk, it’s the same as the original risk. It’s equal. 100% = 1. 100% of something is the same as the original something. On the other hand, if in (b) a percentage increases **by** 100%, you add the original 100% to the amount and it’s actually a 200% change. The first 100% is the original value.

There are many other ways to cook the numbers, to make numbers seem more or less important than they may be, to use numbers and other statistical methods to aim the conversation at one or another goals or agendas. I’ll try to delve into them as much as I can. But know that I believe that the public trust is being abused by use of these kinds of tricks in any kind of reporting to the public audience. I think we should standardize on one method of reporting risks (I prefer absolute if possible) and one method of reporting uncertainty (one that makes the uncertainty clear in context with the risk and the population – I like both error bars and NNT – number needed to treat). I think that if we can standardize we should, because it’s unreasonable to ask consumers of the data to know as much about statistics as we do in order to make sense of the reporting.

My primary goal when looking at the statistics from various articles about health is to convert whatever their method of statistical reporting is to a common framework so we can all get our heads around the numbers.

Most of these links may be my own attempts to get a handle on the statistics I encounter in studies. At the very least I will attempt to provide URL links to statistics sources I used.

- Gin, M. (2011, September 6). CVD Incidence and Prevalence. Google Docs. Retrieved from https://docs.google.com/spreadsheet/ccc?key=0AswFSN523g2acHhxN2lEUmk3dDJ2enNBWmk1M1kyU2c&hl=en_US#gid=2