todays post ist about statistics, again.
Thats because I discovered a really interesting example of how statistics can be misused if not understood correctly.
The quotation which my follower -DO- used "glaube nie einer statistik die du nicht selbst gefälscht hast" what reinterprets as "Never believe a stat that you did not fake on your own" or just "Lies, damned lies, and statistics."
is what brings it to the point.
The german Tax office uses the Chi-Square-Distribution to uncover irregularities in the billing of companies.
What they do is, they use the Chi-Square-Distr. to meassure the number of the figures 0 to 10 on each position of a Value which is listed on the Bill.
The result they get is the deviation of which should be the actual number (They expect each figure to come up equally)
And if the deviation is too high they assume that the bill has been rigged.
What they often do forget is, that this deviation only expresses that the values aren't coincidental but have a certain reason. This reason can be a lot of things, and by no means it has to be that the bill is manipulated.
The reason for not coincidental values can be a lot of things, especially in smaller firms or businesses with a uncommen price strategy there can be prices which only end in .00 or .50, which naturally makes values of .00 and .50 much more likely than any other value.
But, of course, if they invested time in doing the allmighty Chi-Square-Distribution there must be something wrong and so you'll get problems with the taxoffice, just because they don't know what they are doing.
Because there are no interesting youtube clips, other than explanations of Chi, the daily video is a litte smooth jazz from the fabulous Avishai Cohen Trio. Enjoy!