Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> because not enough people were using the products... because they couldn't.

I don't think this is off topic at all. I think is is explicitly on topic, at least the the underlying one. Not just statistics are hard, but it's hard to measure things and even harder to determine causality. Which is often the underlying goal of statistics and data science. To find out why things happen. Measurements are incredibly difficult and people often think they are simple. The problem is that whatever you're measuring is actually always a proxy and has uncertainty. Often uncertainty you won't know about if you don't have a good understanding of what the metric means. You'll always reap the rewards when putting in the hard work to do this, but unfortunately if you don't it can take time before the seams start to crack. I think this asymmetry is often why people get sloppy.



The example I like to use is the confusion around COVID statistics, and how people mis-interpreted them.

For example, the rate of infections (or deaths) per day that was reported regularly in the news is actually: rate of infections * measurement accuracy * rate of measurement.

I.e.:

If more people turn up to be tested, the "rate" would go up.

If the PCR tests improved, the "rate" would go up.

A similar thing applies with hospitalisations and deaths. It might go up because a strain is more lethal than another strain, or because more people are infected with the same strain, or because more deaths are attributed to COVID instead of something else.

It doesn't help that different countries have different reporting standards, or that reporting standards changed over time due to the circumstances!

Etc...

It's complicated!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: