Answer :
It is true.
If you start with a spreadsheet of asthma rates, the story might be “people living near factories had more chances of the rate of lung cancer than others.” Or it might not be, because you could be misinterpreting the data in some way. This is the process of selecting and obtaining the relevant data, finding the interesting facts or patterns, putting them in context, and explaining what they mean. There are many reasons that you might accidentally misinterpet your data. You could choose the wrong data to answer your question, or you might not really understand how the data was collected and what its limitations are. You could believe you see a pattern that is really just a coincidence: something that is so likely to turn up by chance that it would be misleading to present it as fact.
Data doesn’t just come from thin air. It’s collected by specific people—or machines—for a specific purpose. There may also be people who have a financial or political interest in the numbers. You must understand the data generation process, and the types of errors it’s likely to introduce. Before start working on the data you should ask certain questions to yourself;
• Where do these numbers come from?• Who recorded them?• How?• For what purpose was this data collected?• How do we know it is complete?• What are the demographics?• Is this the right way to quantify this issue?• Who is not included in these figures?• Who is going to look bad or lose money as a result of these numbers?• Is the data consistent from day to day, or when collected by different people?• What arbitrary choices had to be made to generate the data?• Is the data consistent with other sources? Who has already analyzed it?• Does it have known flaws? Are there multiple versions?
If you start with a spreadsheet of asthma rates, the story might be “people living near factories had more chances of the rate of lung cancer than others.” Or it might not be, because you could be misinterpreting the data in some way. This is the process of selecting and obtaining the relevant data, finding the interesting facts or patterns, putting them in context, and explaining what they mean. There are many reasons that you might accidentally misinterpet your data. You could choose the wrong data to answer your question, or you might not really understand how the data was collected and what its limitations are. You could believe you see a pattern that is really just a coincidence: something that is so likely to turn up by chance that it would be misleading to present it as fact.
Data doesn’t just come from thin air. It’s collected by specific people—or machines—for a specific purpose. There may also be people who have a financial or political interest in the numbers. You must understand the data generation process, and the types of errors it’s likely to introduce. Before start working on the data you should ask certain questions to yourself;
• Where do these numbers come from?• Who recorded them?• How?• For what purpose was this data collected?• How do we know it is complete?• What are the demographics?• Is this the right way to quantify this issue?• Who is not included in these figures?• Who is going to look bad or lose money as a result of these numbers?• Is the data consistent from day to day, or when collected by different people?• What arbitrary choices had to be made to generate the data?• Is the data consistent with other sources? Who has already analyzed it?• Does it have known flaws? Are there multiple versions?