Yesterday I read the book, “Advice for a Young Investigator”. The author said that the progress of studying science takes three steps. observations, hypotheses and experiments. First, we need to observe our target such as stars, animals, people, etc. Then, we provide our hypotheses about why some facts happen and how our targets move. Last but not least, we conduct experiments to prove or disprove our hypotheses. No matter what kinds of sciences we study in, the progress is similar.
Interestingly, the author points that a hypothesis is useless if it cannot be proved or disproved. Although this idea is quite simple, it shocks me. Sometimes I feel that my hypotheses are hard to prove, and they annoy me a lot. For example, I’m studying the ranking of web pages or the ratings of movies. I think there should be some other ways to rank these things; however, ranking methods are hard to evaluate. The fact is that everybody can easily propose a new way to rank, but nobody can evaluate it. Maybe there is always an absolutely correct answer in nature, but there is not, in humans. This phenomenon bothers me for a long time, but it’s also the interesting part of data generated by humans.
Another useful advice from this book is that find out the problems from the real data analysis. It’s a simple idea again, but it is not that obviously in computer science. In computer science, researchers can easily generate synthetic data from simulators. Sometimes researchers are lack of real data such as the real-time traffic flow; they can only generate it by their simulators. Therefore, sometimes they may claim a new problem from their own imaginations. I did similar things several times, too. In the end, I cannot make sure what the impact of my research to the world is even if my experiments show that my idea works. Some researchers illustrate many examples to convince us that someday another study will use our result, and finally many researchers’ efforts will be combined together and create a big successful result. I do believe these scientific histories, but sometimes I doubt that some studies including mine are not useful pieces to a success.
Interestingly, the author points that a hypothesis is useless if it cannot be proved or disproved. Although this idea is quite simple, it shocks me. Sometimes I feel that my hypotheses are hard to prove, and they annoy me a lot. For example, I’m studying the ranking of web pages or the ratings of movies. I think there should be some other ways to rank these things; however, ranking methods are hard to evaluate. The fact is that everybody can easily propose a new way to rank, but nobody can evaluate it. Maybe there is always an absolutely correct answer in nature, but there is not, in humans. This phenomenon bothers me for a long time, but it’s also the interesting part of data generated by humans.
Another useful advice from this book is that find out the problems from the real data analysis. It’s a simple idea again, but it is not that obviously in computer science. In computer science, researchers can easily generate synthetic data from simulators. Sometimes researchers are lack of real data such as the real-time traffic flow; they can only generate it by their simulators. Therefore, sometimes they may claim a new problem from their own imaginations. I did similar things several times, too. In the end, I cannot make sure what the impact of my research to the world is even if my experiments show that my idea works. Some researchers illustrate many examples to convince us that someday another study will use our result, and finally many researchers’ efforts will be combined together and create a big successful result. I do believe these scientific histories, but sometimes I doubt that some studies including mine are not useful pieces to a success.