Christina's LIS Rant: Using Google search results as a measure of “fame”

Christina's LIS Rant

Wednesday, April 28, 2004

Using Google search results as a measure of “fame”

Found via Physics Web 27 April 2004.
"Fame in science is different to fame in other areas of life according to physicists at Clarkson University in the US. Daniel ben-Avraham and colleagues have shown that the fame of a scientist - as measured by the number of hits on Google - is directly proportional to their merit as measured by the number of research papers they have published. Such a relationship is not found for other groups such as sportsmen or actors (J P Bagrow et al. 2004 arXiv/cond-mat/0404515)."

There are several large problems with this article. First, the author’s definition of fame is “how well linked we are in… the World Wide Web.” Yet their measurement of how well linked they are only uses the proprietary Google PageRank feature. If the Google algorithms were published, or relied solely on linking, then this might be justifiable. As it stands, however, one is never certain what tweaks have been made and why top sites have dropped out of the results (see discussion here. Also, there are many documented instances of errors in the PageRank as well as efforts to artificially inflate the rank of a given page (see, for example, this Search Engine Watch article.

Second, this “fame” is compared to achievement which is then equated to merit. The model for this study used an obvious, less disputable metric: the number of enemy planes shot down by a fighter pilot. What metric is appropriate in measuring the merit or achievement of a physicist? The authors use number of publications appearing on www.arxiv.org/cond-mat! They do nothing to counteract occasions when the head of a research group appends his name to every paper written by his group. Also, wouldn’t it make more sense to weigh articles eventually published in Nature or Science or other high impact journals more heavily than never published (and never reviewed) eprints? Using the number of citations in cond-mat is circular reasoning because the first few hits in Google generally come from there.

The mathematics of the article are most likely correct, but the entire basis of the study is not. Bibliometrics of science is a fascinating subject, but there are many pitfalls to be avoided; it appears that these authors have hit them all.

¶ 11:15 AM| |cites (technorati) |

Comments: Post a Comment