Christina's LIS Rant
Friday, August 29, 2008
  The meaning of citations
ResearchBlogging.orgWhat a grand post title, but actually, what I mean is slightly more like: the meaning of citations: what Garfield said he means in a bunch of articles vs. what people say he means and even worse what people do with his work, plus some commentary on a review chapter.

Today I read the whole Nicolaisen[*] article which I just browsed earlier (ok, so it's been A LOT longer than I intended). This is not a review of how to *do* citation analysis, that's included in the several ARIST chapters on bibliometrics and informetrics. Rather, this is a review of two streams of literature about citations: why do scientists cite (and theories about that) and more weakly, one aspect of/model for/theory of how citation patterns "reflect the characterics of science and scholarship" -- how citing patterns can be used to model science/knowledge... **

First, because I always run out of steam at the end, and because it's most important, what Garfield says vs. how his work is used.
L.C. Smith (1981, cited in *) provides these assumptions that underlie citation analysis:
1. Citation of a document implies use of that document by the citing author.
2. Citation of a document (author, journal, etc.) reflects the merit (quality, significance, impact) of that document (author, journal, etc.).
3. Citations are made to the best possible works.
4. A cited document is related in content to the citing document.
5. All citations are equal.
So there's this idea that there's a linear relationship between quality and number of citations (as evidenced by linear regressions used everywhere - also in a note in *). More citations mean better paper, mean better institution, mean more money. BUT, that's not what Garfield said:
A highly cited work is one that has been found useful by a relatively large number of people, or in a relatively large number of experiments. … The citation count of a particular piece of scientific work does not necessarily say anything about its elegance or its relative importance to the advancement of science or society.…The only responsible claim made for citation counts as an aid in evaluating individuals is that they provide a measure of the utility or impact of scientific work. They say nothing about the nature of the work, nothing about the reason for its utility or impact. (Garfield, 1979, p. 246, cited in *)
In fact, Nicolaisen elsewhere provided evidence for Bornstein's suggested J- shape between quality and citations. Utility could be to illustrate a point and impact can be negative...

So back to the content of the review article. Why study citation analysis? Because it's used for (as Zunde said and Nicolaisen added to)
1. Qualitative and quantitative evaluation of scientists, publications, and scientific institutions
2. Modeling of the historical development of science and technology
3. Information search and retrieval
and Nicolaisen's addition (here I paraphrase, above I quote) 4. knowledge organization/mapping through bibliographic coupling and co-citation analysis
So it can be pretty important in the life of an individual scientist as well as in the success of institutions. (particularly in certain European countries that allocate research funding this way)

But there isn't a cut and dried accepted theory of why people cite. Seems pretty obvious, right? Here are the ones that the author reviews
As for the symbolic nature of citations - this goes to the heart of using citations to map knowledge. What can we say about paper A because it cites B, or about A and C if they both cite B? Citations as indicators that provide a formal representation of science - Wouters Reflexive Citation Theory. But look, we don't know why the citation was useful to the author - maybe the context is, "what an idiot Pikas is, see for example Pikas (2008)." So according to the author, Wouter's theory can't handle that.

An interesting (and now on my research questions list) application of all of this is to look at explicit link-love mentions in SCTs used by scientists or well, really anyone. This idea is mentioned in Efimova, L., Hendrick, S., & Anjewierden, A. (2005) but not explicitly researched.

[*]Nicolaisen, J. (2007). Citation Analysis. Annual Review of Information Science and Technology, 41, 609-641.
[**] I do appreciate that research blogging is supposed to make articles more clear not less clear but hopefully I'll get better with practice ;)
Interesting post. I work as a scientist in the private sector, thus I do no longer have to pay so much attention to what Thompson Scientific says about my numbers... However, it raised a couple of questions with me:
1) How do citation researchers take into account that some journals allow only for a limited number of citations per statement that needs support (e.g. typically the original work, some later work expanding on the original work and the mos recent work)?
2) The concept that an author who cites a paper has made use of that paper is sketchy at best. I can't count the number of times I have seen (the same) articles cited in various journals, by different authors, but with the exact same misspellings in title or author names. This suggests to me that they have never seen the actual article, but are merely copying it from another article and thus actually haven't read (used) it. Is that part of any classical citation theories?
Hi Great - yes some known issues there. As for 1) that's absolutely a problem with the normative theory and the standard models that are used in citation analysis - I don't believe it is accounted for. I think in many cases, the original work is no longer cited because anyone reading the paper would take it for granted. There could be a study to look to see if journals with those policies have different citation patterns.... but I don't know of one. There are articles discussing problems with peer review in which extra citations are added by reviewers (to the reviewer's semi-relevant work) - I think this might be covered by Nicolaisen's theory from evolutionary biology. As for 2) there are actually a couple of studies posted on ArXiv showing how a typo (one so bad it would prevent you from retrieving the article) propagates. This is actually mentioned in the Nicolaisen article. The people who go totally toward citation/name-dropping for persuasion say this supports their model: it wasn't cited because it was used, but so that the author could bask in the glow. Others say that this is rare enough and that we don't know for sure that the scientist didn't actually retrieve the article from elsewhere, but just copy the actual citation (IOW - they read it, but got lazy only at the point of creating their reference list).
I would like to echo soem of what greatdane said about restrictions on the number of permitted citations for a journal article. For example, my wife wrote a paper for a journal where the cap was 25, but she probably consulted about 50 papers. What stays? What goes?
I'd be interested in knowing if anybody has studied to what degree editorial restrictions affect citation analyses. How would you even begin to account for the number of papers consulted but not cited?

Christina, there's also the citation typing ontology: http://www.crossref.org/CrossTech/2009/03/citation_typing_ontology.html
Post a Comment

This is my blog on library and information science. I'm into Sci/Tech libraries, special libraries, personal information management, sci/tech scholarly comms.... My name is Christina Pikas and I'm a librarian in a physics, astronomy, math, computer science, and engineering library.

Christina Kirk Pikas

Laurel , Maryland , 20707 USA
