Some advice for the rank beginner in citation analysis part 1
I keep realizing how much I have to learn in citation analysis and social network analysis. So I think that I have nothing to offer; yet, I do know a lot, and I've learned some lessons the hard way. I'll try to give back a little here because it will probably be good for me and it might help someone else.
This post is for librarians and other information professionals who might want to dabble in or dip a toe into citation analysis and who are a bit lost with all of the massive amounts of advice and help out there.
First, I'm talking about studying the structure that is created through the linking of people or works by co-authorship or citation (in some fashion). It's a fairly straight forward thing to calculate someone's or some organization's h index (using two competing tools and some free things), and it's fairly straight forward to tally up citations. Although, it is nigh impossible to be comprehensive in just about any field. What is more complicated is building a network of relationships between authors and using this network to understand the collaborations and potential information flows. So that's what I'm talking about. You might use this within an organization to see the patterns of how people in one department write papers with people in another. You might use this to look at some sort of similarity based on who all cites the same paper. This sort of analysis is a value added service that librarians and information professionals can provide for their organizations.The data
Where do you get the data? Well, Web of Science (henceforth WoS) from
Thompson Reuters is still the best choice. You will need a site license to this or the CD-ROMs because this is way expensive for DIALOG. Yes, there are competitors, but many of the tools are built to work with WoS data. If you think you're going to find cleaner data, hah! Let me know how that works for you. Ok, but, it is still a bit dirty and it has those pesky known weaknesses: western bias, journal articles only (so CS and some areas of engineering under represented), not intended to be comprehensive. Scopus, in my experience, has crap for data (they messed up and at one point marked a bunch of stuff from the 1980s as from 2008, they think MPOW is in New Jersey (I mean really - have you not heard of our parent institution???))... As for Google Scholar, there are some tools that use it, but we don't know how comprehensive (or what it covers), how far back it goes, how frequently it's updated...
Now, for co-authorship, you can really use just about anything or some combination, but we'll talk about that later.What Software Do I Need?Preparing the Data
Do your search in WoS. Mark the records and save the full record including citations as plain text. I believe it will only do 500 at a time, so you'll have to paste these files together (take off the preliminaries like FN and the EF (end of file) for the middle records. You can do it in Sitkis, or you can concatenate at the command line, whatever works. Here's an example of a record (note they do NOT have my zip code and city right, and there's an extra letter in my e-mail, sigh):FN ISI Export Format
AU Pikas, CK
TI Blog searching for competitive intelligence, brand image, and
C1 Johns Hopkins Univ, Appl Phys Lab, Baltimore, MD 21218 USA.
RP Pikas, CK, Johns Hopkins Univ, Appl Phys Lab, Baltimore, MD 21218 USA.
PU ONLINE INC
PA 213 DANBURY RD, WILTON, CT 06897-4007 USA
SC Computer Science, Information Systems; Information Science & Library
When you install Sitkis you'll get a manual and a user guide. The manual helps you with setting up and importing data. I followed the manual closely and didn't have any problems.
- Types of analysis
- exporting from Sitkis
- importing into UCInet or NetDraw (comes with UCInet)
- drawing pretty pictures
- some simple measures
- using a citation manager to get co-authorship data
- where to go for more information (and people who know more about this than I ever will)